Implement the missing overload to handle UTF-8 specific data types,
including char8_t (C++20), char, uchar and signed char.
Introduce the helper function 'assign_helper_char8' which handles the
non-contiguous_iterator case. The contiguous_iterator case is already
handled by the QAnyStringView overload.
Include 'qstringconverter.h' at the end of the file, since it can't
be included at the top due to diamond dependency conflicts.
QStringDecoder is an implementation detail we don't want users to
depend on when using assign(it, it). It would be unnatural to not
be able to use a function just because we didn't include an
apparently unrelated header.
[ChangeLog][QtCore][QString] Enabled assign() for UTF-8 data types.
Fixes: QTBUG-114208
Change-Id: Ia39bbb70ca105a6bbf1a131b2533f29a919ff66d
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
By saying what's special about some of them
Pick-to: 6.6 6.5
Change-Id: I17bf2e12a27bf55f621020ddf3819ee9e606847d
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
QString::fromJsString -> QString::fromEcmaString()
QString::toJsString() -> QString::toEcmaString()
For API naming compatibility with QByteArray::fromEcmaUin8Array()
Pick-to: 6.6
Change-Id: If6e2121e31e630d6728ed24e41d14b763f395aaa
Reviewed-by: Piotr Wierciński <piotr.wiercinski@qt.io>
Reviewed-by: Mikołaj Boc <Mikolaj.Boc@qt.io>
Reviewed-by: Lorn Potter <lorn.potter@gmail.com>
When appending to an empty string or byte array, we optimize and
copy the internal pointer. But if the other string/byte array was
created with fromRawData this might be temporary data on the stack/heap
and might be de-allocated or overwritten before the string/byte array
is used or is forced to make a deep-copy. This would lead to incorrect
data being used.
This is easy to overlook if you plan to append multiple strings
together, potentially supplied through an argument. Upon appending a
second string it would make a full copy, but there might not be a
guarantee for that. So, it's hard for users to avoid this pitfall!
Fixes: QTBUG-115752
Pick-to: 6.6 6.5 6.2
Change-Id: Ia9aa5f463121c2ce2e0e8eee8a6c8612b7297f2b
Reviewed-by: Ahmad Samir <a.samirh78@gmail.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
There was a gap in its numbering, and the quick brown fix could do
with some competition.
Change-Id: I1283bbb6ba321ae2b65b4459327f2428a45f85cc
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
The same _data() will be re-used with trim().
Change-Id: Ie9b794b7e8d40552d9cacb71df0f8a151d4348a5
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
In C++20 std::basic_string_view has gained a range constructor (like
QStringView always had), but that range constructor has been made
explicit. This means we can't just pass a QString(View) to a function
taking a u16string_view. The consensus seems to be that that types that
should implictly convert towards stdlib's string views should do that
via implicit conversion operators. This patch adds them for
* QByteArrayView => std::string_view
* QString(View) => std::u16string_view
* QUtf8StringView => std::string_view or std::u8string_view, depending
on the storage_type
QLatin1StringView doesn't have a matching std:: view so I'm not enabling
its conversion.
QByteArray poses a challenge, in that it already defines a conversion
towards const char *. (One can disable that conversion with a macro.)
That conversion makes it impossible to support:
QByteArray ba;
std::string_view sv1(ba); // 1
std::string_view sv2 = ba; // 2
because:
* if only operator const char *() is defined, then (2) doesn't work
(situation right now);
* if both conversions to const char * and string_view are defined, then
(1) is ambiguous on certain compilers (MSVC, QCC). Interestingly
enough, not on GCC/Clang, but only in C++17 and later modes.
I can't kill the conversion towards const char * (API break, and we use
it *everywhere* in Qt), hence, QByteArray does not get the implicit
conversion, at least not in this patch.
[ChangeLog][QtCore][QByteArrayView] Added an implicit conversion
operator towards std::string_view.
[ChangeLog][QtCore][QString] Added an implicit conversion operator
towards std::u16string_view.
[ChangeLog][QtCore][QStringView] Added an implicit conversion operator
towards std::u16string_view.
[ChangeLog][QtCore][QUtf8StringView] Added an implicit conversion
operator towards std::string_view (QUtf8StringView is using char
as its storage type in Qt 6). Note that QUtf8StringView is planned to
use char8_t in Qt 7, therefore it is expected that the conversion will
change towards std::u8string_view in Qt 7.
Change-Id: I6d3b64d211a386241ae157765cd1b03f531f909a
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
We are gradually enabling more tests for WebAssembly platform
for better test coverage.
Long linking time is no longer an issue due to test batching.
Change-Id: I7ee9f877ecda726bc23d8dd2507c616bb381ebc1
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Lorn Potter <lorn.potter@gmail.com>
Add the boilerplate standalone test prelude to each test, so that they
can be opened with an IDE without the qt-cmake-standalone-test script,
but directly with qt-cmake or cmake.
Boilerplate was added using the following scripts:
https://git.qt.io/alcroito/cmake_refactor
Manual adjustments were made where the code was inserted in the wrong
location.
Task-number: QTBUG-93020
Change-Id: I28b6d3815c5f43d2c33ea65764f6f3f8f129eaf3
Reviewed-by: Amir Masoud Abdol <amir.abdol@qt.io>
Reviewed-by: Joerg Bornemann <joerg.bornemann@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This no longer is range-length preserving now, so adapt the
documentation.
For the non-contiguous iterator case, it's actually ok to always
resize(0) and then append(), because, unlike for QList and QVLA, the
resize(0) doesn't actually iterate the container to destroy
elements. It just sets some members and conveniently detach()es for
us.
The char8_t case is even more complicated, since we can, atm, not
include qstringconverter.h into qstring.h, yet qstringconverter is
required for stateful UTF-8 decoding in the input_iterator case. So
that's postponed to yet another patch, and maybe won't make it into
6.6. But I feel it's important to have at least one
non-length-preserving version of assign(it, it) in before release lest
users come to rely on this documented (and de-facto) feature of the
the step-2 assign().
Fixes: QTBUG-106198
Pick-to: 6.6
Change-Id: Id458776e91b16fb2c80196e339cb817adee5d6d9
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Ivan Solovev <ivan.solovev@qt.io>
Restrict the permissible value_types to those QStringView can take,
plus QLatin1Char. All of these implicitly convert to QChar and give
the correct result, even when converted char-by-char.
Task-number: QTBUG-106198
Pick-to: 6.6
Change-Id: Icb44244cb08af391161c4309467d4e0d2d3d3d62
Reviewed-by: Ivan Solovev <ivan.solovev@qt.io>
Reviewed-by: Dennis Oberst <dennis.oberst@qt.io>
This seems to work with prepend(char), but not with prepend("data"),
cf. QTBUG-114167.
Task-number: QTBUG-114167
Pick-to: 6.5 6.6
Change-Id: I7aa4dca7c2b5938c2e5ad416231945c23140d659
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Implemented assign() methods for QString to align with the
criteria of std::basic_string, addressing the previously missing
functionality. This is a subset of the overloads provided by the
standard.
Reference:
https://en.cppreference.com/w/cpp/string/basic_string/assign
The assign(it, it) overload is a bit more complicated and will be
added in follow-up patches.
[ChangeLog][QtCore][QString] Added assign().
Task-number: QTBUG-106198
Change-Id: Ia1481d184865f46db872cf94c266fef83b962351
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Now the tst_qstring is compiled three times:
- with QT_NO_CAST_FROM_ASCII defined
- with QT_RESTRICTED_CAST_FROM_ASCII defined
- with neither of the above defined
so as to cover more code paths.
Pick-to: 6.5
Task-number: QTBUG-109228
Change-Id: I65eca0f6f6aea66fed6eeda1eb77a50a97210807
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
I got tired of being told off by the inanity 'bot for faithfully
reflecting existing #if-ery in new #if-ery. Retain only the
documentation and definition of the deprecated define.
Change-Id: I47f47b76bd239a360f27ae5afe593dfad8746538
Reviewed-by: Ahmad Samir <a.samirh78@gmail.com>
Reviewed-by: Tor Arne Vestbø <tor.arne.vestbo@qt.io>
Drive-by changes:
- Cleanup creating a QChar[], by creating a char16_t[] and
reinterpret_cast'ing it
- Use human-readable Unicode characters where possible
Pick-to: 6.5
Change-Id: Ice2c36ff3ea4b6a5562cf907a7809166a51abd28
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Drive-by change: use UTF-16 instead of UTF-8 for Eastern Arabic
Numerals, both are not human-readable but UTF-16 is one code point
instead of the two for UTF-8, less \x.
Pick-to: 6.5
Change-Id: I721f3989b7d776ddc4f9d337b21dca9d398fcc0d
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Constructing from const char* etc is already covered by
constructorQByteArray.
I took a guess that the "// b(10)" comment is about testing constructing
a QString from a QChar[] that has an explicit \0 charcater. I tried
finding what the initial intent was but the trail went cold at the
"Initial import from the monolithic Qt" commit.
Pick-to: 6.5
Change-Id: I15bcdb24e55286eb6cd3056af0714a1eed581635
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
I.e. the second arg to QCOMPARE isn't what's being tested.
Drive-by changes:
- More _L1 usage, less blocky and easier to read
- QCOMPARE's second arg can be a View, it is smart enough and can
compare them just fine
- Replace a "//15 chars" comment with a QCOMPARE check
Pick-to: 6.5
Change-Id: I4f4b84b16b543df37b0ba2f9dd781b045b2ed397
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
- Port macros to QTest data rows in separate unittests
- Move DOUBLE_TEST-related data to toDouble() unittest
- Drop one redundant unittest:
QTest::newRow("const-charstar") << (const char*)0;
Pick-to: 6.5
Change-Id: Ie809895e9f5d58c2d3ec419689f409b55e24fcf7
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
And switch to using test data rows (rooting out two macros in the
process).
Pick-to: 6.5
Change-Id: Ib31e6b59f90f0983c0efc4bef7cb246aedfcab5b
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Change test data to compile with NO_CAST_FROM_ASCII uncoditionally where
casting from ASCII isn't what's being tested by a unittest.
The goal is to add a variant of tst_qstring that is compiled with
QT_NO_CAST_FROM_ASCII so that the unittests cover that code path too.
The commits are split into smaller chunks (where there is a common
link between changed code, that code is put in a commit, otherwise I
kept the number of changed lines below ~150) to make reviewing them
easier.
Pick-to: 6.5
Change-Id: I14256f1bde7749a3023753dbb7ed8be72cb6bc14
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
I.e. don't detach in the replace() overloads that delegate to
replace_helper() if this string is shared, instead create a new string
and copy characters from this string to it, along with the "after"
string, then swap it with this.
Do the same thing if "before" is shorter than "after" and there isn't
enough capacity to do the replacement without reallocating.
Use std::copy* and std::move*, which will both fallback to
memmove/memcpy, but they have C++ API, which is more readable.
[ChangeLog][QtCore][QString] Using replace() on a currently shared
QString is now done more efficiently
Task-number: QTBUG-106184
Change-Id: If74ffa1ed47636dc23d543d6dc123d8f2b21d537
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Utf8 data is variable-width, ideally we want to write characters at most
once, so insert directly into the QString buffer if inserting at the end
(by delegating to append(QUtf8SV)), and use an intermediate buffer to
hold the converted data before inserting anywhere else.
Task-number: QTBUG-108546
Change-Id: Iabfaeecaf34a1ba11946bd67951e69a45d954d6d
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Instead of detaching when the string is shared, or if the the insertion
would cause a reallocation, create a new string and copy characters to
it as needed, then swap it with "this" string. This is more efficient
than detaching which would copy the whole string before inserting, as
some characters would be copied multiple times.
Use detachAndGrow(), otherwise QStringBuilder unitests fail:
PASS : tst_QStringBuilder1::initTestCase()
FAIL! : tst_QStringBuilder1::scenario() 'prepends < max_prepends' returned FALSE. ()
Loc: [tests/auto/corelib/text/qstringbuilder/qstringbuilder1/stringbuilder.cpp(61)]
PASS : tst_QStringBuilder1::cleanupTestCase()
The issue is that now when inserting, if the string is going to
reallocated, we create a new string, so the freeSpaceAtBegin()
optimization doesn't work the same way.
void checkItWorksWithFreeSpaceAtBegin(const String &chunk, const Separator &separator)
{
// GIVEN: a String with freeSpaceAtBegin() and less than chunk.size() freeSpaceAtEnd()
String str;
int prepends = 0;
const int max_prepends = 10;
while (str.data_ptr().freeSpaceAtBegin() < chunk.size() && prepends++ < max_prepends)
str.prepend(chunk);
QVERIFY(prepends < max_prepends);
...
...
each str.prepend() would have reallocated.
[ChangeLog][QtCore][QString] Calling insert() on a currently shared
string is now done more efficiently.
Task-number: QTBUG-106186
Change-Id: I07ce8d6bde50919fdc587433e624ace9cee05be8
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
If the object is shared, instead of detaching, copy characters from
"this" to a new object except for the chacters that would be erased,
this is more efficient than detaching (which would copy the whole data
then erase).
- Extend tst_QString::removeIf() to catch a corner-case (that I saw
with tst_QByteArray::removeIf()).
- Add q_uninitialized_remove_copy_if, which works like
std::remove_copy_if but for uninitialized memory like
q_uninitialized_relocate_n (but copies rather than relocates/moves).
With the same static_assert from q_relocate_overlap_n that the type
destructor is non-throwing.
Added q_uninitialized_remove_copy_if in this commit rather than a
separate one so that it's unittested by its usage in eraseIf().
[ChangeLog][QtCore][QString, QByteArray] Removing characters from a
currently shared string or byte array is now done more efficiently
Task-number: QTBUG-106181
Task-number: QTBUG-106183
Change-Id: Icc0ed31633cef71d482b97e0d2d20d763163d383
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The QChar::toLatin1() args in:
str.replace(index, len, QChar(after[0]).toLatin1())
s2.replace(ch.toLatin1(), after, cs)
will be converted to QChar, so it's always calling the same QString
overload, I argue that we're not testing QChar implicit conversions
here.
Change-Id: I3962cab2b34684f970638575e6bd15dd1067a8c6
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
When the match finds a surrogate pair as the first true Unicode character,
then we need to skip both code units of the pair in order to restart the
search. PCRE2 does not allow us to search for individual UTF-16 code
units.
That actually means that counting "." gives us the count of Unicode
characters.
Fixes: QTBUG-110586
Pick-to: 5.15 6.2 6.4 6.5
Change-Id: I194d0a32c94148f398e6fffd173d5b5be8137e19
Reviewed-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
If the string is shared, instead of detaching, create a new string and
copy the characters from this string, replacing the ones matching "before"
with "after", to the new string.
Change-Id: I2c33690230d40f3121e60e242666460559258b7b
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Following the QRect, add functions converting the QString to native
emscripten::val and back: fromJsString, toJsString
Change-Id: I2d0625ede3bbf7249e2e91b8de298b5b91df8ba2
Reviewed-by: Morten Johan Sørvig <morten.sorvig@qt.io>
QtBase didn't contain any checks for QT_RESTRICTED_CAST_FROM_ASCII, so
a recent addition to the QString::append/insert/prepend overload set
made calls with C string literal arguments ambiguous without the CI
noticing. We had a similar problem with QString::multiArg.
To increase test coverage, we now run tst_qstring two times:
- without any define
- with QT_RESTRICTED_CAST_FROM_ASCII (lots of changes necessary)
Most removals are expected, because they disable tests that check the
implicit conversions from QByteArray and const char*, but the
relational operators with QLatin1String objects might warrant fixing.
In some places, when the conversion wasn't the functionality under
test, replaced C string literals or QByteArrays with QLatin1String.
We should also test with QT_NO_CAST_FROM_ASCII, but that's even larger
surgery.
QString doesn't have a ctor from std::nullptr_t, so QString s =
nullptr; doesn't compile in C++17 mode, but does in C++20 mode, due to
the const char8_t* ctor.
Pick-to: 6.5 6.4 6.2 5.15
Change-Id: I0c5a31719a4b8dd585dd748e0ca0d99964866064
Reviewed-by: Alexey Edelev <alexey.edelev@qt.io>
Reviewed-by: hjk <hjk@qt.io>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
... in lieu of <cctype>'s toupper(), which is locale-dependent, and
out-of-line.
The code doesn't run into the toupper(i) issue in the Türkiye locale,
because we don't run tests in that locale and because 'i' is not a
valid format specifier, but don't let the next reader of the code
guess when the use of toAsciiUpper() provides unambiguous guidance.
Task-number: QTBUG-109235
Pick-to: 6.5
Change-Id: I8988f5190441e1ae5cb57370952cda70ca6bb658
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Same fix as in tst_qbytearray's QCOMPARE() in
cb9715557c.
Pick-to: 6.4 6.2
Change-Id: I2222d9015ae7121a2fbcf5b936b27de20e873064
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
For both the [4, 7] and [8,15] length cases, we can perform the same
technique: perform two overlapped loads, zero-extend, then perform two
overlapped stores. The 8-character case could be done in a single
load/store pair, but is not worth the extra conditionals. And it should
have the exact same performance numbers whether we use non-overlapping
4-character operations or completely-overlapping 8-character ones (I
*think* the full overlap is actually better).
The 4-character operation is new in this commit. That reduces the
non-vectorized, unrolled to at most 3 characters.
Change-Id: Ib42b3adc93bf4d43bd55fffd16c257ada774236a
Reviewed-by: Lars Knoll <lars@knoll.priv.no>
Overloading insert is a bit tricky since the size might change after
the conversion so either the tail has to be moved twice or a temporary
buffer is needed. For now, add an ineffective but simple overload as in
the case of the const char *s overload, and do the performance
optimization in a follow-up task (QTBUG-108546).
[ChangeLog][QtCore][QString] Added insert(QUtf8StringView) overload.
Task-number: QTBUG-103302
Change-Id: If01c216ff626da29abb43eb68d4de82824f3bfba
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The += operator is already overloaded to handle QStringView and
QLatin1String - add the missing QUtf8StringView overload.
[ChangeLog][QtCore][QString] Added operator+=(QUtf8StringView)
overload.
Task-number: QTBUG-103302
Change-Id: Iec6940bad7866310c826a130b98accebc3c82aa8
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
Add the missing overload, among other things it is needed to
implement QTBUG-103302.
[ChangeLog][QtCore][QString] Added append(QUtf8StringView)
overload.
Task-number: QTBUG-103302
Change-Id: I576f73c1919e3a1f1a315d0f82c708e835686eb1
Reviewed-by: Marc Mutz <marc.mutz@qt.io>