QtBase didn't contain any checks for QT_RESTRICTED_CAST_FROM_ASCII, so
a recent addition to the QString::append/insert/prepend overload set
made calls with C string literal arguments ambiguous without the CI
noticing. We had a similar problem with QString::multiArg.
To increase test coverage, we now run tst_qstring two times:
- without any define
- with QT_RESTRICTED_CAST_FROM_ASCII (lots of changes necessary)
Most removals are expected, because they disable tests that check the
implicit conversions from QByteArray and const char*, but the
relational operators with QLatin1String objects might warrant fixing.
In some places, when the conversion wasn't the functionality under
test, replaced C string literals or QByteArrays with QLatin1String.
We should also test with QT_NO_CAST_FROM_ASCII, but that's even larger
surgery.
QString doesn't have a ctor from std::nullptr_t, so QString s =
nullptr; doesn't compile in C++17 mode, but does in C++20 mode, due to
the const char8_t* ctor.
Pick-to: 6.5 6.4 6.2 5.15
Change-Id: I0c5a31719a4b8dd585dd748e0ca0d99964866064
Reviewed-by: Alexey Edelev <alexey.edelev@qt.io>
Reviewed-by: hjk <hjk@qt.io>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
... in lieu of <cctype>'s toupper(), which is locale-dependent, and
out-of-line.
The code doesn't run into the toupper(i) issue in the Türkiye locale,
because we don't run tests in that locale and because 'i' is not a
valid format specifier, but don't let the next reader of the code
guess when the use of toAsciiUpper() provides unambiguous guidance.
Task-number: QTBUG-109235
Pick-to: 6.5
Change-Id: I8988f5190441e1ae5cb57370952cda70ca6bb658
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Same fix as in tst_qbytearray's QCOMPARE() in
cb9715557c.
Pick-to: 6.4 6.2
Change-Id: I2222d9015ae7121a2fbcf5b936b27de20e873064
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
For both the [4, 7] and [8,15] length cases, we can perform the same
technique: perform two overlapped loads, zero-extend, then perform two
overlapped stores. The 8-character case could be done in a single
load/store pair, but is not worth the extra conditionals. And it should
have the exact same performance numbers whether we use non-overlapping
4-character operations or completely-overlapping 8-character ones (I
*think* the full overlap is actually better).
The 4-character operation is new in this commit. That reduces the
non-vectorized, unrolled to at most 3 characters.
Change-Id: Ib42b3adc93bf4d43bd55fffd16c257ada774236a
Reviewed-by: Lars Knoll <lars@knoll.priv.no>
Overloading insert is a bit tricky since the size might change after
the conversion so either the tail has to be moved twice or a temporary
buffer is needed. For now, add an ineffective but simple overload as in
the case of the const char *s overload, and do the performance
optimization in a follow-up task (QTBUG-108546).
[ChangeLog][QtCore][QString] Added insert(QUtf8StringView) overload.
Task-number: QTBUG-103302
Change-Id: If01c216ff626da29abb43eb68d4de82824f3bfba
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The += operator is already overloaded to handle QStringView and
QLatin1String - add the missing QUtf8StringView overload.
[ChangeLog][QtCore][QString] Added operator+=(QUtf8StringView)
overload.
Task-number: QTBUG-103302
Change-Id: Iec6940bad7866310c826a130b98accebc3c82aa8
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
Add the missing overload, among other things it is needed to
implement QTBUG-103302.
[ChangeLog][QtCore][QString] Added append(QUtf8StringView)
overload.
Task-number: QTBUG-103302
Change-Id: I576f73c1919e3a1f1a315d0f82c708e835686eb1
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
During the implementation of QString::append(QUtf8StringView) it has
become apparent that the testing is insufficient as it did not warn
about an extra growth. The following tests have been added that append:
- y-umlaut and greek letter small theta (2 UTF-8 code units => 1 UTF-16)
- devanagri letter ssa (3 UTF-8 code units => 1 UTF-16)
- chakma digit zero (4 UTF-8 code units => 2 UTF-16 code units)
- some combinations of the above
Pick-to: 6.4 6.2
Task-number: QTBUG-103302
Change-Id: I981213c296bafc81663b08c0f1f339bbd8a96485
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
During the implementation of QString::append(QUtf8StringView) it has
become apparent that the testing is insufficient as it did not warn
about an extra growth. The following tests have been added that append:
- y-umlaut and greek letter small theta (2 UTF-8 code units => 1 UTF-16)
- devanagri letter ssa (3 UTF-8 code units => 1 UTF-16)
- chakma digit zero (4 UTF-8 code units => 2 UTF-16 code units)
- some combinations of the above
Pick-to: 6.4 6.2
Task-number: QTBUG-103302
Change-Id: I3d81cf10b7eb74433ce5bea9b92ce6bce1230dcd
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
During the implementation of QString::append(QUtf8StringView) it has
become apparent that the testing is insufficient as it did not warn
about an extra growth. The following tests have been added that append:
- y-umlaut and greek letter small theta (2 UTF-8 code units => 1 UTF-16)
- devanagri letter ssa (3 UTF-8 code units => 1 UTF-16)
- chakma digit zero (4 UTF-8 code units => 2 UTF-16 code units)
- some combinations of the above
Note that this also affects operator_pluseq_data, which is basically
a wrapper around append_data.
Pick-to: 6.4 6.2
Task-number: QTBUG-103302
Change-Id: I09ed950e3f0e71ae9ae85a455f42e130887f1109
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
It's not intuitive, so check lest people break it.
Pick-to: 6.4 6.2
Change-Id: I2435cd69be7b77a6ae59cdc7b5fb99658cfc42fd
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
- If this string is not shared, modify it directly
- If this string is shared, instead of detaching copy the characters
from this string, except the ones that are going to be removed, to a
new string and swap it. This is more efficient than detaching, which
would copy the whole string including the characters that are going
to be removed.
This affects:
remove(const QString &str, Qt::CaseSensitivity cs)
remove(QLatin1StringView str, Qt::CaseSensitivity cs)
Adjust the unittests to test both code paths.
[ChangeLog][QtCore][QString] Improved the performance of
QString::remove() by avoiding unnecessary data copying. Now, if this
string is (implicitly) shared with another, instead of copying
everything and then removing what we don't want, the characters from
this string are copied to the destination, except the ones that need to
be removed.
Task-number: QTBUG-106181
Change-Id: Id8eba59a44bab641cc8aa662eb45063faf201183
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
- If the string isn't shared, don't call detach(), instead remove characters
matching ch, and resize()
- If the string is shared, create a new string, and copy all characters
except the ones that would be removed, see task for details
Update unittets so that calls to this overload of remove() test both code
paths (replace() calls remove(QChar, cs) internally).
Drive-by change: use QCOMPARE() instead of QTEST()
Task-number: QTBUG-106181
Change-Id: I1fa08cf29baac2560fca62861fc4a81967b54e92
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This is a semantic patch using ClangTidyTransformator as in
qtbase/df9d882d41b741fef7c5beeddb0abe9d904443d8, but extended to
handle typedefs and accesses through pointers, too:
const std::string o = "object";
auto hasTypeIgnoringPointer = [](auto type) { return anyOf(hasType(type), hasType(pointsTo(type))); };
auto derivedFromAnyOfClasses = [&](ArrayRef<StringRef> classes) {
auto exprOfDeclaredType = [&](auto decl) {
return expr(hasTypeIgnoringPointer(hasUnqualifiedDesugaredType(recordType(hasDeclaration(decl))))).bind(o);
};
return exprOfDeclaredType(cxxRecordDecl(isSameOrDerivedFrom(hasAnyName(classes))));
};
auto renameMethod = [&] (ArrayRef<StringRef> classes,
StringRef from, StringRef to) {
return makeRule(cxxMemberCallExpr(on(derivedFromAnyOfClasses(classes)),
callee(cxxMethodDecl(hasName(from), parameterCountIs(0)))),
changeTo(cat(access(o, cat(to)), "()")),
cat("use '", to, "' instead of '", from, "'"));
};
renameMethod(<classes>, "count", "size");
renameMethod(<classes>, "length", "size");
except that the on() matcher has been replaced by one that doesn't
ignoreParens().
a.k.a qt-port-to-std-compatible-api V5 with config Scope: 'Container'.
Added two NOLINTNEXTLINEs in tst_qbitarray and tst_qcontiguouscache,
to avoid porting calls that explicitly test count().
Change-Id: Icfb8808c2ff4a30187e9935a51cad26987451c22
Reviewed-by: Ivan Solovev <ivan.solovev@qt.io>
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
- If this string isn't shared, don't call detach, instead use ->erase() as
needed
- If this string is shared, create a new string, and copy all elements
except the ones that would be removed, see task for details
Update unittest to test both code paths.
Task-number: QTBUG-106181
Change-Id: I4c73ff17a6fa89ddcf6966f9c5bf789753f6d39e
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
%.f should be handled like %.0f. You probably don't want it for strings,
though.
Fixes: QTBUG-107991
Pick-to: 6.2 6.4
Change-Id: I07ec23f3cb174fb197c3fffd1721a941fbcf15e1
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
We've been requiring C++17 since Qt 6.0, and our qAsConst use finally
starts to bother us (QTBUG-99313), so time to port away from it
now.
Since qAsConst has exactly the same semantics as std::as_const (down
to rvalue treatment, constexpr'ness and noexcept'ness), there's really
nothing more to it than a global search-and-replace, with manual
unstaging of the actual definition and documentation in dist/,
src/corelib/doc/ and src/corelib/global/.
Task-number: QTBUG-99313
Change-Id: I4c7114444a325ad4e62d0fcbfd347d2bbfb21541
Reviewed-by: Ivan Solovev <ivan.solovev@qt.io>
There were two data8 rows; and no data9, so that was easy to fix.
Change-Id: I8191de142e1a3be57bf1ad97e63d5780f2859fea
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Two test cases were called "base 2, negative"; one of them use -1 as
value, so s/negative/minus 1/ for it.
Change-Id: Ia5da3952d93976262cc8423d4e75ec19dab9a088
Reviewed-by: Mate Barany <mate.barany@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
And include qcore_mac_p.h where needed.
Task-number: QTBUG-99313
Change-Id: Idb1b005f1b5938e8cf329ae06ffaf0d249874db2
Reviewed-by: Tor Arne Vestbø <tor.arne.vestbo@qt.io>
* QVariant::Type -> QMetaType::Type.
* Guard the test for deprecated fromUtf16(const ushort *) overload with
QT_DEPRECATED_SINCE check.
* Use fromUtf16(const char16_t *) overload in other places.
As a drive-by: fix formatting in the affected lines.
Task-number: QTBUG-104858
Change-Id: I9fa3a935bca36e97f934f673e2fc07b451c72872
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The new name describes the behavior in a better way.
[ChangeLog][Build System] The QT_DISABLE_DEPRECATED_BEFORE macro is
renamed to QT_DISABLE_DEPRECATED_UP_TO. The old name is deprecated, but
is still recognized if it is defined during configuration and the new
name is not defined.
Task-number: QTBUG-104944
Change-Id: Ifc34323e0bbd9e3dc2f86c3e80d4d0940ebccbb8
Reviewed-by: Alexandru Croitor <alexandru.croitor@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
CMakeLists.txt and .cmake files of significant size
(more than 2 lines according to our check in tst_license.pl)
now have the copyright and license header.
Existing copyright statements remain intact
Task-number: QTBUG-88621
Change-Id: I3b98cdc55ead806ec81ce09af9271f9b95af97fa
Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
QString has several functions taking a QRegularExpression: indexOf(),
contains(), and so on. Some of those have an out-argument of type
QRegularExpressionMatch, to report the details of the match (if any).
For instance:
QRegularExpression re(...);
QRegularExpressionMatch match;
if (string.contains(re, &match))
use(match);
The code used to route the implementation of these functions through
QStringView (which has the very same functions). This however opens
up a lifetime problem with temporary strings:
if (getString().contains(re, &match))
use(match); // match is dangling
Here `match` is dangling because it is referencing data into the
destroyed temporary -- nothing is keeping the string alive. This is
against the rules we've decided for Qt, and it's also asymmetric with
the corresponding code that uses QRegularExpression directly instead:
match = re.match(getString());
if (match.hasMatch())
use(match); // not dangling
... although we've documented not to do this. (In light of the decision
we've made w.r.t. temporaries, the documentation is wrong anyways.)
Here QRE takes a copy of the string and stores it in the match object,
thus keeping it alive.
Hence, extend the implementation of the QString functions to keep a
(shallow) copy of the string. To keep the code shared as much as
possible with QStringView, in theory one could have a function taking a
std::variant<QString, QStringView> and that uses the currently active
member. However I've found that std::variant here creates some abysmal
codegen, so instead I went for a simpler approach -- pass a QStringView
and an optional pointer to a QString. Use the latter if it's loaded.
QStringView has some inline code that calls into exported functions, so
I can't change the signature of them without breaking BC; I'm instead
adding new unexported functions and a Qt 7 note to unify them.
Change-Id: I7c65885a84069d0fbb902dcc96ddff543ca84562
Fixes: QTBUG-103940
Pick-to: 6.2 6.3 6.4
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Replace the current license disclaimer in files by
a SPDX-License-Identifier.
Files that have to be modified by hand are modified.
License files are organized under LICENSES directory.
Task-number: QTBUG-67283
Change-Id: Id880c92784c40f3bbde861c0d93f58151c18b9f1
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
[ChangeLog][QtCore] Deprecated _qs and _qba literal operators
for QString and QByteArray in favor of _s and _ba in the
Qt::Literals::StringLiterals namespace.
Task-number: QTBUG-101408
Change-Id: I26aee0055e3b4c1860de6eda8e0eb857c5b3e11a
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
[ChangeLog][QtCore] Added literal operators for _s and _ba for QString
and QByteArray respectively in the Qt::Literals::StringLiterals
namespace.
Task-number: QTBUG-101408
Change-Id: I5cd4e7f36f614ea805cfecc27b91c5d981cd3794
Reviewed-by: Ivan Solovev <ivan.solovev@qt.io>
Reviewed-by: Andrei Golubev <andrei.golubev@qt.io>
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
Update docs and add tests.
[ChangeLog][QtCore] Documented existing support for 'F' format when
converting floating-point numbers to strings in QLocale::toString(),
hence equally for QString's floating-point formatting. Previously it
was supported but the documentation neglected to mention it; it only
differs from 'f' for infinities and NaN.
Change-Id: Ic946c0f7b9e86fdf512daa3124bea57fc664b34b
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Sona Kurazyan <sona.kurazyan@qt.io>
As a drive-by, fixed misleading wording used in docs.
[ChangeLog][QtCore][Potentially Source-Incompatible Changes][QLatin1String]
Added QLatin1String(std::nullptr_t) constructor, which makes
QLatin1String(0) call ambiguous. To fix the ambiguity, nullptr
must be passed instead of 0.
Task-number: QTBUG-98433
Change-Id: I2b888aa23469343d78aa640dc39a6028b77165dd
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
Just as a minor debugging helper: when warning that an invalid
regular expression object is being used to match, also print
the used regular expression pattern.
Change-Id: I0f99bcf4ca87ec67d04ed91d9dc315814f56d392
Fixes: QTBUG-76670
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
- current INTEGRITY development pack don't support denormals for float and double.
All values are rounded to 0.
Task-number: QTBUG-99123
Pick-to: 6.2 6.3
Change-Id: Iaaacdc4210c7ac2ec3ec337c61164a1ade0efb01
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
qstrncmp() would stop at the first null character, which isn't correct.
The tests that had been disabled in tst_qstring.cpp (with an inaccurate
comment) were actually passing. I've added one more to ensure that the
terminating null is compared where needed.
[ChangeLog][QtCore][QLatin1String and QUtf8StringView] Fixed a
couple of bugs where two QLatin1Strings or two QUtf8StringViews
would stop their comparisons at the first embedded null
character, instead of comparing the full string. This issue
affected both classes' relational operators (less than, greater
than, etc.) and QUtf8StringView's operator== and operator!=.
Pick-to: 5.15 6.2 6.3
Change-Id: I0e5f6bec596a4a78bd3bfffd16c90ecea71ea68e
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
We want to preserve nullness where possible. Test that various ctors
do the right thing when presented with null input.
Pick-to: 6.3
Change-Id: Ia1a1d4fb3c919b4fed2d9b87827815a1b5072c54
Reviewed-by: Fabian Kosmale <fabian.kosmale@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The only documented replacements for Q*String*::arg() are sequences like
%1, %2, %3 -- where the n-th number is expressed using a sequence of
ASCII digits [1].
The code parsing the replacements however used the QChar::digitValue()
function. That function simply checks if a QChar has a *Unicode digit
value* (no matter what its block/category is), and if so, returns the
corresponding digit value as an int (otherwise returns -1).
The result of this is that a sequence like "%¹" or "%१" actually
triggered substitutions (both count as "1"). Similarly, QChars with
a digit value would be parsed as part of longer sequences like "%1²"
(counting as "12" (!)).
This behavior is weird, undocumented, and extremely likely the usual
backstabbing by Unicode by using "convenience" QChar methods -- that is,
never *intended* by the implementation.
This commit deprecates (via warnings) such usages, which for the time
being are left working as before (in the name of backwards
compatibility). At the same time: given it's extremely unlikely that
someone would be deliberately relying on this behavior, it implements
the desired change of behavior (only accept sequences of ASCII digits)
starting from Qt 6.6, that is, after the next LTS.
Throughout Qt 6's lifetime users will still be able to control arg()'s
behavior by setting an env variable, but that variable (and the support
for Unicode digits) will disappear in Qt 7.
To summarize:
* Qt 6.3->6.5: default is Unicode digits, env var to control
* Qt 6.6->6.x: default is ASCII digits, env var to control
* Qt 7: only ASCII digits, no env var
[1] That's the name Unicode gives to them, cf. https://www.unicode.org/charts/PDF/U0000.pdf
[ChangeLog][QtCore][Deprecation Notices] The arg() functions
featured in Qt string classes have always been documented to require
replacements tokens to be sequences of ASCII digits (like %1, %2, %34,
and so on). A coding oversight made it accept sequences of arbitrary
characters with a Unicode digit value instead. For instance, "%2੩" is
interpreted as the 23rd substitution; and "%1²" is interpreted as the
12th substitution. This behavior is deprecated, and will result in
runtime warnings. Starting from Qt 6.6, arg()'s behavior will be changed
to accept only ASCII digits by default. That means that "%1²" is going
to be interpreted as substitution number 1 followed by the "²" character
(which does not get substituted, so it gets left as-is in the result).
Users can restore the previous semantics (accept Unicode digits) by
setting the QT_USE_UNICODE_DIGIT_VALUES_IN_STRING_ARG environment
variable to a non-zero value. In Qt 7, arg() will only support sequences
of ASCII digits. Note that from Qt 6.3 users can also set
QT_USE_UNICODE_DIGIT_VALUES_IN_STRING_ARG to zero; this will make arg()
use ASCII digits only, in preparation for the future change of defaults.
Change-Id: I8a044b629bcca6996e76018c9faf7c6748ae04e8
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
QLocale and QString tests had copies of a TransientLocale; we've
recently improved the QLocale one. Rather than duplicating those
rather complicated improvements, finally share a common version.
In the process, I noticed that setlocale() only returns the prior
value when passed nullptr as the new value; so rework the
implementation to get that right, so that it now correctly restores
the prior locale. That, in turn, means there's now a later call to
setlocale(), when we actually set the changed setting, which may
invalidate the earlier return; so copy it to a QByteArray before the
second call.
Included Ivan Solovev's improved version of how to reset the locale,
since TransientLocale needs it.
Pick-to: 6.2
Change-Id: I4cb1efbda42f0e2cdd934e04b3b3732ce0f45a06
Reviewed-by: Ivan Solovev <ivan.solovev@qt.io>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Remove third-party code in favor of STL. Implement (for now)
strtou?ll() as inlines on strntou?ll() calling strlen() for the size
parameter. (This is not entirely safe, as a string lacking
'\0'-termination but with at least some non-matching text after the
numeric portion would formerly be parsed just fine, but would now
produce a crash. However, strtou?ll() are internal and callers should
be ensuring '\0'-termination.)
Task-number: QTBUG-74286
Change-Id: I0c8ca7d4f6110367e93b4c0164854a82c5a545e1
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
When a needle has length 1 (because it's a QChar/char16_t, or because
it's a string-like of length 1) then an ad-hoc search algorithm is
used. This algorithm had a off-by-one, by not allowing to match at
the last position of a haystack (in case `from` was `haystack.size()`).
That is inconsistent with the general search of substring needles
(and what QByteArray does). Fix that case and amend wrong tests.
This in turn unveiled the fact that the algorithm was unable to cope
with 0-length haystacks (whops), so fix that as well. Drive-by, add a
similar fix for QByteArray.
Amends 6cee204d56.
Pick-to: 6.2
Change-Id: I6b3effc4ecd74bcbcd33dd2e550da2df7bf05ae3
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
There's no need of duplicating code all over the place; QString can
reuse the implementation of the indexOf/contains/count/lastIndexOf
family of functions already existing for QStringView.
For simplicity, the warning messages (that our autotests actually check)
have been made more generic, rather than introducing some other
parameter (as in, "which class is using this functionality so to emit
a more precise warning"), which would have just complicated things as
the implementation of these functions is exported and used by inline
QStringView member functions.
Change-Id: I85cd94a31c82b00d61341b3058b954749a2d6c6b
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
When trying to fix 0-length matches at the end of a QString,
be83ff65c4 actually introduced a
regression due to how lastIndexOf interprets its `from` parameter.
The "established" (=legacy) interpretation of a negative `from` is that
it is supposed to indicate that we want the last match at offset `from +
size()`. With the default from of -1, that means we want a match
starting at most at position `size() - 1` inclusive, i.e. *at* the last
position in the string. The aforementioned commit changed that, by
allowing a match at position `size()` instead, and this behavioral
change broke code.
The problem the commit tried to fix was that empty matches *are* allowed
to happen at position size(): the last match of regexp // inside the
string "test" is indeed at position 4 (the regexp matches 5 times).
Changing the meaning of negative from to include that last position (in
general: to include position `from+size()+1` as the last valid matching
position, in case of a negative `from`) has unfortunately broken client
code. Therefore, we need to revert it. This patch does that, adapting
the tests as necessary (drive-by: a broken #undef is removed).
Reverting the patch however is not sufficient. What we are facing here
is an historical API mistake that forces the default `from` (-1) to
*skip* the truly last possible match; the mistake is that thre is simply
no way to pass a negative `from` and obtain that match. This means that
the revert will now cause code like this:
str.lastIndexOf(QRE("")); // `from` defaulted to -1
NOT to return str.size(), which is counter-intuitive and wrong. Other
APIs expose this inconsistency: for instance, using
QRegularExpressionIterator would actually yield a last match at position
str.size(). Similarly, using QString::count would return `str.size()+1`.
Note that, in general, it's still possible for clients to call
str.lastIndexOf(~~~, str.size())
to get the "truly last" match.
This patch also tries to fix this case ("have our cake and eat it").
First and foremost, a couple of bugs in QByteArray and QString code are
fixed (when dealing with 0-length needles).
Second, a lastIndexOf overload is added. One overload is the "legacy"
one, that will honor the pre-existing semantics of negative `from`. The
new overload does NOT take a `from` parameter at all, and will actually
match from the truly end (by simply calling `lastIndexOf(~~~, size())`
internally).
These overloads are offered for all the existing lastIndexOf()
overloads, not only the ones taking QRE.
This means that code simply using `lastIndexOf` without any `from`
parameter get the "correct" behavior for 0-length matches, and code that
specifies one gets the legacy behavior. Matches of length > 0 are not
affected anyways, as they can't match at position size().
[ChangeLog][Important Behavior Changes] A regression in the behavior of
the lastIndexOf() function on text-related containers and views
(QString, QStringView, QByteArray, QByteArrayView, QLatin1String) has
been fixed, and the behavior made consistent and more in line with
user expectations. When lastIndexOf() is invoked with a negative `from`
position, the last match has now to start at the last character in the
container/view (before, it was at the position *past* the last
character). This makes a difference when using lastIndexOf() with a
needle that has 0 length (for instance an empty string, a regular
expression that can match 0 characters, and so on); any other case is
unaffected. To retrieve the "truly last" match, one can pass a
positive `from` offset to lastIndexOf() (basically, pass `size()` as the
`from` parameter). To make calls such as `text.lastIndexOf(~~~);`, that
do not pass any `from` parameter, behave properly, a new lastIndexOf()
overload has been added to all the text containers/views. This overload
does not take a `from` parameter at all, and will search starting from
one character past the end of the text, therefore returning a correct
result when used with needles that may yield 0-length matches. Client
code may need to be recompiled in order to use this new overload.
Conversely, client code that needs to skip the "truly last" match now
needs to pass -1 as the `from` parameter instead of relying on the
default.
Change-Id: I5e92bdcf1a57c2c3cca97b6adccf0883d00a92e5
Fixes: QTBUG-94215
Pick-to: 6.2
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>