When trying to fix 0-length matches at the end of a QString,
be83ff65c4 actually introduced a
regression due to how lastIndexOf interprets its `from` parameter.
The "established" (=legacy) interpretation of a negative `from` is that
it is supposed to indicate that we want the last match at offset `from +
size()`. With the default from of -1, that means we want a match
starting at most at position `size() - 1` inclusive, i.e. *at* the last
position in the string. The aforementioned commit changed that, by
allowing a match at position `size()` instead, and this behavioral
change broke code.
The problem the commit tried to fix was that empty matches *are* allowed
to happen at position size(): the last match of regexp // inside the
string "test" is indeed at position 4 (the regexp matches 5 times).
Changing the meaning of negative from to include that last position (in
general: to include position `from+size()+1` as the last valid matching
position, in case of a negative `from`) has unfortunately broken client
code. Therefore, we need to revert it. This patch does that, adapting
the tests as necessary (drive-by: a broken #undef is removed).
Reverting the patch however is not sufficient. What we are facing here
is an historical API mistake that forces the default `from` (-1) to
*skip* the truly last possible match; the mistake is that thre is simply
no way to pass a negative `from` and obtain that match. This means that
the revert will now cause code like this:
str.lastIndexOf(QRE("")); // `from` defaulted to -1
NOT to return str.size(), which is counter-intuitive and wrong. Other
APIs expose this inconsistency: for instance, using
QRegularExpressionIterator would actually yield a last match at position
str.size(). Similarly, using QString::count would return `str.size()+1`.
Note that, in general, it's still possible for clients to call
str.lastIndexOf(~~~, str.size())
to get the "truly last" match.
This patch also tries to fix this case ("have our cake and eat it").
First and foremost, a couple of bugs in QByteArray and QString code are
fixed (when dealing with 0-length needles).
Second, a lastIndexOf overload is added. One overload is the "legacy"
one, that will honor the pre-existing semantics of negative `from`. The
new overload does NOT take a `from` parameter at all, and will actually
match from the truly end (by simply calling `lastIndexOf(~~~, size())`
internally).
These overloads are offered for all the existing lastIndexOf()
overloads, not only the ones taking QRE.
This means that code simply using `lastIndexOf` without any `from`
parameter get the "correct" behavior for 0-length matches, and code that
specifies one gets the legacy behavior. Matches of length > 0 are not
affected anyways, as they can't match at position size().
[ChangeLog][Important Behavior Changes] A regression in the behavior of
the lastIndexOf() function on text-related containers and views
(QString, QStringView, QByteArray, QByteArrayView, QLatin1String) has
been fixed, and the behavior made consistent and more in line with
user expectations. When lastIndexOf() is invoked with a negative `from`
position, the last match has now to start at the last character in the
container/view (before, it was at the position *past* the last
character). This makes a difference when using lastIndexOf() with a
needle that has 0 length (for instance an empty string, a regular
expression that can match 0 characters, and so on); any other case is
unaffected. To retrieve the "truly last" match, one can pass a
positive `from` offset to lastIndexOf() (basically, pass `size()` as the
`from` parameter). To make calls such as `text.lastIndexOf(~~~);`, that
do not pass any `from` parameter, behave properly, a new lastIndexOf()
overload has been added to all the text containers/views. This overload
does not take a `from` parameter at all, and will search starting from
one character past the end of the text, therefore returning a correct
result when used with needles that may yield 0-length matches. Client
code may need to be recompiled in order to use this new overload.
Conversely, client code that needs to skip the "truly last" match now
needs to pass -1 as the `from` parameter instead of relying on the
default.
Change-Id: I5e92bdcf1a57c2c3cca97b6adccf0883d00a92e5
Fixes: QTBUG-94215
Pick-to: 6.2
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>