qt5base-lts/tests/auto/corelib/text/qstring
Giuseppe D'Angelo 54536bb5ae QString::arg: deprecate use of arbitrary Unicode digits as replacements
The only documented replacements for Q*String*::arg() are sequences like
%1, %2, %3 -- where the n-th number is expressed using a sequence of
ASCII digits [1].

The code parsing the replacements however used the QChar::digitValue()
function. That function simply checks if a QChar has a *Unicode digit
value* (no matter what its block/category is), and if so, returns the
corresponding digit value as an int (otherwise returns -1).

The result of this is that a sequence like "%¹" or "%१" actually
triggered substitutions (both count as "1"). Similarly, QChars with
a digit value would be parsed as part of longer sequences like "%1²"
(counting as "12" (!)).

This behavior is weird, undocumented, and extremely likely the usual
backstabbing by Unicode by using "convenience" QChar methods -- that is,
never *intended* by the implementation.

This commit deprecates (via warnings) such usages, which for the time
being are left working as before (in the name of backwards
compatibility). At the same time: given it's extremely unlikely that
someone would be deliberately relying on this behavior, it implements
the desired change of behavior (only accept sequences of ASCII digits)
starting from Qt 6.6, that is, after the next LTS.

Throughout Qt 6's lifetime users will still be able to control arg()'s
behavior by setting an env variable, but that variable (and the support
for Unicode digits) will disappear in Qt 7.

To summarize:

* Qt 6.3->6.5: default is Unicode digits, env var to control
* Qt 6.6->6.x: default is ASCII digits, env var to control
* Qt 7: only ASCII digits, no env var

[1] That's the name Unicode gives to them, cf. https://www.unicode.org/charts/PDF/U0000.pdf

[ChangeLog][QtCore][Deprecation Notices] The arg() functions
featured in Qt string classes have always been documented to require
replacements tokens to be sequences of ASCII digits (like %1, %2, %34,
and so on). A coding oversight made it accept sequences of arbitrary
characters with a Unicode digit value instead. For instance, "%2੩" is
interpreted as the 23rd substitution; and "%1²" is interpreted as the
12th substitution. This behavior is deprecated, and will result in
runtime warnings. Starting from Qt 6.6, arg()'s behavior will be changed
to accept only ASCII digits by default. That means that "%1²" is going
to be interpreted as substitution number 1 followed by the "²" character
(which does not get substituted, so it gets left as-is in the result).
Users can restore the previous semantics (accept Unicode digits) by
setting the QT_USE_UNICODE_DIGIT_VALUES_IN_STRING_ARG environment
variable to a non-zero value. In Qt 7, arg() will only support sequences
of ASCII digits. Note that from Qt 6.3 users can also set
QT_USE_UNICODE_DIGIT_VALUES_IN_STRING_ARG to zero; this will make arg()
use ASCII digits only, in preparation for the future change of defaults.

Change-Id: I8a044b629bcca6996e76018c9faf7c6748ae04e8
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-11-30 19:33:34 +02:00
..
.gitignore
CMakeLists.txt CMake: Regenerate projects to use new qt_internal_ API 2020-09-23 16:59:06 +02:00
double_data.h
tst_qstring_mac.mm Replace QtTest headers with QTest 2020-12-22 15:20:30 +01:00
tst_qstring.cpp QString::arg: deprecate use of arbitrary Unicode digits as replacements 2021-11-30 19:33:34 +02:00