This class is designed as C++20-style generator / lazy sequence, and
the new return value of QString{,View}::tokenize().
It thus is more similar to a hand-coded loop around indexOf() than
QString::split(), which returns a container (the filling of which
allocates memory).
The template arguments of QStringTokenizer intricately depend on the
arguments with which it is constructed, so QStringTokenizer cannot be used
directly without C++17 CTAD. To work around this issue, add a factory
function, qTokenize().
LATER:
- ~Optimize QLatin1String needles (avoid repeated L1->UTF16 conversion)~
(out of scope for QStringTokenizer, should be solved in the respective
indexOf())
- Keep per-instantiation state:
* Boyer-Moore table
[ChangeLog][QtCore][QStringTokenizer] New class.
[ChangeLog][QtCore][qTokenize] New function.
Change-Id: I7a7a02e9175cdd3887778f29f2f91933329be759
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Since QString::split() is not going away in Qt 6, we should aim
to provide API symmetry here, and ease porting existing code from
QString(Ref) to use QStringView.
This is easier than having to port everything to use tokenize() at
the same time. tokenize() will however lead to better performance
and thus should be preferred.
Change-Id: I1eb43300a90167c6e9389ab56f416f2bf7edf506
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
The idea is pretty simple -- add QRegularExpression matching over
QStringView. When matching over a QString, keep the string
alive (by taking a copy), and set the view onto that string.
Otherwise, just use the view provided by the user (who is then
responsible for ensuring the data stays valid while matching).
Do just minor refactorings to support this use case in a cleaner
fashion.
In QRegularExpressionMatch drop the QStringRef-returning methods, as
they cannot work any more -- in the general case there won't be a
QString to build a QStringRef from.
[ChangeLog][QtCore][QRegularExpression] All the APIs dealing
with QStringRef have been ported to QStringView, following
QStringRef deprecation in Qt 6.0.
Change-Id: Ic367991d9583cc108c045e4387c9b7288c8f1ffd
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Make the API more symmetric with regards to both QString and QStringRef.
Change-Id: Ia67c53ba708f6c33874d1a127de8e2857ad9b5b8
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
Make the API more symmetric with regards to both QString and QStringRef.
Having this available helps making QStringView more of a drop-in
replacement for QStringRef. QStringRef is planned to get removed in Qt 6.
Change-Id: Ife036c0b55970078f42e1335442ff9ee5f4a2f0d
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
The name CET is locale-dependent; but QLocale doesn't know about
localization of time zone names. Such abbreviated zone names are, in
any case, potentially ambiguous - various zones around the world have
collisions - so they can't be relied on.
QTimeZone's various backends have differing handlings of how to
abbreviate zone names (MS's provides no abbreviated names at all); and
it appears macOS actually follows the relevant localizations.
So it is hopeless to hard-code the expected zone abbreviations.
Changed the tests to consult QTimeZone for the abbreviation and
compare what it gets with the results of checks which should match
this. This is less stringent, but it is at least robustly correct,
thereby getting rid of assorted kludges and #if-ery.
Pick-to: 5.15
Task-number: QTBUG-70149
Change-Id: I0c565de3fd8b5987c8f5a3f785ebd8f6e941e055
Reviewed-by: Tor Arne Vestbø <tor.arne.vestbo@qt.io>
resize() to a smaller size does not reallocate in Qt 5 if the container
is not shared. Match this here.
As a drive-by also fix resize calls on raw data strings to ensure
they are null terminated after the resize.
Change-Id: Ic4d8830e86ed3f247020d7ece3217cebd344ae96
Reviewed-by: Marc Mutz <marc.mutz@kdab.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Even in Qt 5, remove() can be passed an alias to *this. In Qt 6, with
the advent of substring sharing, this will become even more
pronounced. Use the same fix as was already used in QString::insert().
Pick-to: 5.15
Change-Id: I1a0d3d99fd7dff6e727661646d2cbfdc94df2682
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
As best as I can guess, this used the QString::operator=(char), which
I locally removed. Before that lands in Qt, remove this ... wtf?
Pick-to: 5.15
Change-Id: Ie083fe69500d6b5b633416f89f5dd1d7068c20b2
Reviewed-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Centralize, rather than keeping adding constructors from any
array-like container.
A more robust implementation, likely following the converting
constructor for std::span ([span.cons]), is out of scope for C++17
and will require C++20's ranges and concepts.
[ChangeLog][QtCore][QStringView] QStringView can now be constructed
from any contiguous container, as long as they hold string-like data.
For instance, it's now possible to create a QStringView object
from a std::vector<char16_t>, a QVarLengthArray<ushort> and so on.
Change-Id: I7043eb194f617e98bd1f8af1237777a93a6c5e75
Reviewed-by: Marc Mutz <marc.mutz@kdab.com>
It has been the case for both QStringLiteral and QByteArrayLiteral
since Qt 5.0, and Q_ARRAY_LITERAL since Qt 6.0.
Since it's definitely surprising, add a note in the docs, which
is "somehow" consistent with the interpretation of capacity as
the biggest possible size before we reallocate. Since it's 0,
any manipulation of the size will cause a reallocation.
(Alternatively: the capacity() is for how many elements memory was
requested from the free store. No memory was allocated, so 0...)
Task-number: QTBUG-84069
Change-Id: I5c7d21a22d1bd8b8d9b71143e33d537ca0224acd
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Add an operator QVariant() to QRegExp to keep things at source compatible
as possible.
Add a hack to QVariant::load/save() to recognize the old typeid
for QRegExp and stream them correctly as long as the streaming operators
for QRegExp are registered.
Also move the datastream test for QRegExp to tst_qregexp, and adjust it to
the qvariant changes.
Change-Id: I120b38a7541b43ec07a21b17f7f35c55f071eb75
Reviewed-by: Fabian Kosmale <fabian.kosmale@qt.io>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Replacement methods do now exist in QRegExp, or
for QRegularExpression when porting to it.
Remove all autotests associated with the old methods.
Change-Id: I3ff1e0da4b53adb64d5a48a30aecd8b960f5e633
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The prepares for the removal of those methods from QString and
QStringList. The new methods in QRegExp are left as a porting help.
Change-Id: Ieffa33a79caf53b83029e9b070c4eb5cadca1418
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Move pre/and post condition handling out of the main loop
to make that one as fast as possible.
Remove special handling of a corner case when the input length
is zero, where the utf8 decoder did something else than all
other decoders.
Change-Id: I94992767ea15405b38f7953adadaa6ff98b20b6f
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
There's no real dependency to QTextCodec in those files anymore.
Change-Id: Ifaf19ab554fd108fa26095db4e2bd4a3e9ea427f
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Document QStringConverter, QStringDecoder and QStringEncoder.
In addition, do some touches to the API, renaming one enum value,
add a flags argument to one constructor and make some members private.
Change-Id: I8f99dc3d98fb8860cf6fa46301e34b7eb400511b
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This is a replacement for Qt::codecForHtml().
Change-Id: I31f03518fd9c70507cbd210a8bcf405b6a0106b1
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Add method that tries to determine the encoding of the data
from an initial byte order mark.
Change-Id: I348c51a3d4db9b434af53359b739a7e17acfc760
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Add static methods that allow converting between a name for an
encoding and the Encoding enum.
Change-Id: I12bc503cf757ea31d3ca8d5e1f1216efddcb16d4
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Add a constructor, that allows constructing a string converter by
name. This is required in some cases and also makes it possible to
(in the future) extend the API to 3rd party encodings.
Also add a name() accessor.
Change-Id: I606d6ce9405ee967f76197b803615e27c5b001cf
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Feed the data one by one to the encoder or decoder to
verify that the handling of incremental decoding is
correct.
Change-Id: I565e4f1872e00859026334f7662b6778772e159d
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
IgnoreHeader was a rather badly defined enum, in addition the
utf8 and utf16 codecs where handling BOMs somewhat different
for stateless decoding.
Fix this by introducing explicit flags for writing a bom when
encoding and not skipping the initial bom when decoding.
Source compatibility for QTextCodec is done with a couple of
static constexpr variables.
Change-Id: I0b2d94f84c937cec1e0494c16ef448c00382691d
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The new QStringEncoder and QStringDecoder classes
(with a common QStringConverter base class) are
there to replace QTextCodec in Qt 6.
It currently uses a trivial wrapper around the utf
encoding functionality.
Added some autotests, mostly copied from the text codec
tests.
Change-Id: Ib6eeee55fba918b9424be244cbda9dfd5096f7eb
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The former messes in bad ways with the overload set (it, fatally,
attracts char16_t, e.g.). The latter was probably added in response to
ambiguities between (char) and (QChar). While it's harmless now,
remove it, since it no longer pulls its weight.
The no-ascii warning is now coming from QChar(char), so the protection
isn't lost.
[ChangeLog][QtCore][QString] The += operators taking char and
QChar::SpecialCharacter have been removed as they cause adding a
char16_t to QString to call the char overload, losing information. The
append() function was not affected.
Change-Id: I57116314bcc71c0d9476159513c0c10048239db3
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
The fromUtf16(ushort*) and fromUcs4(uint*) overloads are going
to be deprecated. Use the newer fromUtf16(char16_t*) and
fromUcs4(char32_t*) overloads.
As a drive-by, use std::end()/std::size() where applicable.
Change-Id: I5a93e38cae4a2e33d49c90d06c5f14f7cb7ce90c
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
The fromUcs2() named ctor is designed to replace all the non-char
integral-type constructors of QChar which make it very hard to control
the implicit QChar conversions, which have caused a few bugs in Qt
itself. As a classical named contructor, it simply returns QChar.
The fromUcs4() named "ctor", however, needs to expand surrogate pairs,
and thus can't just return QChar. Instead, it returns a small struct
that contains one or two char16_t's, can be iterated over and be
implicitly converted to QStringView. To avoid bikeshedding the name
(FromUcs4Result, of course :), it's defined inline and thus can't be
named outside the function. This function replaces most uses of
QChar::requiresSurrogates() in QtBase.
[ChangeLog][QtCore][QChar] Added fromUcs2(), fromUcs4().
Change-Id: I803708c14001040f75cb599e33c24a3fb8d2579c
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
wchar_t hasn't been QStringView's storage_type for a very long time
(at least since 5.10). And in Qt 6, we require char16_t support not
only in the compiler, but also in the stdlib, so drop the guards and
the alternative code paths.
Change-Id: I99f28b575f61c16a2497840708beaa4b54a80f57
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
They don't work with std::span, since std::span has neither
const_iterator nor cbegin()/cend().
To fix, perfectly forward the return type with decltype(auto) instead
of mentioning T::const_iterator, and rely on the const-qualfication of
the help::c{r,}{begin,end}() functions' arguments to select the
correct T::{r,}{begin,end}() overload.
Change-Id: I4992d4bd521d2dc0f9ea51ae70cde8286ae543a5
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
It clearly belonged in tst_QDate::toDateTime(), for which it adds a
few more test-case and in which it inspires some further testing.
The new testing of case-insensitivity doesn't work if the format
contains stray non-format characters, so added a new data column to
take care of that.
Pick-to: 5.15
Change-Id: I73619be02091c97024a84cb963c7029e9fd0569a
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
We don't rely on a latin1 locale anymore for the test,
and the other code was not doing anything.
Change-Id: I08bc08d200c9e037884d8b680dfbb24c129f3d2e
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Alex Blasche <alexander.blasche@qt.io>
[ChangeLog][QtCore][QString] Now supports appending, prepending
and inserting QStringViews.
Change-Id: I7538c050c67590f27d91443eda0b94a4b80b62f2
Reviewed-by: Anton Kudryavtsev <antkudr@mail.ru>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
This is a follow-up to commit ed2b110b6a
to fix indexing errors. Added the test that should have accompanied
that commit, which found some bugs, and refined the Indian number
formatting test (on which it's based).
Make variable i in the loops that insert grouping characters in a
number be consistently a *character* offset - which, when each digit
is a surrogate pair, isn't the same as an index into the
QString. Apply the needed scaling when indexing with it, not when
setting it or decrementing it. Don't assume the separator has the same
width as a digit.
Differences in index no longer give the number of digits between two
points in a string, so actively track how many digits we've seen in a
group when converting a numeric string to the C locale. Partially
cleaned up the code for that in the process (more shall follow when I
sort out digit grouping properly, without special-casing India).
Change-Id: I13d0f24efa26e599dfefb5733e062088fa56d375
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Conflicts:
examples/opengl/doc/src/cube.qdoc
src/corelib/global/qlibraryinfo.cpp
src/corelib/text/qbytearray_p.h
src/corelib/text/qlocale_data_p.h
src/corelib/time/qhijricalendar_data_p.h
src/corelib/time/qjalalicalendar_data_p.h
src/corelib/time/qromancalendar_data_p.h
src/network/ssl/qsslcertificate.h
src/widgets/doc/src/graphicsview.qdoc
src/widgets/widgets/qcombobox.cpp
src/widgets/widgets/qcombobox.h
tests/auto/corelib/tools/qscopeguard/tst_qscopeguard.cpp
tests/auto/widgets/widgets/qcombobox/tst_qcombobox.cpp
tests/benchmarks/corelib/io/qdiriterator/qdiriterator.pro
tests/manual/diaglib/debugproxystyle.cpp
tests/manual/diaglib/qwidgetdump.cpp
tests/manual/diaglib/qwindowdump.cpp
tests/manual/diaglib/textdump.cpp
util/locale_database/cldr2qlocalexml.py
util/locale_database/qlocalexml.py
util/locale_database/qlocalexml2cpp.py
Resolution of util/locale_database/ are based on:
https://codereview.qt-project.org/c/qt/qtbase/+/294250
and src/corelib/{text,time}/*_data_p.h were then regenerated by
running those scripts.
Updated CMakeLists.txt in each of
tests/auto/corelib/serialization/qcborstreamreader/
tests/auto/corelib/serialization/qcborvalue/
tests/auto/gui/kernel/
and generated new ones in each of
tests/auto/gui/kernel/qaddpostroutine/
tests/auto/gui/kernel/qhighdpiscaling/
tests/libfuzzer/corelib/text/qregularexpression/optimize/
tests/libfuzzer/gui/painting/qcolorspace/fromiccprofile/
tests/libfuzzer/gui/text/qtextdocument/sethtml/
tests/libfuzzer/gui/text/qtextdocument/setmarkdown/
tests/libfuzzer/gui/text/qtextlayout/beginlayout/
by running util/cmake/pro2cmake.py on their changed .pro files.
Changed target name in
tests/auto/gui/kernel/qaction/qaction.pro
tests/auto/gui/kernel/qaction/qactiongroup.pro
tests/auto/gui/kernel/qshortcut/qshortcut.pro
to ensure unique target names for CMake
Changed tst_QComboBox::currentIndex to not test the
currentIndexChanged(QString), as that one does not exist in Qt 6
anymore.
Change-Id: I9a85705484855ae1dc874a81f49d27a50b0dcff7
The name of the option may cause confusion due to the fact
that it's not _fully_ anchoring the match, only anchoring it
at the offset passed to match() -- in other words, it's a
"left" anchoring. Deprecate the old name and introduce
a new one that should explain the situation better.
[ChangeLog][QtCore][QRegularExpression] The AnchoredMatchOption
match option has been deprecated in favor of
AnchorAtOffsetMatchOption, which should better describe
that the match is only anchored at the offset.
Change-Id: Ib751e5e488f2d0309a2da6496378247dfa4648de
Reviewed-by: Samuel Gaist <samuel.gaist@idiap.ch>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
In particular, this changed the US currency formats for negative
amounts to be parenthesised versions of the positive amount forms,
rather than having a minus sign after the $ sign. Test updated.
[ChangeLog][QtCore][QLocale] Currency formats are now based on CLDR's
accounting formats, where they were previously mostly based (more or
less by accident) on standard formats. In particular, this now means
negative currency formats are specified, where available, where they
(mostly) were not previously.
Task-number: QTBUG-79902
Change-Id: Ie0c07515ece8bd518a74a6956bf97ca85e9894eb
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
We should not implicitly convert a QString to a QLocale object. It can
easily create unwanted side effects.
Change-Id: I7bd9b4a4e4512c0e60176ee4d241d172f00fdc32
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
The call of _control87 would crash because of the previous test.
Change-Id: I254efe9c2e9892a473a02663e5ff7016791d5d6d
Reviewed-by: Tony Sarajärvi <tony.sarajarvi@qt.io>
Include WordBreakTest.html, since a test uses sample strings from it,
albeit without actually reading the file.
Had to comment out more of the new tests, as at Revision 24, pending
an update to harfbuzz and the text boundary detection code.
Task-number: QTBUG-79631
Task-number: QTBUG-79418
Task-number: QTBUG-82747
Change-Id: I0082294b09d67ffdc6a9b5c15acf77ad3b86f65f
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
No surprises, as char16_t is transparently handled by QChar overloads.
Ok, one surprise: we seem to have QChar <> QByteArray relational
operators, but they don't work for char16_t. Probably members of
QChar, so LHS implicit conversions are disabled. Didn't investigate,
because it needs to be fixed at some point anyway, but that point is
not now.
Change-Id: I74e1c9bdd168e6480e18d7d86c1f13412e718a32
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>