Commit Graph

13 Commits

Author SHA1 Message Date
Marc Mutz
ee63557112 QString/View: add tokenize() member functions
[ChangeLog][QtCore][QString, QStringView, QLatin1String] Added tokenize().

Change-Id: I5fbeab0ac1809ff2974e565129b61a6bdfb398bc
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2020-06-06 02:07:28 +00:00
Edward Welbourne
9dd8e655cd Limit QByteArray's 8-bit support to ASCII
Previously it handled Latin-1, which made it incompatible with UTF-8,
which is now our preferred 8-bit encoding. For Qt6 it is limited to
ASCII. Adjusted tests to match. QLatin1String::compare() turned out
to be relying on qstrnicmp()'s Latin-1 handling.

Removed some spurious Q_UNLIKELY()s and tidied up code a little in the
process.

[ChangeLog][QtCore][Important Behavior Changes] Encoding-dependent
features of QByteArrray are now limited to ASCII, where previously
they worked for the whole of Latin-1. This affects case-insensitive
comparison, notably including qstricmp() and qstrnicmp(), and
case-transforming functions.

Fixes: QTBUG-84323
Change-Id: I2925d9908f8654599195a2860847b17083911b41
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
2020-06-04 10:39:53 +02:00
Marc Mutz
6a3c6f939f Long live QStringTokenizer!
This class is designed as C++20-style generator / lazy sequence, and
the new return value of QString{,View}::tokenize().

It thus is more similar to a hand-coded loop around indexOf() than
QString::split(), which returns a container (the filling of which
allocates memory).

The template arguments of QStringTokenizer intricately depend on the
arguments with which it is constructed, so QStringTokenizer cannot be used
directly without C++17 CTAD. To work around this issue, add a factory
function, qTokenize().

LATER:
- ~Optimize QLatin1String needles (avoid repeated L1->UTF16 conversion)~
  (out of scope for QStringTokenizer, should be solved in the respective
  indexOf())
- Keep per-instantiation state:
  * Boyer-Moore table

[ChangeLog][QtCore][QStringTokenizer] New class.

[ChangeLog][QtCore][qTokenize] New function.

Change-Id: I7a7a02e9175cdd3887778f29f2f91933329be759
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2020-06-03 19:13:54 +02:00
Lars Knoll
a1056096fc Add support for count() to QStringView
Make the API more symmetric with regards to both QString and QStringRef.

Change-Id: Ia67c53ba708f6c33874d1a127de8e2857ad9b5b8
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2020-05-30 23:11:09 +02:00
Lars Knoll
beaef85b8d Add toInt() and friends to QStringView
Make the API more symmetric with regards to both QString and QStringRef.
Having this available helps making QStringView more of a drop-in
replacement for QStringRef. QStringRef is planned to get removed in Qt 6.

Change-Id: Ife036c0b55970078f42e1335442ff9ee5f4a2f0d
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2020-05-29 19:01:58 +02:00
Marc Mutz
d67551ff5b tst_qstringapisymmetry: test split() with char16_t seps
Change-Id: I6744291b88d5334764da78375899313ac771294b
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
2020-05-08 16:52:59 +02:00
Sona Kurazyan
7e1dacc27a Port qtbase/tests/auto/corelib/text tests to CMake
Task-number: QTBUG-78220
Change-Id: I497da6ed489854bdee5a1ead9a3f34118c78d001
Reviewed-by: Alexandru Croitor <alexandru.croitor@qt.io>
2020-04-27 14:34:51 +02:00
Marc Mutz
c58249c327 tst_qstringapisymmetry: start testing char16_t, too
No surprises, as char16_t is transparently handled by QChar overloads.

Ok, one surprise: we seem to have QChar <> QByteArray relational
operators, but they don't work for char16_t. Probably members of
QChar, so LHS implicit conversions are disabled. Didn't investigate,
because it needs to be fixed at some point anyway, but that point is
not now.

Change-Id: I74e1c9bdd168e6480e18d7d86c1f13412e718a32
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2020-03-05 10:51:39 +03:00
Marc Mutz
50f865e33f tst_qstringapisymmetry: fix indexOf/contains/lastIndexOf tests
... to not fold QChar tests into QString ones.

This is needed for adding char16_t tests.

Change-Id: I2507d7d68a39ff96cf033eadde10e383dc976dda
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2020-03-05 07:50:33 +00:00
Marc Mutz
b2f79cceb1 QLatin1String/QStringView: add (missing) member compare()
[ChangeLog][QtCore][QLatin1String] Added compare().

[ChangeLog][QtCore][QStringView] Added compare() overloads
taking QLatin1String, QChar.

Change-Id: Ie2aa400299cb63495e65ce29b2a32133066de826
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-03-05 07:50:02 +00:00
Marc Mutz
728cc964f3 QString/QByteArray: make all symmetry-checked member-compare() combinations noexcept
In QByteArray, they were just not marked as such.

In QString and QStringRef, the implicit conversion from QChar to
QString would destroy it. Add a QChar overload, delegating to
QStringView.

Added docs for the new overloads, copying from the nearest neighbor so
as to not look out of place. All string classes use different wording
for these functions. A cleanup of this state of affairs is out of the
scope of this patch.

Change-Id: I0b7b1d037aa229bcaf29b793841a18caf977d66b
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2020-03-05 07:49:40 +00:00
Marc Mutz
6b9a1824a4 Extend tst_qstringapisymmetry for member compare()
There were a few surprises:

- QByteArray::compare() are missing noexcept (will add)
- ibid., called with non-ascii content and CaseInsensitive fails
  (this was discussed on the ML, with tentative agreement that
  it's a feature, not a bug; waiting for QUtf8String(View) for a
  fix, then).
- As was the case when we did this exercise with the relational
  operators, QString(Ref)/QChar is not noexcept (will fix)

These have been QEXPECT_FAIL'ed.

Not much of the cartesian product is implemented at all, yet.  These
have been #ifdef'ed with NOT_YET_IMPLEMENTED to see what's still
missing.

Change-Id: I7d9b21e292b98f980aacdc6248e88188f7472ba2
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-02-09 17:43:20 +00:00
Edward Welbourne
a9aa206b7b Move text-related code out of corelib/tools/ to corelib/text/
This includes byte array, string, char, unicode, locale, collation and
regular expressions.

Change-Id: I8b125fa52c8c513eb57a0f1298b91910e5a0d786
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
2019-07-10 17:05:30 +02:00