qt5base-lts/tests/auto/corelib/tools
Giuseppe D'Angelo bcd1b7fe8e Fix QString::toUcs4 returning invalid data when encountering stray surrogates
Code units 0xD800 .. 0xDFFF are not UCS-4, so we can't happily return them.
Instead, if we encounter a stray surrogate, replace it with 0xFFFD, which
is what Unicode recommends anyhow.

References:

§3.9 Unicode Encoding Forms

    D76: Unicode scalar value: Any Unicode code point except high-surrogate
    and low surrogate code points.

    As a result of this definition, the set of Unicode scalar values consists
    of the ranges 0 to D7FF_16 and E000_16 to 10FFFF_16, inclusive.

    [...]

    UTF-32 encoding form: The Unicode encoding form that assigns each Unicode
    scalar value to a single unsigned 32-bit code unit with the same numeric
    value as the Unicode scalar value.

§ C.2 Encoding Forms in ISO/IEC 10646

    UCS-4. UCS-4 stands for “Universal Character Set coded in 4 octets.” It is
    now treated simply as a synonym for UTF-32, and is considered the canonical
    form for representation of characters in 10646.

§ 3.9 Unicode Encoding Forms (Best Practices for Using U+FFFD)
and
§ 5.22 Best Practice for U+FFFD Substitution

    Whenever an unconvertible offset is reached during conversion of a code
    unit sequence:

    1. The maximal subpart at that offset should be replaced by a single
    U+FFFD.

    2. The conversion should proceed at the offset immediately after the
    maximal subpart.

    [...]

    Whenever an unconvertible offset is reached during conversion of a code
    unit sequence to Unicode:

    1. Find the longest code unit sequence that is the initial subsequence of
    some sequence that could be converted. If there is such a sequence, replace
    it with a single U+FFFD; otherwise replace a single code unit with a single
    U+FFFD.

    2. The conversion should proceed at the offset immediately after the
    subsequence which has been replaced.

[ChangeLog][QtCore][QString] QString::toUcs4 now does not return invalid
UCS-4 code units belonging to the surrogate range (U+D800 to U+DFFF)
when the QString contains malformed UTF-16 data. Instead, U+FFFD
is returned in place of the malformed subsequence.

Change-Id: I19d7af03e749fea680fd5d9635439bc9d56558a9
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2014-02-07 15:00:36 +01:00
..
qalgorithms tst_QAlgorithms: fix compilation with C++11 enabled 2013-11-17 09:48:17 +01:00
qarraydata Fix QArrayData check 2014-01-26 20:03:34 +01:00
qbitarray Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qbytearray Base64: Implement the "base64url" encoding and the stripping of '=' 2013-09-14 03:20:25 +02:00
qbytearraymatcher Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qbytedatabuffer don't erroneously claim that gui support is needed 2013-10-16 17:10:15 +02:00
qcache Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qchar Introduce QChar::JoiningType enum and QChar::joiningType() method 2014-01-29 23:19:47 +01:00
qcollator QCollator: enable move semantics 2013-11-17 09:47:07 +01:00
qcommandlineparser Merge remote-tracking branch 'origin/stable' into dev 2013-12-24 00:56:59 +01:00
qcontiguouscache Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qcryptographichash Whitespace cleanup: remove trailing whitespace 2013-03-16 20:22:50 +01:00
qdate QDate - Fix parsing Qt::ISODate 2014-01-11 20:45:22 +01:00
qdatetime Merge remote-tracking branch 'origin/stable' into dev 2014-01-20 18:18:59 +01:00
qeasingcurve Fix MSVC-warnings about double to float truncation. 2014-01-24 20:26:39 +01:00
qelapsedtimer Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qexplicitlyshareddatapointer Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qfreelist Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qhash QHash/QSet - fix QHash::erase when the hash is shared 2013-08-24 15:36:30 +02:00
qline Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qlinkedlist QLinkedList - extend the auto test. 2013-08-29 08:16:44 +02:00
qlist QList - fix insert with iterator on shared instance 2013-08-24 15:36:30 +02:00
qlocale Revert "test: marked tst_qlocale as insignificant on Windows" 2014-01-30 18:04:59 +01:00
qmap Add first/last accessors to QMap 2013-09-08 16:13:16 +02:00
qmargins Add missing operators QMargins -=,+= (int). 2013-10-15 18:20:37 +02:00
qmessageauthenticationcode tst_qmessageauthenticationcode: Fix warning about character conversion. 2013-06-08 10:29:34 +02:00
qpair Change copyrights from Nokia to Digia 2012-09-22 19:20:11 +02:00
qpoint Add static dotProduct methods to the QPoint(F) classes 2013-01-26 00:09:14 +01:00
qpointf Fix QPointF::division autotest 2013-08-19 14:24:28 +02:00
qqueue Whitespace cleanup: remove trailing whitespace 2013-03-16 20:22:50 +01:00
qrect Merge remote-tracking branch 'origin/stable' into dev 2013-01-22 18:40:13 +01:00
qregexp Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qregularexpression QRegularExpression: print a warning if (?J) is used in a pattern 2013-02-12 22:40:21 +01:00
qringbuffer Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qscopedpointer Revert "Implement move-ctor and move-assignment-op for QScopedPointer" 2013-09-05 08:20:19 +02:00
qscopedvaluerollback Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qset Added initializer list constructors for Qt associative containers. 2013-01-24 11:38:54 +01:00
qsharedpointer Add QT_NO_PROCESS guards in tests where they are missing 2013-09-03 08:42:24 +02:00
qsize Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qsizef Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qstl Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qstring Fix QString::toUcs4 returning invalid data when encountering stray surrogates 2014-02-07 15:00:36 +01:00
qstring_no_cast_from_bytearray Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qstringbuilder Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qstringiterator Long live QStringIterator! 2014-02-07 04:47:04 +01:00
qstringlist Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qstringmatcher Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
qstringref New QStringRef methods. 2013-09-11 08:06:21 +02:00
qtextboundaryfinder Update UCD source files up to Unicode 6.3.0 2014-01-14 15:38:43 +01:00
qtime Use the short time format of the current locale on Windows 2013-12-16 22:26:37 +01:00
qtimeline Fixed bug in QTimeLine::setPaused(false) 2013-03-13 14:51:05 +01:00
qtimezone Merge remote-tracking branch 'origin/stable' into dev 2013-12-16 16:59:33 +01:00
qvarlengtharray Add QVarLengthArray::{indexOf,lastIndexOf,contains} functions 2014-01-09 17:42:22 +01:00
qvector Fix crash when constructing a QVector with an empty initializer list. 2014-01-18 11:16:40 +01:00
tools.pro Long live QStringIterator! 2014-02-07 04:47:04 +01:00