Go to file
Giuseppe D'Angelo bcd1b7fe8e Fix QString::toUcs4 returning invalid data when encountering stray surrogates
Code units 0xD800 .. 0xDFFF are not UCS-4, so we can't happily return them.
Instead, if we encounter a stray surrogate, replace it with 0xFFFD, which
is what Unicode recommends anyhow.

References:

§3.9 Unicode Encoding Forms

    D76: Unicode scalar value: Any Unicode code point except high-surrogate
    and low surrogate code points.

    As a result of this definition, the set of Unicode scalar values consists
    of the ranges 0 to D7FF_16 and E000_16 to 10FFFF_16, inclusive.

    [...]

    UTF-32 encoding form: The Unicode encoding form that assigns each Unicode
    scalar value to a single unsigned 32-bit code unit with the same numeric
    value as the Unicode scalar value.

§ C.2 Encoding Forms in ISO/IEC 10646

    UCS-4. UCS-4 stands for “Universal Character Set coded in 4 octets.” It is
    now treated simply as a synonym for UTF-32, and is considered the canonical
    form for representation of characters in 10646.

§ 3.9 Unicode Encoding Forms (Best Practices for Using U+FFFD)
and
§ 5.22 Best Practice for U+FFFD Substitution

    Whenever an unconvertible offset is reached during conversion of a code
    unit sequence:

    1. The maximal subpart at that offset should be replaced by a single
    U+FFFD.

    2. The conversion should proceed at the offset immediately after the
    maximal subpart.

    [...]

    Whenever an unconvertible offset is reached during conversion of a code
    unit sequence to Unicode:

    1. Find the longest code unit sequence that is the initial subsequence of
    some sequence that could be converted. If there is such a sequence, replace
    it with a single U+FFFD; otherwise replace a single code unit with a single
    U+FFFD.

    2. The conversion should proceed at the offset immediately after the
    subsequence which has been replaced.

[ChangeLog][QtCore][QString] QString::toUcs4 now does not return invalid
UCS-4 code units belonging to the surrogate range (U+D800 to U+DFFF)
when the QString contains malformed UTF-16 data. Instead, U+FFFD
is returned in place of the malformed subsequence.

Change-Id: I19d7af03e749fea680fd5d9635439bc9d56558a9
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2014-02-07 15:00:36 +01:00
bin Merge "Merge remote-tracking branch 'origin/release' into stable" into refs/staging/stable 2013-06-15 22:39:25 +02:00
config.tests Merge "Merge remote-tracking branch 'origin/stable' into dev" into refs/staging/dev 2014-01-21 17:57:54 +01:00
dist Merge remote-tracking branch 'origin/stable' into dev 2013-12-24 00:56:59 +01:00
doc Ask qdoc not to parse Q_DECL_UNUSED 2014-01-14 18:52:14 +01:00
examples Fix MSVC-warnings about double to float truncation. 2014-01-24 20:26:39 +01:00
lib Initial import from the monolithic Qt. 2011-04-27 12:05:43 +02:00
mkspecs Update the macro that MSVC 2013 defines for AVX code generation 2014-02-01 00:58:58 +01:00
qmake Fix configure & qmake compilation with a future MSVC version 2014-02-01 06:56:45 +01:00
src Fix QString::toUcs4 returning invalid data when encountering stray surrogates 2014-02-07 15:00:36 +01:00
tests Fix QString::toUcs4 returning invalid data when encountering stray surrogates 2014-02-07 15:00:36 +01:00
tools Fix configure & qmake compilation with a future MSVC version 2014-02-01 06:56:45 +01:00
util Introduce QChar::JoiningType enum and QChar::joiningType() method 2014-01-29 23:19:47 +01:00
.gitattributes Update the git-archive export options 2012-09-07 15:39:31 +02:00
.gitignore GitIgnore updates 2013-12-09 17:28:18 +01:00
.qmake.conf Enable -Werror for all of qtbase 2013-09-04 01:50:10 +02:00
.tag Update the git-archive export options 2012-09-07 15:39:31 +02:00
configure Set QMAKE_X11_PREFIX also when Qt configure with -qt-xcb 2014-02-04 20:57:06 +01:00
configure.bat get rid of syncqt wrapper scripts 2013-05-13 21:54:48 +02:00
header.BSD Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
header.FDL Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
header.LGPL Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
header.LGPL-ONLY Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
INSTALL INSTALL: Fix URL of Installing Qt documentation 2013-04-11 16:09:07 +02:00
LGPL_EXCEPTION.txt Change copyrights from Nokia to Digia 2012-09-22 19:20:11 +02:00
LICENSE.FDL Initial import from the monolithic Qt. 2011-04-27 12:05:43 +02:00
LICENSE.GPL Add the LICENSE.GPL file to the module referenced from license headers 2012-05-20 22:41:08 +02:00
LICENSE.LGPL Update copyright year in Digia's license headers 2013-01-18 09:07:35 +01:00
LICENSE.PREVIEW.COMMERCIAL Update LICENSE.PREVIEW.COMMERCIAL license 2013-06-03 20:04:26 +02:00
qtbase.pro Merge remote-tracking branch 'origin/release' into stable 2013-12-05 17:42:33 +01:00
sync.profile generate qfeatures.h at build time 2013-10-29 15:37:58 +01:00