Commit Graph

9 Commits

Author SHA1 Message Date
Lars Knoll
57037145f5 Streamline the code in the conversion to and from utf8
Move pre/and post condition handling out of the main loop
to make that one as fast as possible.

Remove special handling of a corner case when the input length
is zero, where the utf8 decoder did something else than all
other decoders.

Change-Id: I94992767ea15405b38f7953adadaa6ff98b20b6f
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-05-14 07:51:45 +02:00
Lars Knoll
124d587bb9 Document the string converter classes
Document QStringConverter, QStringDecoder and QStringEncoder.

In addition, do some touches to the API, renaming one enum value,
add a flags argument to one constructor and make some members private.

Change-Id: I8f99dc3d98fb8860cf6fa46301e34b7eb400511b
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-05-14 07:50:29 +02:00
Lars Knoll
b8db123341 Add a method to determine the encoding for encoded HTML data
This is a replacement for Qt::codecForHtml().

Change-Id: I31f03518fd9c70507cbd210a8bcf405b6a0106b1
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-05-14 07:49:05 +02:00
Lars Knoll
13af1312f7 Add QStringConverter::encodingForData()
Add method that tries to determine the encoding of the data
from an initial byte order mark.

Change-Id: I348c51a3d4db9b434af53359b739a7e17acfc760
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-05-14 07:48:55 +02:00
Lars Knoll
a639bcda1e Add methods to convert between encoding and name to QStringConverter
Add static methods that allow converting between a name for an
encoding and the Encoding enum.

Change-Id: I12bc503cf757ea31d3ca8d5e1f1216efddcb16d4
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-05-14 07:48:49 +02:00
Lars Knoll
3ce9162ab5 Construct a string converter by name
Add a constructor, that allows constructing a string converter by
name. This is required in some cases and also makes it possible to
(in the future) extend the API to 3rd party encodings.

Also add a name() accessor.

Change-Id: I606d6ce9405ee967f76197b803615e27c5b001cf
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-05-14 07:48:42 +02:00
Lars Knoll
940665eff5 Add some incremental tests
Feed the data one by one to the encoder or decoder to
verify that the handling of incremental decoding is
correct.

Change-Id: I565e4f1872e00859026334f7662b6778772e159d
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-05-14 07:48:13 +02:00
Lars Knoll
cab0d57d1e Clean up the Flag handling in QStringConverter
IgnoreHeader was a rather badly defined enum, in addition the
utf8 and utf16 codecs where handling BOMs somewhat different
for stateless decoding.

Fix this by introducing explicit flags for writing a bom when
encoding and not skipping the initial bom when decoding.

Source compatibility for QTextCodec is done with a couple of
static constexpr variables.

Change-Id: I0b2d94f84c937cec1e0494c16ef448c00382691d
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-05-14 07:47:33 +02:00
Lars Knoll
f64a6bd638 Start work on a new API to replace QTextCodec
The new QStringEncoder and QStringDecoder classes
(with a common QStringConverter base class) are
there to replace QTextCodec in Qt 6.

It currently uses a trivial wrapper around the utf
encoding functionality.

Added some autotests, mostly copied from the text codec
tests.

Change-Id: Ib6eeee55fba918b9424be244cbda9dfd5096f7eb
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-05-14 07:46:14 +02:00