Completes the inplace converters so that we can rely on inplace
conversions to succede as long as the image depth is the same.
Change-Id: Ia1ae34b5de1bc16e87ff5403bdacfcae44a22791
Reviewed-by: Gunnar Sletta <gunnar@sletta.org>
Cleaning up smoothscale code. Upscaling is improved using existing
optimized interpolation methods, and downscale is given SSE4.1
optimized versions.
Change-Id: I7cdc256c26850948aef7dae26fda1622be6b8179
Reviewed-by: Gunnar Sletta <gunnar@sletta.org>
Improves the conversion from RGB888 to RGB32 on platforms without SIMD
versions. This includes the fallback used on non-neon ARM devices.
Besides image conversion the routine is also used for decoding JPEG.
On x86 this version is within 0.7x of the speed of the SSSE3 version.
Change-Id: Id131994d7c3c4f879d89e80f9d6c435bb5535ed7
Reviewed-by: Gunnar Sletta <gunnar@sletta.org>
There are many direct QImage conversions that doesn't need to be direct
but only are because they are faster than the generic conversion. This
patch optimizes the generic conversions and then removes all the direct
conversions that are now no faster than the generic.
Change-Id: I3dc5f44cc7f6358fd66420e9974eebaf2c7ca59c
Reviewed-by: Gunnar Sletta <gunnar@sletta.org>
Qt copyrights are now in The Qt Company, so we could update the source
code headers accordingly. In the same go we should also fix the links to
point to qt.io.
Outdated header.LGPL removed (use header.LGPL21 instead)
Old header.LGPL3 renamed to header.LGPL3-COMM to match actual licensing
combination. New header.LGPL-COMM taken in the use file which were
using old header.LGPL3 (src/plugins/platforms/android/extract.cpp)
Added new header.LGPL3 containing Commercial + LGPLv3 + GPLv2 license
combination
Change-Id: I6f49b819a8a20cc4f88b794a8f6726d975e8ffbe
Reviewed-by: Matti Paaso <matti.paaso@theqtcompany.com>
The autovectorized versions of premultiplying conversions are almost
twice as fast with SSE4.1 as with SSE2. Therefore this patch lets
compilers that can make those versions convenient without duplicating
code do that and lets us use them when available.
Change-Id: I699035963abe55a38b9ef8ba7b4a8c961c8dfcdd
Reviewed-by: Gunnar Sletta <gunnar@sletta.org>
We still have a bunch of Q_WS_ ifdefs in our code, which are easy to
mistake for Q_OS_ ifdefs when quickly scanning the code. By renaming
the ifdefs we make it clear that the code in question is dead.
In incremental follow-ups, we can then selectively either remove, or
port, the pieces that are dead code.
Change-Id: Ib5ef3e9e0662d321f179f3e25122cacafff0f41f
Reviewed-by: Marc Mutz <marc.mutz@kdab.com>
With auto-vectorization enabled in QtGui, the 32bit version of
qPremultiply is faster than the 64bit version since it can be vectorized
wider (4x on 128bit as opposed to 2x). Since all our important 64bit
targets have SIMD, that makes the 64bit version pointless.
Change-Id: I4e9070a3a3c8e2b54f17a95ba0aee0405cbb8ec9
Reviewed-by: Marc Mutz <marc.mutz@kdab.com>
The class hasn't been used for a while anymore. Since it's
private, simply remove it from QtGui.
Change-Id: Ia0911d1c8b8836d963a51c8e354c96bc1ee4093f
Reviewed-by: Gunnar Sletta <gunnar@sletta.org>
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@theqtcompany.com>
There's a change in Qt 5.4.0 that makes Qt compile with its own set of
D-Bus headers, which means QT_CFLAGS_DBUS may be empty. Thus, we can't
compile or link if we're using the actual libdbus-1 API to build the
test.
This commit makes these unit tests use the same dynamic loading
mechanism.
Change-Id: I56b2a7320086ef88793f6552cb54ca6224010451
Reviewed-by: Simon Hausmann <simon.hausmann@digia.com>
This covers the only real additions over QVector: push and pop. Really, there
isn't too much specific to benchmark here, but we're interested in one specific
case: that of pushing and popping a single item repeatedly.
With the current QVector behavior, this causes constant deallocation, which
makes it morbidly slow. This behavior will be reviewed in a subsequent commit.
Results (not that anyone really cares) for me:
PASS : tst_QStack::qstack_push()
RESULT : tst_QStack::qstack_push():
1.9 msecs per iteration (total: 61, iterations: 32)
PASS : tst_QStack::qstack_pop()
RESULT : tst_QStack::qstack_pop():
8.2 msecs per iteration (total: 66, iterations: 8)
PASS : tst_QStack::qstack_pushpopone()
RESULT : tst_QStack::qstack_pushpopone():
80 msecs per iteration (total: 80, iterations: 1)
Change-Id: I3530888abbfcfcef39318d6be6d5b07306a4704e
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Wait for the subprocess to print "ready" before assuming that it is
ready to receive calls. waitForStarted() will return as soon as the
child is running, but it may not have registered on D-Bus yet.
This also solves the synchronization problem more elegantly than how
tst_qdbusmarshall.cpp was trying to do it.
Change-Id: I548dfba2677cc5a34ba50f4310c4d5baa98093b2
Reviewed-by: Simon Hausmann <simon.hausmann@digia.com>
The executables are not in the same dir as on Unix, so we need to use
QFINDTESTDATA to find them. The DESTDIR setting prevents qmake from
placing the executables in a "debug/" subdir.
Change-Id: I1d6d10e6f6f109f55fd9809dcf83da0386f38772
Reviewed-by: Simon Hausmann <simon.hausmann@digia.com>
No need to loop twice to add the "native" entries, since they are added
by the helper function anyway.
Change-Id: I9caabc6fc4973a90b483840815769b1351947a89
Reviewed-by: Friedemann Kleint <Friedemann.Kleint@theqtcompany.com>
The enum was made public in f84b00c6d2, but this
makes it follow the convention to camel case acronyms too before it's too late
to change it.
Change-Id: Ibb81e9221cb73fe0502d0a26f2d73512dd142f08
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
QMetaType::type(const char *) requires that the string argument is
0-terminated. This new overload makes it possible to query the type
of a string with an explicit length.
In particular, QByteArrays constructed by QByteArray::fromRawData(),
for example from a substring of a normalized method signature (the
"int" part of "mySlot(int"), can now be queried without making a copy
of the string.
Also, Qt5 meta-objects represent type names as QByteArray literals,
which can be fed directly to this new QMetaType::type() overload (no
need to call strlen).
Change-Id: I60d35aa6bdc0f77e0997f98b0e30e12fd3d5e100
Reviewed-by: Jędrzej Nowacki <jedrzej.nowacki@digia.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
dbmsType was previously kept as a private variable in QSqlDriverPrivate,
however it's particularly useful for QODBC users.
[ChangeLog][QtSql][QSqlDriver] Add support for determining DBMS type from SQL driver.
Change-Id: If1c221520da9ac4ccef85a02db078679d76eac92
Reviewed-by: Mark Brand <mabrand@mabrand.nl>
Avoids allocating a QString for every char being written out.
The benchmark went from 5.5 ms per iteration to 0.8 ms,
and from 40 million instructions to 6 million.
Found using Milian Wolff's heaptrack tool.
Change-Id: I1784c47b944454bc947a607a22c39d249372ed55
Reviewed-by: Adam Majer <adamm@zombino.com>
Reviewed-by: Marc Mutz <marc.mutz@kdab.com>
Do a check first if we need to transform before doing the transform.
This means we won't detach when transforming data that is already
correct.
And instead of using QChar, use our own hand-rolled table. In a proper
LTO build, the QChar calls would be resolved to a lookup of the Unicode
data, but not many people do LTO builds, Therefore, this means a great
speed-up is achieved by simply avoiding the function call. The extra
gain in performance comes from the simpler translation table instead of
the more complex full-Unicode data.
Also as a consequence, this changes the handling of two characters in
Latin 1: 'ß' should be uppercased to "SS" but we won't do it, and 'ÿ'
can't be uppercased in Latin 1 ('Ÿ' is outside the range).
Benchmarking is included. Comparing the Qt 5.4 algorithm to the new code
is almost 20x faster. Other alternatives are included in the benchmark
and are all faster than the current code, though slower than the new
one. While all of them could compress the tables to be smaller or shared
between uppercasing and lowercasing, they would also expand to more code
(though probably less than the extra bytes required in the full
translation table). In the trade-off, I decided to go with simplicity
and most efficient code.
Change-Id: I002d98318d236de0d27ffbea39d662cbed359985
Reviewed-by: Marc Mutz <marc.mutz@kdab.com>
The comparison, Latin 1 and UTF-8 benchmarks contained in this file are
stale. The implementation changed in Qt 5.3 and this benchmark couldn't
be updated (test data too large for Qt).
Please contact Thiago Macieira to obtain the benchmarks and test data.
Change-Id: I48c19b1f1711eb73c953a30ed4da510e97a62472
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
clang warning: adding 'int' to a string does not append to the string
Change-Id: I6dc393269a52e9482fde106c17132336cf5ce226
Reviewed-by: Jędrzej Nowacki <jedrzej.nowacki@digia.com>
Cost of a type lookup for core built-in types is really small, just few
cpu instructions, but the benchmark was testing create() and destroy()
functions (in a different fashion) which by definition allocate and
de-allocate memory. These memory operations are significantly more
expensive which obfuscate the results.
Change-Id: I33c679f57e6c2b57e98328f076dfe249ab7bcde8
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
Reviewed-by: Stephen Kelly <steveire@gmail.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The "rbit" instruction requires ARMv6T2 or higher. This was found in the
CI when building the imx6 target:
Compiler: arm-poky-linux-gnueabi-g++
Flags: -mfloat-abi=hard -mfpu=neon
Errors from the assembler:
{standard input}:3078: Error: selected processor does not support ARM mode `rbit r3,r3'
{standard input}:7341: Error: selected processor does not support ARM mode `rbit ip,ip'
That compiler defaults to ARMv5T. That's obviously wrong for an i.MX 6,
which is a Cortex-A9 (ARMv7), but the correction applies for older
processors.
Change-Id: I56c276fa411977dd7cd867d62adf021e4909302c
Reviewed-by: Jędrzej Nowacki <jedrzej.nowacki@digia.com>
QRingBuffer is a fully inlined class used in many I/O classes.
So, it must be as fast and small as possible. To this end, a lot of
unnecessary special cases were replaced by generic structures.
Change-Id: Ic189ced3b200924da158ce511d69d324337d01b6
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
main.cpp(332) : warning C4307: '*' : integral constant overflow
tst_qpainter.cpp(1293) : warning C4305: '+=' : truncation from 'double' to 'float'
tst_qpainter.cpp(1474) : warning C4305: '+=' : truncation from 'double' to 'float'
tst_qtbench.cpp(155) : warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
main.cpp(68) : warning C4189: 'fontHeight' : local variable is initialized but not referenced
Change-Id: If6aadd50df7c5cf7d0f33791c9247730a47ddd27
Reviewed-by: Joerg Bornemann <joerg.bornemann@digia.com>
The check has to detect if boost header is present in the system we are
building for.
Change-Id: I700a11df208c8852ba094d8bff387ad21fa309b2
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
Some quick benchmarks against GNU coreutils 8.21 and OpenSSL 1.0.1e
(time in µs; time for coreutils and OpenSSL include the loading of the
executable):
Qt Coreutils OpenSSL
n SHA-1 SHA-224 SHA-512 SHA-1 SHA-224 SHA-512 SHA-1 SHA-224 SHA-512
0 0 0 0 717 716 700 2532 2553 2522
64k 120 484 381 927 1074 966 2618 2782 2694
Diff 120 484 381 210 358 266 86 229 172
The numbers for Qt are pretty stable and vary very little; the numbers
for the other two vary quite a bit, since they involve launching and
executing separate processes. We can take the lesson that we're in the
same ballpark for SHA-1 and we should investigate whether our SHA2
implementation is sufficiently optimized.
Change-Id: Ib081d002ed57c4f43741eca45ff5cd13b97b6276
Reviewed-by: Richard J. Moore <rich@kde.org>
Currently the only supported SPDY version is 3.0.
The feature needs to be enabled explicitly via
QNetworkRequest::SpdyAllowedAttribute. Whether SPDY actually was used
can be determined via QNetworkRequest::SpdyWasUsedAttribute from a
QNetworkReply once it has been started (i.e. after the encrypted()
signal has been received). Whether SPDY can be used will be
determined during the SSL handshake through the TLS NPN extension
(see separate commit).
The following things from SPDY have not been enabled currently:
* server push is not implemented, it has never been seen in the wild;
in that case we just reject a stream pushed by the server, which is
legit.
* settings are not persisted across SPDY sessions. In practice this
means that the server sends a small message upon session start
telling us e.g. the number of concurrent connections.
* SSL client certificates are not supported.
Task-number: QTBUG-18714
[ChangeLog][QtNetwork] Added support for the SPDY protocol (version
3.0).
Change-Id: I81bbe0495c24ed84e9cf8af3a9dbd63ca1e93d0d
Reviewed-by: Richard J. Moore <rich@kde.org>
When drawing to and from the less common formats most of the cpu time
is spend in conversion. The conversion method is rather slow due to
using variable shifts and masks that the compiler does not have a chance
to optimize.
This patch changes the conversion methods to being templates fed by
constexpr methods. This allows the compiler to fully optimize the methods
yielding 2x->5x speedups.
The reliance on constexpr however means the optimized methods are only
used under C++11.
Change-Id: I2ec77c4c1c03f12ee463a694a2b59db0f0b52db1
Reviewed-by: Gunnar Sletta <gunnar.sletta@jollamobile.com>
According to my profiling of Qt Creator, qHash and the SHA-1 calculation
are the hottest spots remaining in QtCore. The current qHash function is
not really vectorizable. We could come up with a different algorithm
that is more SIMD-friendly, but since we have the CRC32 instruction that
can read 32- and 64-bit entities, we're set.
This commit also updates the benchmark for QHash and benchmarks both
the hashing function itself and the QHash class. The updated
benchmarks for the CRC32 on my machine shows that the hashing function
is *always* improved, but the hashing isn't always. In particular, the
current algorithm is better for the "numbers" case, for which the data
sample differs in very few bits. The new code is 33% slower for that
particular case.
On average, the improvement (including the "numbers" case) is:
compared to qHash only QHash
Qt 5.0 function 2.54x 1.06x
Qt 4.x function 4.34x 1.34x
Java function 2.71x 1.11x
Test machine: Sandybridge Core i7-2620M @ 2.66 GHz with turbo disabled
for the benchmarks
Change-Id: Ia80b98c0e20d785816f7a7f6ddf40b4b302c7297
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
the diff -w for this commit is empty.
Started-by: Thiago Macieira <thiago.macieira@intel.com>
Change-Id: I77bb84e71c63ce75e0709e5b94bee18e3ce6ab9e
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
For the conflicts in msvc_nmake.cpp the ifdefs are extended since we
need to support windows phone in the target branch while it is not there
in the current stable branch (as of Qt 5.2).
Conflicts:
configure
qmake/generators/win32/msvc_nmake.cpp
src/3rdparty/angle/src/libEGL/Surface.cpp
src/angle/src/common/common.pri
src/corelib/global/qglobal.h
src/corelib/io/qstandardpaths.cpp
src/plugins/platforms/qnx/qqnxintegration.cpp
src/plugins/platforms/qnx/qqnxscreeneventhandler.h
src/plugins/platforms/xcb/qglxintegration.h
src/widgets/kernel/win.pri
tests/auto/corelib/thread/qreadwritelock/tst_qreadwritelock.cpp
tests/auto/corelib/tools/qdatetime/tst_qdatetime.cpp
tests/auto/gui/text/qtextdocument/tst_qtextdocument.cpp
tools/configure/configureapp.cpp
Change-Id: I00b579eefebaf61d26ab9b00046d2b5bd5958812
Tweak a handful of tests which didn't compile on this platform.
Change-Id: I208d9eb289dfb226746c6d0163c3ea752485033b
Reviewed-by: Friedemann Kleint <Friedemann.Kleint@digia.com>
Reviewed-by: Frederik Gladhorn <frederik.gladhorn@digia.com>