As the comment says, Haswell is a nice divider and is a good
optimization target.
I'm using -march=core-avx2 instead of -march=haswell because the latter
form was only added to GCC 4.9 but we still support 4.7 and that has
support for AVX2.
This commit changes the AVX2-optimized code in QtGui to Haswell-
optimized instead. That means, for example, that qdrawhelper_avx2.cpp
can now use the FMA instructions.
Change-Id: If025d476890745368955fffd153129c1716ba006
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
"AVX512MIC" (Many Integrated Cores) is the set of AVX-512 features found
on the Intel Xeon Phi coprocessors (codename "Knights Landing"), which
is an unlikely architecture for Qt to run on.
The two profiles with VL came from study of early GCC code and are no
longer applicable. GCC source code now shows both VBMI and IFMA as part
of the -march=cannonlake feature set.
Change-Id: Iff4151c519c144d580c4fffd153a0f268919fe2c
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Right now we are using the C++ compiler here, which relies on
the compilers automatically switching mode. This behavior is also
deprecated in newer clang versions and block clang developer
builds.
Task-number: QTBUG-64822
Done-with: Thiago Macieira <thiago.macieira@intel.com>
Change-Id: I4d3c00ac528a45934c85777f42d243d0fe367c92
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@qt.io>
The instruction is "RDRAND", but the feature name, according to GCC, is
RDRND, so I had to change some macros in qsimd_p.h.
Change-Id: Icd0e0d4b27cb4e5eb892fffd14b5166779137e63
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@qt.io>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
The AES instructions were first introduced with the Westmere shrink
(22nm) of the Nehalem architecture. The SHA instructions are still
pending on Intel architecture, but is available on AMD family 17h (gcc
argument -march=znver1).
Both features operate on SSE registers, so that's why the MSVC command-
line argument is the SSE2 one and the configure-time tests depend on
features.sse2.
The qmake feature names end in "ni" because "aes" and "sha" are too
simple and could clash with other uses. The QT_COMPILER_SUPPORTS_ macro
doesn't have the "NI" suffix because it has to match the GCC/Clang
predefined macro.
Change-Id: I445bb15619f6401494e8fffd149dbd1f862ff51c
Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
Use F16C or ARM FP16 if available at compile time.
Configure check added because older clang compilers have F16C defines
and flags but not all the intrinsics.
Change-Id: I71f358b8fd003e70ab8fcf35097414591e485112
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Due to common subexpression elimination instruction set extensions may
leak from the objects where they were enabled when doing link-time
optimizations.
To avoid that this patch disables LTCG/LTO on files built with extra
instruction set extensions.
Change-Id: Ie34ad900be7fb04a0dc4d3562187ee170c183333
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@theqtcompany.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This was a long-time coming.
One innovation from this commit is that it will add the source to
SOURCES if the compiler is already generating code for that specific
target. That is currently always the case for Neon, and the MIPS DSPs
since that is the only condition in which configure will enable those
targets. And because of qt_module.prf, it's also always the case for
SSE2 (but not for SSE3 or higher).
So simplify the .pri files by removing always-true conditions.
Change-Id: Ib24af74717b652c9a6be246e3c17a839470f37da
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
We don't actually detect whether the compiler can create Neon code or
provides Neon intrinsics. Most of them do, so that test would be mostly
moot. We removed the detection previously because we couldn't
automatically enable Neon due to leakage of instructions outside the
areas protected at runtime.
Instead, we rely on the mkspec properly passing the necessary flags that
enable Neon support.
This commit does not change that. All it does is verify whether the arch
detection found "neon" as part of the target CPU features. In other
words, it moves the test that was in simd.prf to configure.
It does fix the Neon detection in configure.exe, which was always
failing for trying to run a test that didn't exist
(config.tests/unix/neon).
Change-Id: Id561dfb2db7d3dca7b8c29afef63181693bdc0aa
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
The integrated assembler of clang does not understand some/all of the
ARM macro assembler syntax used in pixman-arm-neon-asm.S. By default,
this integrated assembler is used when using the "clang" command as a
driver. This patch turns off the integrated assembler of clang for that
file.
Change-Id: Ic06801266b5a4b097ca835d815bcc5d5fc672946
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Unlike MSVC, ICC is capable of selecting each of the processor feature
levels, so let's define the right macros.
Version 9.1 is really old and not supported, so we don't need to keep
the old workaround.
The compiler has been complaining that option -GX is deprecated and will
be removed, so update it to use the same as MSVC does.
Change-Id: I4158fcf2331c1d27462bb1cb19725c7136efab4a
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Matches the compiler capabilities better and will catch all GCC-like
compilers (including Clang, LLVM and Intel CC on Unix).
Task-number: QTBUG-38544
Change-Id: I102966d307a4e167b6dcf3da08359e656f3af45e
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
If one doesn't set CONFIG+=neon in it's mkspec but the compiler enables
it, Qt fails during linking of libQtGui,
because simd.prf isn't executed as needed (doesn't run into neon{..}).
Although a mkspec which enabled neon (-mfpu=neon) without CONFIG+=neon
could be considered broken,
it's still simpler to just 'fix' Qt to not fail unexpected.
Follow-up to e5066a3a2e
Task-number: QTBUG-37264
Change-Id: I3aa0afbe430547971e76c2c988697c133d69796b
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
ICC 8 and 9 are positively ancient. I doubt anyone is using them for
Qt, let alone Qt 5. ICC 11 through 13 haven't supported OS X.
ICC now masquerades as Clang, so we need to let qmake and
qcompilerdetection.h know about it.
Change-Id: If0d2bd8b6a4a45250c15c9472c062effc76f17de
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
http://msdn.microsoft.com/en-us/library/b0084kay(v=vs.120).aspx says:
__AVX__ Defined when /arch:AVX is specified.
Now we know what flag it is, we don't need to use our _M_AVX flag
anymore. We're also now assuming that Microsoft will follow the same
pattern for AVX2 (i.e., __AVX2__), so this commit also removes the
check for _M_AVX2.
The other defines that were defined alongside AVX2 are removed because
they have no use currently in Qt.
Change-Id: I64a026b2206dbd0d2dffa7c803bee969c9b94a94
Reviewed-by: Friedemann Kleint <Friedemann.Kleint@digia.com>
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Changed MIPS DSP portion of the mkspecs/features/simd.prf file in order
to fix the corrupted build system for MIPS platforms.
List of the additionally optimized functions
from file src/gui/painting/qdrawhelper.cpp:
- qt_blend_rgb16_on_rgb16
- qt_fetchUntransformed_888
- qt_fetchUntransformed_444
- qt_fetchUntransformed_argb8565
from file src/gui/image/qimage.cpp:
- convert_ARGB_to_ARGB_PM_inplace
from file src/corelib/qstring.cpp:
- ucstrncmp
- toLatin1_helper
- fromLatin1_helper
Change-Id: I5c47a69784917eee29a8dbd2718828a390b27c93
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Various global changes, primarily preprocessor flow, to support the
WinRT platform.
Change-Id: I3fa9cf91d5fb24019362e88fcf205e31b4f810b5
Reviewed-by: Andrew Knight <andrew.knight@digia.com>
This reverts commit 1ef74a763a, as
assembler files in SOURCES break compiling with -pch, as we don't create
a respective PCH.
instead, compile assembler code with QMAKE_CC, not QMAKE_CXX.
the reason why this change is needed in the first place is not clear to
me, but i guess that CXX defines some c++-related macros when
preprocessing the file, which breaks further down the line. this is
counter-intuitive, as the g++ frontend should treat the same sources the
same way as the gcc frontend (differences should be limited the the ld
invocation).
Task-number: QTBUG-29765
Change-Id: Ic0116b3a5fa621f12ac41cadf3062ff00b538e85
Reviewed-by: Rafael Roquetto <rafael.roquetto@kdab.com>
Reviewed-by: Joerg Bornemann <joerg.bornemann@digia.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
For now there's only one architecture per host/target, but
this will change once we start detecting the CPU features
for both device and simulator on iOS.
For convenience we set QT_CPU_FEATURES to the resolved
value of the current architecture, so that simd.prf still
can use QT_CPU_FEATURES directly.
Change-Id: I28e8b339a5c30a630e276165254dba09a3da6940
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@digia.com>
This guarantees that qmake gives them the proper flags (non C++) while
building these asm files.
Change-Id: I41150f543b8fac81bcd0da963b4d0e0a19b9db2f
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@nokia.com>
So you can get AVX/NEON etc source compiled by assigning to the
corresponding variable (e.g. AVX_SOURCES).
This was previously used in just the gui module, but other
external modules might like it too.
Change-Id: I51aa64760c469c7dc4c71e6f089c2ddef4f509c5
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Oswald Buddenhagen <oswald.buddenhagen@nokia.com>