skia2/gyp/opts.gyp
digit@google.com eec9dbcace arm: First step towards dynamic NEON support.
This patch adds minimal support for dynamic ARM NEON support,
i.e. the ability to probe the CPU at runtime for NEON and
provide alternate code paths when it is available.

- Add include/core/SkUtilsArm.h, which declares a few helper
  macros (e.g. SK_NEON_ARM_IS_DYNAMIC), plus the handy
  function 'sk_cpu_arm_has_neon()' which returns true if
  the target CPU supports the ARM NEON instruction set.

  Note that the header is in include/core/ because it will
  have to be included from NEON-specific code under src/code/

  It would probably be more logical to put it under include/opts/
  instead, but this would require moving all the NEON-specific
  stuff under src/code/ into src/opts/, which is not trivial
  due to the way the code is currently architected.

- Add src/core/SkUtilsArm.cpp which implements
  'sk_cpu_arm_has_neon' for ARM-based Linux systems, only
  when SK_NEON_ARM_IS_DYNAMIC is true.

  (For other cases, 'sk_cpu_arm_has_neon' is an inline function
   that returns a constant 'true' or 'false' value).

  There is no user-level accessible CPUID instruction on ARM,
  so do all CPU feature probing by parsing /proc/cpuinfo.
  This is Linux-specific.

  For Debug build types, the CPU probing result is printed
  to the Android log (or Linux command-line) for easier
  debugging.

- Create a new 'opts_neon' target (static library) which shall
  contain all the NEON-specific code paths for the library.

  This is necessary because -mfpu=neon impacts also non-scalar
  code. Just like with -mssse3 on x86, we can't build the rest
  of the library with this flag.

  Note that for now, we only include memset16_neon and
  memset32_neon in this library.

- Modify opts_check_arm.cpp to implement SK_ARM_NEON_IS_DYNAMIC
  properly.

Compared to a 'xoom' build, the only difference is the use of
NEON-optimized memset16/32 functions. Later patches will move
more NEON-specific code paths to 'opts_neon'.
Review URL: https://codereview.appspot.com/6247058

git-svn-id: http://skia.googlecode.com/svn/trunk@4069 2bbb7eff-a529-9590-31e7-b0007b416f81
2012-05-30 13:54:41 +00:00

153 lines
5.0 KiB
Python

{
'targets': [
# Due to an unfortunate intersection of lameness between gcc and gyp,
# we have to build the *_SSE2.cpp files in a separate target. The
# gcc lameness is that, in order to compile SSE2 intrinsics code, it
# must be passed the -msse2 flag. However, with this flag, it may
# emit SSE2 instructions even for scalar code, such as the CPUID
# test used to test for the presence of SSE2. So that, and all other
# code must be compiled *without* -msse2. The gyp lameness is that it
# does not allow file-specific CFLAGS, so we must create this extra
# target for those files to be compiled with -msse2.
#
# This is actually only a problem on 32-bit Linux (all Intel Macs have
# SSE2, Linux x86_64 has SSE2 by definition, and MSC will happily emit
# SSE2 from instrinsics, while generating plain ol' 386 for everything
# else). However, to keep the .gyp file simple and avoid platform-specific
# build breakage, we do this on all platforms.
# For about the same reason, we need to compile the ARM opts files
# separately as well.
{
'target_name': 'opts',
'type': 'static_library',
'include_dirs': [
'../include/config',
'../include/core',
'../src/core',
'../src/opts',
],
'conditions': [
[ 'skia_target_arch == "x86"', {
'conditions': [
[ 'skia_os in ["linux", "freebsd", "openbsd", "solaris"]', {
'cflags': [
'-msse2',
],
}],
],
'sources': [
'../src/opts/opts_check_SSE2.cpp',
'../src/opts/SkBitmapProcState_opts_SSE2.cpp',
'../src/opts/SkBlitRow_opts_SSE2.cpp',
'../src/opts/SkBlitRect_opts_SSE2.cpp',
'../src/opts/SkUtils_opts_SSE2.cpp',
],
'dependencies': [
'opts_ssse3',
],
}],
[ 'skia_target_arch == "arm" and armv7 == 1', {
# The assembly uses the frame pointer register (r7 in Thumb/r11 in
# ARM), the compiler doesn't like that.
'cflags!': [
'-fno-omit-frame-pointer',
],
'cflags': [
'-fomit-frame-pointer',
],
'variables': {
'arm_neon_optional%': '<(arm_neon_optional>',
},
'sources': [
'../src/opts/opts_check_arm.cpp',
'../src/opts/memset.arm.S',
'../src/opts/SkBitmapProcState_opts_arm.cpp',
'../src/opts/SkBlitRow_opts_arm.cpp',
],
'conditions': [
[ 'arm_neon == 1 or arm_neon_optional == 1', {
'dependencies': [
'opts_neon',
]
}]
],
}],
[ 'skia_target_arch == "arm" and armv7 != 1', {
'sources': [
'../src/opts/SkBitmapProcState_opts_none.cpp',
'../src/opts/SkBlitRow_opts_none.cpp',
'../src/opts/SkUtils_opts_none.cpp',
],
}],
],
},
# For the same lame reasons as what is done for skia_opts, we have to
# create another target specifically for SSSE3 code as we would not want
# to compile the SSE2 code with -mssse3 which would potentially allow
# gcc to generate SSSE3 code.
{
'target_name': 'opts_ssse3',
'type': 'static_library',
'include_dirs': [
'../include/config',
'../include/core',
'../src/core',
],
'conditions': [
[ 'skia_os in ["linux", "freebsd", "openbsd", "solaris"]', {
'cflags': [
'-mssse3',
],
}],
# TODO(epoger): the following will enable SSSE3 on Macs, but it will
# break once we set OTHER_CFLAGS anywhere else (the first setting will
# be replaced, not added to)
[ 'skia_os in ["mac"]', {
'xcode_settings': {
'OTHER_CFLAGS': ['-mssse3',],
},
}],
[ 'skia_target_arch == "x86"', {
'sources': [
'../src/opts/SkBitmapProcState_opts_SSSE3.cpp',
],
}],
],
},
# NEON code must be compiled with -mfpu=neon which also affects scalar
# code. To support dynamic NEON code paths, we need to build all
# NEON-specific sources in a separate static library. The situation
# is very similar to the SSSE3 one.
{
'target_name': 'opts_neon',
'type': 'static_library',
'include_dirs': [
'../include/config',
'../include/core',
'../src/core',
],
'cflags!': [
'-fno-omit-frame-pointer',
'-mfpu=vfp', # remove them all, just in case.
'-mfpu=vfpv3',
'-mfpu=vfpv3-d16',
],
'cflags': [
'-fomit-frame-pointer',
'-mfpu=neon',
],
'sources': [
'../src/opts/memset16_neon.S',
'../src/opts/memset32_neon.S',
],
},
],
}
# Local Variables:
# tab-width:2
# indent-tabs-mode:nil
# End:
# vim: set expandtab tabstop=2 shiftwidth=2: