make __packed memory access default for gcc

It's always as good or better then memcpy()
but depends on gcc-specific extension.

solves https://github.com/facebook/zstd/issues/620
This commit is contained in:
Yann Collet 2017-03-30 12:22:17 -07:00
parent 3d4ee35da4
commit e2827775ee

View File

@ -63,16 +63,15 @@
* Method 1 : `__packed` statement. It depends on compiler extension (ie, not portable).
* This method is safe if your compiler supports it, and *generally* as fast or faster than `memcpy`.
* Method 2 : direct access. This method is portable but violate C standard.
* It can generate buggy code on targets which generate assembly depending on alignment.
* It can generate buggy code on targets which assembly generation depends on alignment.
* But in some circumstances, it's the only known way to get the most performance (ie GCC + ARMv6)
* See https://fastcompression.blogspot.fr/2015/08/accessing-unaligned-memory.html for details.
* Prefer these methods in priority order (0 > 1 > 2)
*/
#ifndef LZ4_FORCE_MEMORY_ACCESS /* can be defined externally, on command line for example */
#ifndef LZ4_FORCE_MEMORY_ACCESS /* can be defined externally */
# if defined(__GNUC__) && ( defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) || defined(__ARM_ARCH_6T2__) )
# define LZ4_FORCE_MEMORY_ACCESS 2
# elif defined(__INTEL_COMPILER) || \
(defined(__GNUC__) && ( defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) || defined(__ARM_ARCH_7S__) ))
# elif defined(__INTEL_COMPILER) || defined(__GNUC__)
# define LZ4_FORCE_MEMORY_ACCESS 1
# endif
#endif