glibc/sysdeps
Noah Goldstein 1bd8b8d58f x86: Optimize memcmp-evex-movbe.S for frontend behavior and size
No bug.

The frontend optimizations are to:
1. Reorganize logically connected basic blocks so they are either in
   the same cache line or adjacent cache lines.
2. Avoid cases when basic blocks unnecissarily cross cache lines.
3. Try and 32 byte align any basic blocks possible without sacrificing
   code size. Smaller / Less hot basic blocks are used for this.

Overall code size shrunk by 168 bytes. This should make up for any
extra costs due to aligning to 64 bytes.

In general performance before deviated a great deal dependending on
whether entry alignment % 64 was 0, 16, 32, or 48. These changes
essentially make it so that the current implementation is at least
equal to the best alignment of the original for any arguments.

The only additional optimization is in the page cross case. Branch on
equals case was removed from the size == [4, 7] case. As well the [4,
7] and [2, 3] case where swapped as [4, 7] is likely a more hot
argument size.

test-memcmp and test-wmemcmp are both passing.
2021-10-12 12:02:12 -05:00
..
aarch64 elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
alpha elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
arc elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
arm elf: Fix elf_get_dynamic_info definition 2021-10-12 13:25:43 -03:00
csky elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
generic Add run-time check for indirect external access 2021-10-07 10:26:48 -07:00
gnu Remove "Contributed by" lines 2021-09-03 22:06:44 +05:30
hppa elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
htl htl: Reimplement GSCOPE 2021-09-16 01:04:17 +02:00
hurd Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
i386 elf: Fix elf_get_dynamic_info definition 2021-10-12 13:25:43 -03:00
ia64 elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
ieee754 Fixed inaccuracy of j0f (BZ #28185) 2021-10-05 13:45:37 +02:00
m68k elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
mach Add fmaximum, fminimum functions 2021-09-28 23:31:35 +00:00
microblaze elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
mips elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
nios2 elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
nptl nptl: Use FUTEX_LOCK_PI2 when available 2021-10-01 08:09:13 -03:00
posix posix: Remove spawni.c 2021-09-27 12:44:25 -03:00
powerpc elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
pthread elf: Avoid deadlock between pthread_create and ctors [BZ #28357] 2021-10-04 15:07:05 +01:00
riscv elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
s390 elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
sh elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
sparc elf: Avoid nested functions in the loader [BZ #27220] 2021-10-07 11:55:02 -07:00
unix Fix nios2 localplt failure 2021-10-11 21:47:32 +00:00
wordsize-32 Disable symbol hack in libc_nonshared.a 2021-09-27 07:46:25 -07:00
wordsize-64 Remove "Contributed by" lines 2021-09-03 22:06:44 +05:30
x86 elf: Remove Intel MPX support (lazy PLT, ld.so profile, and LD_AUDIT) 2021-10-11 11:14:02 -07:00
x86_64 x86: Optimize memcmp-evex-movbe.S for frontend behavior and size 2021-10-12 12:02:12 -05:00