1
0
mirror of https://github.com/mstorsjo/fdk-aac.git synced 2025-06-05 22:39:13 +02:00

Add aarch64 assembly optimization (ARMv8a 64 bits)

The fixmuldiv functions don't need inline assembly to be fast
in this architecture; the compiler (both clang and GCC) figure
out to use the optimal instructions for this (which is 2 instruction
sequence), and when letting the compiler emit the instructions
instead of using inline assembly, the compiler is able to
interleave those instructions with other instructions,
improving scheduling, making it even faster than when using
inline assembly.

Overall, this gives about 50% speedup.
This commit is contained in:
Lexyan
2016-09-03 15:38:08 +02:00
committed by Martin Storsjo
parent a0bd8aa3b6
commit 1d686c3a23
5 changed files with 249 additions and 0 deletions

View File

@@ -198,6 +198,14 @@ amm-info@iis.fraunhofer.de
#undef POW2COEFF_16BIT
#undef LDCOEFF_16BIT
#elif defined(__aarch64__) || defined(__AARCH64EL__)
#define ARCH_PREFER_MULT_32x32
#define ARCH_PREFER_MULT_32x16
#define SINETABLE_16BIT
#define POW2COEFF_16BIT
#define LDCOEFF_16BIT
#define WINDOWTABLE_16BIT
#elif defined(__x86__) /* cppp replaced: elif */
#define ARCH_PREFER_MULT_32x16
#define SINETABLE_16BIT