New log implementation

The new implementations are provided under !__OBSOLETE_MATH, it uses
ISO C99 code.  With default settings the worst case error in nearest
rounding mode is 0.519 ULP with inlined fma and fma contraction.  It uses
a 2 KB lookup table, on aarch64 .text+.rodata size of libm.a is increased
by 1703 bytes.  The w_log.c wrapper is disabled since error handling is
inline in the new code.

New __HAVE_FAST_FMA and __HAVE_FAST_FMA_DEFAULT feature macros were
added to enable selecting between the code path that uses fma and the
one that does not.  Targets supposed to set __HAVE_FAST_FMA_DEFAULT
if they have single instruction fma and the compiler can actually
inline it (gcc has __FP_FAST_FMA macro but that does not guarantee
inlining with -fno-builtin-fma).

Improvements on Cortex-A72:
latency: 1.9x
thruput: 2.3x
This commit is contained in:
Szabolcs Nagy
2018-06-26 15:27:50 +01:00
committed by Corinna Vinschen
parent fb929067db
commit e5791079c6
9 changed files with 744 additions and 4 deletions

View File

@ -65,6 +65,17 @@
double and single precision arithmetics has similar latency and it
has no legacy SVID matherr support, only POSIX errno and fenv
exception based error handling.
__HAVE_FAST_FMA_DEFAULT
Default value for __HAVE_FAST_FMA if that's not set by the user.
It should be set here based on predefined feature macros.
__HAVE_FAST_FMA
It should be set to 1 if the compiler can inline an fma call as a
single instruction. Some math code has a separate faster code
path assuming the target has single instruction fma.
*/
#if (defined(__arm__) || defined(__thumb__)) && !defined(__MAVERICK__)
@ -80,6 +91,9 @@
# endif
# if __ARM_FP & 0x8
# define __OBSOLETE_MATH_DEFAULT 0
# if __ARM_FEATURE_FMA
# define __HAVE_FAST_FMA_DEFAULT 1
# endif
# endif
#else
# define __IEEE_BIG_ENDIAN
@ -96,6 +110,7 @@
#define __IEEE_BIG_ENDIAN
#endif
#define __OBSOLETE_MATH_DEFAULT 0
#define __HAVE_FAST_FMA_DEFAULT 1
#endif
#ifdef __epiphany__
@ -460,6 +475,14 @@
#define __OBSOLETE_MATH __OBSOLETE_MATH_DEFAULT
#endif
#ifndef __HAVE_FAST_FMA_DEFAULT
/* Assume slow fma by default. */
#define __HAVE_FAST_FMA_DEFAULT 0
#endif
#ifndef __HAVE_FAST_FMA
#define __HAVE_FAST_FMA __HAVE_FAST_FMA_DEFAULT
#endif
#ifndef __IEEE_BIG_ENDIAN
#ifndef __IEEE_LITTLE_ENDIAN
#error Endianess not declared!!