xref: /aosp_15_r20/external/arm-optimized-routines/math/README.contributors (revision 412f47f9e737e10ed5cc46ec6a8d7fa2264f8a14)
1*412f47f9SXin LiSTYLE REQUIREMENTS
2*412f47f9SXin Li==================
3*412f47f9SXin Li
4*412f47f9SXin Li1. Most code in this sub-directory is expected to be upstreamed into glibc so
5*412f47f9SXin Li   the GNU Coding Standard and glibc specific conventions should be followed
6*412f47f9SXin Li   to ease upstreaming.
7*412f47f9SXin Li
8*412f47f9SXin Li2. ABI and symbols: the code should be written so it is suitable for inclusion
9*412f47f9SXin Li   into a libc with minimal changes. This e.g. means that internal symbols
10*412f47f9SXin Li   should be hidden and in the implementation reserved namespace according to
11*412f47f9SXin Li   ISO C and POSIX rules. If possible the built shared libraries and static
12*412f47f9SXin Li   library archives should be usable to override libc symbols at link time (or
13*412f47f9SXin Li   at runtime via LD_PRELOAD). This requires the symbols to follow the glibc ABI
14*412f47f9SXin Li   (other than symbol versioning), this cannot be done reliably for static
15*412f47f9SXin Li   linking so this is a best effort requirement.
16*412f47f9SXin Li
17*412f47f9SXin Li3. API: include headers should be suitable for benchmarking and testing code
18*412f47f9SXin Li   and should not conflict with libc headers.
19*412f47f9SXin Li
20*412f47f9SXin Li
21*412f47f9SXin LiCONTRIBUTION GUIDELINES FOR math SUB-DIRECTORY
22*412f47f9SXin Li==============================================
23*412f47f9SXin Li
24*412f47f9SXin Li1. Math functions have quality and performance requirements.
25*412f47f9SXin Li
26*412f47f9SXin Li2. Quality:
27*412f47f9SXin Li   - Worst-case ULP error should be small in the entire input domain (for most
28*412f47f9SXin Li     common double precision scalar functions the target is < 0.66 ULP error,
29*412f47f9SXin Li     and < 1 ULP for single precision, even performance optimized function
30*412f47f9SXin Li     variant should not have > 5 ULP error if the goal is to be a drop in
31*412f47f9SXin Li     replacement for a standard math function), this should be tested
32*412f47f9SXin Li     statistically (or on all inputs if possible in reasonable amount of time).
33*412f47f9SXin Li     The ulp tool is for this and runulp.sh should be updated for new functions.
34*412f47f9SXin Li
35*412f47f9SXin Li   - All standard rounding modes need to be supported but in non-default rounding
36*412f47f9SXin Li     modes the quality requirement can be relaxed. (Non-nearest rounded
37*412f47f9SXin Li     computation can be slow and inaccurate but has to be correct for conformance
38*412f47f9SXin Li     reasons.)
39*412f47f9SXin Li
40*412f47f9SXin Li   - Special cases and error handling need to follow ISO C Annex F requirements,
41*412f47f9SXin Li     POSIX requirements, IEEE 754-2008 requirements and Glibc requiremnts:
42*412f47f9SXin Li     https://www.gnu.org/software/libc/manual/html_mono/libc.html#Errors-in-Math-Functions
43*412f47f9SXin Li     this should be tested by direct tests (glibc test system may be used for it).
44*412f47f9SXin Li
45*412f47f9SXin Li   - Error handling code should be decoupled from the approximation code as much
46*412f47f9SXin Li     as possible. (There are helper functions, these take care of errno as well
47*412f47f9SXin Li     as exception raising.)
48*412f47f9SXin Li
49*412f47f9SXin Li   - Vector math code does not need to work in non-nearest rounding mode and error
50*412f47f9SXin Li     handling side effects need not happen (fenv exceptions and errno), but the
51*412f47f9SXin Li     result should be correct (within quality requirements, which are lower for
52*412f47f9SXin Li     vector code than for scalar code).
53*412f47f9SXin Li
54*412f47f9SXin Li   - Error bounds of the approximation should be clearly documented.
55*412f47f9SXin Li
56*412f47f9SXin Li   - The code should build and pass tests on arm, aarch64 and x86_64 GNU linux
57*412f47f9SXin Li     systems. (Routines and features can be disabled on specific targets, but
58*412f47f9SXin Li     the build must complete). On aarch64, both little- and big-endian targets
59*412f47f9SXin Li     are supported as well as valid combinations of architecture extensions.
60*412f47f9SXin Li     The configurations that should be tested depend on the contribution.
61*412f47f9SXin Li
62*412f47f9SXin Li3. Performance:
63*412f47f9SXin Li   - Common math code should be benchmarked on modern aarch64 microarchitectures
64*412f47f9SXin Li     over typical inputs.
65*412f47f9SXin Li
66*412f47f9SXin Li   - Performance improvements should be documented (relative numbers can be
67*412f47f9SXin Li     published; it is enough to use the mathbench microbenchmark tool which should
68*412f47f9SXin Li     be updated for new functions).
69*412f47f9SXin Li
70*412f47f9SXin Li   - Attention should be paid to the compilation flags: for aarch64 fma
71*412f47f9SXin Li     contraction should be on and math errno turned off so some builtins can be
72*412f47f9SXin Li     inlined.
73*412f47f9SXin Li
74*412f47f9SXin Li   - The code should be reasonably performant on x86_64 too, e.g. some rounding
75*412f47f9SXin Li     instructions and fma may not be available on x86_64, such builtins turn into
76*412f47f9SXin Li     libc calls with slow code. Such slowdown is not acceptable, a faster fallback
77*412f47f9SXin Li     should be present: glibc and bionic use the same code on all targets. (This
78*412f47f9SXin Li     does not apply to vector math code).
79