diff --git a/libc/docs/headers/math/index.rst b/libc/docs/headers/math/index.rst index a707b37894afc..7ea8401d932eb 100644 --- a/libc/docs/headers/math/index.rst +++ b/libc/docs/headers/math/index.rst @@ -376,6 +376,173 @@ Legends: TODO(lntue): Add a new page to discuss about the algorithms used in the implementations and include the link here. +GPU Conformance +=============== + +* Conformance tests are located at: `offload/unittests/Conformance `_. + +* The math functions for GPUs are compiled with the following optimization options: ``LIBC_MATH_SKIP_ACCURATE_PASS``, ``LIBC_MATH_INTERMEDIATE_COMP_IN_FLOAT``, ``LIBC_MATH_SMALL_TABLES``, ``LIBC_MATH_NO_ERRNO``, and ``LIBC_MATH_NO_EXCEPT``. + +* The conformance test results for higher math functions on GPUs are reported in the table below. The results show the maximum observed ULP distance when comparing a given GPU implementation against the corresponding correctly rounded implementation from LLVM libc, which is computed on the host CPU and serves as the reference. For comparison purposes, results for CUDA Math and HIP Math against the same reference are also included. + ++------------------------+-------------+---------------+-----------------------------------------------------------------------------------+ +| Function | Test Method | ULP Tolerance | Max ULP Distance | +| | | +--------------------+--------------------+--------------------+--------------------+ +| | | | LLVM libc | LLVM libc | CUDA Math | HIP Math | +| | | | (AMDGPU) | (CUDA) | (CUDA) | (AMDGPU) | ++========================+=============+===============+====================+====================+====================+====================+ +| acos | Randomized | 4 | 6 (FAILED) | 6 (FAILED) | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| acosf | Exhaustive | 4 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| acosf16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| acoshf | Exhaustive | 4 | 1 | 1 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| acoshf16 | Exhaustive | 2 | 0 | 0 | | 0 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| acospif16 | Exhaustive | 2 | 0 | 0 | | | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| asin | Randomized | 4 | 6 (FAILED) | 6 (FAILED) | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| asinf | Exhaustive | 4 | 1 | 1 | 1 | 3 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| asinf16 | Exhaustive | 2 | 0 | 0 | | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| asinhf | Exhaustive | 4 | 1 | 1 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| asinhf16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| atanf | Exhaustive | 5 | 0 | 0 | 1 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| atanf16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| atan2f | Randomized | 6 | 1 | 1 | 2 | 3 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| atanhf | Exhaustive | 5 | 0 | 0 | 3 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| atanhf16 | Exhaustive | 2 | 0 | 0 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| cbrt | Randomized | 2 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| cbrtf | Exhaustive | 2 | 0 | 0 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| cos | Randomized | 4 | 1 | 1 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| cosf | Exhaustive | 4 | 1 | 1 | 2 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| cosf16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| coshf | Exhaustive | 4 | 0 | 0 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| coshf16 | Exhaustive | 2 | 1 | 0 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| cospif | Exhaustive | 4 | 0 | 0 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| cospif16 | Exhaustive | 2 | 0 | 0 | | | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| erff | Exhaustive | 16 | 0 | 0 | 1 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| exp | Randomized | 3 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| expf | Exhaustive | 3 | 0 | 0 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| expf16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| exp10 | Randomized | 3 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| exp10f | Exhaustive | 3 | 0 | 0 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| exp10f16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| exp2 | Randomized | 3 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| exp2f | Exhaustive | 3 | 1 | 1 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| exp2f16 | Exhaustive | 2 | 1 | 1 | | 0 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| expm1 | Randomized | 3 | 0 | 0 | 1 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| expm1f | Exhaustive | 3 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| expm1f16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| hypot | Randomized | 4 | 0 | 0 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| hypotf | Randomized | 4 | 0 | 0 | 1 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| hypotf16 | Exhaustive | 2 | 0 | 0 | | | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| log | Randomized | 3 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| logf | Exhaustive | 3 | 1 | 1 | 1 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| logf16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| log10 | Randomized | 3 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| log10f | Exhaustive | 3 | 1 | 1 | 2 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| log10f16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| log1p | Randomized | 2 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| log1pf | Exhaustive | 2 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| log2 | Randomized | 3 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| log2f | Exhaustive | 3 | 0 | 0 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| log2f16 | Exhaustive | 2 | 1 | 1 | | 0 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| powf (integer exp.) | Randomized | 16 | 0 | 0 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| powf (real exp.) | Randomized | 16 | 0 | 0 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sin | Randomized | 4 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sinf | Exhaustive | 4 | 1 | 1 | 1 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sinf16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sincos (cos part) | Randomized | 4 | 1 | 1 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sincos (sin part) | Randomized | 4 | 1 | 1 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sincosf (cos part) | Exhaustive | 4 | 1 | 1 | 2 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sincosf (sin part) | Exhaustive | 4 | 1 | 1 | 1 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sinhf | Exhaustive | 4 | 1 | 1 | 3 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sinhf16 | Exhaustive | 2 | 1 | 1 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sinpif | Exhaustive | 4 | 0 | 0 | 1 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| sinpif16 | Exhaustive | 2 | 0 | 0 | | | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| tan | Randomized | 5 | 2 | 2 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| tanf | Exhaustive | 5 | 0 | 0 | 3 | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| tanf16 | Exhaustive | 2 | 1 | 1 | | 2 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| tanhf | Exhaustive | 5 | 0 | 0 | 2 | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| tanhf16 | Exhaustive | 2 | 0 | 0 | | 1 | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| tanpif | Exhaustive | 6 | 0 | 0 | | | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ +| tanpif16 | Exhaustive | 2 | 1 | 1 | | | ++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+ + +Notes: + +* Exhaustive tests check every representable point in the input space. This method is used for half-precision functions and single-precision univariate functions. +* Randomized tests check a large, deterministic subset of the input space, typically using 2\ :sup:`32` samples. This method is used for functions with larger input spaces, such as single-precision bivariate and double-precision functions. +* ULP tolerances are based on The Khronos Group, `The OpenCL C Specification v3.0.19 `_, Sec. 7.4, Khronos Registry [July 10, 2025]. +* The AMD GPU used for testing is AMD Radeon RX 6950 XT. +* The NVIDIA GPU used for testing is NVIDIA RTX 4000 SFF Ada Generation. Performance ===========