Summary:
These were originally intended to represent the functions that are
present on the GPU as to be provided by the LLVM libc implementation.
The original plan was that LLVM libc would report which functions were
supported and then the offload interface would mark those as supported.
The problem is that these wrapper headers are very difficult to make
work given the various libc extensions everyone does so they were
extremely fragile.
OpenMP already declares all functions used inside of a target region as
implicitly host / device, while these headers weren't even used for CUDA
/ HIP yet anyway. The only things we need to define right now are the
stdio FILE types. If we want to make this work for CUDA we'd need to
define these manually, but we're a ways off and that's way easier
because they do proper overloading.
Closes#161461
- This is my first time contributing to libc's POSIX, so for reference I
used `clock_gettime` implementation for Linux. For convenience, here is
the description of `clock_settime` function
[behavior](https://www.man7.org/linux/man-pages/man3/clock_settime.3.html)
This patch adds the implementation for `inet_aton` function. Since this
function is not explicitly included in POSIX, I have marked it with
`llvm_libc_ext`. It is widely available and commonly used, and can also
be used to implement `inet_addr`, which is included in POSIX.
The VSCode instructions were stale from the transition to the runtimes
directory. This updates will all the options give on the Full Host Build
page.
Tested:
Built libc target.
Closes#159614
**Changes:**
- Initial implementation of rsqrt for single precision float
**Some small unrelated style changes to this PR (that I missed in my
rsqrtf16 PR):**
- Added extra - to the top comments to make it look nicer in
libc/shared/math/rsqrtf16.h
- Put rsqrtf16 inside of libc/src/__support/math/CMakeLists.txt in
sorted order
- Rearanged libc_math_function rsqrtf16 in Bazel to match alphabetical
order
This PR includes only one of the fxdivi functions (rdivi). It uses a
polynomial function for initial approximation followed by 4
newton-raphson iterations to calculate the reciprocal and finally
multiplies the numerator with it to get the result.
---------
Signed-off-by: Shreeyash Pandey <shreeyash335@gmail.com>
This patch enhances the GPU support documentation page (`support.html`)
by adding a new, detailed section for `math.h`. This new section
presents the results of the GPU math conformance tests, providing
quantitative data on the accuracy of the supported higher math
functions.
This is an implementation for template functions of localtime.
Update for this pull request: Implementation as been removed from this
pull request and will be added to a new one. This is because this pull
request is getting big. This pull request will only contain template
functions in order to implement localtime.
Update: The implementation is available in
https://github.com/zimirza/llvm-project/tree/localtime_implementation.
---------
Co-authored-by: Зишан Мирза <zmirza@tutanota.de>
Co-authored-by: Zishan Mirza <zmirza@posteo.de>
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- scalbnbf16
- scalblnbf16
---------
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- totalorderbf16
- totalordermagbf16
---------
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- canonicalizebf16
- iscanonicalbf16
- fdimbf16
- copysignbf16
- issignalingbf16
---------
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- modfbf16
- remainderbf16
- remquobf16
---------
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
Co-authored-by: OverMighty <its.overmighty@gmail.com>
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- frexpbf16
- ilobbf16
- ldexpbf16
- llogbbf16
- logbbf16
---------
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- nearbyintbf16
- rintbf16
- lrintbf16
- llrintbf16
- lroundbf16
- llroundbf16
---------
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- getpayloadbf16
- setpayloadbf16
- setpayloadsigbf16
---------
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- nextafterbf16
- nextdownbf16
- nexttowardbf16
- nextupbf16
---------
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
Co-authored-by: OverMighty <its.overmighty@gmail.com>
This PR adds the following basic math functions for BFloat16 type along
with the tests:
- fromfpbf16
- fromfpxbf16
- ufromfpbf16
- ufromfpxbf16
---------
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
Use LIBC_ERRNO_MODE_SYSTEM_INLINE instead as the default for the "public
packaging" (i.e. release mode) of an overlay build. The Bazel build has
already switched to use it by default in
5ccc734fa0355f971f8f515457a0bece33ab6642. This should be a safe change,
as LIBC_ERRNO_MODE_SYSTEM_INLINE works a drop-in (but simpler)
LIBC_ERRNO_MODE_SYSTEM replacement. Remove the associated code paths and
config settings.
Fixes issue #143454.
Part of https://github.com/llvm/llvm-project/issues/145349. Required to
allow `atexit` to work. As part of `HermeticTestUtils.cpp`, there is a
reference to `atexit()`, which eventually instantiates an instance of a
Mutex.
Instead of copying the implementation from
`libc/src/__support/threads/gpu/mutex.h`, we allow platforms to select
an implementation based on configurations, allowing the GPU and
single-threaded baremetal platforms to share an implementation. This can
be configured or overridden.
Later, when the threading API is more complete, we can add an option to
support multithreading (or set it as the default), but having
single-threading (in tandem) is in line with other libraries for
embedded devices.
Reverts llvm/llvm-project#146226
The MPFR test uses `mpfr_asinpi` which requires MPFR 4.2.0 or later, but
the Buildbots are running an older version of MPFR.
See https://lab.llvm.org/buildbot/#/builders/104/builds/27743 for
example.
I said I was going to revert the PR until we have a workaround for older
versions of MPFR, but then I forgot and I just disabled the entrypoints
which doesn't fix the Buildbot builds.
The function is implemented using the following Taylor series that's
generated using [python-sympy](https://www.sympy.org/en/index.html), and
it is very accurate for |x| $$\in [0, 0.5]$$ and has been verified using
Geogebra. The range reduction is used for the rest range (0.5, 1].
$$
\frac{\arcsin(x)}{\pi} \approx
\begin{aligned}[t]
& 0.318309886183791x \\
& + 0.0530516476972984x^3 \\
& + 0.0238732414637843x^5 \\
& + 0.0142102627760621x^7 \\
& + 0.00967087327815336x^9 \\
& + 0.00712127941391293x^{11} \\
& + 0.00552355646848375x^{13} \\
& + 0.00444514782463692x^{15} \\
& + 0.00367705242846804x^{17} \\
& + 0.00310721681820837x^{19} + O(x^{21})
\end{aligned}
$$
## Geogebra graph

Closes#132210
This PR fixes broken links in all files describing libc usage modes.
Please let me know if there are any other places that need updating.
---------
Co-authored-by: shubhp@perlmutter <shubhp@perlmutter.com>
Add `CLOCKS_PER_SEC` and the older `CLK_TCK`. Allows the user to define
a `__CLK_TCK` to override if necessary.
Also add an extra column for embedded AArch64 in `time.rst`
Main algorithm:
The Taylor series expansion of `asin(x)` is:
```math
\begin{align*}
asin(x) &= x + x^3 / 6 + 3x^5 / 40 + ... \\
&= x \cdot P(x^2) \\
&= x \cdot P(u) &\text{, where } u = x^2.
\end{align*}
```
For the fast path, we perform range reduction mod 1/64 and use degree-7
(minimax + Taylor) polynomials to approximate `P(x^2)`.
When `|x| >= 0.5`, we use the transformation:
```math
u = \frac{1 + x}{2}
```
and apply half-angle formula to reduce `asin(x)` to:
```math
\begin{align*}
asin(x) &= sign(x) \cdot \left( \frac{\pi}{2} - 2 \cdot asin(\sqrt{u}) \right) \\
&= sign(x) \cdot \left( \frac{\pi}{2} - 2 \cdot \sqrt{u} \cdot P(u) \right).
\end{align*}
```
Since `0.5 <= |x| <= 1`, `|u| <= 0.5`. So we can reuse the polynomial
evaluation of `P(u)` when `|x| < 0.5`.
For the accurate path, we redo the computations in 128-bit precision
with degree-15 (minimax + Taylor) polynomials to approximate `P(u)`.
This PR implements the following 8 functions along with the tests.
```c++
int idivr(fract, fract);
long int idivlr(long fract, long fract);
int idivk(accum, accum);
long int idivlk(long accum, long accum);
unsigned int idivur(unsigned fract, unsigned fract);
unsigned long int idivulr(unsigned long fract, unsigned long fract);
unsigned int idivuk(unsigned accum, unsigned accum);
unsigned long int idivulk(unsigned long accum, unsigned long accum);
```
ref: https://www.iso.org/standard/51126.htmlFixes#129125
---------
Signed-off-by: krishna2803 <kpandey81930@gmail.com>