Summary:
Currently, we implement the `rand` function using thread-local storage.
This is somewhat problematic because not every target supports TLS, and
even more do not support non-zero initializers on TLS.
The C standard states that the `rand()` function need not be thread,
safe. However, many implementations provide thread-safety anyway.
There's some confusing language in the 'rationale' section of
https://pubs.opengroup.org/onlinepubs/9699919799/functions/rand.html,
but given that `glibc` uses a lock, I think we should make this thread
safe as well. it mentions that threaded behavior is desirable and can be
done in the two ways:
1. A single per-process sequence of pseudo-random numbers that is shared
by all threads that call rand()
2. A different sequence of pseudo-random numbers for each thread that
calls rand()
The current implementation is (2.) and this patch moves it to (1.). This
is beneficial for the GPU case and more generic support. The downside is
that it's slightly slower to do these atomic operations, the fast path
will be two atomic reads and an atomic write.
This refactors some of the FreeListHeap, FreeList, and Block classes to
have constexpr ctors so we can constinit a global allocator that does
not require running some global function or global ctor to initialize.
This is needed to prevent worrying about initialization order and any
other module-ctor can invoke malloc without worry.
This is the actual freelist allocator which utilizes the generic
FreeList and the Block classes. We will eventually wrap the malloc
interface around this.
This is a part of #94270 to land in smaller patches.
This implements a traditional freelist to be used by the freelist
allocator. It operates on spans of bytes which can be anything. The
freelist allocator will store Blocks inside them.
This is a part of #94270 to land in smaller patches.
A block represents a chunk of memory used by the freelist allocator. It
contains header information denoting the usable space and pointers as
offsets to the next and previous block.
On it's own, this doesn't do much. This is a part of
https://github.com/llvm/llvm-project/pull/94270 to land in smaller
patches.
This is a subset of pigweed's freelist allocator implementation.
- added at_quick_exit function
- used helper file exit_handler which reuses code from atexit
- atexit now calls helper functions from exit_handler
- test cases and dependencies are added
---------
Co-authored-by: Aaryan Shukla <aaryanshukla@google.com>
- In /libc/src/__support/ OSUtil, changed quick_exit to just exit, and
put in namespace
LIBC_NAMESPACE::internal.
- In /libc/src/stdlib added quick_exit
- Added test files for quick_exit
Adds tests for `inf` and `nan` values to the tests for `strfrom*()`
functions.
Also marks some variables as `[[maybe_unused]]` to silence unused
variable warnings.
Follow up to #85438.
Implements the functions `strfromd()` and `strfroml()` introduced in
C23, and unifies the testing framework for `strfrom*()` functions.
Fixes#84244.
Implements the function `strfromf()` introduced in C23, and adds shared
utilities for implementation of other `strfrom*()` functions, including
`strfromd()` and `strfroml()`.
Summary:
A lot of these tests failed previously and were disabled. However we
have fixed some things since then and many of these seem to pass.
Additionally, the last remaining math tests that failed seemed to be due
to the exception handling. For now we just set it to be 'errno'.
These pass locally when tested on a gfx1030, gfx90a, and sm_89
architecture. Hopefully these pass correctly on the sm_60 bot as I've
had things fail on that one only before.
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.
This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.
The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
https://github.com/llvm/llvm-project/pull/81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.
The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.
The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.
We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.
This approach is vastly superior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.
Some certain utilities need to be built with
`--target=${LLVM_HOST_TRIPLE}` as well. I think this is a fine
workaround as we
will always assume that the GPU `libc` is a cross-build with a
functioning host.
Depends on https://github.com/llvm/llvm-project/pull/81557
Having libc_errno outside of the namespace causes versioning issues when
trying to link the tests against LLVM-libc. Most of this patch is just
moving libc_errno inside the namespace in tests. This isn't necessary in
the function implementations since those are already inside the
namespace.
This patch provides specific test macros to deal with `errno`.
This will help abstract away the differences between unit test and integration/hermetic tests in #79319.
In one case we use `libc_errno` which is a struct, in the other case we deal directly with `errno`.
The semantics for casting can range from "bitcast" (same representation)
to "different representation", to "type promotion". Here we remove the
cast operator and force usage of `get_val` as the only function to get
the floating point value, making the intent clearer and more consistent.
This patch reduces the scope of `FPBits` exported variables and
functions.
It also moves storage up into `FPRep` and tries to make the default and
specialized versions of `FPBits` more uniform.
The next step is to move the specialization from `FPBits` to `FPRep` so
we can manipulate floating point representations through `FPType`
alone - that is - independently from the host architecture.
Summary:
This patch partially implements the `rand` function on the GPU. This is
partial because the GPU currently doesn't support thread local storage
or static initializers. To implement this on the GPU. I use 1/8th of the
local / shared memory quota to treak the shared memory as thread local
storage. This is done by simply allocating enough storage for each
thread in the block and indexing into this based off of the thread id.
The downside to this is that it does not initialize `srand` correctly to
be `1` as the standard says, it is also wasteful. In the future we
should figure out a way to support TLS on the GPU so that this can be
completely common and less resource intensive.
Summary:
This patch improves the implementation of the standard `rand()` function
by implementing it in terms of the xorshift64star pRNG as described in
https://en.wikipedia.org/wiki/Xorshift#xorshift*. This is a good,
general purpose random number generator that is sufficient for most
applications that do not require an extremely long period. This patch
also correctly initializes the seed to be `1` as described by the
standard. We also increase the `RAND_MAX` value to be `INT_MAX` as the
standard only specifies that it can be larger than 32768.
Other libc implementations support underscores in NaN(n-char-sequence)
strings. Us not supporting that is causing fuzz failures, so this patch
solves the problem.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D156927
This patch adds the reentrent qsort entrypoint, qsort_r. This is done by
extending the qsort functionality and moving it to a shared utility
header. For this reason the qsort_r tests focus mostly on the places
where it differs from qsort, since they share the same sorting code.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D152467
This patch adds support for the `malloc` and `free` functions. These
currently aren't implemented in-tree so we first add the interface
filies.
This patch provides the most basic support for a true `malloc` and
`free` by using the RPC interface. This is functional, but in the future
we will want to implement a more intelligent system and primarily use
the RPC interface more as a `brk()` or `sbrk()` interface only called
when absolutely necessary. We will need to design an intelligent
allocator in the future.
The semantics of these memory allocations will need to be checked. I am
somewhat iffy on the details. I've heard that HSA can allocate
asynchronously which seems to work with my tests at least. CUDA uses an
implicit synchronization scheme so we need to use an explicitly separate
stream from the one launching the kernel or the default stream. I will
need to test the NVPTX case.
I would appreciate if anyone more experienced with the implementation details
here could chime in for the HSA and CUDA cases.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D151735
There were regressions in the testing framework due to none of the
functioning buildbots having a 32 bit long. This allowed the 32 bit
version of the strtointeger function to go untested. This patch adds
tests for strtoint32 and strtoint64, which are internal testing
functions that use constant integer sizes. It also fixes the tests to
properly handle these situations.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D151935
Along the way, couple of additional things have been done:
1. Move `ErrnoSetterMatcher.h` to `test/UnitTest` as all other matchers live
there now.
2. `ErrnoSetterMatcher` ignores matching `errno` on GPUs.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D151129
The previous string to float tests didn't check correctness, but due to
the atof differential test proving unreliable the strtofloat fuzz test
has been changed to use MPFR for correctness checking. Some minor bugs
have been found and fixed as well.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D150905
This changes the `stdlib` tests to the new `add_libc_test` framework.
This applies to all but the exit tests.
Depends on D149691
Reviewed By: sivachandra, michaelrj
Differential Revision: https://reviews.llvm.org/D149705
Previously, only those unit tests which belonged to a suite were run as
part of libc-unit-tests. It meant that unit tests not part of any suite
were not being tested. This change makes all unit tests run as part of
libc-unit-tests. The convenience function to add a libc unit test suite
has been removed and add_custom_target is used in its place. One of the
bit-rotting test has along been fixed. Math exhaustive and differential
tests are skipped under full build.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D148784
Same issue as was fixed in commit 3d95323, but for hexadecimal floats.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D148545
String to float has a condition to prevent overflowing the exponent with
the E notation. To do this it checks if adding that exponent to the
exponent found by parsing the number is greater than the maximum
exponent for the given size of float. The if statements had a gap on
exactly the maximum exponent value that caused it to be treated as the
minimum exponent value. This patch fixes those conditions.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D148152
Previously if you specified an exponent of more than 10000 in string to
float (e.g. "1e10001") it would treat it as 10000. A bug was discovered
where by having more than 10000 zeroes after a decimal point and an
exponent of more than 10000 this would cause the code to return the
incorrect value.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D147474
Previously the check to just return MAX or MIN used the caclulated
number being the maximum absolute value. This was right in every case
except for an unsigned conversion being passed its maximum value with a
negative sign on the front. This should return -MAX, but was returning
just MAX.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D147171
The differential fuzzer found that glibc and our libc disagree on the
result for "0x30000002222225p-1077", with ours being rounded up and
theirs rounded down. Ours is more correct for the nearest rounding mode,
so only a test is added.
Reviewed By: lntue, sivachandra
Differential Revision: https://reviews.llvm.org/D145821
This part of the effort to make all test related pieces into the `test`
directory. This helps is excluding test related pieces in a straight
forward manner if LLVM_INCLUDE_TESTS is OFF. Future patches will also move
the MPFR wrapper and testutils into the 'test' directory.
one of the tests in StrtolTest.h is intended to detect that 32 bit longs
are handled correctly, but it wasn't using the correct value for errno
causing failures.
Differential Revision: https://reviews.llvm.org/D140577
Previously the tests were copy/pasted into several files, this changes
them to be instead templated and sharing one file.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D140441