Summary:
I initially didn't report these as supported because they didn't provide
expected behavior and were very wasteful. The recent patch moved them to
a lock-free atomic implementation so they can now actually be used.
Summary:
Currently, we implement the `rand` function using thread-local storage.
This is somewhat problematic because not every target supports TLS, and
even more do not support non-zero initializers on TLS.
The C standard states that the `rand()` function need not be thread,
safe. However, many implementations provide thread-safety anyway.
There's some confusing language in the 'rationale' section of
https://pubs.opengroup.org/onlinepubs/9699919799/functions/rand.html,
but given that `glibc` uses a lock, I think we should make this thread
safe as well. it mentions that threaded behavior is desirable and can be
done in the two ways:
1. A single per-process sequence of pseudo-random numbers that is shared
by all threads that call rand()
2. A different sequence of pseudo-random numbers for each thread that
calls rand()
The current implementation is (2.) and this patch moves it to (1.). This
is beneficial for the GPU case and more generic support. The downside is
that it's slightly slower to do these atomic operations, the fast path
will be two atomic reads and an atomic write.
Summary:
This patch partially implements the `rand` function on the GPU. This is
partial because the GPU currently doesn't support thread local storage
or static initializers. To implement this on the GPU. I use 1/8th of the
local / shared memory quota to treak the shared memory as thread local
storage. This is done by simply allocating enough storage for each
thread in the block and indexing into this based off of the thread id.
The downside to this is that it does not initialize `srand` correctly to
be `1` as the standard says, it is also wasteful. In the future we
should figure out a way to support TLS on the GPU so that this can be
completely common and less resource intensive.
Summary:
This patch improves the implementation of the standard `rand()` function
by implementing it in terms of the xorshift64star pRNG as described in
https://en.wikipedia.org/wiki/Xorshift#xorshift*. This is a good,
general purpose random number generator that is sufficient for most
applications that do not require an extremely long period. This patch
also correctly initializes the seed to be `1` as described by the
standard. We also increase the `RAND_MAX` value to be `INT_MAX` as the
standard only specifies that it can be larger than 32768.
It resolves to thread_local on all platform except for the GPUs on which
it resolves to nothing. The use of thread_local in the source code has been
replaced with the new macro.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D151486
This provides the reference implementation of rand and srand. In future
this will likely be upgraded to something that supports full ints.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D135187