This is based on ideas from @nafi to:
- use a branchless version of 'cmp' for 'uint32_t',
- completely resolve the lexicographic comparison through vector
operations when wide types are available. We also get rid of byte
reloads and serializing '__builtin_ctzll'.
I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.
The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.
Reviewed By: nafi3000
Differential Revision: https://reviews.llvm.org/D148717
Many math functions need to check for floating point rounding modes to
return correct values. Currently most of them use the internal implementation
of `fegetround`, which is platform-dependent and blocking math functions to be
enabled on platforms with unimplemented `fegetround`. In this change, we add
platform independent rounding mode checks and switching math functions to use
them instead. https://github.com/llvm/llvm-project/issues/63016
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D152280
This is based on ideas from @nafi to:
- use a branchless version of 'cmp' for 'uint32_t',
- completely resolve the lexicographic comparison through vector
operations when wide types are available. We also get rid of byte
reloads and serializing '__builtin_ctzll'.
I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.
The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.
Reviewed By: nafi3000
Differential Revision: https://reviews.llvm.org/D148717
str method of FPBits class is only used for pretty printing its objects
in tests. It brings cpp::string dependency to FPBits class, which is not ideal
for embedded use case. We move str method to a free function in test utils and
remove this dependency of FPBits class.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D152607
A few of these tests were disabled due to failing on NVPTX. After
looking into it the vast majority of these cases were due to
insufficient stack memory. This can be worked around by increasing the
stack size in the loader or by reducing the memory usage in the case of
large string constants.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D152583
This is a bit of cleanup before working on logging via stream operator (i.e., `EXPECT_XXX() << ...`).
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D152503
The constants in the ryu constant tables were not marked constexpr,
only static const. This caused problems when compiling with GCC.
Differential Revision: https://reviews.llvm.org/D152487
The Mega Table that printf uses for long doubles with some flags is too
large for the linters, and so has been split out from the main patch.
The main patch: https://reviews.llvm.org/D150399
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D152470
This test started failing on Nvidia, we need to disable it to keep the
bot green until we can investigate the root cause.
Differential Revision: https://reviews.llvm.org/D152481
The results for the %Lf tests were calculated on 80 bit long double
systems, meaning the results are not necessarily accurate for float128
systems. In future these will be fixed, but for the moment I'm just
turning them off.
Differential Revision: https://reviews.llvm.org/D152471
This patch adds three options for printf decimal long doubles, and these
can also apply to doubles.
1. Use a giant table which is fast and accurate, but takes up ~5MB).
2. Use dyadic floats for approximations, which only gives ~50 digits of
accuracy but is very fast.
3. Use large integers for approximations, which is accurate but very
slow.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D150399
To generate fma instructions for x86-64 targets, we need both -mavx2
and -mfma.
Reviewed By: brooksmoses
Differential Revision: https://reviews.llvm.org/D152410
Recent upstream changes to -fsanitize=function find more instances of
function type mismatches. One case is with the comparator passed to this
class. Libraries like boringssl will tend to pass comparators that take
pointers to varying types while this comparator expects to accept const
void pointers. Ideally those tools would pass a function that strictly
accepts const void*s to avoid UB, or we'd have something like qsort_r/s.
Differential Revision: https://reviews.llvm.org/D152322
The `exit` entrypoint calls into `quick_exit` which is marked noreturn
in some cases. This will cause errors because we then have control flow
externally. This warning can be silenced by using a
`__builtin_unreachable` instruction accordingly.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D152323
We should more consistently use inline assembly using the LIBC wrappers.
It's much safer to mark all of these volatile as well.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D152294
ea8f4b98419750c8cc7c60ea43b570adf47b3f78 broke some build configurations
because it was enabled by default and some people are using a just built
libc/clang/LLVM to work on other projects where having a just built LLVM
libc in one of Clang's default include directories can make things
unusable.
Differential Revision: https://reviews.llvm.org/D152190
A previous patch added general support for printing via the RPC
interface. we should consolidate this functionality and get rid of the
old opcode that was used for simple testing.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D152211
If CUDA is not found this string will expand into nothing. We need to
surround it with a string otherwise it will cause build failures.
Differential Revision: https://reviews.llvm.org/D152209
This patch adds the initial support required to support basic priting in
`stdio.h` via `puts` and `fputs`. This is done using the existing LLVM C
library `File` API. In this sense we can think of the RPC interface as
our system call to dump the character string to the file. We carry a
`uintptr_t` reference as our native "file descriptor" as it will be used
as an opaque reference to the host's version once functions like
`fopen` are supported.
For some unknown reason the declaration of the `StdIn` variable causes
both the AMDGPU and NVPTX backends to crash if I use the `READ` flag.
This is not used currently as we only support output now, but it needs
to be fixed
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D151282
This patch adds support for the `malloc` and `free` functions. These
currently aren't implemented in-tree so we first add the interface
filies.
This patch provides the most basic support for a true `malloc` and
`free` by using the RPC interface. This is functional, but in the future
we will want to implement a more intelligent system and primarily use
the RPC interface more as a `brk()` or `sbrk()` interface only called
when absolutely necessary. We will need to design an intelligent
allocator in the future.
The semantics of these memory allocations will need to be checked. I am
somewhat iffy on the details. I've heard that HSA can allocate
asynchronously which seems to work with my tests at least. CUDA uses an
implicit synchronization scheme so we need to use an explicitly separate
stream from the one launching the kernel or the default stream. I will
need to test the NVPTX case.
I would appreciate if anyone more experienced with the implementation details
here could chime in for the HSA and CUDA cases.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D151735
This is based on ideas from @nafi to:
- use a branchless version of 'cmp' for 'uint32_t',
- completely resolve the lexicographic comparison through vector
operations when wide types are available. We also get rid of byte
reloads and serializing '__builtin_ctzll'.
I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.
The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.
Reviewed By: nafi3000
Differential Revision: https://reviews.llvm.org/D148717
This patch moves the location of libllvmlibc.a within the build tree to
within ./lib/<target triple>. This more closely matches the behavior of
other runtime builds and allows for clang in the same build tree to
automatically be able to link against llvmlibc since this path is by
default included by the driver.
Also removes the LIBC_BINARY_DIR CMake flag since it isn't used anywhere
in the tree (based on a quick grep).
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D151624
The C standard asserts that the `errno` value is an l-value thread local
integer. We cannot provide a generic thread local integer on the GPU
currently without some workarounds. Previously, we worked around this by
implementing the `errno` value as a special consumer class that made all
the writes disappear. However, this is problematic for internal tests.
Currently there are build failures because of this handling and it's
only likely to cause more problems the more we do this.
This patch instead makes the internal target used for testing export the
`errno` value as a simple global integer. This allows us to use and test
the `errno` interface correctly assuming we run with a single thread.
Because this is only used for the non-exported target we still do not
provide this feature in the version that users will use so we do not
need to worrk about it being incorrect in general.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D152015
There were regressions in the testing framework due to none of the
functioning buildbots having a 32 bit long. This allowed the 32 bit
version of the strtointeger function to go untested. This patch adds
tests for strtoint32 and strtoint64, which are internal testing
functions that use constant integer sizes. It also fixes the tests to
properly handle these situations.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D151935
Update implementation status table for Date and Time Functions to include different targets.
Reviewed By: jeffbailey
Differential Revision: https://reviews.llvm.org/D151809
This patch simply moves the special handling for `linux` files to a
subdirectory. This is done to make it easier in the future to extend
this support to targets (like the GPU) that will have different
dependencies.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D151231
This reverts commit d763c6e5e2d0a6b34097aa7dabca31e9aff9b0b6.
Adds the patch by @hans from
https://github.com/llvm/llvm-project/issues/62719
This patch fixes the Windows build.
d763c6e5e2d0a6b34097aa7dabca31e9aff9b0b6 reverted the reviews
D144509 [CMake] Bumps minimum version to 3.20.0.
This partly undoes D137724.
This change has been discussed on discourse
https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-cmake-version/66193
Note this does not remove work-arounds for older CMake versions, that
will be done in followup patches.
D150532 [OpenMP] Compile assembly files as ASM, not C
Since CMake 3.20, CMake explicitly passes "-x c" (or equivalent)
when compiling a file which has been set as having the language
C. This behaviour change only takes place if "cmake_minimum_required"
is set to 3.20 or newer, or if the policy CMP0119 is set to new.
Attempting to compile assembly files with "-x c" fails, however
this is workarounded in many cases, as OpenMP overrides this with
"-x assembler-with-cpp", however this is only added for non-Windows
targets.
Thus, after increasing cmake_minimum_required to 3.20, this breaks
compiling the GNU assembly for Windows targets; the GNU assembly is
used for ARM and AArch64 Windows targets when building with Clang.
This patch unbreaks that.
D150688 [cmake] Set CMP0091 to fix Windows builds after the cmake_minimum_required bump
The build uses other mechanism to select the runtime.
Fixes#62719
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D151344
Along the way, couple of additional things have been done:
1. Move `ErrnoSetterMatcher.h` to `test/UnitTest` as all other matchers live
there now.
2. `ErrnoSetterMatcher` ignores matching `errno` on GPUs.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D151129
This passed locally but unfortauntely it seems some tests are not ready
to be made hermetic. Revert for now until we can investigate
specifically which tests are failing and mark those as `UNIT_TEST_ONLY`.
This reverts commit 417ea79e792a87d53f5ac4f5388af4b25aa04d7d.
This patch enables us to run the floating point tests as hermetic.
Importantly we now use the internal versions of the `fesetround` and
`fegetround` functions.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D151123
Strict warnings require explicit static_cast to counteract
default widening of types narrower than int.
Functions in header files should have vague linkage (inline
keyword), not internal linkage (static) or external linkage
(no inline keyword) even for template functions. Note these
don't use the LIBC_INLINE macro since this is only for test code.
Reviewed By: abrachet
Differential Revision: https://reviews.llvm.org/D151494
It resolves to thread_local on all platform except for the GPUs on which
it resolves to nothing. The use of thread_local in the source code has been
replaced with the new macro.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D151486