Usage of uninitialized memory is a top memory safety issue in C++ codebases.
Help mitigate this somewhat by default initialize stack allocations to a
pattern (0xAA repeating).
Clang has received optimizations to sink these into control flow paths that
access such values to minimize the overhead of these added initializations.
If there's a measurable slowdown, we can add
-ftrivial-auto-var-init-max-size=<N> for some value N bytes if we have any
large stack allocations, or add attribute uninitialized to any variable
declarations.
Unsupported until GCC 12.1 / Clang 8.
Increases file size of libc.a from a full build by +8.79Ki (+0.2%).
fixes https://github.com/llvm/llvm-project/issues/78743
- For normal objects, the patch removes `RTTI` and exceptions in `fullbuild`
- For FP tests, the patch adds links to `stdc++` and `gcc_s` if `MPFR` is used.
This reverts commit 6886a52d6dbefff77f33de12ff85d654e2557f81.
Most of the errors observed in postsubmit have been addressed. We can
fix-forward the remaining ones.
Link: https://lab.llvm.org/buildbot/#/changes/117129
Summary:
A previous patch added a dependency on the stack protectors, this was
not built on the GPU targets so every test was disabled. It turns out
that disabled tests still get targets so we need to specifically check
if the it is in the target's set of entrypoints before we can use it.
Another patch, because the build-bot was down, snuck in that prevented
the new math tests from being run. The problem is that the `signal.h`
header requires target specific definitions but was being used
unconditionally. I have made changes that disable building this header
if the file is not defined in the config. This required disbaling the
signal_to_string utility, so that will simply be missing from targets
that don't define it.
This allows individual object files to override the common compile
commands in
their local CMakeLists' add_object_library call.
For example, the common compile commands contain -Wall and -Wextra.
Before
this patch, the per object COMPILE_OPTIONS were prepended to these, so
that
builds of individual object files could not individually disable
specific
diagnostics from those groups explicitly.
After this patch, the per-object file compile objects are appended to
the list
of compiler flags, enabling this use case.
ARGN is a bit of cmake magic; let's be explicit in the APPEND that we're
appending the compile options.
Link: #77007
* separate initialization routines into _start and do_start for all
architectures.
* lift do_start as a separate object library to avoid code duplication.
* (addtionally) address the problem of building hermetic libc with
-fstack-pointer-*
The `crt1.o` is now a merged result of three components:
```
___
|___ x86_64
| |_______ start.cpp.o <- _start (loads process initial stack and aligns stack pointer)
| |_______ tls.cpp.o <- init_tls, cleanup_tls, set_thread_pointer (TLS related routines)
|___ do_start.cpp.o <- do_start (sets up global variables and invokes the main function)
```
Profiling cmake shows that a significant time configuring `libc` folder
is spent on running `get_object_files_for_test` in the `test` folder (13
sec in `libc/test` folder / 16 sec in `libc` folder). By caching all
needed objects for each target instead of resolving every time, the time
cmake spends on configuring `libc/test` folder is reduced to ~1s.
Summary:
The GPU tests tend to fail when run massively in parallel. This is why
we use a CMake job pool to limit it to 1 in most cases. We should
default to the configuration that is most likely to work, that being a
single thread. There aren't enough GPU tests for this to be a massive
increase in test time on the bots, so we should default to what works
guaranteed.
This reverts commit 606653091d1a66d1a83a1bfdea2883cc8d46687e.
Post submit buildbots are now red. We can use these explicit errors to better
clean up existing warnings, then reland this.
Link: #73966
A recent commit introduced warnings observable when building unit tests.
If the
unit tests don't fail when warnings are introduced into the build, then
we
might fail to notice them in the stream of output from check-libc.
Link: https://github.com/llvm/llvm-project/pull/72763/files#r1410932348
There are some basic vectorization features in standard architecture
specifications. Such as SSE/SSE2 for x86-64, or NEON for aarch64. Even
though such features are almost always available, we still need some
methods to test fallback routines without any vectorization.
Previous attempt in hsearch adds a DISABLE_SSE2_OPT flag that tries to
compile the code with -mno-sse2 in order to test specific table scanning
routines. However, it turns out that such flag may have some unwanted
side effects hindering portability.
This PR introduces PREFER_GENERIC as an alternative. When a target is
built with PREFER_GENERIC, cmake will define a macro
__LIBC_PREFER_GENERIC such that developers can selectively choose the
fallback routine based on the macro.
This patch implements `hcreate(_r)/hsearch(_r)/hdestroy(_r)` as
specified in https://man7.org/linux/man-pages/man3/hsearch.3.html.
Notice that `neon/asimd` extension is not yet added in this patch.
- The implementation is largely simplified from rust's
[`hashbrown`](https://github.com/rust-lang/hashbrown/blob/master/src/raw/mod.rs)
as we only consider fix-sized insertion-only hashtables. Technical
details are provided in code comments.
- This patch also contains a portable string hash function, which is
derived from [`aHash`](https://github.com/tkaitchuck/aHash)'s fallback
routine. Not using any SIMD acceleration, it has a good enough quality
(passing all SMHasher tests) and is not too bad in speed.
- Some general functionalities are added, such as `memory_size`,
`offset_to`(alignment), `next_power_of_two`, `is_power_of_two`.
`ctz/clz` are extended to support shorter integers.
Floating point properties are a combination of target OS, target
architecture and compiler support.
- Adding target OS detection,
- Moving floating point type detection to its own file.
This is in preparation of adding support for `_Float16` which requires
testing compiler **version** and target architecture.
Summary:
This patch includes the necessary changes to make the `libc` tests
running on AMD GPUs run using the newer code object version. The 'code
object version' is AMD's internal ABI for making kernel calls. The move
from 4 to 5 changed how we handle arguments for builtins such as
obtaining the grid size or setting up the size of the private stack.
Fixes: https://github.com/llvm/llvm-project/issues/72517
Currently the only way to add or remove entrypoints is to modify the
entrypoints.txt file for the current target. This isn't ideal since
a user would have to carry a diff for this file when updating their
checkout. This patch adds a basic mechanism to allow the user to remove
entrypoints without modifying the repository.
Summray:
A recent patch upgrades the NVPTX ctor / dtor lowering pass to emit
kernels so other languages can call them. We do this manually in `libc`
so we do not need this. Use the provided flag to disable this step to
keep the created kernels cleaner.
Summary:
This patch fixes some code duplication on the GPU. The GPU build wanted
to enable timing for hermetic tests so it built some special case
handling into the test suite. Now that `clock` is supported on the
target we can simply link against the external interface. Because we
include `clock.h` for the CLOCKS_PER_SEC macro we remap the C entrypoint
to the internal one if it ends up called. This should allow hermetic
tests to run with timing if it is supported.
Summary:
This patch simply adds the `-fconvergent-functions` flag to the GPU
compilation. This is in relation to the behaviour of SIMT
architectures under divergence. With the flag, we assume every function
is convergent by default and rely on the compiler's divergence analysis
to transform it if possible.
Fixes: https://github.com/llvm/llvm-project/issues/63853
Summary:
There were a few tests that weren't enabled on the GPU. This is because
the logic caused them to be skipped as we don't use CPU featured on the
host. This also disables the logic making multiple versions of the
memory functions.
Summary:
Previously this code was applied to the integration tests but did not
copy the logic that stopped this from being passed to the GPU build.
Copy the full line to avoid the warnings and prevent any libraries from
being included.
This patch enables the compilation of libc for rv32 by unifying the
current rv64 and rv32 implementation into a single rv implementation.
We updated the cmake file to match the new riscv32 arch and force
LIBC_TARGET_ARCHITECTURE to be "riscv" whenever we find "riscv32" or
"riscv64". This is required as LIBC_TARGET_ARCHITECTURE is used in the
path for several platform specific implementations.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D148797
printf_core.parser is not yet updated to use the printf config options. It
does not use them currently anyway and the corresponding parser_test
should be updated to respect the config options.
This patch set the integration test's linking options to be the same one
used in the hermetic tests.
In particular, by removing -nostdlib the tests are linked with
libgcc/compiler-rt and this fixes an issue undefined reference to
__udivdi3 and __umoddi3 in rv32.
${CMAKE_CROSSCOMPILING_EMULATOR} will be used in the new rv32 buildbot
and is prepended automatically when we call add_custom_target in CMake,
except when we use a custom command.
There are two places where custom commands are used in libc, so we
explicitly add the ${CMAKE_CROSSCOMPILING_EMULATOR} variable there.
Other systems that don't use ${CMAKE_CROSSCOMPILING_EMULATOR} are
unaffected
We want to activate `llvm-header-guard` (#66477) but the current CMake
configuration includes paths that should be `isystem`. This PR restricts
the number of `-I` passed to the clang command line and correctly marks
the llvm libc include path as `isystem`.
Summary:
The GPU build has a lot of magic around how we package the output.
Generally, the GPU needs to exist as a secondary fatbinary image for
offloading languages. This is because offloading languages pretend like
offloading to an accelerator is a single file. This then needs to be put
into a single file to make it mesh with the existing build
infrastructure. To work with this, the `libc` makes an installed version
of the library that simply embeds the GPU code into an empty stub file.
This wasn't being updated correctly, which lead to the installed `libc`
static library not being updated correctly when the underlying file was
changed. The previous behaviour only updated when the entrypoint itself
was modified, but not any of its headers. By adding a dependcy on the
actual *object* file we should now capture the regular CMake semantics.
This is part of a libc wide CMake cleanup which aims to eliminate
certain explicitly duplicated logic which is available in CMake-3.20.
This change in particular makes the entrypoint aliases real library
targets so that they can be treated as normal library targets by other
libc build rules.
The internal header library target with name suffix `.__header_library`
has been removed as it serves no purpose now. It was added to make older
versions of CMake happy.
Summary:
There is currently effort to change over the default AMDGPU code object
version https://github.com/llvm/llvm-project/pull/65410. However, this
unfortunately causes problems in the LLVM LibC test suite that leads to
a hang while executing. This is most likely a bug to do with indirect
call optimization, as it can be avoided without optimizations or with
manually preventing inlining in the AMDGPU startup code.
This patch sets the AMDGPU code object version to be four explicitly on
the LibC test suite. This should unblock the efforts to move the default
to 5 without breaking the test suite. This isn't a great solution, but
there is currently some time pressure to get COV5 landed and this seems
to be the easiest solution.
The options added via COMPILE_OPTIONS will be treated as INTERFACE
options. This will help in setting compile options based on libc config
options in future patches.
Summary:
AMDGPU binaries use a "code object" as the ABI indicator. We are
currently trying to move over to a newer code object. We want these
library functions to use the "generic" or default ABI such that it is
specified when linked into the user application. Currently this will
default to v4 as the startup code will use whatever the current default
is.
Summary:
A previous introduced a new object type for the GPU functions
implemented by an external vendor library. This was done so they we did
not attempt to run tests on functions which we did not implement,
however this accidentally stopped them from being included in the actual
output. Fix this by checking the new type as well.
The long term goal is to remove this vendor handling altogether, but is
being used as a short-term solution to provide a math library on the
GPU which currently lacks one.
Few printf config options have been setup using this new config system
along with their baremetal overrides. A follow up patch will add generation
of doc/config.rst, which will contain the full list of libc config options
and short description explaining how they affect the libc.
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D159158
It's very important that the GPU build does not include any system
directories. We currently use `-ffreestanding` to disable a lot of these
features, but we can still accidentally include them if they are not
provided by `libc` yet. This patch changes this to use `-nostdinc` to
disable all standard search paths. Then we use the compiler's resource
directory to pick up the provided headers like `stdint.h`.
Differential Revision: https://reviews.llvm.org/D159445
We currently remap vendor implementations of math functions to provide a
temporarily functional `libm.a` for the GPU. However, we should not run
tests on any files that depend on these vendor implementations as they
are not under our control and are not always present.
The goal in the future is to remove the need for this by replacing all
the vendor functionality, but for now this is a workaround.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D158213
We use `find_program` to identify a few programs we use for offloading.
Namely, `clang-offload-packger`, `amdgpu-arch`, and `nvptx-arch`.
Currently the logic allows these to bind to any tool matching this name,
so it will find it on the system. This meant that if the installation
was deleted or it found a broken binary the compilation would fail. We
should only pull these from the current LLVM binary directory.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D158203
This more closely matches the stricter warnings used for
this same code in the Fuchsia build.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D156630