Currently `add_lit_target` sets the `USES_TERMINAL` CMake option. When
using Ninja, this forces all lit testsuite targets into the
single-threaded `console` pool.
This PR adds a new option `LLVM_PARALLEL_LIT` which drops the
`USES_TERMINAL` flag, allowing Ninja to run them in parallel.
The default setting (`LLVM_PARALLEL_LIT=OFF`) retains the existing
behavior of serial testsuite execution.
Add options --set-symbol-visibility and --set-symbols-visibility to
manually change the visibility of symbols.
There is already an option to set the visibility of newly added symbols
via --add-symbol and --new-symbol-visibility. This option will allow to
change the visibility of already existing symbols.
This patch adds a LLVM-EXEGESIS-LOOP-REGISTER snippet annotation which
allows a user to specify the register to use for the loop counter in the
loop repetition mode. This allows for executing snippets that don't work
with the default value (currently R8 on X86).
Primary change is to add a flag `--pretty-pgo-analysis-map` to
llvm-readobj and llvm-objdump that prints block frequencies and branch
probabilities in the same manner as BFI and BPI respectively. This can
be helpful if you are manually inspecting the outputs from the tools.
In order to print, I moved the `printBlockFreqImpl` function from
Analysis to Support and renamed it to `printRelativeBlockFreq`.
https://reviews.llvm.org/D63820 added
llvm/docs/CommandGuide/llvm-objcopy.rst with clearer semantics, e.g.
```
Read a list of names from the file <filename> and mark defined symbols with those names as global in the output
instead of the help message
Read a list of symbols from <filename> and marks them global" (omits "defined")
Rename sections called <old> to <new> in the output
instead of the help message
Rename a section from old to new (multiple sections may be named <old>
```
Sync the help messages to incorporate the CommandGuide improvement.
While here, switch to the conventional imperative sentences for a few
options. Additionally, mark some options as grp_coff or grp_macho.
This is a fix for commit e9cdd165d7bc ("[docs] Remove the Packaging
"Tips" which seems to be about pre-cmake ./configure (#82958)"). The
doc was removed but not the references to it.
Following the changes made for:
- https://reviews.llvm.org/D154130: [lit][clang] Avoid realpath on Windows due to MAX_PATH limitations
in commit:
- 05d613ea931b6de1b46dfe04b8e55285359047f4
Python 3.8.0 or newer is now required by at least the following tests
when they are run on Windows from a substitute (virtual) drive. A
substitute drive is often used as a workaround for `MAX_PATH`
limitations on Windows. These tests are impacted because they use the
lit `%{?:real}` path expansion syntax to expand symbolic links and
substitute drives. This path expansion is implemented with Python's
`os.path.realpath()` function which changed behavior in Python 3.8.0
with regard to expansion of substitute drives. The changes mentioned
above rely on the newer Python behavior.
- `clang/test/Lexer/case-insensitive-include-absolute.c`
- `clang/test/Lexer/case-insensitive-include-win.c`
This change updates the LLVM Getting Started guide to note this newer
Python version dependency for this relatively niche case. Python 3.6.0
remains the minimum required Python version otherwise.
This patch also pick the MatchContext framework from DAGCombiner to an
indiviual header file to make the framework be used from other files in
llvm/lib/CodeGen/SelectionDAG/.
When a section has the SHF_COMPRESSED flag, -p/-x dump the compressed
content by default. In GNU readelf, if --decompress/-z is specified,
-p/-x will dump the decompressed content. This patch implements the
option.
Close#82507
This reverts commit 0a6c74e21cc6750c843310ab35b47763cddaaf32.
This created a lot of post-commit failures due to buildbots running
older versions of Python.
As per the RFC
https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-python-version/67571,
raise the minimum Python version to ensure that the Python syntax
doesn't become overly obsolete, to enable new Python feature usage,
and to improve the maintainability of CI.
One of the primary use cases for this higher Python version is to enable
python type annotations that are more aligned with current Python
best practices. This is not only important for our own internal Python
for testing, but for the Python bindings that are exposed to users.
This patch implements the codegen support of zabha (Byte and Halfword
Atomic Memory Operations) v1.0-rc1 extension.
See also https://github.com/riscv/riscv-zabha/blob/v1.0-rc1/zabha.adoc.
---------
Co-authored-by: Craig Topper <craig.topper@sifive.com>
The dot is too confusing for tools. Output temporaries would have
'10.3-generic' so tools could parse it as an extension, device libs &
the associated clang driver logic are also confused by the dot.
After discussions, we decided it's better to just remove the '.' from
the target name than fix each issue one by one.
This allows for accessing the function/basic block that a blockaddress
constant refers to
Due to the difficulties of fully supporting cloning BlockAddress values
in echo.cpp, tests are instead done using a unit test.
This previously was up for review at
https://github.com/llvm/llvm-project/pull/77390.
These generic targets include multiple GPUs and will, in the future,
provide a way to build once and run on multiple GPU, at the cost of less
optimization opportunities.
Note that this is just doing the compiler side of things, device libs an
runtimes/loader/etc. don't know about these targets yet, so none of them
actually work in practice right now. This is just the initial commit to
make LLVM aware of them.
This contains the documentation changes for both this change and #76954
as well.
Summary:
The previous patch did very simple folding that only worked for driectly
used branches. This patch improves this by traversing the use-def chain
to sipmlify every constant subexpression until it reaches a terminator
we can delete. The support should work for all expected cases now.
Reverts llvm/llvm-project#79586
This broke the AMDGPU OpenMP Offload buildbot.
The typical error message was that the GPU attempted to read beyong the
largest legal address.
Error message:
AMDGPU fatal error 1: Received error in queue 0x7f8363f22000:
HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION: The agent attempted to
access memory beyond the largest legal address.
Summary:
The `__nvvm_reflect` function is used to guard invalid code that varies
between architectures. One problem with this feature is that if it is
used without optimizations, it will leave invalid code in the module
that will then make it to the backend. The `__nvvm_reflect` pass is
already mandatory, so it should do some trivial branch removal to ensure
that constants are handled correctly. This dead branch elimination only
works in the trivial case of a compare on a branch and does not touch
any conditionals that were not realted to the `__nvvm_reflect` call in
order to preserve `O0` semantics as much as possible. This should allow
the following to work on NVPTX targets
```c
int foo() {
if (__nvvm_reflect("__CUDA_ARCH") >= 700)
asm("valid;\n");
}
```
Relanding after fixing a bug.
This reverts commit 9211e67da36782db44a46ccb9ac06734ccf2570f.
Summary:
This seemed to crash one one of the CUDA math tests. Revert until it can
be fixed.
Summary:
The `__nvvm_reflect` function is used to guard invalid code that varies
between architectures. One problem with this feature is that if it is
used without optimizations, it will leave invalid code in the module
that will then make it to the backend. The `__nvvm_reflect` pass is
already mandatory, so it should do some trivial branch removal to ensure
that constants are handled correctly. This dead branch elimination only
works in the trivial case of a compare on a branch and does not touch
any conditionals that were not realted to the `__nvvm_reflect` call in
order to preserve `O0` semantics as much as possible. This should allow
the following to work on NVPTX targets
```c
int foo() {
if (__nvvm_reflect("__CUDA_ARCH") >= 700)
asm("valid;\n");
}
```
Both LLVM_LINK_LLVM_DYLIB and LLVM_PARALLEL_LINK_JOBS help with some
common gotchas. It seems worth documenting them here explicitly.
Based on a review comment, also "refactor" the documentation to avoid duplication.