Adds BOLT_TARGETS_TO_BUILD, which defaults to the intersection of
X86;AArch64 and LLVM_TARGETS_TO_BUILD, but allows configuration to
alter that -- for instance omitting one of those two targets even if
llvm supports both.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D148847
This commit exposes the transformation behind the pattern.
It is useful for more targeted application on a specific op
for once.
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D148758
The BOLT runtime is specifically hard coded for x86_64 linux or x86_64
darwin. (Using x86_64 syscalls, hardcoding syscall numbers.)
Make it very clear this is for those specific pair of systems.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D148825
On z/OS, we need to pass the location of unwind interface header when building cxxabi. The cmake macro `LIBCXXABI_LIBUNWIND_INCLUDES_INTERNAL` is available for this purpose but it is only used with conjunction with `LIBCXXABI_USE_LLVM_UNWINDER`. For the external unwind library we need to use LIBCXXABI_LIBUNWIND_INCLUDES_INTERNAL unconditionally whenever it is set.
Reviewed By: #libc_abi, muiez, phosek, SeanP
Differential Revision: https://reviews.llvm.org/D147460
The host registration is a convenient way to get CUDA kernels
running, but it may be slow and does not work for all buffer
(like global constants). This revision uses the proper alloc
copy dealloc chains for buffers, using asynchronous chains
to increase overlap. The host registration mechanism is
kept under a flag for the output, just for experimentation
purposes while this project ramps up.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D148682
Currently the compiler calculates the compensation cost for the
extractelements, removed during vectorization. But if the extractelement
instruction is used in several nodes, we can calculate the compensation
for them several times.
Differential Revision: https://reviews.llvm.org/D148806
Recent commit 8f833f88ab modified the installation rpath and did not set `BUILD_WITH_INSTALL_RPATH` correctly on AIX, which led to installation failures on AIX. This patch sets `BUILD_WITH_INSTALL_RPATH` on AIX to fix the installation failures.
Reviewed By: buttaface, daltenty
Differential Revision: https://reviews.llvm.org/D148866
The current code did not properly account for integer matrixes. Check
if the operands are floating point or integer matrixes and use FAdd/Add
accordingly.
This is already done for other cases, like multiplies.
Fixes#62281.
I received a crash report in DiagnosticManager that was caused by a
nullptr diagnostic having been added. The API allows passing in a null
unique_ptr, but all the methods are written assuming that all pointers
a dereferencable. This patch makes it impossible to add a null
diagnostic.
rdar://107633615
Differential Revision: https://reviews.llvm.org/D148823
This patch recommits 0827e2fa3fd15b49fd2d0fc676753f11abb60cab after
reverting it in ed7ada259f665a742561b88e9e6c078e9ea85224. Added
workround for `Targetlowering::AddrMode` no longer being an aggregate
in C++20.
`AArch64TargetLowering::isLegalAddressingMode` has a number of
defects, including accepting an addressing mode, which consists of
only an immediate operand, or not checking the offset range for an
addressing mode in the form `1*ScaledReg + Offs`.
This patch fixes the above issues.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D143895
Change-Id: I41a520c13ce21da503ca45019979bfceb8b648fa
The add-return.s test is failing on s390x.
See also e30ce634f75c01cc8784cb0c4972c42987178c1d.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D148807
Moved computeExtractCost to ShuffleCostEstimator class as another step
for unifying actual codegen/cost estimation for buildvectors.
Differential Revision: https://reviews.llvm.org/D148864
D97673 implemented salvaging o dbg.value inside coroutine funclets, but
left the original function untouched. Before, only dbg.addr and dbg.decl
would get salvaged.
D121324 implemented salvaging of dbg.addr and dbg.decl in the original
function as well, but not of dbg.values.
This patch unifies salvaging in the original function and related
funclets, so that all intrinsics are salvaged in all functions. This is
particularly useful for ABIs where the original function is also
rewritten to receive the frame pointer as an argument.
Differential Revision: https://reviews.llvm.org/D148745
The justification in isAddRecNeverPoison() no longer applies, as
it dates back to a time where LLVM had an unconditional forward
progress guarantee. However, we also no longer need it, because we
can exploit branch on poison UB instead.
For a single exit loop (without abnormal exits) we know that all
instructions dominating the exit will be executed, so if any of
them trigger UB on poison that means that addrec is not poison.
This is slightly stronger than the previous code, because a) we
don't need the exit to also be the latch and b) we don't need the
value to be used in the exit branch in particular, any UB-producing
instruction is fine.
I don't expect much practical impact from this change, this is
mainly to clarify the reasoning behind this logic.
Differential Revision: https://reviews.llvm.org/D148633
Use clang driver on MinGW where clang-cl is not usable. MSVC target
still uses clang-cl to minimize changes to existing test runners.
Differential Revision: https://reviews.llvm.org/D147432
Similar to D125411, but for ARM64X.
ARM64X PE binaries are hybrids containing both ARM64EC and pure ARM64
variants in one file. They are usually linked by passing separate
ARM64EC and ARM64 object files to linker. Linked binaries use ARM64
machine and contain additional CHPE metadata in their load config.
CHPE metadata support is not part of this patch, I plan to send that later.
Using ARM64X as a machine type of object files themselves is somewhat
ambiguous, but such files are allowed by MSVC. It treats them as ARM64
or ARM64EC object, depending on the context. Such objects can be
produced with cvtres.exe -machine:arm64x.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D148517
This is useful for examining ARM64EC static libraries and allows
better llvm-lib testing. Changes to Archive class will also be
useful for LLD to support ARM64EC, where it will need to use one
map or the other, depending on linking target (or both, in case of
ARM64X, but separately as they are in different namespaces).
Reviewed By: jhenderson, efriedma
Differential Revision: https://reviews.llvm.org/D146534
ARM64EC allows having both pure ARM64 objects and ARM64EC in the
same archive. This allows using single static library for linking
pure ARM64, pure ARM64EC or mixed modules (what MS calls ARM64X:
a single module that may be used in both modes). To achieve that,
such static libraries need two separated symbol maps. The usual map
contains only pure ARM64 symbols, while a new /<ECSYMBOLS>/ section
contains EC symbols. EC symbols map has very similar format to the
usual map, except it doesn't contain object offsets and uses offsets
from regular map instead. This is true even for pure ARM64EC static
library: it will simply have 0 symbols in the symbol map.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D143541
This reverts commit 67b298f6d82e0b4bb648ac0dabe895e816a77ef1.
We got linker errors with undefined symbols during a compiler release
and tracked it down to this change. I am in the process of understanding
what is happening and getting a reproducer.
Sorry for reverting this again.
I will reopen#61065 until we fix this.
Previously if a register had fields we would always print them after the
value if the register was asked for by name.
```
(lldb) register read MDCR_EL3
MDCR_EL3 = 0x00000000
= {
ETBAD = 0
<...>
RLTE = 0
}
```
This can be quite annoying if there are a whole lot of fields but you
want to see the register in a specific format.
```
(lldb) register read MDCR_EL3 -f i
MDCR_EL3 = 0x00000000 unknown udf #0x0
= {
ETBAD = 0
<...lots of fields...>
```
Since it pushes the interesting bit far up the terminal. To solve this,
don't print fields if the user passes --format. If they're doing that
then I think it's reasonable to assume they know what they want and only
want to see that output.
This also gives users a way to silence fields, but not change the format.
By doing `register read foo -f x`. In case they are not useful or perhaps
they are trying to work around a crash.
I have customised the help text for --format for register read to explain this:
```
-f <format> ( --format <format> )
Specify a format to be used for display. If this is set, register fields will not be dispayed.
```
Reviewed By: jasonmolenda
Differential Revision: https://reviews.llvm.org/D148790
-Wcast-qual does not trigger on the following code in Clang, but does
in GCC.
const auto i = 42;
using T = int*;
auto p = T(&i);
The expected behavior is that a functional cast should trigger
the warning the same as the equivalent C cast because
the meaning is the same, and nothing about the functional cast
makes it easier to recognize that a const_cast is occurring.
Fixes https://github.com/llvm/llvm-project/issues/62083
Differential Revision: https://reviews.llvm.org/D148276
Some applications make heavy use of the crc32 operation (e.g., as part
of a hash function), so having a FastISel path avoids fallbacks to
SelectionDAG and improves compile times, in our case by ~1.5%.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D148023
LICM does not use ORE from the pass manager, it constructs its
own instance. As such, explicitly requiring the analysis in the
pipeline is unnecessary.
We don't have `c++` anymore in the Docker image, but the script does
require $CXX to be in the environment so that should always work.
Differential Revision: https://reviews.llvm.org/D148830