In review of bbde6b, I had originally proposed that we support the
legacy text format. As review evolved, it bacame clear this had been a
bad idea (too much complexity), but in order to let that patch finally
move forward, I approved the change with the variant. This change undoes
the variant, and updates all the tests to just use the array form.
This flag applies to G_PTR_ADD instructions and indicates that the operation
implements an inbounds getelementptr operation, i.e., the pointer operand is in
bounds wrt. the allocated object it is based on, and the arithmetic does not
change that.
It is set when the IRTranslator lowers inbounds GEPs (currently only in some
cases, to be extended with a future PR), and in the
(build|materialize)ObjectPtrOffset functions.
Inbounds information is useful in ISel when we have instructions that perform
address computations whose intermediate steps must be in the same memory region
as the final result. A follow-up patch will start using it for AMDGPU's flat
memory instructions, where the immediate offset must not affect the memory
aperture of the address.
This is analogous to a concurrent effort in SDAG: #131862
(related: #140017, #141725).
For SWDEV-516125.
The "at construction" binop folds in SelectionDAG::getNode() has
different behaviour when compared to the equivalent LLVM IR. This PR
makes the behaviour consistent while also extending the coverage to
include signed/unsigned max/min operations.
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.
This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.
Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.
Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.
For SWDEV-516125.
This is another prune of dead code -- we never generate debug intrinsics
nowadays, therefore there's no need for these codepaths to run.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
This PR exposes the backend pass config to plugins via a callback.
Plugin authors can register a callback that is being triggered before
the target backend adds their passes to the pipeline. In the callback
they then get access to the `TargetMachine`, the `PassManager`, and the
`TargetPassConfig`. This allows plugins to call
`TargetPassConfig::insertPass`, which is honored in the subsequent
`addPass` of the main backend. We implemented this using the legacy pass
manager since backends still use it as the default.
This reverts commit 8ac7210b7f0ad49ae7809bf6a9faf2f7433384b0.
This breaks the building the AArch64 backend, e.g. see
https://github.com/llvm/llvm-project/pull/144947
Revert to unbreak the build.
Also reverts follow-up commits 1e76f012db3ccfaa05e238812e572b5b6d12c17e.
If a kernel is known to be executing only a single lane, IR
UniformityAnalysis will take note of that (via
GCNTTIImpl::hasBranchDivergence) and report that all values are uniform.
SelectionDAG's built-in divergence tracking should do the same.
Add SDPatternMatch matcher and unit test coverage for `ISD::LOAD`
opcode.
This only matches the loaded value i.e. ResNo 0 and not the output
chain.
e.g.
```
m_Load(m_Value(), m_Value(), m_Value())
```
The first value is the input chain, the second is the base pointer, and
the last value is the offset.
`m_Result<N>` matches a SDValue that is the N-th result of the defining
SDNode. This is useful for creating a more fine-grained matching on
SDNode with multiple results.
-----
Inspired by #145481
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/CGData` and
`llvm/CodeGen` libraries. These annotations currently have no meaningful
impact on the LLVM build; however, they are a prerequisite to support an
LLVM Windows DLL (shared library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
The bulk of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.
The following manual adjustments were also applied after running IDS on
Linux:
- Add `LLVM_ABI` to a subset of private class methods and fields that
require export
- Add `LLVM_TEMPLATE_ABI` and `LLVM_EXPORT_TEMPLATE` to exported
instantiated templates defined via X-macro
- Add `LLVM_ABI_FRIEND` to friend member functions declared with
`LLVM_ABI`
- Explicitly make classes non-copyable where needed to due IDS adding
LLVM_ABI at the class level
- Add `#include "llvm/Support/Compiler.h"` to files where it was not
auto-added by IDS due to no pre-existing block of include statements.
- Add `LLVM_ABI` to a small number of symbols that require export but
are not declared in headers
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
This continues s-barannikov's work TableGen-erating SDNode descriptions.
This takes the initial patch from #119709 and moves documentation and the
rest of the AArch64ISD nodes to TableGen. Some issues were found by the
generated SDNode verification added in this patch. These issues have been
described and fixed in the following PRs:
- #140706
- #140711
- #140713
- #140715
---------
Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
A few files of llvm dir had duplicate headers included. This patch
removes those redundancies.
---------
Co-authored-by: Akash Agrawal <akashag@qti.qualcomm.com>
This adds a GISelValueTrackingPrinterPass that can print the known bits
and sign bit of each def in a function. It is built on the new pass
manager and so adds a NPM GISelValueTrackingAnalysis, renaming the older
class to GISelValueTrackingAnalysisLegacy.
The first 2 functions from the AArch64GISelMITest are ported over to an
mir test to show it working. It also runs successfully on all files in
llvm/test/CodeGen/AArch64/GlobalISel/*.mir that are not invalid. It can
hopefully be used to test GlobalISel known bits analysis more directly
in common cases, without jumping through the hoops that the C++ tests
requires.
This patch fixes:
third-party/unittest/googletest/include/gtest/gtest.h:1379:11:
error: comparison of integers of different signs: 'const unsigned
long' and 'const int' [-Werror,-Wsign-compare]
The current implementation always creates a 1 bit constant for the
result of the `G_ICMP`, which will cause issues if the destination
register size is larger than that. With asserts enabled, it will cause a
crash in `buildConstant`:
```
llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp:322: virtual MachineInstrBuilder llvm::MachineIRBuilder::buildConstant(const DstOp &, const ConstantInt &): Assertion `EltTy.getScalarSizeInBits() == Val.getBitWidth() && "creating constant with the wrong size"' failed.
```
Reordering `OS` and `PassMgrF` should fix the asan failure that's caused
by OS being destroyed before `PassMgrF` deletes the AsmPrinter.
As shown in[ this asan run
](https://lab.llvm.org/buildbot/#/builders/52/builds/7340/steps/12/logs/stdio)
```
This frame has 15 object(s):
[32, 48) 'PassMgrF' (line 154)
[64, 1112) 'Buf' (line 155)
[1248, 1304) 'OS' (line 156) <== Memory access at offset 1280 is inside this variable
```
which indicates an ordering problem.
This should help to fix all the sanitizer failures caused by the test
`X86MCInstLowerTest.cpp` that's introduced by [this
PR](https://github.com/llvm/llvm-project/pull/133352#issuecomment-2780173791).
In `X86MCInstLower::LowerMachineOperand`, a new `MCSymbol` can be
created in `GetSymbolFromOperand(MO)` where `MO.getType()` is
`MachineOperand::MO_ExternalSymbol`
```
case MachineOperand::MO_ExternalSymbol:
return LowerSymbolOperand(MO, GetSymbolFromOperand(MO));
```
at
725a7b664b/llvm/lib/Target/X86/X86MCInstLower.cpp (L196)
However, this newly created symbol will not be marked properly with its
`IsExternal` field since `Ctx.getOrCreateSymbol(Name)` doesn't know if
the newly created `MCSymbol` is for `MachineOperand::MO_ExternalSymbol`.
Looking at other backends, for example `Arch64MCInstLower` is doing for
handling `MC_ExternalSymbol`
14c36db16f/llvm/lib/Target/AArch64/AArch64MCInstLower.cpp (L366-L367)14c36db16f/llvm/lib/Target/AArch64/AArch64MCInstLower.cpp (L145-L148)
It creates/gets the MCSymbol from `AsmPrinter.OutContext` instead of
from `Ctx`. Moreover, `Ctx` for `AArch64MCLower` is the same as
`AsmPrinter.OutContext`.
8e7d6baf0e/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (L100).
This applies to almost all the other backends except X86 and M68k.
```
$git grep "MCInstLowering("
lib/Target/AArch64/AArch64AsmPrinter.cpp💯 : AsmPrinter(TM, std::move(Streamer)), MCInstLowering(OutContext, *this),
lib/Target/AMDGPU/AMDGPUMCInstLower.cpp:223: AMDGPUMCInstLower MCInstLowering(OutContext, STI, *this);
lib/Target/AMDGPU/AMDGPUMCInstLower.cpp:257: AMDGPUMCInstLower MCInstLowering(OutContext, STI, *this);
lib/Target/AMDGPU/R600MCInstLower.cpp:52: R600MCInstLower MCInstLowering(OutContext, STI, *this);
lib/Target/ARC/ARCAsmPrinter.cpp:41: MCInstLowering(&OutContext, *this) {}
lib/Target/AVR/AVRAsmPrinter.cpp:196: AVRMCInstLower MCInstLowering(OutContext, *this);
lib/Target/BPF/BPFAsmPrinter.cpp:144: BPFMCInstLower MCInstLowering(OutContext, *this);
lib/Target/CSKY/CSKYAsmPrinter.cpp:41: : AsmPrinter(TM, std::move(Streamer)), MCInstLowering(OutContext, *this) {}
lib/Target/Lanai/LanaiAsmPrinter.cpp:147: LanaiMCInstLower MCInstLowering(OutContext, *this);
lib/Target/Lanai/LanaiAsmPrinter.cpp:184: LanaiMCInstLower MCInstLowering(OutContext, *this);
lib/Target/MSP430/MSP430AsmPrinter.cpp:149: MSP430MCInstLower MCInstLowering(OutContext, *this);
lib/Target/Mips/MipsAsmPrinter.h:126: : AsmPrinter(TM, std::move(Streamer)), MCInstLowering(*this) {}
lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp:695: WebAssemblyMCInstLower MCInstLowering(OutContext, *this);
lib/Target/X86/X86MCInstLower.cpp:2200: X86MCInstLower MCInstLowering(*MF, *this);
```
This patch makes `X86MCInstLower` and `M68KInstLower` to have their
`Ctx` from `AsmPrinter.OutContext` instead of getting it from
`MF.getContext()` to be consistent with all the other backends.
I think since normal use case (probably anything other than our
un-conventional case) only handles one llvm module all the way through
in the codegen pipeline till the end of code emission (AsmPrint),
`AsmPrinter.OutContext` is the same as MachineFunction's MCContext, so
this change is an NFC.
----
This fixes an error while running the generated code in ORC JIT for our
use case with
[MCLinker](https://youtu.be/yuSBEXkjfEA?si=HjgjkxJ9hLfnSvBj&t=813) (see
more details below):
https://github.com/llvm/llvm-project/pull/133291#issuecomment-2759200983
We (Mojo) are trying to do a MC level linking so that we break llvm
module into multiple submodules to compile and codegen in parallel
(technically into *.o files with symbol linkage type change), but
instead of archive all of them into one `.a` file, we want to fix the
symbol linkage type and still produce one *.o file. The parallel codegen
pipeline generates the codegen data structures in their own `MCContext`
(which is `Ctx` here). So if function `f` and `g` got split into
different submodules, they will have different `Ctx`. And when we try to
create an external symbol with the same name for each of them with
`Ctx.getOrCreate(SymName)`, we will get two different `MCSymbol*`
because `f` and `g`'s `MCContext` are different and they can't see each
other. This is unfortunately not what we want for external symbols.
Using `AsmPrinter.OutContext` helps, since it is shared, if we try to
get or create the `MCSymbol` there, we'll be able to deduplicate.
fixes https://github.com/llvm/llvm-project/issues/118847
implements matchers for reassociatable opcodes as well as helpers for
commonly used reassociatable binary matchers.
---------
Co-authored-by: Min-Yih Hsu <min@myhsu.dev>
This adds a Flags parameter to the BinaryOp_match, allowing it to detect
different flags like Disjoint. A m_GDisjointOr is added to detect Or's
with disjoint flags, and G_AddLike is then either a m_GADD or
m_GDisjointOr.
The rest is trying to allow matching `const MachineInstr&`, as opposed
to non-const references.
The standard libcalls for half to float and float to half conversion are
__extendhfsf2 and __truncsfhf2. However, LLVM currently uses
__gnu_h2f_ieee and __gnu_f2h_ieee instead. As far as I can tell, these
libcalls are an ARM-ism and only provided by libgcc on that platform.
compiler-rt always provides both libcalls.
Use the standard libcalls by default, and only use the __gnu libcalls on
ARM.
This patch attempts to reland
https://github.com/llvm/llvm-project/pull/120780 while addressing the
issues that caused the patch to be reverted.
Namely:
1. The patch had included code from the llvm/Passes directory in the
llvm/CodeGen directory.
2. The patch increased the backend compile time by 2% due to adding a
very expensive include in MachineFunctionPass.h
The patch has been re-structured so that there is no dependency between
the llvm/Passes and llvm/CodeGen directory, by moving the base class,
`class DroppedVariableStats` to the llvm/IR directory.
The expensive include in MachineFunctionPass.h has been changed to
contain forward declarations instead of other header includes which was
pulling a ton of code into MachineFunctionPass.h and should resolve any
issues when it comes to compile time increase.