We introduced VariantKinds after MCSymbolRefExpr::VariantKind and then
deprecated the VariantKind naming in favor of AtSpecifier (#133214).
Rename the function and type to use the recommended convention.
This change is a step towards fixing one long-standing problem with
LLVM's debug WASM codegen: excessive use of locals. One local for each
temporary value in IR (roughly speaking).
This has a lot of problems:
1) It makes it easy to hit engine limitations of 50K locals with certain
code patterns and large functions.
2) It makes for larger binaries that are slower to load and slower to
compile to native code.
3) It makes certain compilation strategies (spill all WASM locals to
stack, for example) for debug code excessively expensive and makes debug
WASM code either run very slow, or be less debuggable.
4) It slows down LLVM itself.
This change addresses these partially by running a limited version of
the stackification pass for unoptimized code, one that gets rid of the
most 'obviously' unnecessary locals.
Care needs to be taken to not impact LLVM's ability to produce high
quality debug variable locations with this pass. To that end:
1) We only allow stackification when it doesn't require moving any
instructions.
2) We disable stackification of any locals that are used in
DEBUG_VALUEs, or as a frame base.
I have verified on a moderately large example that the baseline and the
diff produce the same kinds (local/global/stack) of locations, and the
only differences are due to the shifting of instruction offsets, with
many local.[get|set]s not being present anymore.
Even with this quite conservative approach, the results are pretty good:
1) 30% reduction in raw code size, up to 10x reduction in the number of
locals for select large methods (~1000 => ~100).
2) ~10% reduction in instructions retired for an "llc -O0" run on a
moderately sized input.
Construct RuntimeLibcallsInfo instead of manually creating a map.
This was repeating the setting of the RETURN_ADDRESS. This removes
an obstacle to generating libcall information with tablegen.
This is also not great, since it's setting a static map which
would be broken if there were ever a triple with a different libcall
configuration.
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/Target` library.
These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
A sub-set of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.
The bulk of this change is manual additions of `LLVM_ABI` to
`LLVMInitializeX` functions defined in .cpp files under llvm/lib/Target.
Adding `LLVM_ABI` to the function implementation is required here
because they do not `#include "llvm/Support/TargetSelect.h"`, which
contains the declarations for this functions and was already updated
with `LLVM_ABI` in a previous patch. I considered patching these files
with `#include "llvm/Support/TargetSelect.h"` instead, but since
TargetSelect.h is a large file with a bunch of preprocessor x-macro
stuff in it I was concerned it would unnecessarily impact compile times.
In addition, a number of unit tests under llvm/unittests/Target required
additional dependencies to make them build correctly against the LLVM
DLL on Windows using MSVC.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
The refactored code accidentally tokenized a string instead of just
concatenating it.
Add a regression test and some assertions to ensure consistency.
Fixes#143890 .
This flag was used to let us incrementally introduce debug records
into LLVM, however everything is now using records. It serves no
purpose now, so delete it.
- try_emplace(Key) is shorter than insert(std::make_pair(Key, 0)).
- try_emplace performs value initialization without value parameters.
- We overwrite values on successful insertion anyway.
Remove the MCSubtargetInfo argument from applyFixup, introduced in
https://reviews.llvm.org/D45962 , as it's only required by ARM. Instead,
add const MCFragment & so that ARMAsmBackend can retrieve
MCSubtargetInfo via a static member function.
Additionally, remove the MCAssembler argument, which is also only
required by ARM.
Additionally, make applyReloc non-const. Its arguments now fully cover
addReloc's functionality.
Register assembly printer passes in the pass registry.
This makes it possible to use `llc -start-before=<target>-asm-printer ...` in tests.
Adds a `char &ID` parameter to the AssemblyPrinter constructor to allow
targets to use the `INITIALIZE_PASS` macros and register the pass in the
pass registry. This currently has a default parameter so it won't break
any targets that have not been updated.
Previous logic did not handle the case where the result bit size was
between 32 and 64 bits inclusive. I updated the if-statements for more
precise handling.
An alternative solution would have been to abort FastISel in case the
result type is not legal for FastISel.
Resolves: #64222.
This PR began as an investigation into the root cause of
https://github.com/ziglang/zig/issues/20966.
Godbolt link showing incorrect codegen on 20.1.0:
https://godbolt.org/z/cEr4vY7d4.
The "using" decl seems to be unused since it was first introduced in:
commit a8e1135baa9074f7c088c8e1999561f88699b56e
Author: Heejin Ahn <aheejin@gmail.com>
Date: Thu Jan 9 22:36:10 2025 -0800
If the SDNode is used it can pick up the wrong results number, for
example looking at the known bits of the first result where it should be
looking at the second. The SDValue is already present as the
SelectCodeCommon checks move from parent to child, pass the SDValue
through to CheckNodePredicate as Op so that it can use it if necessary.
SDNode *N is still generated, keeping most PatFrags the same.
Fixes#137274
This makes them more consistent with the checks performed by regular loads. We can't simply add IsNonExtLoad to the existing atomic_load_8/16/32/64 as that would affect out of tree targets.
Replace "concept based polymorphism" with simpler PImpl idiom.
This pursues two goals:
* Enforce static type checking. Previously, target implementations hid
base class methods and type checking was impossible. Now that they
override the methods, the compiler will complain on mismatched
signatures.
* Make the code easier to navigate. Previously, if you asked your
favorite LSP server to show a method (e.g. `getInstructionCost()`), it
would show you methods from `TTI`, `TTI::Concept`, `TTI::Model`,
`TTIImplBase`, and target overrides. Now it is two less :)
There are three commits to hopefully simplify the review.
The first commit removes `TTI::Model`. This is done by deriving
`TargetTransformInfoImplBase` from `TTI::Concept`. This is possible
because they implement the same set of interfaces with identical
signatures.
The first commit makes `TargetTransformImplBase` polymorphic, which
means all derived classes should `override` its methods. This is done in
second commit to make the first one smaller. It appeared infeasible to
extract this into a separate PR because the first commit landed
separately would result in tons of `-Woverloaded-virtual` warnings (and
break `-Werror` builds).
The third commit eliminates `TTI::Concept` by merging it with the only
derived class `TargetTransformImplBase`. This commit could be extracted
into a separate PR, but it touches the same lines in
`TargetTransformInfoImpl.h` (removes `override` added by the second
commit and adds `virtual`), so I thought it may make sense to land these
two commits together.
Pull Request: https://github.com/llvm/llvm-project/pull/136674
isAnyExtLoad/isZExtLoad/isSignExtLoad are able to emit predicate checks
from tablegen now so we should use them.
The next step would be to add isNonExtLoad versions and migrate all
remaining uses of atomic_load_8/16/32/64 to that.
These are not diagnosed because implementations hide the methods of the base class rather than overriding them.
This works as long as a hiding function is callable with the same arguments as the same function from the base class.
Pull Request: https://github.com/llvm/llvm-project/pull/136655
Making `TargetTransformInfo::Model::Impl` `const` makes sure all
interface methods are `const`, in `BasicTTIImpl`, its bases, and in all
derived classes.
Pull Request: https://github.com/llvm/llvm-project/pull/136598
The main change is making `thisT` method `const`, the rest of the
changes is fixing compilation errors (*).
(*) There are two tricky methods, `getVectorInstrCost()` and
`getIntImmCost()`.
They have several overloads; some of these overloads are typically
pulled in to derived classes using the `using` directive, and then
hidden by methods in the derived class.
The compiler does not complain if the hiding methods are not marked as
`const`, which means that clients will use the methods from the base
class. If after this change your target fails cost model tests, this
must be the reason. To resolve the issue you need to make all hiding
overloads `const`. See the second commit in this PR.
Pull Request: https://github.com/llvm/llvm-project/pull/136575
We will increase the use of raw relocation types and eliminate fixup
kinds that correspond to relocation types. The getFixupKindInfo
functions will return an rvalue instead. Let's update the return type
from a const reference to a value type.
- Use range for loop to iterate over instructions.
- Emit generated code in anonymous namespace instead of `llvm` and
reduce the scope of this to just the type declarations.
- Emit generated tables as static constexpr
- Replace code to search in operand table with `std::search`.
- Skip the last "null" entry in PrefixTable and use range for loop to
search PrefixTable in the .cpp code.
- Do not generate `WebAssemblyInstructionTableSize` definition as its
already defined in the .cpp file.
- Remove {} for single statement loop/if/else bodies.
Move wasm-specific members outside of MCSymbolRefExpr::VariantKind (a
legacy interface I am eliminating). Most changes are mechanic and
similar to what I've done for many ELF targets (e.g. X86 #132149)
Notes:
* `fixSymbolsInTLSFixups` is replaced with `setTLS` in
`WebAssemblyWasmObjectWriter::getRelocType`, similar to what I've done
for many ELF targets.
* `SymA->setUsedInGOT()` in `recordRelocation` is moved to
`getRelocType`.
While here, rename "Modifier' to "Specifier":
> "Relocation modifier", though concise, suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation. I landed on "relocation specifier" as the winner. It's clear, aligns with Arm and IBM’s usage, and fits the assembler's role seamlessly.
Pull Request: https://github.com/llvm/llvm-project/pull/133116
The WebAssemblyLowerRefTypesIntPtrConv pass currently uses `undef` to
represent trap instructions. These can instead be represented by the
`poison` value.
Commit 52eb11f925ddeba4e1b3840fd636ee87387f3ada temporarily introduced
getSymSpecifier to prepare for "MCValue: Replace MCSymbolRefExpr members
with MCSymbol" (d5893fc2a7e1191afdb4940469ec9371a319b114). The
refactoring is now complete.
Some targets encode the relocation specifier within SymA using
MCSymbolRefExpr::SubclassData. They will cast the specifier
to *MCExpr::Specifier.
Migrate away from the confusing MCSymbolRefExpr::VariantKind.
Note: getAccessVariant is a deprecated method to get the relocation
specifier.
This commit is the result of investigation and discussion on
WebAssembly/wide-arithmetic#6 where alternatives to the `i64.add128`
instruction were discussed but ultimately deferred to a future proposal.
In spite of this though I wanted to apply a few changes to the LLVM
backend here with `wide-arithmetic` enabled for a few minor changes:
* A lowering for the `ISD::UADDO` node is added which uses `add128`
where the upper bits of the two operands are constant zeros and the
result of the 128-bit addition is the result of the overflowing
addition.
* The high bits of a `I64_ADD128` node are now flagged as "known zero"
if the upper bits of the inputs are also zero, assisting this `UADDO`
lowering to ensure the backend knows that the carry result is a 1-bit
result.
A few tests were then added to showcase various lowerings for various
operations that can be done with wide-arithmetic. They don't all
optimize super well at this time but I wanted to add them as a reference
here regardless to have them on-hand for future evaluations if
necessary.
Unlike in Itanium EH IR, WinEH IR's unwinding instructions (e.g.
`invoke`s) can have multiple possible unwind destinations.
For example:
```ll
entry:
invoke void @foo()
to label %cont unwind label %catch.dispatch
catch.dispatch: ; preds = %entry
%0 = catchswitch within none [label %catch.start] unwind label %terminate
catch.start: ; preds = %catch.dispatch
%1 = catchpad within %0 [ptr null]
...
terminate: ; preds = %catch.dispatch
%2 = catchpad within none []
...
...
```
In this case, if an exception is not caught by `catch.dispatch` (and
thus `catch.start`), it should next unwind to `terminate`.
`findUnwindDestination` in ISel gathers the list of this unwind
destinations traversing the unwind edges:
ae42f07103/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (L2089-L2150)
But we don't use that, and instead use our custom
`findWasmUnwindDestinations` that only adds the first unwind
destination, `catch.start`, to the successor list of `entry`, and not
`terminate`:
ae42f07103/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (L2037-L2087)
The reason behind it was, as described in the comment block in the code,
it was assumed that there always would be an `invoke` that connects
`catch.start` and `terminate`. In case of `catch (type)`, there will be
`call void @llvm.wasm.rethrow()` in `catch.start`'s predecessor that
unwinds to the next destination. For example:
0db702ac8e/llvm/test/CodeGen/WebAssembly/exception.ll (L429-L430)
In case of `catch (...)`, `__cxa_end_catch` can throw, so it becomes an
`invoke` that unwinds to the next destination. For example:
0db702ac8e/llvm/test/CodeGen/WebAssembly/exception.ll (L537-L538)
So the unwind ordering relationship between `catch.start` and
`terminate` here would be preserved.
But turns out this assumption does not always hold. For example:
```ll
entry:
invoke void @foo()
to label %cont unwind label %catch.dispatch
catch.dispatch: ; preds = %entry
%0 = catchswitch within none [label %catch.start] unwind label %terminate
catch.start: ; preds = %catch.dispatch
%1 = catchpad within %0 [ptr null]
...
call void @_ZSt9terminatev()
unreachable
terminate: ; preds = %catch.dispatch
%2 = catchpad within none []
call void @_ZSt9terminatev()
unreachable
...
```
In this case there is no `invoke` that connects `catch.start` to
`terminate`. So after `catch.dispatch` BB is removed in ISel,
`terminate` is considered unreachable and incorrectly removed in DCE.
This makes Wasm just use the general `findUnwindDestination`. In that
case `entry`'s successor is going to be [`catch.start`, `terminate`]. We
can get the first unwind destination by just traversing the list from
the front.
---
This required another change in WinEHPrepare. WinEHPrepare demotes all
PHIs in EH pads because they are funclets in Windows and funclets can't
have PHIs. When used in Wasm they are not funclets so we don't need to
do that wholesale but we still need to demote PHIs in `catchswitch` BBs
because they are deleted during ISel. (So we created
[`-demote-catchswitch-only`](a5588b6d20/llvm/lib/CodeGen/WinEHPrepare.cpp (L57-L59))
option for that)
But turns out we need to remove PHIs that have a `catchswitch` BB as an
incoming block too:
```ll
...
catch.dispatch:
%0 = catchswitch within none [label %catch.start] unwind label %terminate
catch.start:
...
somebb:
...
ehcleanup ; preds = %catch.dispatch, %somebb
%1 = phi i32 [ 10, %catch.dispatch ], [ 20, %somebb ]
...
```
In this case the `phi` in `ehcleanup` BB should be demoted too because
`catch.dispatch` BB will be removed in ISel so one if its incoming block
will be gone. This pattern didn't manifest before presumably due to how
`findWasmUnwindDestinations` worked. (In this example, in our
`findWasmUnwindDestinations`, `catch.dispatch` would have had only one
successor, `catch.start`. But now `catch.dispatch` has both
`catch.start` and `ehcleanup` as successors, revealing this bug. This
case is
[represented](ab87206c4b/llvm/test/CodeGen/WebAssembly/exception.ll (L445))
by `rethrow_terminator` function in `exception.ll` (or
`exception-legacy.ll`) and without the WinEHPrepare fix it will crash.
---
Discovered by the reproducer provided in #126916, even though the bug
reported there was not this one.