241 Commits

Author SHA1 Message Date
Benjamin Maxwell
81c06d198e
Reland "[AArch64][SME] Port all SME routines to RuntimeLibcalls" (#153417)
This updates everywhere we emit/check an SME routines to use
RuntimeLibcalls to get the function name and calling convention.
2025-08-18 14:53:40 +01:00
Nikita Popov
48beed5b71
Revert "[AArch64][SME] Port all SME routines to RuntimeLibcalls" (#153392)
This introduced a 5% compile-time regression on AArch64, see
https://llvm-compile-time-tracker.com/compare.php?from=b9138bde3562de5c28a239dbd303caf2406678c6&to=271688b87abe7cf45aceaff8266270a25eb7b436&stat=instructions:u.

Reverts llvm/llvm-project#152505.
2025-08-13 11:54:39 +00:00
Benjamin Maxwell
271688b87a
[AArch64][SME] Port all SME routines to RuntimeLibcalls (#152505)
This updates everywhere we emit/check an SME routines to use
RuntimeLibcalls to get the function name and calling convention.

Note: RuntimeLibcallEmitter had some issues with emitting non-unique
variable names for sets of libcalls, so I tweaked the output to avoid
the need for variables.
2025-08-13 08:48:59 +01:00
David Stuttard
c7c0229480
Revert "[AMDGPU] SelectionDAG divergence tracking should take into account Target divergency. (#147560)" (#152548)
This reverts commit 9293b65a616b8de432a654d046e802540b146372.
2025-08-08 09:05:59 +01:00
Justin Bogner
3f066f5fcf
[HLSL][DirectX] Extract HLSLBinding out of DXILResource. NFC (#150633)
We extract the binding logic out of the DXILResource analysis passes into the
FrontendHLSL library. This will allow us to use this logic for resource and
root signature bindings in both the DirectX backend and the HLSL frontend.
2025-07-31 08:35:47 -07:00
alex-t
9293b65a61
[AMDGPU] SelectionDAG divergence tracking should take into account Target divergency. (#147560)
This is the next attempt to upstream this:
https://github.com/llvm/llvm-project/pull/144947
The las one caused build errors in AArch64.
Issue was resolved.
2025-07-09 00:06:58 +02:00
Craig Topper
3c13257f32
[RISCV] Rename XTHeadVdot instructions to match their mnemonic. NFC (#146953)
We were using the extension name as a prefix rather than TH_.
2025-07-03 13:43:34 -07:00
Craig Topper
d0d84c4150
[RISCV] Add SF_ to SiFive instructions in RISCVGenInstrInfo.inc. NFC (#146939) 2025-07-03 13:38:27 -07:00
Diana Picus
a201f8872a
[AMDGPU] Replace dynamic VGPR feature with attribute (#133444)
Use a function attribute (amdgpu-dynamic-vgpr) instead of a subtarget
feature, as requested in #130030.
2025-06-24 11:09:36 +02:00
Andrew Rogers
19658d1474
[llvm] annotate interfaces in llvm/Target for DLL export (#143615)
## Purpose

This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/Target` library.
These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.

## Background

This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).

A sub-set of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.

The bulk of this change is manual additions of `LLVM_ABI` to
`LLVMInitializeX` functions defined in .cpp files under llvm/lib/Target.
Adding `LLVM_ABI` to the function implementation is required here
because they do not `#include "llvm/Support/TargetSelect.h"`, which
contains the declarations for this functions and was already updated
with `LLVM_ABI` in a previous patch. I considered patching these files
with `#include "llvm/Support/TargetSelect.h"` instead, but since
TargetSelect.h is a large file with a bunch of preprocessor x-macro
stuff in it I was concerned it would unnecessarily impact compile times.

In addition, a number of unit tests under llvm/unittests/Target required
additional dependencies to make them build correctly against the LLVM
DLL on Windows using MSVC.

## Validation

Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:

- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
2025-06-17 13:28:45 -07:00
Helena Kotas
f9ae8aaff2
[DirectX] Detect resources with identical overlapping binding (#140645)
This change uses resource name during DXIL resource binding analysis to detect when two (or more) resources have identical overlapping binding.

The DXIL resource analysis just detects that there is a problem with the binding and sets the `hasOverlappingBinding` flag. Full error reporting will happen later in DXILPostOptimizationValidation pass (llvm/llvm-project#110723).
2025-05-28 14:07:51 -07:00
Benjamin Maxwell
6a477f6577
[AArch64] TableGen-erate SDNode descriptions (#140472)
This continues s-barannikov's work TableGen-erating SDNode descriptions. 
This takes the initial patch from #119709 and moves documentation and the
rest of the AArch64ISD nodes to TableGen. Some issues were found by the
generated SDNode verification added in this patch. These issues have been 
described and fixed in the following PRs:

- #140706 
- #140711 
- #140713 
- #140715

---------

Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-05-28 12:02:58 +01:00
Helena Kotas
27675ccdd6
[DirectX] Add resource name argument to llvm.dx.handlefrom[implicit]binding intrinsics (#139991)
Adds resource name argument to `llvm.dx.handlefrombinding` and `llvm.dx.handlefromimplicitbinding` intrinsics.
SPIR-V currently does not seem to need the resource names so this change only affects DirectX binding intrinsics.

Part 2/4 of https://github.com/llvm/llvm-project/issues/105059
2025-05-27 22:57:01 -07:00
Rahul Joshi
52c2e45c11
[NFC][CodeGen] Adopt MachineFunctionProperties convenience accessors (#141101) 2025-05-23 08:30:29 -07:00
Benjamin Maxwell
647db1b02d
Reland "[AArch64][SME] Split SMECallAttrs out of SMEAttrs" (#138671)
SMECallAttrs is a new helper class that holds all the SMEAttrs for a
call. The interfaces to query actions needed for the call (e.g. change
streaming mode) have been moved to the SMECallAttrs class.

The main motivation for this change is to make the split between the
caller, callee, and callsite attributes more apparent.

Before this change, we would always merge callsite and callee
attributes. The main reason to do this was to handle indirect calls,
however, we also occasionally used callsite attributes on direct calls
in tests (mainly to avoid creating multiple function declarations). With
this patch, we now explicitly handle indirect calls and disallow
incompatible attributes on direct calls (so this patch is not entirely
an NFC).

Same as #137239, but with a change to avoid inferring SME attributes for
function definitions. This allows stubbing the SME ABI routines in C/C++
(and matches the old behaviour).
2025-05-15 08:37:08 +01:00
Helena Kotas
c66f401e1e
[DirectX] Implement DXILResourceBindingAnalysis (#137258)
`DXILResourceBindingAnalysis` analyses explicit resource bindings in the
module and puts together lists of used virtual register spaces and
available virtual register slot ranges for each binding type. It also
stores additional information found during the analysis such as whether
the module uses implicit bindings or if any of the bindings overlap.

This information will be used in `DXILResourceImplicitBindings` pass
(coming soon) to assign register slots to resources with implicit
bindings, and in a post-optimization validation pass that will raise
diagnostic about overlapping bindings.

Part 1/2 of #136786
2025-05-09 10:42:31 -07:00
Benjamin Maxwell
703b479f16
Revert "[AArch64][SME] Split SMECallAttrs out of SMEAttrs" (#138664)
Reverts llvm/llvm-project#137239

This broke implementing SME ABI routines in C/C++ (used for some stubs),
see: https://lab.llvm.org/buildbot/#/builders/94/builds/6859
2025-05-06 10:28:13 +01:00
Benjamin Maxwell
cadf652857
[AArch64][SME] Split SMECallAttrs out of SMEAttrs (#137239)
SMECallAttrs is a new helper class that holds all the SMEAttrs for a
call. The interfaces to query actions needed for the call (e.g. change
streaming mode) have been moved to the SMECallAttrs class.

The main motivation for this change is to make the split between the
caller, callee, and callsite attributes more apparent.

Before this change, we would always merge callsite and callee
attributes. The main reason to do this was to handle indirect calls,
however, we also occasionally used callsite attributes on direct calls
in tests (mainly to avoid creating multiple function declarations). With
this patch, we now explicitly handle indirect calls and disallow
incompatible attributes on direct calls (so this patch is not entirely
an NFC).
2025-05-06 09:36:26 +01:00
Kazu Hirata
b4fac94181
[llvm] Remove unused using decls (NFC) (#138386) 2025-05-03 07:05:02 -07:00
Benjamin Maxwell
8c7a2ce01a
[AArch64][SME] Allow spills of ZT0 around SME ABI routines again (#136726)
In #132722 spills of ZT0 were disabled around all SME ABI routines to
avoid a case where ZT0 is spilled before ZA is enabled (resulting in a
crash).

It turns out that the ABI does not promise that routines will preserve
ZT0 (however in practice they do), so generally disabling ZT0 spills for
ABI routines is not correct.

The case where a crash was possible was "aarch64_new_zt0" functions with
ZA disabled on entry and a ZT0 spill around __arm_tpidr2_save. In this
case, ZT0 will be undefined at the call to __arm_tpidr2_save, so this
patch avoids the ZT0 spill by marking the callsite with
"aarch64_zt0_undef". This attribute only applies to callsites and marks
that at the point the call is made ZT0 is not defined, so does not need
preserving.
2025-04-25 13:33:09 +01:00
Diana Picus
5bad5d84a1
Reland [AMDGPU] Support block load/store for CSR #130013 (#137169)
Add support for using the existing SCRATCH_STORE_BLOCK and
SCRATCH_LOAD_BLOCK instructions for saving and restoring callee-saved
VGPRs. This is controlled by a new subtarget feature, block-vgpr-csr. It
does not include WWM registers - those will be saved and restored
individually, just like before. This patch does not change the ABI.

Use of this feature may lead to slightly increased stack usage, because
the memory is not compacted if certain registers don't have to be
transferred (this will happen in practice for calling conventions where
the callee and caller saved registers are interleaved in groups of 8).
However, if the registers at the end of the block of 32 don't have to be
transferred, we don't need to use a whole 128-byte stack slot - we can
trim some space off the end of the range.

In order to implement this feature, we need to rely less on the
target-independent code in the PrologEpilogInserter, so we override
several new methods in SIFrameLowering. We also add new pseudos,
SI_BLOCK_SPILL_V1024_SAVE/RESTORE.

One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the
SCRATCH_LOAD_BLOCK instructions will have all the registers that are not
transferred added as implicit uses. This is done in order to inform
LiveRegUnits that those registers are not available before the restore
(since we're not really restoring them - so we can't afford to scavenge
them). Unfortunately, this trick doesn't work with the save, so before
the save all the registers in the block will be unavailable (see the
unit test).

This was reverted due to failures in the builds with expensive checks
on, now fixed by always updating LiveIntervals and SlotIndexes in
SILowerSGPRSpills.
2025-04-25 11:29:27 +02:00
Ashley Coleman
f12fb2ff74
[HLSL] Analyze updateCounter usage (#135669)
Fixes https://github.com/llvm/llvm-project/issues/135667

Analyze and annotate `ResourceInfo` with the derived direction of calls
to updateCounter (if any).

This change only sets the value. Any diagnostics that should be raised
must be done somewhere else.
2025-04-24 13:17:24 -06:00
Diana Picus
6bb2f90557
Revert "[AMDGPU] Support block load/store for CSR" (#136846)
Reverts llvm/llvm-project#130013 due to failures with expensive checks
on.
2025-04-23 14:01:00 +02:00
Diana Picus
4a58071d87
[AMDGPU] Support block load/store for CSR (#130013)
Add support for using the existing `SCRATCH_STORE_BLOCK` and
`SCRATCH_LOAD_BLOCK` instructions for saving and restoring callee-saved
VGPRs. This is controlled by a new subtarget feature, `block-vgpr-csr`.
It does not include WWM registers - those will be saved and restored
individually, just like before. This patch does not change the ABI.

Use of this feature may lead to slightly increased stack usage, because
the memory is not compacted if certain registers don't have to be
transferred (this will happen in practice for calling conventions where
the callee and caller saved registers are interleaved in groups of 8).
However, if the registers at the end of the block of 32 don't have to be
transferred, we don't need to use a whole 128-byte stack slot - we can
trim some space off the end of the range.

In order to implement this feature, we need to rely less on the
target-independent code in the PrologEpilogInserter, so we override
several new methods in `SIFrameLowering`. We also add new pseudos,
`SI_BLOCK_SPILL_V1024_SAVE/RESTORE`.

One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the
SCRATCH_LOAD_BLOCK instructions will have all the registers that are not
transferred added as implicit uses. This is done in order to inform
LiveRegUnits that those registers are not available before the restore
(since we're not really restoring them - so we can't afford to scavenge
them). Unfortunately, this trick doesn't work with the save, so before
the save all the registers in the block will be unavailable (see the
unit test).
2025-04-23 10:33:36 +02:00
Philip Reames
f2ecd86e34
[Analysis] Remove implicit LocationSize conversion from uint64_t (#133342)
This change removes the uint64_t constructor on LocationSize
preventing implicit conversion, and fixes up the using APIs to adapt to
the change. Note that I'm adding a couple of explicit conversion points
on routines where passing in a fixed offset as an integer seems likely
to have well understood semantics.

We had an unfortunate case which arose if you tried to pass a TypeSize
value to a parameter of LocationSize type. We'd find the implicit
conversion path through TypeSize -> uint64_t -> LocationSize which works
just fine for fixed values, but looses information and fails assertions
if the TypeSize was scalable. This change breaks the first link in that
implicit conversion chain since that seemed to be the easier one.
2025-04-18 07:46:31 -07:00
Ashley Coleman
e3369a8dc9
[NFC][HLSL] Rename ResourceBinding Types (#134165)
Non-functional change as first step in
https://github.com/llvm/wg-hlsl/pull/207

Removes `Binding` from "Resource Instance" types
2025-04-04 16:51:35 -06:00
Rahul Joshi
a8a33bab69
[NFC][SPIRV] Misc code cleanup in SPIRV Target (#133764)
- Use static instead of anonymous namespace for file local functions.
- Enclose file-local classes in anonymous namespace.
- Eliminate `llvm::` qualifier when file has `using namespace llvm`.
- Eliminate namespace surrounding entire code in
SPIRVConvergenceRegionAnalysis.cpp file.
- Eliminate call to `initializeSPIRVStructurizerPass` from the pass
constructor (https://github.com/llvm/llvm-project/issues/111767)
2025-04-01 08:35:06 -07:00
Alex Bradbury
71a977d0d6
[RISCV] Add shift-add (SH1ADD, ...) to isCopyInstrImpl (#133443)
As with #132002, these do show up in a compilation of llvm-test-suite
(including SPEC 2017). We remove 30-40 static instances so this isn't
anything earth shattering.

rs2 is always added to the other shifted (and potentially extended)
operand unmodified, so rs1==zero is equivalent to a copy.
2025-03-28 15:46:50 +00:00
Alex Bradbury
a481452cd8
[RISCV] Add OR/XOR/SUB to RISCVInstrInfo::isCopyInstrImpl (#132002)
This adds coverage for additional instructions in isCopyInstrImpl, for
now picking just those where I can observe that there is a codegen
difference for SPEC.

This allows MachineCopyPropagation to successfully eliminate no-op moves in this form.
2025-03-28 12:59:18 +00:00
Jessica Clarke
acdb0c1f99 [test][DXIL] Add to LLVM_LINK_COMPONENTS to fix BUILD_SHARED_LIBS build 2025-03-21 22:14:36 +00:00
Diana Picus
1f84495255
[AMDGPU] Update target helpers & GCNSchedStrategy for dynamic VGPRs (#130047)
In dynamic VGPR mode, we can allocate up to 8 blocks of either 16 or 32
VGPRs (based on a chip-wide setting which we can model with a Subtarget
feature). Update some of the subtarget helpers to reflect this.

In particular:
- getVGPRAllocGranule is set to the block size
- getAddresableNumVGPR will limit itself to 8 * size of a block

We also try to be more careful about how many VGPR blocks we allocate.
Therefore, when deciding if we should revert scheduling after a given
stage, we check that we haven't increased the number of VGPR blocks that
need to be allocated.

---------

Co-authored-by: Jannik Silvanus <jannik.silvanus@amd.com>
2025-03-19 10:29:38 +01:00
Nikita Popov
f137c3d592
[TargetRegistry] Accept Triple in createTargetMachine() (NFC) (#130940)
This avoids doing a Triple -> std::string -> Triple round trip in lots
of places, now that the Module stores a Triple.
2025-03-12 17:35:09 +01:00
Nikita Popov
979c275097
[IR] Store Triple in Module (NFC) (#129868)
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.

For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.

The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
2025-03-06 10:27:47 +01:00
Alex Bradbury
dd662d8028
[RISCV] Handle ADD in RISCVInstrInfo::isCopyInstrImpl (#81123)
Split out from #77610 and features a test, as a buggy version of this
caused a regression when landing that patch (the previous version had a
typo picking the wrong register as the source).

This is also motivated by future changes to MachineCopyPropagation which will use this information to determine if we have been left with a nop mv.
2025-03-05 12:29:04 +00:00
Farzon Lotfi
dc764f5c68
[DirectX] initialize registers properties by calling addRegisterClass and computeRegisterProperties (#128818)
This fixes #126784 for the DirectX backend.
This bug was marked critical for DX so it is going to go in first. At
least one register class needs to be added via `addRegisterClass` for
`RegClassForVT` to be valid.
Further for costing information used by loop unroll and other
optimizations to be valid we need to call `computeRegisterProperties`.
This change does both of these.

The test cases confirm that we can fetch costing information off of
`getRegisterInfo` and that `DirectXTargetLowering` maps `i32` typed
registers to `DXILClassRegClass`.
2025-02-27 10:35:14 -05:00
Ashley Coleman
02c9dae814
[HLSL] Add support to lookup a ResourceBindingInfo from its use (#126556)
Adds `findByUse` which takes a `llvm::Value` from a use and resolves it
(as best as possible) back to the creation of that resource.

It may return multiple ResourceBindingInfo if the use comes from
branched control flow.

Fixes #125746
2025-02-18 17:29:23 -07:00
Matt Arsenault
ab2d330fea
TableGen: Generate reverseComposeSubRegIndices (#127050)
This is necessary to enable composing subregisters in peephole-opt.
For now use a brute force table to find the return value. The worst
case target is AMDGPU with a 399 x 399 entry table.
2025-02-17 22:11:26 +07:00
Vyacheslav Levytskyy
df122fc734
[SPIR-V] Change a way SPIR-V Backend API works with user facing options (#124745)
This PR fixes https://github.com/llvm/llvm-project/issues/124703:
* added a new API call `SPIRVTranslate` that is to replace entirely old
`SPIRVTranslateModule` after existing clients switch into the new
function;
* the new `SPIRVTranslate` doesn't require option parsing, replacing the
`Opts` argument with explicit `CodeGenOptLevel` and `Triple` arguments;
* the old `SPIRVTranslateModule` call is a wrapper for `SPIRVTranslate`,
it doesn't require option parsing either and doesn't hold any logic
inside except for converting string options into `CodeGenOptLevel` and
`Triple` arguments;
* usage of the extensions list is reworked to avoid writes to the global
cl::opt variable `lib/Target/SPIRV/SPIRVSubtarget.cpp::Extensions` --
instead a new class member in SPIRVSubtarget.cpp is implemented that
allows to replace supported extensions after SPIRVSubtarget.cpp is
created;
* both API calls don't require option parsing and don't write to global
cl::opt variables.

Other related/required changes:
* SPIRV::Capability::Shader is marked as an capability of lesser
priority for OpenCL environment (to remediate absence of the
"avoid-spirv-capabilities" command line option in API calls);
* unit tests are updated and extended to cover testing of a newer API
call;
* old API call is marked with TODO to remove it after existing clients
switch into the new function.
2025-01-28 17:33:11 +01:00
Vyacheslav Levytskyy
ac94fade60
[SPIR-V] Rename internal command line flags for optimization level and mtriple used when passing options into the translate API call (#123975)
Rename internal command line flags for optimization level and mtriple
used when passing options into the translate API call.
2025-01-22 23:16:49 +01:00
Vyacheslav Levytskyy
3ff9368e58
[SPIR-V] Ensure that Module resource is managed locally wrt. a unit test case and fix a memory leak (#123725)
Adding SPIRV to LLVM_ALL_TARGETS
(https://github.com/llvm/llvm-project/pull/119653) revealed a series of
minor compilation problems and sanitizer complaints. This PR is to move
unit tests resources (a Module ptr) from the class-scope to a local
scope of the class member function to be sure that before the test env
is teared down the ptr is released.
2025-01-21 12:37:02 +01:00
Emma Pilkington
dc0e258fe4
[AMDGPU] Remove Dwarf encodings for subregisters (#117891)
Previously, registers and subregisters mapped to the same Dwarf
encoding. We don't really have any way to refer to subregisters directly
from Dwarf, the expression emitter should instead use DW_OPs to stencil
out the subregister from the whole register. This was also confusing
tools that need to map back to the llvm reg (e.g. dwarfdump), since
getLLVMRegNum() would arbitrarily return the _LO16 register.
2025-01-06 14:51:16 -05:00
paperchalice
1562b70eaf
Reapply "[DomTreeUpdater] Move critical edge splitting code to updater" (#119547)
This relands commit #115111.
Use traditional way to update post dominator tree, i.e. break critical
edge splitting into insert, insert, delete sequence.
When splitting critical edges, the post dominator tree may change its
root node, and `setNewRoot` only works in normal dominator tree...
See

6c7e5827ed/llvm/include/llvm/Support/GenericDomTree.h (L684-L687)
2024-12-13 11:43:09 +08:00
paperchalice
553058f825
Revert "[DomTreeUpdater] Move critical edge splitting code to updater" (#119512)
Reverts llvm/llvm-project#115111 Causes #119511
2024-12-11 14:25:17 +08:00
paperchalice
79047fac65
[DomTreeUpdater] Move critical edge splitting code to updater (#115111)
Support critical edge splitting in dominator tree updater. Continue the
work in #100856.

Compile time check:
https://llvm-compile-time-tracker.com/compare.php?from=87c35d782795b54911b3e3a91a5b738d4d870e55&to=42b3e5623a9ab4c3648564dc0926b36f3b438a3a&stat=instructions%3Au
2024-12-11 11:31:42 +08:00
Jon Roelofs
b6c22a4e58
Add processor aliases back to -print-supported-cpus and -mcpu=help (#118581)
They were accidentally dropped in
https://github.com/llvm/llvm-project/pull/96249

rdar://140853882
2024-12-09 09:18:31 -08:00
Nathan Gauër
45b567be8d
[SPIR-V] Add partial order tests, assert reducible (#117887)
Add testing for the visitor and added a note explaining irreducible CFG
are not supported.
Related to #116692

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-11-28 16:33:01 +01:00
Nathan Gauër
53326ee0cf
[SPIR-V] Fix block sorting with irreducible CFG (#116996)
Block sorting was assuming reducible CFG. Meaning we always had a best
node to continue with. Irreducible CFG makes breaks this assumption, so
the algorithm looped indefinitely because no node was a valid candidate.

Fixes #116692

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-11-28 13:42:57 +01:00
Sander de Smalen
318c69de52 Reland "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)"
The issue with slow compile-time was caused by an assert in
AArch64RegisterInfo.cpp. The assert invokes 'checkAllSuperRegsMarked'
after adding all the reserved registers. This call gets very expensive
after adding the _HI registers due to the way the function searches
in the 'Exception' list, which is expected to be a small list but isn't
(the patch added 190 _HI regs).

It was possible to rewrite the code in such a way that the _HI registers
are marked as reserved after the check. This makes the problem go away
entirely and restores compile-time to what it was before (tested for
`check-runtimes`, which previously showed a ~5x slowdown).

This reverts commits:
  1434d2ab215e3ea9c5f34689d056edd3d4423a78
  2704647fb7986673b89cef1def729e3b022e2607
2024-11-27 13:31:59 +00:00
Vitaly Buka
1434d2ab21
Revert "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)" (#117307)
Details in #114827

This reverts commit c1c68baf7e0fcaef1f4ee86b527210f1391b55f6.
2024-11-22 11:48:25 -08:00
Matin Raayai
bb3f5e1fed
Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234)
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html,
https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this
PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine`
interface classes. More specifically:
1. Makes `TargetMachine` the only class implemented under
`TargetMachine.h` in the `Target` library.
2. `TargetMachine` contains target-specific interface functions that
relate to IR/CodeGen/MC constructs, whereas before (at least on paper)
it was supposed to have only IR/MC constructs. Any Target that doesn't
want to use the independent code generator simply does not implement
them, and returns either `false` or `nullptr`.
3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming
aims to make the purpose of `LLVMTargetMachine` clearer. Its interface
was moved under the CodeGen library, to further emphasis its usage in
Targets that use CodeGen directly.
4. Makes `TargetMachine` the only interface used across LLVM and its
projects. With these changes, `CodeGenCommonTMImpl` is simply a set of
shared function implementations of `TargetMachine`, and CodeGen users
don't need to static cast to `LLVMTargetMachine` every time they need a
CodeGen-specific feature of the `TargetMachine`.
5. More importantly, does not change any requirements regarding library
linking.

cc @arsenm @aeubanks
2024-11-14 13:30:05 -08:00