1723 Commits

Author SHA1 Message Date
Vladimir Vereschaka
19d681177f
Revert "[MC][TableGen] Expand Opcode field of MCInstrDesc" (#180321)
Reverts llvm/llvm-project#179652

This PR causes the out-of-memory build failures on many Windows
builders.
2026-02-06 21:58:50 -08:00
sstipano
13d8870d45
[MC][TableGen] Expand Opcode field of MCInstrDesc (#179652)
Increase width of Opcode to `int` from `short` to allow more capacity.
2026-02-06 20:21:48 +01:00
Craig Topper
65cc6951d5 Reapply "[SelectionDAGISel] Separate the operand numbers in OPC_EmitNode/MorphNodeTo into their own table. (#178722)"
This includes a fix to use size_t instead of uint64_t in one place.
2026-02-03 14:20:42 -08:00
Craig Topper
19cf75c72f Revert "[SelectionDAGISel] Separate the operand numbers in OPC_EmitNode/MorphNodeTo into their own table. (#178722)"
This reverts commit caab98284166784459a2fb76df7bca3f1d35e41e.

This is failing some build bots.
2026-02-03 13:45:00 -08:00
Craig Topper
caab982841
[SelectionDAGISel] Separate the operand numbers in OPC_EmitNode/MorphNodeTo into their own table. (#178722)
The operand lists for these opcode require 1 byte per operand and are
usually small values that fit in 3-4 bits. This makes their storage
inefficient. In addition, many EmitNode/MorphNodeTo in the isel table
will use the same list of operand numbers.

This patch proposes to separate the operand lists into their own table
where they can be de-duplicated. The OPC_EmitNode/MorphNodeTo in the
main table will only store an index into this smaller table.

This is a reduced version of a suggestion from this very old FIXME.
d8d4096c0b/llvm/utils/TableGen/DAGISelMatcherGen.cpp (L1070)

For RISC-V this reduces the main table from 1437353 bytes to 1276015
bytes plus a 929 byte operand list table. A savings of about 11%.

For X86 this reduces the main table from 719237 bytes to 623612 bytes
plus a 1042 byte operand list table. A savings of about 11%.

I expect further savings could be had by moving more bytes over.
2026-02-03 13:13:13 -08:00
Nicolai Hähnle
af836ff60c
[CodeGen] Add getTgtMemIntrinsic overload for multiple memory operands (NFC) (#175843)
There are target intrinsics that logically require two MMOs, such as
llvm.amdgcn.global.load.lds, which is a copy from global memory to LDS,
so there's both a load and a store to different addresses.

Add an overload of getTgtMemIntrinsic that produces intrinsic info in a
vector, and implement it in terms of the existing (now protected)
overload.

GlobalISel and SelectionDAG paths are updated to support multiple MMOs.
The main part of this change is supporting multiple MMOs in
MemIntrinsicNodes.

Converting the backends to using the new overload is a fairly mechanical step
that is done in a separate change in the hope that that allows reducing merging
pains during review and for downstreams. A later change will then enable
using multiple MMOs in AMDGPU.
2026-02-02 21:58:42 +00:00
Craig Topper
7e48b14d1e
[SelectionDAGISel] Avoid unnecessary MatchScope copy. NFC (#178957)
Add the MatchScope to the vector first, then write its fields.
2026-01-30 14:57:45 -08:00
Craig Topper
43e52b7803
[SelectionDAGISel] Use size_t for MatcherIndex. NFC (#178828)
There's some evidence that this improves compile time on stage2-O3.
https://llvm-compile-time-tracker.com/compare.php?from=75f03a62d1f9b0081fff57ceebb29a3ae1560a61&to=d9cdb41d51f7010ba710403e2d1e30c969e4f88b&stat=instructions:u
2026-01-30 10:28:24 -08:00
Rahul Joshi
26f962465e
[LLVM][CodeGen] Remove pass initialization calls from pass constructors (#173061)
- Remove pass initialization calls from pass constructors.
- For some passes, add the initialization to `initializeCodeGen` or
`initializeGlobalISel`.
- Remove redundant initializations from llc and X86 target for some
passes.
2026-01-21 08:44:51 -08:00
Matt Arsenault
aa57ee958d
CodeGen: Use LibcallLoweringInfo for stack protector insertion (#176829)
Thread LibcallLoweringInfo into the TargetLowering hooks used
by the stack protector passes.
2026-01-20 12:37:31 +01:00
Matt Arsenault
f734f42bb0
DAG: Take LibcallLoweringInfo from analysis (#176800)
Previously this was taking a duplicate copy of this information
from TargetLowering. This moves the bulk of libcall checks to use
the new analysis. There are still a few straggler uses in misc.
passes in a few backends (mainly AArch64 has some libcall emission
in FinalizeISel and PrologEpilogInserter).
2026-01-19 22:35:53 +01:00
Matt Arsenault
2c9cc88e25
FastISel: Thread LibcallLoweringInfo through (#176799)
Boilerplate change to prepare to take LibcallLoweringInfo from
an analysis. For now, it just sets it from the copy inside of
TargetLowering.
2026-01-19 20:44:48 +00:00
Matt Arsenault
01e6245af4
DAG: Avoid querying libcall info from TargetLowering (#176268)
Libcall lowering decisions should come from the LibcallLoweringInfo
analysis. Query this through the DAG, so eventually the source
can be the analysis. For the moment this is just a wrapper around
the TargetLowering information.
2026-01-16 09:02:49 +00:00
Craig Topper
08de4fd0d4
[SelectionDAG] Move HwMode expansion from tablegen to SelectionISel. (#174471)
The way HwMode is currently implemented, tablegen duplicates each
pattern that is dependent on hardware mode. The HwMode predicate is
added as a pattern predicate on the duplicated pattern.
    
RISC-V uses HwMode on the GPR register class which means almost every
isel pattern is affected by HwMode. This results in the isel table
being nearly twice the size it would be if we only had a single GPR
size.

This patch proposes to do the expansion at instruction selection time
instead. To accomplish this new opcodes like OPC_CheckTypeByHwMode
are added to the isel table. The unique combinations of types and HwMode
are converted to an index that is the payload for the new opcodes.
TableGen emits a new virtual function getValueTypeByHwMode that uses
this index and the current HwMode to look up the type.

This reduces the size of the isel table on RISC-V from ~2.38 million
bytes to ~1.38 million bytes.

I did not add an OPC_SwitchTypeByHwMode opcode yet. If the VT requires a
hardware mode, we emit an OPC_Scope+OPC_CheckTypeByHwMode instead. I
expect adding an OPC_SwitchTypeByHwMode could further reduce the table
size. I will investigate this as a follow up.
    
Many of the matcher classes in tablegen now use ValueTypeByHwMode
insteadof MVT. This may have an impact on the memory usage and runtime of
tablegen. We can mitigate some of this by splitting the matchers into MVT and
ValueTypeByHwMode versions. We can also explore alternate data
structures for ValueTypeByHwMode instead of a std::map. Maybe a sorted vector.

A similar change can be made to GlobalISel as a follow up.
2026-01-15 09:35:02 -08:00
Craig Topper
ba76e02d67
[SelectionDAGISel] Remove unused opcodes. NFC (#175621)
Nodes with 0 results should always have OPFL_Chain. We don't need other
CompressedNodeInfo opcodes for 0 results.
2026-01-12 13:09:58 -08:00
Craig Topper
1c1310d022 [SelectionDAG] Use a simpler version of decodeSLEB128 in GetSignedVBR to improve compile time. NFC
This reduces a compile time regression after 7f780046e2 seen here
https://llvm-compile-time-tracker.com/compare.php?from=01e75e2eafcea00c448e289154d8adeecc1c6c3a&to=7f780046e2ae79218dd6abc2d008c1a9eeddedc7&stat=instructions:u

decodeSLEB128 has some code to check for errors that we don't need.
Writing a simpler version seems to recover most of the regression.
2025-12-31 13:02:42 -08:00
Craig Topper
775251a807
[SelectionDAG] Remove OPC_EmitStringInteger from isel. (#173936)
Instead emit this as an OPC_EmitInteger, but print the string
when the value is known to be 0..63 (when we don't need a VBR).
Also print the string into a comment when comments are not omitted
so it isn't lost when a VBR is needed.
2025-12-31 08:40:57 -08:00
Craig Topper
7f780046e2 [SelectionDAG] Use SLEB128 for signed integers in isel table instead of 'signed rotated'. NFC (#173936)
Previously, we used a VBR that stored the sign bit in bit 0 followed
by the absolute value in subsequent bits.

This patch changes it to use SLEB128 which discards redundant sign
bits, but keeps the bits in the same positions. This uses the same
number of bytes to encode values so doesn't change the table size.

My goal is to remove OPC_EmitStringInteger as a special opcode type.
Instead, we can print the string directly with OPC_EmitInteger for
any string that has an enum value of 0..63.
2025-12-30 22:22:35 -08:00
Craig Topper
01e75e2eaf Revert "[SelectionDAG] Use SLEB128 for signed integers in isel table instead of 'signed rotated'. NFC (#173928)"
This reverts commit 3ff2637d867a6cc23ea5d5127b065efb8299d196.

I accidentally merged another PR into this during a rebase. Reverting
to commit it correctly.
2025-12-30 22:13:58 -08:00
Craig Topper
3ff2637d86
[SelectionDAG] Use SLEB128 for signed integers in isel table instead of 'signed rotated'. NFC (#173928)
Previously, we used a VBR that stored the sign bit in bit 0 followed by
the absolute value in subsequent bits.

This patch changes it to use SLEB128 which discards redundant sign bits,
but keeps the bits in the same positions. This uses the same number of
bytes to encode values so doesn't change the table size.

My goal is to remove OPC_EmitStringInteger as a special opcode type.
Instead, we can print the string directly with OPC_EmitInteger for any
string that has an enum value of 0..63.
2025-12-30 21:12:45 -08:00
Craig Topper
0bd5975132
[SelectionDAG] Use uint8_t instead of unsigned char for isel MatcherTable. (#174014)
These are really the same type, but uint8_t is more accurate since we
make assumptions that a table element is 8 bits when we emit VBRs.
2025-12-30 14:49:54 -08:00
Craig Topper
52b4470454
[SelectionDAG] Use SmallVector::assign instead of clear+append. NFC (#173946) 2025-12-30 08:44:32 -08:00
Craig Topper
44514f7917
[SelectionDAG] Rename OPC_EmitInteger8->OPC_EmitIntegerI8. NFC (#173832)
Same for OPC_EmitInteger16/32/64 and OPC_EmitStringInteger32.

This matches OPC_CheckTypeI32, OPC_EmitRegisterI32, etc.
2025-12-29 11:12:44 -08:00
Matt Arsenault
9ad39dd116
AMDGPU: Avoid crashing on statepoint-like pseudoinstructions (#170657)
At the moment the MIR tests are somewhat redundant. The waitcnt
one is needed to ensure we actually have a load, given we are
currently just emitting an error on ExternalSymbol. The asm printer
one is more redundant for the moment, since it's stressed by the IR
test. However I am planning to change the error path for the IR test,
so it will soon not be redundant.
2025-12-29 19:08:08 +01:00
Craig Topper
65f9374cec
[SelectionDAG] Use emplace_back. NFC (#173824)
This avoids using push_back+std::pair/make_pair.
2025-12-29 09:04:49 -08:00
Daniel Paoliello
644fd3b665
[FastISel] Don't select a CallInst as a BasicBlock in the SelectionDAG fallback if it has bundled ops (#162895)
This was discovered while looking at the codegen for x64 when Control
Flow Guard is enabled.

When using `SelectionDAG`, LLVM would generate the following sequence
for a CF guarded indirect call:
```
	leaq	target_func(%rip), %rax
	rex64 jmpq	*__guard_dispatch_icall_fptr(%rip) # TAILCALL
```

However, when Fast ISel was used the following is generated:
```
	leaq	target_func(%rip), %rax
	movq	__guard_dispatch_icall_fptr(%rip), %rcx
	rex64 jmpq	*%rcx                   # TAILCALL
```

This was happening despite Fast ISel aborting and falling back to
`SelectionDAG`.

The root cause for this code gen is that `SelectionDAGISel` has a
special case when Fast ISel aborts when lowering a `CallInst` where it
tries to lower the instruction as its own basic block, which for such a
CF Guard call means that it is lowering an indirect call to
`__guard_dispatch_icall_fptr` without observing that the function was
being loaded into a pointer in the preceding (and bundled) instruction.

The fix for this is to not use the special case when a `CallInst` has
bundled instructions: it's better to allow the call and its bundled
instructions to be lowered together by `SelectionDAG` instead.
2025-12-15 14:38:20 -08:00
Peter Collingbourne
6227eb90da
Add IR and codegen support for deactivation symbols.
Deactivation symbols are a mechanism for allowing object files to disable
specific instructions in other object files at link time. The initial use
case is for pointer field protection.

For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

Reviewers: ojhunt, nikic, fmayer, arsenm, ahmedbougacha

Reviewed By: fmayer

Pull Request: https://github.com/llvm/llvm-project/pull/133536
2025-11-26 12:37:09 -08:00
Matt Arsenault
db20a7f2bc
DAG: Fix constructing a temporary TargetTransformInfo instance (#168480) 2025-11-20 01:19:23 -05:00
Craig Topper
8d6a1def4d
[SelectionDAGISel] Don't merge input chains if it would put a token factor in the way of a glue. (#167805)
In the new test, we're trying to fold a load and a X86ISD::CALL. The
call has a CopyToReg glued to it. The load and the call have different
input chains so they need to be merged. This results in a TokenFactor
that gets put between the CopyToReg and the final CALLm instruction. The
DAG scheduler can't handle that.

The load here was created by legalization of the extract_element using a
stack temporary store and load. A normal IR load would be chained into
call sequence by SelectionDAGBuilder. This would usually have the load
chained in before the CopyToReg. The store/load created by legalization
don't get chained into the rest of the DAG.

Fixes #63790
2025-11-13 09:25:53 -08:00
Craig Topper
99a726ea51
[SelectionDAGISel] Const correct ChainNodesMatched argument to HandleMergeInputChains. NFC (#167807) 2025-11-12 22:56:57 -08:00
Craig Topper
7171a9cfc4
[SelectionDAG] Fix typo in comment glueged->glued. NFC (#167006)
This was the result of a search and replace when "flag" was renamed to
"glue". This originally said "flagged".
2025-11-10 08:37:29 -08:00
Daniel Thornburgh
5f08fb4d72
[IR] llvm.reloc.none intrinsic for no-op symbol references (#147427)
This intrinsic emits a BFD_RELOC_NONE relocation at the point of call,
which allows optimizations and languages to explicitly pull in symbols
from static libraries without there being any code or data that has an
effectual relocation against such a symbol.

See issue #146159 for context.
2025-11-06 08:52:46 -08:00
Grigory Pastukhov
2c3f0e541d
[CodeGen] Preserve branch weights from PGO profile during instruction selection at -O0 (#161620)
Branch probabilities from PGO profile data were not preserved during
instruction selection at -O0 because BranchProbabilityInfo was only
requested when OptLevel != None.
2025-10-09 11:16:43 -07:00
Mikołaj Piróg
91f4db77b0
[SDAG] Use useDebugInstrRef instead of shouldUseDebugInstrRef (#160686)
`shouldUseDebugInstrRef` can return different value than
`useDebugInstrRef`, since the first depends on opt level which can
change. Inconsistent usage can lead to errors later.

I believe that using `should...` instead of `use...` here is a result of
a minor error during this:
https://github.com/llvm/llvm-project/pull/94149/files#diff-8ec547e1244562c5837ed180dd9bed61b3cd960ef90bb6002ea2db41a67ed693

Notice how before the change `InstrRef` is assigned value from
`should...` *before* the opt change. Now, it's done after -- opt change
happens here:
```c
bool SelectionDAGISelLegacy::runOnMachineFunction(MachineFunction &MF) {
...
  // Decide what flavour of variable location debug-info will be used, before
  // we change the optimisation level.
  MF.setUseDebugInstrRef(MF.shouldUseDebugInstrRef());
....

  return Selector->runOnMachineFunction(MF);
}
```

Then `runOnMachineFunction` uses `should...`, which after opt change may
return different value than it did previously.
2025-10-08 18:13:32 +02:00
Min-Yih Hsu
c2c2e4ec90
[SelectionDAG] Add support to dump DAGs with sorted nodes (#161097)
An alternative approach to #149732 , which sorts the DAG before dumping
it. That approach runs a risk of altering the codegen result as we don't
know if any of the downstream DAG users relies on the node ID, which was
updated as part of the sorting.

The new method proposed by this PR does not update the node ID or any of
the DAG's internal states: the newly added
`SelectionDAG::getTopologicallyOrderedNodes` is a const member function
that returns a list of all nodes in their topological order.
2025-10-03 16:18:21 -07:00
Benjamin Maxwell
5899bca6ba
[AArch64][SME] Resume streaming-mode on entry to exception handlers (#156638)
This patch adds a new `TargetLowering` hook `lowerEHPadEntry()` that is
called at the start of lowering EH pads in SelectionDAG. This allows the
insertion of target-specific actions on entry to exception handlers.

This is used on AArch64 to insert SME streaming-mode switches at landing
pads. This is needed as exception handlers are always entered with
PSTATE.SM off, and the function needs to resume the streaming mode of
the function body.
2025-09-04 12:55:12 +01:00
David Stuttard
c7c0229480
Revert "[AMDGPU] SelectionDAG divergence tracking should take into account Target divergency. (#147560)" (#152548)
This reverts commit 9293b65a616b8de432a654d046e802540b146372.
2025-08-08 09:05:59 +01:00
Sam Elliott
bccd34f323
[SelectionDAG] Correctly Mark Required Analyses (#147649)
llvm/llvm-project#147560 changed when the legacy SelectionDAG pass needs
TargetTransformInfoWrapperPass to always require it (rather than only
when assertions are enabled). `SelectionDAGISelLegacy::getAnalysisUsage`
was not updated in that PR, which was causing crashes on
assertions-disabled builds, which are hard to track down.

This makes the required update, which should avoid crashes being seen on
some buildbots and by some users.
2025-07-08 21:40:29 -07:00
alex-t
9293b65a61
[AMDGPU] SelectionDAG divergence tracking should take into account Target divergency. (#147560)
This is the next attempt to upstream this:
https://github.com/llvm/llvm-project/pull/144947
The las one caused build errors in AArch64.
Issue was resolved.
2025-07-09 00:06:58 +02:00
Florian Hahn
bfd457588a
Revert "[AMDGPU] SelectionDAG divergence tracking should take into account Target divergency. (#144947)"
This reverts commit 8ac7210b7f0ad49ae7809bf6a9faf2f7433384b0.

This breaks the building the AArch64 backend, e.g. see
https://github.com/llvm/llvm-project/pull/144947

Revert to unbreak the build.

Also reverts follow-up commits 1e76f012db3ccfaa05e238812e572b5b6d12c17e.
2025-07-03 19:25:01 +01:00
alex-t
8ac7210b7f
[AMDGPU] SelectionDAG divergence tracking should take into account Target divergency. (#144947)
If a kernel is known to be executing only a single lane, IR
UniformityAnalysis will take note of that (via
GCNTTIImpl::hasBranchDivergence) and report that all values are uniform.
SelectionDAG's built-in divergence tracking should do the same.
2025-07-03 18:37:37 +02:00
Matt Arsenault
48155f93dd
CodeGen: Emit error if getRegisterByName fails (#145194)
This avoids using report_fatal_error and standardizes the error
message in a subset of the error conditions.
2025-06-23 16:33:35 +09:00
Orlando Cazalet-Hyams
36038a1048
[RemoveDIs][NFC] Remove dbg intrinsic handling code from SelectionDAG ISel (#144702) 2025-06-18 16:04:18 +01:00
Jeremy Morse
9eb0020555
[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389)
Seeing how we can't generate any debug intrinsics any more: delete a
variety of codepaths where they're handled. For the most part these are
plain deletions, in others I've tweaked comments to remain coherent, or
added a type to (what was) type-generic-lambdas.

This isn't all the DbgInfoIntrinsic call sites but it's most of the
simple scenarios.

Co-authored-by: Nikita Popov <github@npopov.com>
2025-06-17 15:55:14 +01:00
Omair Javaid
e1e1836bbd
[CodeGen] Inline stack guard check on Windows (#136290)
This patch optimizes the Windows security cookie check mechanism by
moving the comparison inline and only calling __security_check_cookie
when the check fails. This reduces the overhead of making a DLL call 
for every function return.

Previously, we implemented this optimization through a machine pass
(X86WinFixupBufferSecurityCheckPass) in PR #95904 submitted by
@mahesh-attarde. We have reverted that pass in favor of this new 
approach. Also we have abandoned the AArch64 specific implementation 
of same pass in PR #121938 in favor of this more general solution.

The old machine instruction pass approach:
- Scanned the generated code to find __security_check_cookie calls
- Modified these calls by splitting basic blocks
- Added comparison logic and conditional branching
- Required complex block management and live register computation

The new approach:
- Implements the same optimization during instruction selection
- Directly emits the comparison and conditional branching
- No need for post-processing or basic block manipulation
- Disables optimization at -Oz.

Thanks @tamaspetz, @efriedma-quic and @arsenm for their help.
2025-06-12 19:38:42 +05:00
Matt Arsenault
742e84dc5d
SelectionDAG: Use unique_ptr for SwiftErrorValueTracking (#142532) 2025-06-03 19:15:03 +09:00
Matt Arsenault
36b710a7e5
CodeGen: Convert some assorted errors to use reportFatalUsageError (#142031)
The test coverage is lacking for many of these errors.
2025-05-30 08:06:53 +02:00
Jon Roelofs
346a72f2ca
[LLVM] Add color to SDNode ID's when dumping (#141295)
This is especially helpful for the recursive 'Cannot select:' dumps,
where colors help distinguish nodes at a quick glance.
2025-05-24 09:40:29 -07:00
Kazu Hirata
3bc174ba77
[CodeGen] Remove unused includes (NFC) (#141320)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-05-24 00:00:00 -07:00
Rahul Joshi
1fdf02ad5a
[LLVM][CodeGen] Add convenience accessors for MachineFunctionProperties (#140002)
Add per-property has<Prop>/set<Prop>/reset<Prop> functions to
MachineFunctionProperties.
2025-05-22 08:07:52 -07:00