37373 Commits

Author SHA1 Message Date
Daniel Paoliello
16e051f0b9
[win] NFC: Rename EHCatchret to EHCont to allow for EH Continuation targets that aren't catchret instructions (#129953)
This change splits out the renaming and comment updates from #129612 as a non-functional change.
2025-03-06 09:28:44 -08:00
Craig Topper
81089f0fd1 [CodeGen] Use Register::id(). NFC 2025-03-06 09:08:21 -08:00
Craig Topper
bdf50f0292 [CodeGen] Use Register or MCRegister. NFC 2025-03-06 09:08:21 -08:00
Craig Topper
d0b8f5d8b3 [RegisterBankInfo] Use MCRegister instead of Register for getMinimalPhysRegClass. NFC 2025-03-06 09:07:53 -08:00
Craig Topper
68b1fe8628 [LivePhysRegs] Use MCRegister instead of MCPhysReg in interface. NFC 2025-03-06 09:07:53 -08:00
Matt Arsenault
b21663cb5b
SplitKit: Take register class directly from instruction definition (#129727)
This fixes an expensive chesk failure after 8476a5d480304. The issue
was essentially that getRegClassConstraintEffectForVReg was not doing
anything useful, sometimes. If the register passed to it is not present
in the instruction, it is a no-op and returns the original classe. The
Edit->getReg() register may not be the register as it appears in either
the use or def instruction. It may be some split register, so take
the register directly from the instruction being rematerialized.

Also directly query the constraint from the def instruction, with a
hardcoded operand index. This isn't ideal, but all the other
rematerialize
code makes the same assumption.

So far I've been unable to reproduce this with a standalone MIR test. In
the
original case, stop-before=greedy and running the one pass is not
working.
2025-03-06 20:06:35 +07:00
Nikita Popov
979c275097
[IR] Store Triple in Module (NFC) (#129868)
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.

For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.

The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
2025-03-06 10:27:47 +01:00
Fangrui Song
fe56c4c019 [MC] Remove unneeded VK_None argument from MCSymbolRefExpr::create. NFC 2025-03-05 23:14:04 -08:00
Craig Topper
58670aa79a [FastISel] Use Register. NFC
This focuses on the common interfaces and tablegen. More changes
are needed to individual targets.
2025-03-05 09:13:02 -08:00
Paul Walker
a537724069
[LLVM][DAGCombine] Remove combiner-vector-fcopysign-extend-round. (#129878)
This option was added to improve test coverage for SVE lowering code
that is impossible to reach otherwise. Given it is not possible to
trigger a bug without it and the generated code is universally worse
with it, I figure the option has no value and should be removed.
2025-03-05 15:31:34 +00:00
Benjamin Maxwell
0228b778a4
[SDAG] Add missing SoftenFloatRes legalization for FMODF (#129264)
This is needed on some ARM platforms.
2025-03-05 13:45:48 +00:00
Nikita Popov
a614f2b489 [StackProtector] Fix domtree verification in NewPM
Use DTU.getDomTree() to make sure the DTU if flushed.
2025-03-05 12:55:27 +01:00
Nikita Popov
53c157939e
[StackProtector] Fix phi handling in HasAddressTaken() (#129248)
Despite the name, the HasAddressTaken() heuristic identifies not only
allocas that have their address taken, but also those that have accesses
that cannot be proven to be in-bounds.

However, the current handling for phi nodes is incorrect. Phi nodes are
only visited once, and will perform the analysis using whichever
(remaining) allocation size is passed the first time the phi node is
visited. If it is later visited with a smaller remaining size, which may
lead to out of bounds accesses, it will not be detected.

Fix this by keeping track of the smallest seen remaining allocation size
and redo the analysis if it is decreased. To avoid degenerate cases
(including via loops), limit the number of allowed decreases to a small
number.
2025-03-05 12:45:13 +01:00
Aiden Grossman
f1dbc45210 [MLGO] Properly Handle Counting Evictions of Candidates
This patch makes it so that onEviction actually gets called when the
model ends up selecting the candidate to evict. Where we were handling
this previously ended up being dead code as we would return earlier with
MCRegister::NoRegister.

Fixes #129841.
2025-03-05 07:19:50 +00:00
Kazu Hirata
40c65e8589
[CodeGen] Avoid repeated hash lookups (NFC) (#129821) 2025-03-04 22:17:00 -08:00
Craig Topper
efb966e929 [MIRParser] Use Register::id(). Pass Twine by reference. NFC 2025-03-04 22:02:58 -08:00
Krzysztof Drewniak
e697c99b63
[AMDGPU] Add custom MachineValueType entries for buffer fat poiners (#127692)
The old hack of returning v5/v6i32 for the fat and strided buffer
pointers was causing issuse during vectorization queries that expected
to be able to construct a VectorType from the return value of `MVT
getPointerType()`. On example is in the test attached to this PR, which
used to crash.

Now, we define the custom MVT entries, the 160-bit
amdgpuBufferFatPointer and 192-bit amdgpuBufferStridedPointer, which are
used to represent ptr addrspace(7) and ptr addrspace(9) respectively.

Neither of these types will be present at the time of lowering to a
SelectionDAG or other MIR - MVT::amdgpuBufferFatPointer is eliminated by
the LowerBufferFatPointers pass and amdgpu::bufferStridedPointer is not
currently used outside of the SPIR-V translator (which does its own
lowering).

An alternative solution would be to add MVT::i160 and MVT::i192. We
elect not to do this now as it would require changes to unrelated code
and runs the risk of breaking any SelectionDAG code that assumes that
the MVT series are all powers of two (and so can be split apart and
merged back together) in ways that wouldn't be obvious if someone tried
to use MVT::i160 in codegen. If i160 is added at some future point,
these custom types can be retired.
2025-03-04 17:19:06 -06:00
Craig Topper
6ca2a9f2df
[CodeGen] Use Register in SDep interface. NFC (#129734) 2025-03-04 12:26:28 -08:00
Lucas Ramirez
03677f63a7
[MachineScheduler] Optional scheduling of single-MI regions (#129704)
Following 15e295d the machine scheduler no longer filters-out single-MI
regions when emitting regions to schedule. While this has no functional
impact at the moment, it generally has a negative compile-time impact
(see #128739).

Since all targets but AMDGPU do not care for this behavior, this
introduces an off-by-default flag to `ScheduleDAGInstrs` to control
whether such regions are going to be scheduled, effectively reverting
15e295d for all targets but AMDGPU (currently the only target enabling
this flag).
2025-03-04 17:46:44 +01:00
James Chesterman
e3c8e17b07 Reland "[DAGCombiner] Add generic DAG combine for ISD::PARTIAL_REDUCE_MLA (#127083)"
This relands commit 7a06681398a33d53ba6d661777be8b4c1d19acb7.
2025-03-04 11:09:33 +00:00
Vikram Hegde
e0eb4edad6
[CodeGen][NewPM] Port "FixupStatepointCallerSaved" pass to NPM (#129541) 2025-03-04 15:47:43 +05:30
Kazu Hirata
ef94d8a0f2
[CodeGen] Avoid repeated hash lookups (NFC) (#129652) 2025-03-04 01:49:48 -08:00
Kazu Hirata
7a06681398 Revert "[DAGCombiner] Add generic DAG combine for ISD::PARTIAL_REDUCE_MLA (#127083)"
This reverts commit 2bef21f24ba932a757a644470358c340f4bcd113.

Multiple builtbot failures have been reported:
https://github.com/llvm/llvm-project/pull/127083
2025-03-04 01:44:09 -08:00
James Chesterman
2bef21f24b
[DAGCombiner] Add generic DAG combine for ISD::PARTIAL_REDUCE_MLA (#127083)
Add generic DAG combine for ISD::PARTIAL_REDUCE_U/SMLA nodes. Transforms
the DAG from:
PARTIAL_REDUCE_MLA(Acc, MUL(EXT(MulOpLHS), EXT(MulOpRHS)), Splat(1)) to
PARTIAL_REDUCE_MLA(Acc, MulOpLHS, MulOpRHS).
2025-03-04 09:09:15 +00:00
Kazu Hirata
a5bbfcf0c9
[GlobalISel] Avoid repeated hash lookups (NFC) (#129653) 2025-03-04 00:08:40 -08:00
Matt Arsenault
39bf765bb6
DAG: Use phi to create vregs instead of the constant input (#129464)
For most targets, the register class comes from the type so this
makes no difference. For AMDGPU, the selected register class depends
on the divergence of the value. For a constant phi input, this will
always be false. The heuristic for whether to treat the value as
a scalar or vector constant based on the uses would then incorrectly
think this is a scalar use, when really the phi is a copy from S to V.

This avoids an intermediate s_mov_b32 plus a copy in some cases. These
would often, but not always, fold out in mi passes.

This only adjusts the constant input case. It may make sense to do
this for the non-constant case as well.
2025-03-04 14:44:54 +07:00
Akshat Oke
af4ec59f8d
[CodeGen][NPM] Port ExpandPostRAPseudos to NPM (#129509) 2025-03-04 11:49:09 +05:30
Akshat Oke
3aab3fe56f
[NPM][NFC] Chain PreservedAnalyses methods (#129505) 2025-03-04 10:23:01 +05:30
Matt Arsenault
4670f0d827
MachineVerifier: Print name of failing subregister index (#129491)
I'm not sure of a good example to test the "does not fully support"
case.
2025-03-04 11:25:34 +07:00
Matt Arsenault
8476a5d480
SplitKit: Fix rematerialization undoing subclass based split (#122110)
This fixes an allocation failure in the new test.

In cases where getLargestLegalSuperClass can inflate the register class,
rematerialization could effectively undo a split which was done to
inflate
the register class, if the defining instruction can only write a
subclass
and the use can read the superclass.

Some of the x86 tests changes look like improvements, but some are
likely regressions.

I'm not entirely sure this is the correct place to fix this. It also
seems more complicated than necessary, but the decision to change
the register class is far removed from the point where the decision
to split the virtual register is made. I'm also also not sure if this
should be considering the register classes of all the use indexes
in getUseSlots, rather than just checking if this use index instruction
reads the register.
2025-03-04 10:04:14 +07:00
Jeffrey Byrnes
3963d21482
[MachineSink] Fix typo in loop sinking (#127133)
Failure to sink a candidate should not block us from attempting to sink
other candidates. There are mechanisms in place to handle the case where
the failed to be sunk instruction uses an instruction that gets sunk (we
do not delete the original instruction corresponding with the sunk
instruction if it still has uses).
2025-03-03 17:30:12 -08:00
Kazu Hirata
bcb0c3a291
[CodeGen] Avoid repeated hash lookups (NFC) (#129465) 2025-03-03 07:27:16 -08:00
Vikram Hegde
6abe148bac
[CodeGen][NewPM] Port "RemoveRedundantDebugValues" to NPM (#129005) 2025-03-03 19:57:50 +07:00
Benjamin Maxwell
55fdeccc45
[SDAG][X86] Remove hack needed to avoid missing x87 FPU stack pops (#128055)
If a (two-result) node like `FMODF` or `FFREXP` is expanded to a library
call, where said library has the function prototype like: `float(float,
float*)` -- that is it returns a float from the call and via an output
pointer. The first result of the node maps to the value returned by
value and the second result maps to the value returned via the output
pointer.

If only the second result is used after the expansion, we hit an issue
on x87 targets:

```
// Before expansion: 
t0, t1 = fmodf x
return t1  // t0 is unused
```

Expanded result:
```
ptr = alloca
ch0 = call modf ptr
t0, ch1 = copy_from_reg, ch0 // t0 unused
t1, ch2 = ldr ptr, ch1
return t1
```

So far things are alright, but the DAGCombiner optimizes this to:
```
ptr = alloca
ch0 = call modf ptr
// copy_from_reg optimized out
t1, ch1 = ldr ptr, ch0
return t1
```

On most targets this is fine. The optimized out `copy_from_reg` is
unused and is a NOP. However, x87 uses a floating-point stack, and if
the `copy_from_reg` is optimized out it won't emit a pop needed to
remove the unused result.

The prior solution for this was to attach the chain from the
`copy_from_reg` to the root, which did work, however, the root is not
always available (it's set to null during legalize types). So the
alternate solution in this patch is to replace the `copy_from_reg` with
an `X86ISD::POP_FROM_X87_REG` within the X86 call lowering. This node is
the same as `copy_from_reg` except this node makes it explicit that it
may lower to an x87 FPU stack pop. Optimizations should be more cautious
when handling this node than a normal CopyFromReg to avoid removing a
required FPU stack pop.

```
ptr = alloca
ch0 = call modf ptr
t0, ch1 = pop_from_x87_reg, ch0 // t0 unused
t1, ch2 = ldr ptr, ch1
return t1
```

Using this node ensures a required x87 FPU pop is not removed due to the
DAGCombiner.

This is an alternate solution for #127976.
2025-03-03 12:23:28 +00:00
Akshat Oke
77f44a9642
[CodeGen][NewPM] Port MachineSink to NPM (#115434)
Targets can set the EnableSinkAndFold option in CGPassBuilderOptions for
the NPM pipeline in buildCodeGenPipeline(... &Opts, ...)
2025-03-03 15:49:37 +05:30
Pedro Lobo
75bfdebdee
[SelectionDAG] Use poison instead of undef for dbg.values (#127915)
`undef dbg.values` can be replaced with `poison dbg.values`.
2025-03-03 10:54:26 +01:00
Matt Arsenault
cb113a7812
RegisterCoalescer: Avoid repeated getRegClass on all paths (#129490) 2025-03-03 16:05:54 +07:00
Craig Topper
5387a77f8b [CodeGen] Use MCRegister in CalleeSavedInfo. NFC 2025-03-02 23:46:18 -08:00
Craig Topper
caa798cb1e [GlobalISel] Use Register. NFC 2025-03-02 23:46:18 -08:00
Craig Topper
7cee4c7c59 [CallingConvLower] Use MCRegister. NFC 2025-03-02 23:46:18 -08:00
Craig Topper
9f8e148a6c [CalcSpillWeights] Use Register. NFC 2025-03-02 23:46:17 -08:00
Craig Topper
3fe22559c7 [InlineSpiller] Use Register. NFC 2025-03-02 23:46:17 -08:00
Craig Topper
49ba565913 [IfConversion] Use MCRegister. NFC 2025-03-02 23:46:16 -08:00
Craig Topper
8a9a363ffb [MIRCanonicalizerPass] Use MCRegister. NFC 2025-03-02 23:46:16 -08:00
Huibin Wang
59138a603f
[DAGCombiner] Cleanup MatchFunnelPosNeg by using SDPatternMatch matchers (#129482)
Fixes issue: https://github.com/llvm/llvm-project/issues/129034
2025-03-03 14:35:38 +07:00
chrisPyr
71f4c7dabe
[NFC]Make file-local cl::opt global variables static (#126486)
#125983
2025-03-03 13:46:33 +07:00
Craig Topper
aaaaa4d256 [MachineLICM] Use Register. NFC 2025-03-02 22:33:26 -08:00
Craig Topper
dd9bb32b97 [MachineCSE] Const correct some function arguments. NFC 2025-03-02 22:33:26 -08:00
Craig Topper
a70175ab93 [CodeGen] Use MCRegister and Register. NFC 2025-03-02 22:33:26 -08:00
Craig Topper
13cce8c0bc [CodeGen] Use Register::id() to avoid implicit cast. NFC 2025-03-02 22:33:26 -08:00