35846 Commits

Author SHA1 Message Date
Simon Pilgrim
54b20cbb95
[DAG] computeKnownBits - abds(x, y) will be zero in the upper bits if x and y are sign-extended (#94448)
As reported on #94442 - if x and y have more than one signbit, then the upper bits of its absolute value are guaranteed to be zero

Sibling PR to #94382

Alive2: https://alive2.llvm.org/ce/z/7_z2Vc

Fixes #94442
2024-06-05 11:57:55 +01:00
Simon Pilgrim
e635520be8
[DAG] computeKnownBits - abs(x) will be zero in the upper bits if x is sign-extended (#94382)
As reported on https://github.com/llvm/llvm-project/issues/94344 - if x has more than one signbit, then the upper bits of its absolute value are guaranteed to be zero

Alive2: https://alive2.llvm.org/ce/z/a87fHU

Fixes #94344
2024-06-05 10:58:54 +01:00
Fangrui Song
cb09b5f3d5 [MC] Disable MCAssembler based constant folding for compact unwind and emitJumpTableEntry
Similar to commit 245491a9f384e4c53421196533c2a2b693efaf8d for DwarfDebug.

This completely disables the expensive MCFragment walk code in
`AttemptToFoldSymbolOffsetDifference` when compiling sqlite3.i for
macOS.

In the future, we should try enabling the MCFragment walk only for
constructs like `.if . -_start == 1` and `.subsection a-b` and
remove these `setUseAssemblerInfoForParsing`.
2024-06-04 15:06:12 -07:00
Nikita Popov
deab451e7a
[IR] Remove support for icmp and fcmp constant expressions (#93038)
Remove support for the icmp and fcmp constant expressions.

This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179

As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
2024-06-04 08:31:03 +02:00
paperchalice
9b0e1c2ca2
[NewPM][CodeGen] Port finalize-isel to new pass manager (#94214)
It should preserve more analysis results, but it happens immediately
after instruction selection.
2024-06-04 09:23:52 +08:00
Keith Smiley
cac5d0e938
[CodeGen] Fix compiler conditional combination (#94297)
Previously this assumed that `LLVM_ENABLE_ABI_BREAKING_CHECKS` would
always be enabled in this case, if it's not `TTI` does not exist.

Introduced in 7652a59407018c057cdc1163c9f64b5b6f0954eb
2024-06-04 09:16:00 +08:00
Rahman Lavaee
8ec1161fe6
[Codegen, BasicBlockSections] Avoid cloning blocks which have their machine block address taken. (#94296)
These blocks usually show up in the form of branches within inline
assembly. Since it's hard to rewire them, we fully omit paths with such
blocks from path cloning.
2024-06-03 17:22:43 -07:00
paperchalice
7652a59407
Reland "[NewPM][CodeGen] Port selection dag isel to new pass manager" (#94149)
- Fix build with `EXPENSIVE_CHECKS`
- Remove unused `PassName::ID` to resolve warning
- Mark `~SelectionDAGISel` virtual so AArch64 backend can work properly
2024-06-04 08:10:58 +08:00
Jon Roelofs
0b4af3a5f4
[llvm][SelectionDAG] Relax llvm.ptrmask's size check on arm64_32 (#94125)
Since pointers in memory, as well as the index type are both 32 bits,
but in registers pointers are 64 bits, the mask generated by
llvm.ptrmask needs to be zero-extended.

Fixes: #94075
Fixes: rdar://125263567
2024-06-03 15:26:30 -07:00
Xuan Zhang
16c925ab5f
[MachineOutliner] Efficient Implementation of MachineOutliner::findCandidates() (#90260)
This reduce the time complexity of the main loop of `findCandidates()`
method from $O(n^2)$ to $O(n \log n)$.

For small $n$, the modification does not regress the build time, but it
helps significantly when $n$ is large.

For one application, this reduces the runtime of the main loop from 120
seconds to 28 seconds.

This is the first commit for an enhanced version of machine outliner --
see
[RFC](https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-1-fulllto-part-2-thinlto-nolto-to-come/78732).
2024-06-03 07:41:49 -07:00
Michael Maitland
0f669154e1
[GlobalMerge] Add MinSize feature to the GlobalMerge Pass. (#93686)
We add a feature that prevents the GlobalMerge pass from considering
data smaller than a minimum size in bytes for merging.

The MinSize is set in 3 ways:
1. If global-merge-min-data-size is explicitly set, then it uses that
value.
2. If SmallDataLimit is set and non-zero, then SmallDataLimit + 1 is
used.
3. Otherwise, 0 is used, which means all sizes are considered for
merging.

We found that this feature allowed us to see the benefit of the
GlobalMerge pass while eliminating some merging that was not beneficial.
This feature allowed us to enable the GlobalMerge pass on RISC-V in our
downstream by default because it led to improvements on multiple
benchmark suites.

I plan to post a separate patch to propose enabling this by default on
RISC-V. But I do not want that discussion to be part of the discussion
of adding this feature, so I am keeping the patches separate.
2024-06-03 09:10:56 -04:00
Dhruv Chawla
e12bf36d23
[GISel][CombinerHelper] Combine op(trunc(x), trunc(y)) -> trunc(op(x, y)) (#89023) 2024-06-03 10:42:10 +05:30
Joshua Cao
ab08df2292
[IR] Do not set none for function uwtable (#93387)
This avoids the pitfall where we set the uwtable to none:
```
func.setUWTableKind(llvm::UWTableKind::None)
```
`Attribute::getAsString()` would see an unknown attribute and fail an
assertion. In this patch, we assert that we do not see a None uwtable
kind.

This also skips the check of `UWTableKind::Async`. It is dominated by
the check of `UWTableKind::Default`, which has the same enum value
(nfc).
2024-06-02 15:02:11 -07:00
Simon Pilgrim
c9a86fa9a6 [DAG] canCreateUndefOrPoison - fix missing argument typo
We were missing the PoisonOnly argument (so Depth + 1 was being used instead and the default Depth = 0 argument then being silently used)

Fixes #94145 and serves as the test case for 9e22c7a0ea87228dffcdfd7ab62724f72e0b3e30
2024-06-02 10:34:48 +01:00
paperchalice
8917afaf0e
Revert "[NewPM][CodeGen] Port selection dag isel to new pass manager" (#94146)
This reverts commit de37c06f01772e02465ccc9f538894c76d89a7a1 to
de37c06f01772e02465ccc9f538894c76d89a7a1

It still breaks EXPENSIVE_CHECKS build. Sorry.
2024-06-02 14:31:52 +08:00
paperchalice
d2cdc8ab45
[NewPM][CodeGen] Port selection dag isel to new pass manager (#83567)
Port selection dag isel to new pass manager.
Only `AMDGPU` and `X86` support new pass version. `-verify-machineinstrs` in new pass manager belongs to verify instrumentation, it is enabled by default.
2024-06-02 09:12:33 +08:00
Simon Pilgrim
9e22c7a0ea [DAG] canCreateUndefOrPoison - only compute shift amount knownbits when not poison
Since #93182 we can now call computeKnownBits inside getValidMaximumShiftAmount to determine the bounds of the shift amount ensuring that it wasn't poison, meaning if we did freeze the ahift amount, isGuaranteedNotToBeUndefOrPoison would then fail as we can't call computeKnownBits through FREEZE for potentially poison values.

I'm still reducing a decent test case but wanted to get the buildbot fix ASAP.
2024-06-01 19:05:27 +01:00
Simon Pilgrim
2b1dfd2b35
[DAG] Replace getValid*ShiftAmountConstant helpers with getValid*ShiftAmount helpers to support KnownBits analysis (#93182)
The getValidShiftAmountConstant/getValidMinimumShiftAmountConstant/getValidMaximumShiftAmountConstant helpers only worked with constant shift amounts, which could be problematic after type legalization (e.g. v2i64 might be partially scalarized or split into v4i32 on some targets such as 32-bit x86, Thumb2 MVE).

This patch proposes we generalize these helpers to work with ConstantRange+KnownBits if a scalar/buildvector constant isn't available.

Most restrictions are the same - the helper fails if any shift amount is out of bounds, getValidShiftConstant must be a specific constant uniform etc.

However, getValidMinimumShiftAmount/getValidMaximumShiftAmount now can return bounds values that aren't values in the actual data, as they are based off the common KnownBits of every vector element.

This addresses feedback on #92096
2024-06-01 16:48:26 +01:00
Yingwei Zheng
47fd32f81c
[DAGCombine] Fix type mismatch in (shl X, cttz(Y)) -> (mul (Y & -Y), X) (#94008)
Proof: https://alive2.llvm.org/ce/z/J7GBMU

Same as https://github.com/llvm/llvm-project/pull/92753, the types of
LHS and RHS in shift nodes may differ.
+ When VT is smaller than ShiftVT, it is safe to use trunc.
+ When VT is larger than ShiftVT, it is safe to use zext iff
`is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See
also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2
proofs.

Fixes issue
https://github.com/llvm/llvm-project/pull/85066#issuecomment-2142553617.
2024-06-01 19:04:55 +08:00
Yingwei Zheng
0864501b97
[GISel] Convert zext nneg to sext if it is cheaper (#93856)
This patch converts `zext nneg` to `sext` on RISCV to use free sext.

---------

Co-authored-by: Thorsten Schütt <schuett@gmail.com>
2024-06-01 15:02:43 +08:00
Ahmed Bougacha
cc548ec47c
[AArch64][PAC] Lower authenticated calls with ptrauth bundles. (#85736)
This adds codegen support for the "ptrauth" operand bundles, which can
be used to augment indirect calls with the equivalent of an
`@llvm.ptrauth.auth` intrinsic call on the call target (possibly
preceded by an `@llvm.ptrauth.blend` on the auth discriminator if
applicable.)

This allows the generation of combined authenticating calls
on AArch64 (in the BLRA* PAuth instructions), while avoiding
the raw just-authenticated function pointer from being
exposed to attackers.

This is done by threading a PtrAuthInfo descriptor through
the call lowering infrastructure, eventually selecting a BLRA
pseudo.  The pseudo encapsulates the safe discriminator
computation, which together with the real BLRA* call get emitted
in late pseudo expansion in AsmPrinter.

Note that this also applies to the other forms of indirect calls,
notably invokes, rvmarker, and tail calls.  Tail-calls in particular
bring some additional complexity, with the intersecting register
constraints of BTI and PAC discriminator computation.
However this doesn't currently support PAuth_LR tail-call variants.

This also adopts an x8+ allocation order for GPR64noip, matching
GPR64.
2024-05-31 14:08:10 -07:00
Egor Pasko
cab81dd038
[EntryExitInstrumenter] Move passes out of clang into LLVM default pipelines (#92171)
Move EntryExitInstrumenter(PostInlining=true) to as late as possible and
EntryExitInstrumenter(PostInlining=false) to an early pre-inlining stage
(but skip for ThinLTO post-link).

This should fix the issues reported in
https://github.com/rust-lang/rust/issues/92109 and
https://github.com/llvm/llvm-project/issues/52853. These are caused
by https://reviews.llvm.org/D97608.
2024-05-31 12:48:45 -07:00
Jay Foad
b1be480b03 [DAGCombiner] Move CanReassociate down to first use. NFC. 2024-05-31 09:44:47 +01:00
Jianjian Guan
db6de1a20f
[DAGCombiner][VP] Add DAGCombine for VP_MUL (#80105)
Use visitMUL to combine VP_MUL, share most logic of MUL with VP_MUL.

Migrate from https://reviews.llvm.org/D121187
2024-05-31 10:17:11 +08:00
David Green
b5db2e1969 [MCP] Remove unused TII argument. NFC
Last used in e35fbf5c04f4719db8ff7c7a993cbf96bb706903.
2024-05-30 15:01:02 +01:00
Roger Ferrer Ibáñez
05e6bb40eb
[SelectionDAG] Add an ISD::CLEAR_CACHE node to lower llvm.clear_cache (#93795)
The current way of lowering `llvm.clear_cache` is a bit unusual. As
suggested by Matt Arsenault we are better off using an ISD node.

This change introduces a new `ISD::CLEAR_CACHE`, registers a new libcall
by default named `__clear_cache` and the default legalisation is a
libcall.

This is preparatory work for a custom lowering of `ISD::CLEAR_CACHE`
needed by RISC-V on some platforms.
2024-05-30 14:55:32 +02:00
Him188
8bce40b1eb
[AArch64][GISel] Support SVE with 128-bit min-size for G_LOAD and G_STORE (#92130)
This patch adds basic support for scalable vector types in load & store
instructions for AArch64 with GISel.

Only scalable vector types with a 128-bit base size are supported, e.g.
`<vscale x 4 x i32>`, `<vscale x 16 x i8>`.

This patch adapted some ideas from a similar abandoned patch
[https://github.com/llvm/llvm-project/pull/72976](https://github.com/llvm/llvm-project/pull/72976).
2024-05-30 09:10:43 +01:00
Shubham Sandeep Rastogi
89129201fe
[NFC] Move DIExpressionCursor to DebugInfoMetadata.h (#69768)
This is an NFC patch to move DIExpressionCursor to DebugInfoMetada.h, so
that it can be used by classes in that header file.

Specifically, I want to use DIExpressionCursor in a subsequent patch:
https://github.com/llvm/llvm-project/pull/71718
2024-05-29 15:36:33 -07:00
Matt Arsenault
f68fdb84e1
DAG: Fix losing flags on select when expanding select_cc (#93662)
This was only preserving the flags on the setcc, not the new select.
This was missing presumably due to getSelect not having a flags argument
until recently. Avoids regressions in a future commit.
2024-05-29 22:02:27 +02:00
Craig Topper
8aceb7a53d
[ValueTypes] Remove MVT::MAX_ALLOWED_VALUETYPE. NFC (#93654)
Despite the comment, this isn't used to size bit vectors or tables.
That's done by VALUETYPE_SIZE. MAX_ALLOWED_VALUETYPE is only used by
some static_asserts that compare it to VALUETYPE_SIZE.

This patch removes it and most of the static_asserts. I left one where I
compared VALUETYPE_SIZE to token which is the first type that isn't part
of the VALUETYPE range. This isn't strictly needed, we'd probably catch
duplication error from VTEmitter.cpp first.
2024-05-29 11:47:21 -07:00
aengelke
9fe7aef188
[CodeGen] Don't check attrs for stack realign (#92564)
shouldRealignStack/canRealignStack are repeatedly called in PEI (through
hasStackRealignment). Checking function attributes is expensive, so
cache this data in the MachineFrameInfo, which had most data already.

This slightly changes the semantics of `MachineFrameInfo::ForcedRealign`
to be also true when the `stackrealign` attribute is set.
2024-05-29 20:38:34 +02:00
Simon Pilgrim
4e251e7cad Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFC. 2024-05-29 17:57:34 +01:00
Thorsten Schütt
6d90ac1e06
[GlobalIsel] Combine freeze (#93239) 2024-05-29 18:05:33 +02:00
Nikita Popov
e20f0fe29f [WasmEHPrepare] Explicitly create inbounds GEP (NFCI)
These are known to be inbounds, create them as such. NFCI because
constant expression construction currently already infers this.

Also drop the unnecessary zero-index GEP: This is equivalent to
the pointer itself nowadays.
2024-05-29 16:13:36 +02:00
Yingwei Zheng
24ddce62c8
[GISel] Legalize bitreverse with types smaller than 8 bits (#92998)
This patch adds support for lowering `bitreverse` with types smaller
than 8 bits. It also fixes an existing assertion failure in
`llvm::APInt::getSplat`: https://godbolt.org/z/7crs8xrcG

The lowering logic is copied from SDAG:

2034f2fc87/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (L9384-L9398)
2024-05-29 21:42:08 +08:00
Yingwei Zheng
9e8ecce88e
[DAGCombine] Transform shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported (#85066)
This patch fold `shl X, cttz(Y)` to `mul (Y & -Y), X` if cttz is
unsupported by the target.
Alive2: https://alive2.llvm.org/ce/z/AtLN5Y
Fixes https://github.com/llvm/llvm-project/issues/84763.
2024-05-29 18:26:54 +08:00
Matt Arsenault
aef0bdd36d
DAG: Preserve flags when expanding fminimum/fmaximum (#93550)
The operation selection logic here doesn't really work when vector types
need to be split. This was also dropping the flags, and losing nnan made
the combine from select back to fmin/fmax unrecoverable. Preserve the
flags to assist a future commit.
2024-05-29 12:26:27 +02:00
Pengcheng Wang
4e0bd3fab4
[MachineLICM] Hoist copies of constant physical register (#93285)
Previously, we just check if the source is a virtual register and
this prevents some potential hoists.

We can see some improvements in AArch64/RISCV tests.
2024-05-29 14:10:01 +08:00
Heejin Ahn
c179d50fd3
[WebAssembly] Add exnref type (#93586)
This adds (back) the exnref type restored in the new EH proposal adopted
in Oct 2023 CG meeting:

https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md:x
2024-05-28 16:10:11 -07:00
Matt Arsenault
98fa0f6981 DAG: Handle vector splitting for fminnum_ieee/fmaxnum_ieee
Avoids regression in future commit which starts producing
illegal instances.
2024-05-28 22:58:24 +02:00
Craig Topper
1c3a3f0e79
[LegalizeTypes] Use VP_AND and VP_SHL/VP_SRA to promote operands fo VP arithmetic. (#92799)
This adds VPSExtPromotedInteger and VPZExtPromotedInteger and uses them
to promote many arithmetic operations.
    
VPSExtPromotedInteger uses a shift pair because we don't have
VP_SIGN_EXTEND_INREG yet.
2024-05-28 12:49:42 -07:00
Matt Arsenault
196a080982 DAG: Handle fminnum_ieee/fmaxnum_ieee in basic legalization
Handle these in promote float and vector widening. Currently we happen
to avoid emitting these unless legal or custom. Avoids regression in
a future commit which wants to unconditionally emit these.
2024-05-28 20:31:17 +02:00
Matt Arsenault
16a5fd3fdb
DAG: Use flags in isLegalToCombineMinNumMaxNum (#93555) 2024-05-28 18:57:38 +02:00
AtariDreams
d582958618
Revert "[Legalizer] Check full condition for UMIN and UMAX just like the code below does for SMIN and SMAX" (#93573)
Reverts llvm/llvm-project#87932
2024-05-28 12:25:43 -04:00
Simon Pilgrim
8a395b00b8 [DAG] Use auto* for cast/dyn_cast results (style). NFC. 2024-05-28 10:03:04 +01:00
Shengchen Kan
eeb2f72a49
[SelectionDAG][X86] Fix the assertion failure in Release build after #91747 (#93434)
In #91747, we changed the SDNode from `X86ISD::SUB` (FROM) to
`X86ISD::CCMP`
(TO) in the DAGCombine. The value type of `X86ISD::SUB` can be `i8, i32`
while the value type of `X86ISD::CCMP` is i32. This breaks the
assumption
that the value type should match after the combine and triggers the
error

```
SelectionDAG.cpp:10942: void
llvm::SelectionDAG::transferDbgValues(llvm::SDValue, llvm::SDValue,
unsigned int, unsigned int, bool): Assertion `FromNode && ToNode &&
"Can't modify dbg values"' failed.
```

when running tests

llvm/test/CodeGen/X86/apx/ccmp.ll
llvm/test/CodeGen/X86/apx/ctest.ll

in Release build when LLVM_ENABLE_ASSERTIONS is on.

In this patch, we fix it by creating a merged value.
2024-05-27 11:33:23 +08:00
AtariDreams
70bf139651
[Legalizer] Check full condition for UMIN and UMAX just like the code below does for SMIN and SMAX (#87932) 2024-05-26 15:07:31 -04:00
Shengchen Kan
331eb8a004
[X86][CodeGen] Support lowering for CCMP/CTEST (#91747)
DAG combine for `CCMP` and `CTESTrr`:

```
and/or(setcc(cc0, flag0), setcc(cc1, sub (X, Y)))
->
setcc(cc1, ccmp(X, Y, ~cflags/cflags, cc0/~cc0, flag0))

and/or(setcc(cc0, flag0), setcc(cc1, cmp (X, 0)))
 ->
setcc(cc1, ctest(X, X, ~cflags/cflags, cc0/~cc0, flag0))
```
 where `cflags` is determined by `cc1`.

Generic DAG combine:
```
cmp(setcc(cc, X), 0)
brcond ne
->
X
brcond cc

sub(setcc(cc, X), 1)
brcond ne
->
X
brcond ~cc
```

Post DAG transform:  `ANDrr/rm + CTESTrr -> CTESTrr/CTESTmr`


Pattern match for `CTESTri`:
```
X= and A, B
ctest(X, X, cflags, cc0/, flag0)
->
ctest(A, B, cflags, cc0/, flag0)
```

`CTESTmi` is already handled by the memory folding mechanism in MIR.
2024-05-26 18:32:23 +08:00
Craig Topper
a1c9b9673c
[SelectionDAG][RISCV][VE] Rename VP_ASHR->VP_SRA VP_LSHR->VP_SRL. (#93221)
This maintains consistency with the non-VP ISD opcodes.
2024-05-24 09:03:19 -07:00
Simon Pilgrim
729fdb6bb6 [DAG] visitFunnelShift - pull out repeated SDLoc. 2024-05-24 14:50:42 +01:00