36677 Commits

Author SHA1 Message Date
Mirko
4d3c427f33
[CodeGen] Use first EHLabel as a stop gate for live range shrinking (#114195)
This fixes issue #114194

The issue happens during the `LiveRangeShrink` pass, which runs early,
before phi elimination. LandingPads, which are lowered to EHLabels, need
to be the first non phi instruction in an EHPad. In case of a phi node
being in front of the EHLabel and a use being after the EHLabel, we
hoist the use in front of the label.

This results in a portion of the landingpad missing due to being hoisted
in front of the label.
2024-11-01 19:13:18 -07:00
Philip Reames
69edef1ab9 [DAG] Simplify control flow in SelectionDAGBuilder::visitShuffleVector [NFC]
If we've handled ==, and < above, the only case left can be >.  We don't
need to branch on this, and can instead assert and reduce indentation,
and simplify reasoning about the fallthrough path.
2024-11-01 08:59:15 -07:00
Wang Qiang
b77e40265c
[llvm][NFC] Fix typos: replace “avaliable” with “available” across various files (#114524)
This pull request corrects multiple occurrences of the typo "avaliable"
to "available" across the LLVM and Clang codebase. These changes improve
the clarity and accuracy of comments and documentation. Specific
modifications are in the following files:

1. clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp:
Updated comments in readability checks for cognitive complexity.
2. llvm/include/llvm/ExecutionEngine/Orc/ExecutionUtils.h: Corrected
documentation for JITDylib responsibilities.
3. llvm/include/llvm/Target/TargetMacroFusion.td: Fixed descriptions for
FusionPredicate variables.
4. llvm/lib/CodeGen/SafeStack.cpp: Improved comments on DominatorTree
availability.
5. llvm/lib/Target/RISCV/RISCVSchedSiFive7.td: Enhanced resource usage
descriptions for vector units.
6. llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp: Updated invariant
description in shift-detect idiom logic.
7. llvm/test/MC/ARM/mve-fp-registers.s: Amended ARM MVE register
availability notes.
8. mlir/lib/Bytecode/Reader/BytecodeReader.cpp: Adjusted forward
reference descriptions for bytecode reader operations.

These changes have no impact on code functionality, focusing solely on
documentation clarity.

Co-authored-by: wangqiang <wangqiang1@kylinos.cn>
2024-11-01 13:25:04 +00:00
Thorsten Schütt
8e3772744d
[GlobalISel][AArch64] Legalize G_INSERT_VECTOR_ELT for SVE (#114470)
There are patterns for:
* {nxv2s32, s32, s64},
* {nxv4s16, s16, s64},
* {nxv2s16, s16, s64}
2024-11-01 06:10:26 +01:00
Matt Arsenault
9cc298108a
AtomicExpand: Copy metadata from atomicrmw to cmpxchg (#109409)
When expanding an atomicrmw with a cmpxchg, preserve any metadata
attached to it. This will avoid unwanted double expansions
in a future commit.

The initial load should also probably receive the same metadata
(which for some reason is not emitted as an atomic).
2024-10-31 11:54:07 -07:00
Antonio Frighetto
19c8475871 [SelectionDAG] Add preliminary plumbing for samesign flag
Extend recently-added poison-generating IR flag to codegen as well.
2024-10-31 19:47:50 +01:00
Simon Pilgrim
9fb4bc5bf4
[DAG] SimplifyMultipleUseDemandedBits - ignore SRL node if we're just demanding known sign bits (#114389)
Check to see if we are only demanding (shifted) signbits from a SRL node that are also signbits in the source node.

We can't demand any upper zero bits that the SRL will shift in (up to max shift amount), and the lower demanded bits bound must already be all signbits.
2024-10-31 16:40:29 +00:00
Sriraman Tallam
c7ef002bc6
Fix performance bug in buildLocationList (#109343)
In buildLocationList, with basic block sections, we iterate over
every basic block twice to detect section start and end. This is
sub-optimal and shows up as significantly time consuming when
compiling large functions.

This patch uses the set of sections already stored in MBBSectionRanges
and iterates over sections rather than basic blocks.

When detecting if loclists can be merged, the end label of an entry is
matched with the beginning label of the next entry. For the section
corresponding to the entry basic block, this is skipped. This is
because the loc list uses the end label corresponding to the function
whereas the MBBSectionRanges map uses the function end label.

For example:

.Lfunc_begin0:
.file
.loc 0 4 0 # ex2.cc:4:0
.cfi_startproc
.Ltmp0:
.loc 0 8 5 prologue_end # ex2.cc:8:5
....
.LBB_END0_0:
.cfi_endproc
.section .text._Z4testv,"ax",@progbits,unique,1
...
.Lfunc_end0:
.size _Z4testv, .Lfunc_end0-_Z4testv

The debug loc uses ".LBB_END0_0" for the end of the section whereas
MBBSectionRanges uses ".Lfunc_end0".

It is alright to skip this as we already check the section corresponding
to the debugloc entry.

Added a new test case to check that if this works correctly when the
variable's value is mutated in the entry section.
2024-10-31 09:00:25 -07:00
Zaara Syeda
ccddd13602
Enable aggressive constant merge in GlobalMerge for AIX (#113956)
Enable merging all constants without looking at use in GlobalMerge by
default to replace PPCMergeStringPool pass on AIX.
2024-10-31 11:22:48 -04:00
Matt Arsenault
db5bcb24c2
GlobalISel: Fix combine duplicating atomic loads (#111730)
The sext_inreg (load) combine was not deleting the old load instruction,
and it would never be deleted if volatile or atomic.
2024-10-31 07:55:12 -07:00
goldsteinn
1e072ae289
[CGP] [CodeGenPrepare] Folding urem with loop invariant value plus offset (#104724)
This extends the existing fold:

```
for(i = Start; i < End; ++i)
   Rem = (i nuw+- IncrLoopInvariant) u% RemAmtLoopInvariant;
```
 ->
```
Rem = (Start nuw+- IncrLoopInvariant) % RemAmtLoopInvariant;
for(i = Start; i < End; ++i, ++rem)
   Rem = rem == RemAmtLoopInvariant ? 0 : Rem;
```

To work with a non-zero `IncrLoopInvariant`.

This is a common usage in cases such as:

```
for(i = 0; i < N; ++i)
    if ((i + 1) % X) == 0)
        do_something_occasionally_but_not_first_iter();
```

Alive2 w/ i4/unrolled 6x (needs to be ran locally due to timeout):
https://alive2.llvm.org/ce/z/6tgyN3

Exhaust proof over all uint8_t combinations in C++:
https://godbolt.org/z/WYa561388
2024-10-31 09:14:33 -05:00
Benjamin Maxwell
89a8c71db6
[SDAG] Support expanding FSINCOS to vector library calls (#114039)
This shares most of its code with the scalar sincos expansion. It allows
expanding vector FSINCOS nodes to a library call from the specified
`-vector-library`. The upside of this is it will mean the vectorizer
only needs to handle the sincos intrinsic, which has no memory effects,
and this can handle lowering the intrinsic to a call that takes output
pointers.
2024-10-31 12:41:43 +00:00
dnsampaio
28d0718033
[DAGCombiner] Add combine avg from shifts (#113909)
This teaches dagcombiner to fold:
`(asr (add nsw x, y), 1) -> (avgfloors x, y)`
`(lsr (add nuw x, y), 1) -> (avgflooru x, y)`

as well the combine them to a ceil variant:
`(avgfloors (add nsw x, y), 1) -> (avgceils x, y)` 
`(avgflooru (add nuw x, y), 1) -> (avgceilu x, y)`

iff valid for the target.

Removes some of the ARM MVE patterns that are now dead code.
It adds the avg opcodes to `IsQRMVEInstruction` as to preserve the
immediate splatting as before.
2024-10-31 10:57:27 +01:00
Craig Topper
00cbb68fb7 [LegalizeDAG] Use getSignedConstant. NFC 2024-10-30 21:43:16 -07:00
Thorsten Schütt
6effab990c
Revert "[GlobalISel][AArch64] Legalize G_INSERT_VECTOR_ELT for SVE" (#114353)
Reverts llvm/llvm-project#114310
2024-10-31 05:41:16 +01:00
Thorsten Schütt
6bf214b7c6
[GlobalISel][AArch64] Legalize G_INSERT_VECTOR_ELT for SVE (#114310)
There are patterns for:
* {nxv2s32, s32, s64},
* {nxv4s16, s16, s64},
* {nxv2s16, s16, s64}
2024-10-31 04:56:41 +01:00
Craig Topper
f0bae562dc
[GISel] Return const APInt & from getIConstantFromReg. NFC (#114320)
This matches what the call to ConstantInt::getValue() returns. Let the
caller make a copy if needed.
2024-10-30 19:15:51 -07:00
Kazu Hirata
f582cd3dc7 [SelectionDAG] Fix a warning
This patch fixes:

  llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:1489:17: error:
  unused variable 'Flags' [-Werror,-Wunused-variable]
2024-10-30 17:49:51 -07:00
Yingwei Zheng
cf9d1c1486
[SDAG] Simplify SDNodeFlags with bitwise logic (#114061)
This patch allows using enumeration values directly and simplifies the
implementation with bitwise logic. It addresses the comment in
https://github.com/llvm/llvm-project/pull/113808#discussion_r1819923625.
2024-10-31 08:10:07 +08:00
Thorsten Schütt
b3bb6f18bb
[GlobalISel] Import samesign flag (#114267)
Credits: https://github.com/llvm/llvm-project/pull/111419

Fixes icmp-flags.mir

First attempt: https://github.com/llvm/llvm-project/pull/113090

Revert: https://github.com/llvm/llvm-project/pull/114256
2024-10-30 19:56:25 +01:00
Thorsten Schütt
4b028773b2
Revert "[GlobalISel] Import samesign flag" (#114256)
Reverts llvm/llvm-project#113090
2024-10-30 17:03:17 +01:00
Thorsten Schütt
72b115301d
[GlobalISel] Import samesign flag (#113090)
Credits: https://github.com/llvm/llvm-project/pull/111419
2024-10-30 16:34:01 +01:00
Petar Avramovic
84b7bcfcac
GlobalISel/MachineIRBuilder: Construct DstOp with VRegAttrs (#113581)
Allow construction of DstOp with VRegAttrs.
Also allow construction with register class or bank and LLT.
Intended to be used in lowering code for reg-bank-select where
new registers need to have both register bank and LLT.
Add support for new type of DstOp in CSEMIRBuilder.
2024-10-30 14:15:42 +01:00
Jay Foad
cea9dd833c
[CodeGen] Change MachineInstr::isConstantValuePHI to return Register. NFC. (#112901) 2024-10-30 11:58:59 +00:00
Simon Pilgrim
f7b5f0c805 [DAG] Fold (and X, (rot (not Y), Z)) -> (and X, (not (rot Y, Z)))
On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT

Followup to #112547
2024-10-30 10:46:12 +00:00
Akshat Oke
44d0e9522a
[CodeGen][NewPM] Port TailDuplicate pass to NPM (#113293) 2024-10-30 11:48:40 +05:30
Ellis Hoag
9cc5a4bf66
Remove llvm::shouldOptForSize() from Utils.h (#112630)
Remove `llvm::shouldOptForSize()` from `Utils.h` since we can use
`llvm::shouldOptimizeForSize()` from `SizeOpts.h` instead.

Depends on https://github.com/llvm/llvm-project/pull/112626
2024-10-29 14:23:47 -05:00
Afanasyev Ivan
4e1b9d34f9
[mir-strip-debug] Fix debug location info strip for bundled instructions (#113676)
Fix bug that `mir-strip-debug` pass does not remove debug location from
bundled instructions.

Problem arises during testing that debug info does not affect
optimization passes output (`llvm-lit` with ` -Dllc="llc
-debugify-and-strip-all-safe"`), when pass operates on MIR with bundled
instructions + memory operands.

Let mir test check looks like:

```
CHECK-NEXT: BUNDLE {
CHECK-NEXT:   $r3 = LD $r1, $r2 :: (load (s64) from %ir.a, !tbaa !2)
CHECK-NEXT: }
```

So as `mir-strip-debug` pass does not process bundled instructions,
running `llc -debugify-and-strip-all-safe` on the test will produce the
following output:

```
BUNDLE {
  $r3 = LD $r1, $r2, debug-location !DILocation(line: 3, column: 1, scope: <0x608cb2b99b10>) :: (load (s64) from %ir.a, !tbaa !2)
}
```

And test will fail, but it shouldn't.

Seems like the root cause is that `mir-strip-debug` pass should remove
debug location from bundled instructions.
2024-10-29 10:26:15 -07:00
Matt Arsenault
88e23eb2cf
DAG: Fix legalization of vector addrspacecasts (#113964) 2024-10-29 08:08:50 -05:00
Jay Foad
2443549b85
[IR] Remove some uses of StructType::setBody. NFC. (#113685)
It is simple to create the struct body up front, now that we have
transitioned to opaque pointers.
2024-10-29 11:44:53 +00:00
Benjamin Maxwell
c3260c65e8
[IR] Add llvm.sincos intrinsic (#109825)
This adds the `llvm.sincos` intrinsic, legalization, and lowering.

The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine and cosine (as a struct).

```
declare { float, float }          @llvm.sincos.f32(float  %Val)
declare { double, double }        @llvm.sincos.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincos.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincos.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincos.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float>  %Val)
```

The lowering is built on top of the existing FSINCOS ISD node, with
additional type legalization to allow for f16, f128, and vector values.
2024-10-29 10:52:20 +00:00
Matt Arsenault
1ceccbb0dd
VirtRegRewriter: Add implicit register defs for live out undef lanes (#112679)
If an undef subregister def is live into another block, we need to
maintain a physreg def to track the liveness of those lanes. This
would manifest a verifier error after branch folding, when the cloned
tail block use no longer had a def.

We need to detect interference with other assigned intervals to avoid
clobbering the undef lanes defined in other intervals, since the undef
def didn't count as interference. This is pretty ugly and adds a new
dependency on LiveRegMatrix, keeping it live for one more pass. It also
adds a lot of implicit operand spam (we really should have a better
representation for this).

There is a missing verifier check for this situation. Added an xfailed
test that demonstrates this. We may also be able to revert the changes
in 47d3cbcf842a036c20c3f1c74255cdfc213f41c2.

It might be better to insert an IMPLICIT_DEF before the instruction
rather than using the implicit-def operand.

Fixes #98474
2024-10-28 17:33:53 -07:00
Ellis Hoag
6ab26eab4f
Check hasOptSize() in shouldOptimizeForSize() (#112626) 2024-10-28 09:45:03 -07:00
Simon Pilgrim
056cf936a7
[DAG] Fold (and X, (bswap/bitreverse (not Y))) -> (and X, (not (bswap/bitreverse Y))) (#112547)
On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT

Fixes #112425
2024-10-28 11:52:44 +00:00
Jack Styles
933a56674e
[PAuthLR] Add Missing Break Statement for MachineOperand Switch Statement (#113883)
There was a missing break, which led to an unannotated fallthrough when
merging #112171. This has caused sanitizer builds to fail.

This adds the missing break in the switch statement to ensure that the
fallthrough does not occur.
2024-10-28 09:08:48 +00:00
Jack Styles
86f76c3b17
[AArch64][Libunwind] Add Support for FEAT_PAuthLR DWARF Instruction (#112171)
As part of FEAT_PAuthLR, a new DWARF Frame Instruction was introduced,
`DW_CFA_AARCH64_negate_ra_state_with_pc`. This instructs Libunwind that
the PC has been used with the signing instruction. This change includes
three commits
- Libunwind support for the newly introduced DWARF Instruction
- CodeGen Support for the DWARF Instructions
- Reversing the changes made in #96377. Due to
`DW_CFA_AARCH64_negate_ra_state_with_pc`'s requirements to be placed
immediately after the signing instruction, this would mean the CFI
Instruction location was not consistent with the generated location when
not using FEAT_PAuthLR. The commit reverses the changes and makes the
location consistent across the different branch protection options.
While this does have a code size effect, this is a negligible one.

For the ABI information, see here:
853286c7ab/aadwarf64/aadwarf64.rst (id23)
2024-10-28 08:22:38 +00:00
Aiden Grossman
7c9cf0c6f0 [SHT_LLVM_BB_ADDR_MAP][AsmPrinter] Emit error on bad option combinatons
This patch makes it so that specifying all or none for -pgo-analysis-map
along with an explicit option causes an error as this set of options
does not really make sense.
2024-10-26 08:15:34 +00:00
Aiden Grossman
38caf282ab
[SHT_LLVM_BB_ADDR_MAP][AsmPrinter] Add none and all options to PGO Map (#111221)
This patch adds none and all options to the -pgo-analysis-map flag,
which do basically what they say on the tin. The none option is added to
enable forcing the pgo-analysis-map by overriding an earlier invocation
of the flag. The all option is just added for convenience.
2024-10-25 15:39:52 -07:00
Gaëtan Bossu
a0c318938a
[CodeGen][NFC] Properly split MachineLICM and EarlyMachineLICM (#113573)
Both are based on MachineLICMBase, and the functionality there is
"switched" based on a PreRegAlloc flag. This commit is simply about
trusting the original value of that flag, defined by the `MachineLICM`
and `EarlyMachineLICM` classes.

The `PreRegAlloc` flag used to be overwritten it based on MRI.isSSA(),
which is un-reliable due to how it is inferred by the MIRParser. I see
that we can now define isSSA in MIR (thanks @gargaroff ), meaning the
fix isn’t really needed anymore, but redefining that flag still feels
wrong.

Note that I'm looking into upstreaming more changes to MachineLICM, see
[the discourse
thread](https://discourse.llvm.org/t/extending-post-regalloc-machinelicm/82725).
2024-10-25 11:19:22 -07:00
Tex Riddell
c03d09ce3e
[aarch64] atan2 intrinsic lowering (p5) (#112611)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `VecFuncs.def`: define intrinsic to sleef/armpl mapping
- `LegalizerHelper.cpp`: add missing fewerElementsVector handling for
the new atan2 intrinsic
- `AArch64ISelLowering.cpp`: Add arch64 specializations for lowering
like neon instructions
- `AArch64LegalizerInfo.cpp`: Legalize atan2.

Part 5 for Implement the atan2 HLSL Function #70096.
2024-10-24 17:53:12 -07:00
Daniel Hoekwater
1b8cff9a52
Reland "CFIFixup] Factor CFI remember/restore insertion into a helper (NFC)" (#113387)
The original patch (ac1a01f) dereferenced an end iterator, breaking some
tests (e.g. https://lab.llvm.org/buildbot/#/builders/17/builds/3116).
This updated patch only accesses the iterator when it's valid.
2024-10-24 17:07:27 -04:00
Dimitry Andric
4bce21480f
Ensure !NDEBUG with LLVM_ENABLE_ABI_BREAKING_CHECKS does not segfault (#113588)
In SelectionDAG, `TargetTransformInfo::hasBranchDivergence()` can be
called when both `NDEBUG` and `LLVM_ENABLE_ABI_BREAKING_CHECKS` are
enabled. In that case, the class member `TTI` is still initialized to
`nullptr`, causing a segfault.

Fix this by ensuring that all the calls to `hasBranchDivergence` and
`VerifyDAGDivergence` only occur when `NDEBUG` is disabled, and
`LLVM_ENABLE_ABI_BREAKING_CHECKS` is enabled.
2024-10-24 19:30:38 +02:00
Zaara Syeda
f3131c99bf
[GlobalMerge] Aggressively merge constants to reduce TOC entries (#111756)
Symbols that get mapped into the read-only section are loaded as part of
the text segment and will always need a TOC entry to be addressable. Add
an option to aggressively merge these read only globals to reduce TOC
usage.
2024-10-24 10:16:39 -04:00
Nuno Lopes
509af087cc replace 2 placeholder uses of undef with poison [NFC] 2024-10-24 09:01:25 +01:00
Kazu Hirata
141574bacb
[llvm] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#113415) 2024-10-23 10:44:09 -07:00
Vladimir Radosavljevic
401d123a1f
[MCP] Optimize copies when src is used during backward propagation (#111130)
Before this patch, redundant COPY couldn't be removed for the following
case:
```
  $R0 = OP ...
  ... // Read of %R0
  $R1 = COPY killed $R0
```
This patch adds support for tracking the users of the source register
during backward propagation, so that we can remove the redundant COPY in
the above case and optimize it to:
```
  $R1 = OP ...
  ... // Replace all uses of %R0 with $R1
```
2024-10-23 13:37:02 +02:00
Akshat Oke
c4c60c0db9
[CodeGen][NewPM] Port OptimizePHIs to NPM (#113433) 2024-10-23 16:55:21 +05:30
Augusto Noronha
8234f8ae26
[DebugInfo] Emit linkage name into DWARF for types for Swift (#112802)
Store Swift mangled names in DW_AT_linkage_name. The Swift compiler
emits only the type mangled name in debug information, and LLDB uses
those mangled names as keys to look up size, alignment, fields, etc
from either reflection metadata or Swift modules.

Additionally, emit types linkage names for types into the accelerator
table if they exist and they're different from the display name.
2024-10-22 16:47:58 -07:00
Heejin Ahn
5c92f2331c
[WebAssembly] Fix MIR printing of reference types (#113028)
When printing a memory operand in MIR, this line

d37bc32a65/llvm/lib/CodeGen/MachineOperand.cpp (L1247)
calls this
d37bc32a65/llvm/include/llvm/Support/Alignment.h (L238)
which assumes `Rhs` (the size in this case) is positive.

But Wasm reference types' size is set to 0:
d37bc32a65/llvm/include/llvm/CodeGen/ValueTypes.td (L326-L328)

`getSize() > 0` condition was added with the Wasm reference types
support in
46667a1003,
and it looks it was removed in #84751. This revives the condition so
that Wasm reference types will not crash the MIR printer.
2024-10-22 13:48:00 -07:00
Daniel Hoekwater
f66bc4d3f1
Revert "Reland [CFIFixup] Factor CFI remember/restore insertion into a helper (NFC)" (#113340)
Reverts llvm/llvm-project#113328

This change breaks a number of builds (e.g
https://lab.llvm.org/buildbot/#/builders/25/builds/3504), for some
reason. Reverting to do some troubleshooting.
2024-10-22 12:50:15 -04:00