12861 Commits

Author SHA1 Message Date
Kazu Hirata
07eb7b7692
[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>.  Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:

  template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};

We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
2025-08-18 07:01:29 -07:00
Kazu Hirata
cbf5af9668
[llvm] Remove unused includes (NFC) (#154051)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-08-17 23:46:35 -07:00
Nikita Popov
01bc742185
[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817)
This ensures that the required fields are set, and also makes the
construction more convenient.
2025-08-15 18:06:07 +02:00
Matt Arsenault
4aae7bc625
ARM: Move half convert libcall config to tablegen (#153389) 2025-08-14 17:35:58 +09:00
Elvis Wang
01fac67e2a
[TTI] Add cost kind to getAddressComputationCost(). NFC. (#153342)
This patch add cost kind to `getAddressComputationCost()` for #149955.

Note that this patch also remove all the default value in `getAddressComputationCost()`.
2025-08-14 16:01:44 +08:00
Matt Arsenault
bbcac029db
ARM: Move more aeabi libcall config into tablegen (#152109) 2025-08-14 15:43:15 +09:00
David Green
d9d9d9ad19
[ARM][MVE] Add shuffle costs for LDn and STn instructions. (#145304)
LD2 is represented in IR as deinterleave-shuffle(load), and ST2 as
store(interleave-shuffle). Whilst the shuffle would be expensive in
general for MVE (it does not have zip/uzp instructions), it should be
treated as cheap when part of the LD2/ST2 pattern. This borrows some
code from the AArch64 backed to produce lower costs. (Some of which
still shows as higher than it should - that just shows how broken the
generic shuffle costs are at the moment, they would be lower if
getShuffleCost was called directly as opposed to going through
getInstructionCost).
2025-08-14 06:59:37 +01:00
Matt Arsenault
32f1fe3770
ARM: Move calling conv config to RuntimeLibcalls (#152065)
Consolidate module level ABI into RuntimeLibcalls
2025-08-14 08:36:03 +09:00
David Green
06d2d1e156
[ARM] Protect against odd sized vectors in isVTRNMask and friends (#153413)
Fixes the issue reported on #153138, where odd-sized vectors would cause
the checks to iterate off the end of the mask.
2025-08-13 20:57:46 +01:00
Kazu Hirata
3fe9332134
[ARM] Remove unnecessary casts (NFC) (#153357)
getMSRMask() and getBankedReg() already return unsigned.
2025-08-13 10:37:14 -07:00
Daniel Paoliello
2a82e23146
Fix handling of dontcall attributes for arches that lower calls via fastSelectInstruction (#153302)
Recently my change to avoid duplicate `dontcall` attribute errors
(#152810) caused the Clang `Frontend/backend-attribute-error-warning.c`
test to fail on Arm32:
<https://lab.llvm.org/buildbot/#/builders/154/builds/20134>

The root cause is that, if the default `IFastSel` path bails, then
targets are given the opportunity to lower instructions via
`fastSelectInstruction`. That's the path taken by Arm32 and since its
implementation of `selectCall` didn't call `diagnoseDontCall` no error
was emitted.

I've checked the other implementations of `fastSelectInstruction` and
the only other one that lowers call instructions in WebAssembly, so I've
fixed that too.
2025-08-12 16:12:22 -07:00
Min-Yih Hsu
ca05058b49
[IA][RISCV] Recognize deinterleaved loads that could lower to strided segmented loads (#151612)
Turn the following deinterleaved load patterns
```
%l = masked.load(%ptr, /*mask=*/110110110110, /*passthru=*/poison)
%f0 = shufflevector %l, [0, 3, 6, 9]
%f1 = shufflevector %l, [1, 4, 7, 10]
%f2 = shufflevector %l, [2, 5, 8, 11]
```
into
```
%s = riscv.vlsseg2(/*passthru=*/poison, %ptr, /*mask=*/1111)
%f0 = extractvalue %s, 0
%f1 = extractvalue %s, 1
%f2 = poison
```
The mask `110110110110` is regarded as 'gap mask' since it effectively skips the entire third field / component.

Similarly,  turning the following snippet
```
%l = masked.load(%ptr, /*mask=*/110000110000, /*passthru=*/poison)
%f0 = shufflevector %l, [0, 3, 6, 9]
%f1 = shufflevector %l, [1, 4, 7, 10]
```
into
```
%s = riscv.vlsseg2(/*passthru=*/poison, %ptr, /*mask=*/1010)
%f0 = extractvalue %s, 0
%f1 = extractvalue %s, 1
```

Right now this patch only tries to detect gap mask from a constant mask supplied to a masked.load/vp.load.
2025-08-12 14:08:18 -07:00
Luke Lau
acb86fb9e0
[TTI] Consistently pass the pointer type to getAddressComputationCost. NFCI (#152657)
In some places we were passing the type of value being accessed, in
other cases we were passing the type of the pointer for the access.

The most "involved" user is
LoopVectorizationCostModel::getMemInstScalarizationCost, which is the
only call site that passes in the SCEV, and it passes along the pointer
type.

This changes call sites to consistently pass the pointer type, and
renames the arguments to clarify this.

No target actually checks the contents of the type passed, only to see
if it's a vector or not, so this shouldn't have an effect.
2025-08-11 18:00:12 +08:00
Nikita Popov
e92b7e9641
[CodeGen] Provide original IR type to CC lowering (NFC) (#152709)
It is common to have ABI requirements for illegal types: For example,
two i64 argument parts that originally came from an fp128 argument may
have a different call ABI than ones that came from a i128 argument.

The current calling convention lowering does not provide access to this
information, so backends come up with various hacks to support it (like
additional pre-analysis cached in CCState, or bypassing the default
logic entirely).

This PR adds the original IR type to InputArg/OutputArg and passes it
down to CCAssignFn. It is not actually used anywhere yet, this just does
the mechanical changes to thread through the new argument.
2025-08-11 08:57:53 +02:00
AZero13
6a425f1e54
[ARM] Have custom lowering for ucmp and scmp (#149315)
Limited to non-thumb1 for scmp at the moment, since there is no good way
to do it.
2025-08-08 06:51:18 +01:00
Kazu Hirata
62fc0028bf
[Target] Remove unnecessary casts (NFC) (#152262)
value() already returns uint64_t.
2025-08-06 07:11:07 -07:00
eleviant
907b7d0f07
[ARM] Fix inline asm register validation for vector types (#152175)
Patch allows following piece of code to be successfully compiled:
```
register uint8x8_t V asm("d3") = vdup_n_u8(0xff);
```
2025-08-06 10:30:49 +02:00
Matt Arsenault
342bf58f93
RuntimeLibcalls: Add entries for __security_check_cookie (#151843)
Avoids hardcoding string name based on target, and gets
the entry in the centralized list of emitted calls.
2025-08-06 10:26:36 +09:00
Matt Arsenault
d44754c344
ARM: Remove redundant or buggy config of __aeabi_d2h (#152126)
This was set if `TT.isTargetAEABI()`. This was previously set above
if `TM.isAAPCS_ABI() && (TT.isTargetAEABI() || TT.isTargetGNUAEABI() ||
                         TT.isTargetMuslAEABI() || TT.isAndroid())`.

So this could differ based on a manually specified -target-abi flag due
to the `isAAPCS_ABI` part of the original condition. I'm guessing
these should be consistent, so either this second group of
setLibcallImpl
calls should have been guarded by the `isAAPCS_ABI` check, or the first
condition should remove it.

There doesn't appear to be any meaningful test coverage using the
manually specified ABI option, so #152108 tries to remove it
2025-08-06 08:48:01 +09:00
Matt Arsenault
1392edcc07
ARM: Remove idiv runtime call aliases (#152098)
Really only the i32 variants exist. We don't need synthetic
aliases for illegal types which will be promoted.
2025-08-05 17:49:22 +09:00
Matt Arsenault
e848959cb5
ARM: Remove CPU from computeTargetABI (#151983)
The target CPU is a subtarget / function level concept, which
should not influence the module level ABI decisions. No tests fail
so it appears nothing is relying on this.
2025-08-05 10:21:50 +09:00
Craig Topper
f58bc72759 Revert "[X86][ARM][RISCV][XCore][M68K] Invert the low bit to get the inverse predicate (NFC) (#151748)"
This reverts commit 518703806286c98bac7b84156738839f8bd55bef.

Failing M68k build bot.
2025-08-04 15:24:52 -07:00
AZero13
5187038062
[X86][ARM][RISCV][XCore][M68K] Invert the low bit to get the inverse predicate (NFC) (#151748)
All these platforms defined their predicate in such a way to allow bit
twiddling to get inverse predicates
2025-08-04 14:45:04 -07:00
Fangrui Song
85f00707dd MCSymbolMachO: Migrate away from classof
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
2025-08-03 17:48:50 -07:00
Fangrui Song
e640ca8b9a MCSymbolELF: Migrate away from classof
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
2025-08-03 15:45:36 -07:00
Fangrui Song
5570ce5cef MCSymbolELF: Migrate away from classof
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
2025-08-03 15:17:13 -07:00
Fangrui Song
bc0f696b1f MCSymbol: Replace isELF with MCContext::isELF 2025-08-03 12:40:55 -07:00
Fangrui Song
d3589edafc MCAsmBackend::applyFixup: Change Data to indicate the relocated location
`Data` now references the first byte of the fixup offset within the current fragment.

MCAssembler::layout asserts that the fixup offset is within either the
fixed-size content or the optional variable-size tail, as this is the
most the generic code can validate without knowing the target-specific
fixup size.

Many backends applyFixup assert
```
assert(Offset + Size <= F.getSize() && "Invalid fixup offset!");
```

This refactoring allows a subsequent change to move the fixed-size
content outside of MCSection::ContentStorage, fixing the
-fsanitize=pointer-overflow issue of #150846

Pull Request: https://github.com/llvm/llvm-project/pull/151724
2025-08-02 09:27:06 -07:00
AZero13
23022a4683
[SelectionDAG] Move sign pattern check from AArch64 and ARM to general SelectionDAG (#151736)
This works on all cases much like the XOR case above it in SelectionDAG.
2025-08-01 14:46:51 -07:00
Fangrui Song
491c7bdd58 MCAsmBackend::applyFixup: Replace Data.getSize() with F.getSize()
to facilitate replacing `MutableArrayRef<char> Data` (fragment content)
with the relocated location. This is necessary to fix the
pointer-overflow sanitizer issue and reland #150846
2025-08-01 00:31:51 -07:00
Prabhu Rajasekaran
17ccb849f3
[llvm] Extract and propagate callee_type metadata
Update MachineFunction::CallSiteInfo to extract numeric CalleeTypeIds
from callee_type metadata attached to indirect call instructions.

Reviewers: nikic, ilovepi

Reviewed By: ilovepi

Pull Request: https://github.com/llvm/llvm-project/pull/87575
2025-07-30 14:56:39 -07:00
Fangrui Song
96a9e8c9fa ARM: Migrate away from MCFragment::addFixup 2025-07-28 21:43:48 -07:00
Ellis Hoag
819f020b28
Use F.hasOptSize() instead of checking optsize directly (#147348) 2025-07-28 08:38:52 -07:00
Nikita Popov
fe0dbe0f29
[CodeGen] More consistently expand float ops by default (#150597)
These float operations were expanded for scalar f32/f64/f128, but not
for f16 and more problematically, not for vectors. A small subset of
them was separately set to expand for vectors.

Change these to always expand by default, and adjust targets to mark
these as legal where necessary instead.

This is a much safer default, and avoids unnecessary legalization
failures because a target failed to manually mark them as expand.

Fixes https://github.com/llvm/llvm-project/issues/110753.
Fixes https://github.com/llvm/llvm-project/issues/121390.
2025-07-28 09:46:00 +02:00
Fangrui Song
e9de1ee9f5 MC: Move useCodeAlign from MCSection to MCAsmInfo
To centralize assembly-related virtual functions to MCAsmInfo and move
toward making MCSection non-virtual.
2025-07-26 11:34:10 -07:00
Fangrui Song
2571924ad6 MCSectionELF: Remove classof
The object file format specific derived classes are used in context like
MCStreamer and MCObjectTargetWriter where the type is statically known.
We don't use isa/dyn_cast and we want to eliminate
MCSection::SectionVariant in the base class.
2025-07-25 09:50:19 -07:00
Fangrui Song
da7ec1ef0e ARMELFStreamer: Simplify annotateTLSDescriptorSequence with addFixup 2025-07-25 00:39:39 -07:00
eleviant
a4796b14fc
[ARM] Emit error message when incompatible reg is specified (#147559)
At the moment the following piece of code causes undefined behavior:
```
int a;
void b() {
   register float d2 asm("d2") = a;
   asm("" ::"r"(d2));
}
```
This happens because variable and register types are incompatible.
2025-07-24 19:19:25 +02:00
Philip Reames
dbd9eae95a
[IA] Support vp.store in lowerinterleavedStore (#149605)
Follow up to 28417e64, and the whole line of work started with 4b81dc7.

This change merges the handling for VPStore - currently in
lowerInterleavedVPStore - into the existing dedicated routine used in
the shuffle lowering path. This removes the last use of the dedicated
lowerInterleavedVPStore and thus we can remove it.

This contains two changes which are functional.

First, like in 28417e64, merging support for vp.store exposes the
strided store optimization for code using vp.store.

Second, it seems the strided store case had a significant missed
optimization. We were performing the strided store at the full unit
strided store type width (i.e. LMUL) rather than reducing it to match
the input width. This became obvious when I tried to use the mask
created by the helper routine as it caused a type incompatibility.

Normally, I'd try not to include an optimization in an API rework, but
structuring the code to both be correct for vp.store and not optimize
the existing case turned out be more involved than seemed worthwhile. I
could pull this part out as a pre-change, but its a bit awkward on it's
own as it turns out to be somewhat of a half step on the possible
optimization; the full optimization is complex with the old code
structure.

---------

Co-authored-by: Craig Topper <craig.topper@sifive.com>
2025-07-22 15:50:17 -07:00
Fangrui Song
ba6b705620 MC: Replace getOrCreateDataFragment with getCurrentFragment
Add an assert to ensure `CurFrag` is either null or an `FT_Data` fragment.

Follow-up to 39c8cfb70d203439e3296dfdfe3d41f1cb2ec551.
Extracted from #149721
2025-07-20 11:02:26 -07:00
Philip Reames
28417e6459
[IA] Support vp.load in lowerInterleavedLoad [nfc-ish] (#149174)
This continues in the direction started by commit 4b81dc7. We
essentially merges the handling for VPLoad - currently in
lowerInterleavedVPLoad - into the existing dedicated routine. This
removes the last use of the dedicate lowerInterleavedVPLoad and thus we
can remove it.

This isn't quite NFC as the main callback has support for the strided
load optimization whereas the VPLoad specific version didn't. So this
adds the ability to form a strided load for a vp.load deinterleave with
one shuffle used.
2025-07-17 17:29:28 -07:00
Kazu Hirata
2da59287aa
[Target] Remove unnecessary casts (NFC) (#149342)
getFunction().getParent() already returns Module *.
2025-07-17 15:24:25 -07:00
Fangrui Song
6d0f573535 MCFragment: Remove MCDataFragment/MCRelaxableFragment type aliases
Follow-up to #148544
2025-07-15 22:14:39 -07:00
Fangrui Song
dc3a4c0fcf
MC: Restructure MCFragment as a fixed part and a variable tail
Refactor the fragment representation of `push rax; jmp foo; nop; jmp foo`,
previously encoded as
`MCDataFragment(nop); MCRelaxableFragment(jmp foo); MCDataFragment(nop); MCRelaxableFragment(jmp foo)`,

to

```
MCFragment(fixed: push rax, variable: jmp foo)
MCFragment(fixed: nop, variable: jmp foo)
```

Changes:

* Eliminate MCEncodedFragment, moving content and fixup storage to MCFragment.
* The new MCFragment contains a fixed-size content (similar to previous
  MCDataFragment) and an optional variable-size tail.
* The variable-size tail supports FT_Relaxable, FT_LEB, FT_Dwarf, and
  FT_DwarfFrame, with plans to extend to other fragment types.
  dyn_cast/isa should be avoided for the converted fragment subclasses.
* In `setVarFixups`, source fixup offsets are relative to the variable part's start.
  Stored fixup (in `FixupStorage`) offsets are relative to the fixed part's start.
  A lot of code does `getFragmentOffset(Frag) + Fixup.getOffset()`,
  expecting the fixup offset to be relative to the fixed part's start.
* HexagonAsmBackend::fixupNeedsRelaxationAdvanced needs to know the
  associated instruction for a fixup. We have to add a `const MCFragment &` parameter.
* In MCObjectStreamer, extend `absoluteSymbolDiff` to apply to
  FT_Relaxable as otherwise there would be many more FT_DwarfFrame
  fragments in -g compilations.

https://llvm-compile-time-tracker.com/compare.php?from=28e1473e8e523150914e8c7ea50b44fb0d2a8d65&to=778d68ad1d48e7f111ea853dd249912c601bee89&stat=instructions:u

```
stage2-O0-g instructins:u geomeon (-0.07%)
stage1-ReleaseLTO-g (link only) max-rss geomean (-0.39%)
```

```
% /t/clang-old -g -c sqlite3.i -w -mllvm -debug-only=mc-dump &| awk '/^[0-9]+/{s[$2]++;tot++} END{print "Total",tot; n=asorti(s, si); for(i=1;i<=n;i++) print si[i],s[si[i]]}'
Total 59675
Align 2215
Data 29700
Dwarf 12044
DwarfCallFrame 4216
Fill 92
LEB 12
Relaxable 11396
% /t/clang-new -g -c sqlite3.i -w -mllvm -debug-only=mc-dump &| awk '/^[0-9]+/{s[$2]++;tot++} END{print "Total",tot; n=asorti(s, si); for(i=1;i<=n;i++) print si[i],s[si[i]]}'
Total 32287
Align 2215
Data 2312
Dwarf 12044
DwarfCallFrame 4216
Fill 92
LEB 12
Relaxable 11396
```

Pull Request: https://github.com/llvm/llvm-project/pull/148544
2025-07-15 21:56:55 -07:00
Kazu Hirata
39dd6cdd57
[ARM] Remove an unnecessary cast (NFC) (#148869)
TII is already of const ARMBaseInstrInfo *.  This patch removes AII in
favor of TII.
2025-07-15 20:47:38 -07:00
Brad Smith
0d2e11f3e8
Remove Native Client support (#133661)
Remove the Native Client support now that it has finally reached end of life.
2025-07-15 13:22:33 -04:00
Fangrui Song
5ba458c559 MCFixup: Replace getTargetKind with getKind 2025-07-15 00:21:07 -07:00
Fangrui Song
0b674f4c52 MCFixup: Replace getTargetKind with getKind
MCFixupKind is now a type alias (fixup kinds are inherently
target-specific). getTargetKind is no longer necessary.
2025-07-15 00:08:45 -07:00
Kazu Hirata
7c83d66719
[llvm] Remove unused includes (NFC) (#148768)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-14 22:19:14 -07:00
jjasmine
6640b0a293
[WebAssembly] Add patterns for relaxed madd (#147487)
[WebAssembly] Fold fadd contract (fmul contract) to relaxed madd w/
-mattr=+simd128,+relaxed-simd

Fixes #121311

- Precommit test for #121311
- Fold fadd contract (fmul contract) to relaxed madd w/
-mattr=+simd128,+relaxed-simd
- Move PatFrag of fadd_contract in ARM.td and WebAssembly.td to
TargetSelectionDAG.td for reuse of pattern
2025-07-15 00:56:28 +08:00