35598 Commits

Author SHA1 Message Date
Sudharsan Veeravalli
e005a09df5
[RISCV][TypePromotion] Dont generate truncs if PromotedType is greater than Source Type (#86941)
We currently check if the source and promoted types are not equal before
generating truncate instructions. This does not work for RV64 where the
promoted type is i64 and this lead to a crash due to the generation of
truncate instructions from i32 to i64.

Fixes #86400
2024-03-28 21:22:05 -07:00
Craig Topper
23d45e55ed
[MCP] Remove dead copies from basic blocks with successors. (#86973)
Previously we wouldn't remove dead copies from basic blocks with
successors. The comment said we didn't want to trust the live-in lists.
The comment is very old so I'm not sure if that's still a concern today.

This patch checks the live-in lists and removes copies from
MaybeDeadCopies if they are referenced by any live-ins in any
successors. We only do this if the tracksLiveness property is set. If
that property is not set, we retain the old behavior.
2024-03-28 14:43:49 -07:00
Craig Topper
f90813543b
[MCP] Use MachineInstr::all_defs instead of MachineInstr::defs in hasOverlappingMultipleDef. (#86889)
defs does not return the defs for inline assembly. We need to use
all_defs to find them.

Fixes #86880.
2024-03-28 08:37:19 -07:00
Jonas Paulsson
94b5c118b3
[ISel] Move handling of atomic loads from SystemZ to DAGCombiner (NFC). (#86484)
The folding of sign/zero extensions into an atomic load by specifying an
extension type is not target specific, and therefore belongs in the
DAGCombiner rather than in the SystemZ backend.

- Handle atomic loads similarly to regular loads by adding
AtomicLoadExtActions with set/get methods.
- Move SystemZ extendAtomicLoad() to DagCombiner.cpp.
2024-03-28 16:14:35 +01:00
Luke Lau
856e815ca1
[DAGCombiner] Set disjoint flag in add->or and xor->or combines (#86925)
We check DAG.haveNoCommonBitsSet so the operands will be known to be
disjoint.

I couldn't think of a codegen test case since most targets aren't
checking hasDisjoint yet, apart from RISCV in the or_is_add pattern, but
it also falls back to computeKnownBits.
2024-03-28 18:08:59 +08:00
Shilei Tian
a8b90c047d
[GlobalISel] Update MachineIRBuilder::buildAtomicRMW interface (#86851) 2024-03-27 17:41:30 -04:00
Craig Topper
acab142751 [LegalizeDAG] Freeze index when converting insert_elt/insert_subvector to load/store on stack.
We try clamp the index to be within the bounds of the stack object
we create, but if we don't freeze it, poison can propagate into the
clamp code. This can cause the access to leave the bounds of the
stack object.

We have other instances of this issue in type legalization and extract_elt/subvector,
but posting this patch first for direction check.

Fixes #86717
2024-03-27 13:01:23 -07:00
Craig Topper
baf66ec061
[Target][RISCV] Add HwMode support to subregister index size/offset. (#86368)
This is needed to provide proper size and offset for the GPRPair subreg
indices on RISC-V. The size of a GPR already uses HwMode. Previously we
said the subreg indices have unknown size and offset, but this stops
DwarfExpression::addMachineReg from being able to find the registers
that make up the pair.

I believe this fixes https://github.com/llvm/llvm-project/issues/85864
but need to verify.
2024-03-27 12:19:28 -07:00
Craig Topper
1c965801c4
[LegalizeDAG] Merge PerformInsertVectorEltInMemory into ExpandInsertToVectorThroughStack. NFC (#86755)
These functions are very similar. We can share them like we do for
EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR.
2024-03-27 09:39:35 -07:00
Simon Pilgrim
78f0871bee Revert rG58de1e2c5eee548a9b365e3b1554d87317072ad9 "Fix stack layout for frames larger than 2gb (#84114)"
This is failing on some EXPENSIVE_CHECKS buildbots
2024-03-27 16:16:15 +00:00
Wesley Wiser
58de1e2c5e
Fix stack layout for frames larger than 2gb (#84114)
For very large stack frames, the offset from the stack pointer to a local can be more than 2^31 which overflows various `int` offsets in the frame lowering code.

This patch updates the frame lowering code to calculate the offsets as 64-bit values and resolves the overflows, resulting in the correct codegen for very large frames.

Fixes #48911
2024-03-27 15:05:58 +00:00
Justin Cady
26464f2662
[FreeBSD] Mark __stack_chk_guard dso_local except for PPC64 (#86665)
Adjust logic of 1cb9f37a17ab to match freebsd/freebsd-src@9a4d48a645.

D113443 is the original attempt to bring this FreeBSD patch to
llvm-project,
but it never landed. This change is required to build FreeBSD kernel
modules
with -fstack-protector using a standard LLVM toolchain. The FreeBSD
kernel
loader does not handle R_X86_64_REX_GOTPCRELX relocations.

Fixes #50932.
2024-03-27 09:03:46 -04:00
Simon Pilgrim
9247f3185c [DAG] foldAddSubOfSignBit - reuse existing SDLoc instead of regenerating it. NFC. 2024-03-27 12:22:31 +00:00
Simon Pilgrim
51388fbab1 [DAG] visitSub - reuse existing SDLoc instead of regenerating it. NFC. 2024-03-27 12:22:30 +00:00
Michael Maitland
d345599c28 [GISEL][NFC] Use getElementCount instead of getNumElements in more places
These cases in particular are  done as a precommit to support
legalization, regbank selection, and instruction selection for extends,
splat vectors, and integer compares in #85938.
2024-03-26 17:41:46 -07:00
Michael Maitland
54a9f0e441
[RISCV][GISEL] Legalize, regbankselect, and instruction-select G_VSCALE (#85967)
G_VSCALE should be lowered using VLENB. If the type is not sXLen it
should be lowered using a G_VSCALE on the narrow type and a G_MUL.
regbank select and instruction select are straightforward so we really
only need to add tests to show it works.
2024-03-26 20:17:22 -04:00
Craig Topper
09155ac290 [LegalizeDAG] Remove unneeded temporary SDValues from PerformInsertVectorEltInMemory. NFC
There were 3 temporaries that just renamed the 3 well name arguments to the
function to Tmp1-3. Looks like this was done when the code was extracted from
elsewhere into a separate function 15 years ago.
2024-03-26 16:25:24 -07:00
Emil Pedersen
0e5c504d3d
[DebugInfo] [SelectionDAG] Fix handling of duplicate dbg values (#86598)
Before this fix, a duplicate llvm.dbg.value intrinsic referring to an
argument, after an alloca, would be generated with `$noreg`, losing
debug information. Instead, we silently drop the second debug info, so
it doesn't break the first one.

rdar://125375717
2024-03-26 12:09:22 -07:00
Simon Pilgrim
1c9d5c25ae [DAG] foldAddSubBoolOfMaskedVal - reuse existing SDLoc instead of regenerating it. NFC. 2024-03-26 18:33:30 +00:00
Thorsten Schütt
da6cc4a24f
[CodeGen] Add nneg and disjoint flags (#86650)
MachineInstr learned the new flags.
2024-03-26 18:44:34 +01:00
David Green
47f4a07a2f
[GlobalISel] Add Knownbits for G_LOAD/ZEXTLOAD/SEXTLOAD with range metadata (#86431)
Similar to #80829 for GlobalISel.
2024-03-26 13:42:08 +00:00
Il-Capitano
308ed0233a
[Intrinsics] Make patchpoint.i64 generic on its return type (#85911)
Currently patchpoints can only have two result types, `void` and `i64`.
This limits the result to general purpose registers.
This patch makes `patchpoint.i64` an overloadable intrinsic, allowing
result values that can fit in a single register (e.g. integers,
pointers, floats).
2024-03-26 19:08:52 +05:30
Simon Pilgrim
5fc619b5ee [DAG] Update ISD::AVG folds to use hasOperation to allow Custom matching prior to legalization
Fixes issue where AVX1 targets weren't matching 256-bit AVGCEILU cases.
2024-03-26 10:41:07 +00:00
Simon Pilgrim
c7198e0af3
[DAG] Fold insert_subvector(N0, extract_subvector(N0, N2), N2) --> N0 (#86487)
Handle the case where we've ended up inserting back into the source vector we extracted the subvector from.
2024-03-26 10:03:42 +00:00
David Green
fbc247367a
[AArch64][GlobalISel] Legalization for small anyext/sext/zext (#86438)
Similar to #85625, some of the codegen is still far from optimal but
this helps fix quite a few fallback cases.
2024-03-26 09:48:06 +00:00
David Green
4d315ff382
[GlobalISel] Add CTLZ known bits. (#86436)
Replicated from SDAG.
2024-03-26 09:11:35 +00:00
Bevin Hansson
14c30189fb
[ExpandLargeFpConvert] Fix incorrect values in fp-to-int conversion. (#86514)
The IR for a double-to-i129 conversion looks like this in one of the
blocks in compiler-rt:

  %cmp5.i = icmp ult i16 %3, -129, !dbg !24

But in ExpandLargeFpConvert, it looks like:

  %13 = icmp ult i129 %12, 4294967167, !dbg !19

ExpandLargeFpConvert is wrong; the value should have been
signed before negating, but instead we get a very large
unsigned value. Another value in the same pass also has this
issue.
2024-03-26 10:08:22 +01:00
Shilei Tian
0a4299403e
[GlobalISel] Fold G_CTTZ if possible (#86224)
This patch tries to fold `G_CTTZ` if possible.
2024-03-25 16:55:37 -04:00
Michael Maitland
9056ce8804 Revert "[RISCV][GISEL] Legalize G_VSCALE"
This reverts commit 47681506ded30fada68f180b5e80f740bc76abcd. It is not
consistent with SelectionDAG.
2024-03-25 11:46:02 -07:00
Michael Maitland
47681506de [RISCV][GISEL] Legalize G_VSCALE
G_VSCALE should be lowered using VLENB.
2024-03-25 10:44:58 -07:00
Michael Maitland
865294b2e6
[CodeGen][MISched] Add misched post-regalloc bidirectional scheduling (#77138)
This PR is stacked on #76186.

This PR keeps the default strategy as top-down since that is what
existing targets expect. It can be enabled using
`-misched-postra-direction=bidirectional`.

It is up to targets to decide whether they would like to enable this
option for themselves.
2024-03-25 10:10:35 -04:00
AtariDreams
f5a067bb90
[SelectionDAG]: Deduce KnownNeverZero from SMIN and SMAX (#85722) 2024-03-25 10:35:28 +00:00
houndlord
9632e1515c
Match fixed width ISD::AVGFLOORS + ISD::AVGCEILS patterns (#86222) 2024-03-24 15:33:16 +00:00
Owen Anderson
7c9b5228da
Only check assertions that were meant to apply to the normal case of non-splat vector SREM expansion when we aren't hitting the special case. (#86238)
Fixes https://github.com/llvm/llvm-project/issues/84830
Introduced in https://github.com/llvm/llvm-project/pull/82706
2024-03-23 21:49:29 -05:00
Harvin Iriawan
57146daeaa
[CodeGen] Update for scalable MemoryType in MMO (#70452)
Remove getSizeOrUnknown call when MachineMemOperand is created.  For Scalable
TypeSize, the MemoryType created becomes a scalable_vector.

2 MMOs that have scalable memory access can then use the updated BasicAA that
understands scalable LocationSize.

Original Patch by Harvin Iriawan
Co-authored-by: David Green <david.green@arm.com>
2024-03-23 12:56:25 +00:00
Evgenii Kudriashov
d365a45cb3
[GlobalISel] Introduce G_TRAP, G_DEBUGTRAP, G_UBSANTRAP (#84941)
Here we introduce three new GMIR instructions to cover a set of trap
intrinsics. The idea behind it is that generic intrinsics shouldn't be
used with G_INTRINSIC opcode.

These new instructions can match perfectly with existing trap ISD nodes.
It allows X86, AArch64, RISCV and Mips to reuse SelectionDAG patterns for
selection and avoid manual selection. However AMDGPU is an exception. It
selects traps during legalization regardless SelectionDAG or GlobalISel.

Since there are not many places where traps are used, this change
attempts to clean up all the usages of G_INTRINSIC with trap intrinsics. So,
there is no stage when both G_TRAP and
G_INTRINSIC_W_SIDE_EFFECTS(@llvm.trap) are allowed.
2024-03-23 13:12:44 +01:00
Yingwei Zheng
6c1932ffd8
[LLVM] Pass APInt by const reference. NFC. (#86278)
This patch adjusts argument passing for `APInt` to improve the
compile-time.
Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=32d6611af69bf4e76373f9bc7d9649650f760e48&stat=instructions:u
2024-03-23 14:57:35 +08:00
Craig Topper
fb329f1844
[Target] Move SubRegIdxRanges from MCSubtargetInfo to TargetInfo. (#86245)
I'm planning to add HwMode support to SubRegIdxRanges for RISC-V GPR
pairs. The MC layer is currently unaware of the HwMode for registers and
I'd like to keep it that way.

This information is not used by the MC layer so I think it is safe to
move it.
2024-03-22 11:15:45 -07:00
Simon Pilgrim
ceabaa7e7a [DAG] Fix some missing formatting when I rewrote the SUB(MAX,MIN) -> ABD patterns. NFC. 2024-03-22 11:48:03 +00:00
XChy
cb4453dc69
[SelectionDAG] Prevent combination on inconsistent type in combineCarryDiamond (#84888)
Fixes #84831
When matching carry pattern with `getAsCarry`, it may produce different
type of carryout. This patch checks such case and does early exit.

I'm new to DAG, any suggestion is appreciated.
2024-03-22 16:05:20 +05:30
Craig Topper
c67ed2f1e1
[SelectionDAG][RISCV] Use TypeSize version of ComputeValueVTs in TargetLowering::LowerCallTo. (#86166)
This is needed to support non-intrinsic functions returning tuple types
which are represented as structs with scalable vector types in IR.

I suspect this may have been broken since
https://reviews.llvm.org/D158115
2024-03-21 20:35:08 -07:00
Jonas Paulsson
7564566779 Reapply "Move assertion for AdjustsStack from PEI to MachineVerifier (#85698)"
- The check is now actually done in both PEI and the MachineVerifier.
- More .mir tests trivially updated with "adjustsStack: true" as needed.
2024-03-21 20:24:57 -04:00
Simon Pilgrim
6942927609 [DAG] combineConcatVectorOfScalars - stop always creating UNDEF nodes. NFC.
Noticed in debug logs - most calls to visitVECTOR_SHUFFLE resulted into wasteful UNDEF node creations, despite almost never being used.
2024-03-21 16:37:48 +00:00
Simon Pilgrim
e4fa2e3562
[DAG] isGuaranteedNotToBeUndefOrPoisonForTargetNode - add fallback implementation (#86125)
Allow targets to rely on TargetLowering::isGuaranteedNotToBeUndefOrPoisonForTargetNode to test nodes for canCreateUndefOrPoisonForTargetNode + all arguments are isGuaranteedNotToBeUndefOrPoison.

Targets can still perform this themselves for specific special case nodes (e.g. target shuffles).

Matches the fallback in SelectionDAG::isGuaranteedNotToBeUndefOrPoison
2024-03-21 15:11:59 +00:00
Jonas Paulsson
b4b5e8277a
Check for all frame instructions in finalize isel. (#85945)
Check for all frame instructions in finalize isel, not just for the
frame setup opcode. This was proven necessary, see #78001 
for discussion.
2024-03-21 11:00:08 -04:00
AtariDreams
7e72cafd68
[SelectionDAG] Add MaskedValueIsZero check to allow folding of zero extended variables we know are safe to extend (#85573)
Add ones for every high bit that will cleared.

This will allow us to evaluate variables that have their bits known to
see if they have no risk of overflow despite the shift amount being
greater than the difference between the two types.
2024-03-21 16:45:17 +05:30
Simon Pilgrim
23de3862dc [DAG] visitSUB - use sd_match to match SUB(MAX,MIN) -> ABD pattern. NFC.
Seriously simplifies the commutation matching logic.
2024-03-21 09:55:50 +00:00
Simon Pilgrim
11aa95f83b [DAG] visitSUB - pull out repeated getScalarSizeInBits() calls. NFC. 2024-03-21 09:55:50 +00:00
Simon Pilgrim
7b5a5be2a7 [DAG] visitSUB/visitSUBO - move getAsNonOpaqueConstant into the if() where its used. NFC.
Noticed while beginning some cleanup for moving to pattern matchers
2024-03-21 09:19:12 +00:00
Madhur Amilkanthwar
7bb87d5338
[AArch64][GlobalISel] Take abs scalar codegen closer to SDAG (#84886)
This patch improves codegen for scalar (<128bits) version
of llvm.abs intrinsic by using the existing non-XOR based lowering.
This takes the generated code closer to SDAG.

codegen with GISel for > 128 bit types is not very good
with these method so not doing so.
2024-03-21 09:54:03 +05:30