36384 Commits

Author SHA1 Message Date
Jay Foad
f77f60400f [CodeGen] Remove checks that implicit operands are implicit 2024-09-03 13:09:17 +01:00
Him188
0748f4227c
[AArch64][GlobalISel] Legalize 128-bit types for FABS (#104753)
This patch adds a common lower action for `G_FABS`, which generates `and
x8, x8, #0x7fffffffffffffff` to reset the sign bit. The action does not
support vectors since `G_AND` does not support fp128.


This approach is different than what SDAG is doing. SDAG stores the
value onto stack, clears the sign bit in the most significant byte, and
loads the value back into register. This involves multiple memory ops
and sounds slower.
2024-09-03 12:47:26 +01:00
Michael Marjieh
00c198b2ca
[MachinePipeliner] Make Recurrence MII More Accurate (#105475)
Current RecMII calculation is bigger than it needs to be. The
calculation was refined in this patch.
2024-09-03 16:15:17 +09:00
Craig Topper
366ac8c090
[LegalizeVectorOps] Defer UnrollVectorOp in ExpandFNEG to caller. (#106783)
Make ExpandFNEG return SDValue() when it doesn't expand. The caller
already knows how to Unroll when Results is empty.
2024-09-02 16:16:12 -07:00
Sam Tebbs
44cfbef1b3
[AArch64] Lower partial add reduction to udot or svdot (#101010)
This patch introduces lowering of the partial add reduction intrinsic to
a udot or svdot for AArch64. This also involves adding a
`shouldExpandPartialReductionIntrinsic` target hook, which AArch64 will
return false from in the cases that it can be lowered.
2024-09-02 14:06:14 +01:00
Antonio Frighetto
e4e0dfb0c2 [CGP] Undo constant propagation of pointers across calls
It may be profitable to revert SCCP propagation of C++ static values,
if such constants are pointers, in order to avoid redundant pointer
computation, since the method returning the constant is non-removable.
2024-09-02 09:33:23 +02:00
Craig Topper
cd3667d1db
[CodeGen] Update a few places that were passing Register to raw_ostream::operator<< (#106877)
These would implicitly cast the register to `unsigned`. Switch most of
them to use printReg will give a more readable output. Change some
others to use Register::id() so we can eventually remove the implicit
cast to `unsigned`.
2024-09-02 00:19:19 -07:00
Yingwei Zheng
affc0c64b6
[SDAG] Expand vector [u|s]cmp in VectorLegalizer (#106883)
Address comment
https://github.com/llvm/llvm-project/pull/106747#issuecomment-2322922855.
2024-09-01 22:35:52 +08:00
Craig Topper
a3e2936173 [SelectionDAGISel] Use MCRegister and Register for LiveInMap. NFC
This matches the MachineBasicBlock liveins used to populate it.
2024-08-31 14:00:17 -07:00
Brandon Wu
db67a66e8e
Revert "[RISCV] RISCV vector calling convention (2/2)" (#97994)
This reverts commit 91dd844aa499d69c7ff75bf3156e2e3593a88057.

Stacked on https://github.com/llvm/llvm-project/pull/97993
2024-08-31 19:02:35 +08:00
Brandon Wu
dc03ee3cbb
[llvm][RISCV] Add RISCV vector tuple type to value types(MVT) (#97993)
Summary:
This patch handles the types(MVT) in `selectionDAG` for RISCV vector
tuples.
As described in previous patch handling llvm types, the MVTs also have
32 variants:
```
riscv_nxv1i8x2, riscv_nxv1i8x3, riscv_nxv1i8x4, riscv_nxv1i8x5, riscv_nxv1i8x6, riscv_nxv1i8x7, riscv_nxv1i8x8,
riscv_nxv2i8x2, riscv_nxv2i8x3, riscv_nxv2i8x4, riscv_nxv2i8x5, riscv_nxv2i8x6, riscv_nxv2i8x7, riscv_nxv2i8x8,
riscv_nxv4i8x2, riscv_nxv4i8x3, riscv_nxv4i8x4, riscv_nxv4i8x5, riscv_nxv4i8x6, riscv_nxv4i8x7, riscv_nxv4i8x8,
riscv_nxv8i8x2, riscv_nxv8i8x3, riscv_nxv8i8x4, riscv_nxv8i8x5, riscv_nxv8i8x6, riscv_nxv8i8x7, riscv_nxv8i8x8,
riscv_nxv16i8x2, riscv_nxv16i8x3, riscv_nxv16i8x4,
riscv_nxv32i8x2.
```

Detail:
An intuitive way to model vector tuple type is using nested scalable
vector, e.g. `nElts=NF, EltTy=nxv2i32`. However it's not compatible to
what we've done to handle scalable vector in TargetLowering, so it would
need more effort to change the code to handle this concept.
Another approach is encoding the `MinNumElts` info in `sz` of `MVT`,
e.g.
`nElts=NF, sz=(NF*MinNumElts*8)`, this makes it much easier to handle
and
changes less code.

This patch adopts the latter approach.

Stacked on https://github.com/llvm/llvm-project/pull/97992
2024-08-31 19:01:29 +08:00
Craig Topper
4f9ea258c4 [AsmPrinter] Don't store Dwarf register in Register. 2024-08-30 19:39:09 -07:00
Vitaly Buka
982d2445f2
Revert "AtomicExpand: Allow incrementally legalizing atomicrmw" (#106792)
Reverts llvm/llvm-project#103371

There is `heap-use-after-free`, commented on
206b5aff44a95754f6dd7a5696efa024e983ac59

Maybe `if (Next == E || BB != Next->getParent()) {` is enough,
but not sure, what was the intent there,
2024-08-30 13:51:53 -07:00
Philip Reames
c315d787e3 [VP] Reduce duplicate code in vp.reduce expansions
Primary goal is having one way of doing this, to ensure that we don't
end up with accidental divergence.
2024-08-30 12:34:56 -07:00
Craig Topper
c25293c6dd
[LegalizeVectorOps][RISCV] Don't promote VP_FABS/FNEG/FCOPYSIGN. (#106659)
Promoting canonicalizes NaNs which changes the semantics. Bitcast to
integer and use logic ops instead.
2024-08-30 09:44:51 -07:00
Matt Arsenault
206b5aff44
AtomicExpand: Allow incrementally legalizing atomicrmw (#103371)
If a lowering changed control flow, resume the legalization
loop at the first newly inserted block.

This will allow incrementally legalizing atomicrmw and cmpxchg.

The AArch64 test might be a bugfix. Previously it would lower
the vector FP case as a cmpxchg loop, but cmpxchgs get lowered
but previously weren't. Maybe it shouldn't be reporting cmpxchg
for the expand type in the first place though.
2024-08-30 19:11:45 +04:00
Philip Reames
924907bc6a
[DAG] Prefer 0.0 over -0.0 as neutral value for FADD w/NoSignedZero (#106616)
When getting a neutral value, we can prefer using a positive zero over a
negative zero if nsz is set on the FADD (or reduction). A positive zero
should be cheaper to materialize on basically all targets.

Arguably, we should be doing this kind of canonicalization in
DAGCombine, but we don't do that for any of the other reduction
variants, so this seems like path of least resistance. This does mean
that we can only do this for "fast" reductions. Just nsz isn't enough,
as that goes through the SEQ_FADD path where the IR level start value
isn't folded away.

If folks think this is to RISCV specific, let me know. There's a trivial
RISCV specific implementation. I went with the generic one as I through
this might benefit other targets.
2024-08-30 07:56:14 -07:00
Kai Luo
d2b8969b75
[EdgeBundles] Correct MBB label name in output graph when -view-edge-bundles. NFC. (#106661)
With `-view-edge-bundles`, before the change, the dot file output is
kinda like
```dot
digraph {
        "%bb.0" [ shape=box ]
        0 -> "%bb.0"
        "%bb.0" -> 1
        "%bb.0" -> "%bb.1" [ color=lightgray ]
        "%bb.0" -> "%bb.6" [ color=lightgray ]
        "%bb.1" [ shape=box ]
        1 -> "%bb.1"
        "%bb.1" -> 1
        "%bb.1" -> "%bb.2" [ color=lightgray ]
        "%bb.1" -> "%bb.6" [ color=lightgray ]
        "%bb.2" [ shape=box ]
        1 -> "%bb.2"
        "%bb.2" -> 1
        "%bb.2" -> "%bb.3" [ color=lightgray ]
        "%bb.3" [ shape=box ]
        1 -> "%bb.3"
        "%bb.3" -> 2
        "%bb.3" -> "%bb.4" [ color=lightgray ]
        "%bb.4" [ shape=box ]
        2 -> "%bb.4"
        "%bb.4" -> 2
        "%bb.4" -> "%bb.4" [ color=lightgray ]
        "%bb.4" -> "%bb.5" [ color=lightgray ]
        "%bb.5" [ shape=box ]
        2 -> "%bb.5"
        "%bb.5" -> 1
        "%bb.5" -> "%bb.6" [ color=lightgray ]
        "%bb.5" -> "%bb.3" [ color=lightgray ]
        "%bb.6" [ shape=box ]
        1 -> "%bb.6"
        "%bb.6" -> 3
}
```
However, the graph output by graphviz is

![t](https://github.com/user-attachments/assets/24056c0a-3ba9-49c3-a5da-269f3140e619)
The node name corresponding to the MBB is incorrect.
After the change, the node name is consistent with MBB's name.

![s](https://github.com/user-attachments/assets/38c649d1-7222-4de1-971c-56f7721ab64c)
2024-08-30 15:56:27 +08:00
Max Beck-Jones
1693d8eb9a
[AArch64][SelectionDAG] Vector splitting and promotion for histogram intrinsic (#103037)
Adds support for wider-than-legal vector types for the histogram
intrinsic (llvm.experimental.vector.histogram.add) by splitting the
vector. Also adds integer promotion for the Inc operand.
2024-08-30 08:54:12 +01:00
Craig Topper
aa91d90cb0
[LegalizeVectorOps][PowerPC] Use xor to expand fneg. (#106595)
This preserves the semantis of fneg and matches what we do in
LegalizeDAG.

I kept the legal FSUB check to force unrolling for some targets that
don't have FSUB but have XOR. On Aarch64, using xor broke some tests that
expected to see a (v1f64 (fma (insertvector_elt (f64 (fneg
(extractvectorelt X)))))) pattern.
2024-08-29 15:00:23 -07:00
Craig Topper
4ca817d051
[GlobalISel] Add bail outs for scalable vectors to some combines. (#106496)
These combines call getNumElements() which isn't valid for scalable
vectors.
2024-08-29 14:02:53 -07:00
Craig Topper
d5c292d8ef
[GISel][RISCV] Correctly handle scalable vector shuffles of pointer vectors in IRTranslator. (#106580) 2024-08-29 12:35:50 -07:00
Dávid Ferenc Szabó
e9eaf19eb6
[CodeGen] Allow mixed scalar type constraints for inline asm (#65465)
GCC supports code like "asm volatile ("" : "=r" (i) : "0" (f))" where i
is integer type and f is floating point type. Currently this code
produces an error with Clang. The change allows mixed scalar types
between input and output constraints.

Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
2024-08-29 22:53:28 +04:00
Philip Reames
74b4ec17e2
[VP] Remove VP_PROPERTY_REDUCTION and VP_PROPERTY_CMP [nfc] (#105551)
These lists are quite static and several of the parameters are actually
constant across all users. Heavy use of macros is undesirable, and not
idiomatic in LLVM, so let's just use the naive switch cases.

I'll probably continue with removing the other property macros. These
two just happened to be the two I actually had to figure out for an
unrelated change.
2024-08-29 09:57:58 -07:00
Stephen Tozer
3d08ade7bd
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:

https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850

This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-29 17:53:32 +01:00
Stephen Tozer
5fef40c2c4 Reapply "[DebugInfo][DWARF] Set is_stmt on first non-line-0 instruction in BB (#105524)"
Fixes the previous buildbot error by adding an explicit triple to the test,
ensuring that llc can produce a valid object file.

This reverts commit 926f0979af4f6172d4ed3dea5603aa97c800bef1.
2024-08-29 15:08:37 +01:00
Stephen Tozer
926f0979af Revert "[DebugInfo][DWARF] Set is_stmt on first non-line-0 instruction in BB (#105524)"
Reverted (along with the NFC followup fix) due to buildbot failure:
https://lab.llvm.org/buildbot/#/builders/160/builds/4142

This reverts commit 3ef37e2f8f672393ee409fde8309198df0981735, and commit
616f7d3d4f6d9bea6f776e357c938847e522a681.
2024-08-29 12:26:25 +01:00
Stephen Tozer
3ef37e2f8f
[DebugInfo][DWARF] Set is_stmt on first non-line-0 instruction in BB (#105524)
Fixes: https://github.com/llvm/llvm-project/issues/104695

This patch adds the is_stmt flag to line table entries for the first
instruction with a non-0 line location in each basic block, to ensure
that it will be used for stepping even if the last instruction in the
previous basic block had the same line number; this is important for
cases where the new BB is reachable from BBs other than the preceding
block.
2024-08-29 11:29:20 +01:00
Matt Arsenault
7b7b0b95b2
DAG: Check if is_fpclass is custom, instead of isLegalOrCustom (#105577)
For some reason, isOperationLegalOrCustom is not the same as
isOperationLegal || isOperationCustom. Unfortunately, it checks
if the type is legal which makes it uesless for custom lowering
on non-legal types (which is always ppcf128).

Really the DAG builder shouldn't be going to expand this in the
builder, it makes it difficult to work with. It's only here to work
around the DAG requiring legal integer types the same size as
the FP type after type legalization.
2024-08-29 14:05:43 +04:00
Freddy Ye
3a5c578966
[MachineLoopInfo] Fix getLoopID to handle multi latches. (#106195)
This patch also fixed `CodegenPrepare` to preserve loop metadata when
merging blocks.

This fixes issue #102632
2024-08-29 08:44:22 +08:00
Vitaly Buka
0281339159
Revert "[CodeGen] Use MachineInstr::{all_uses,all_defs} (NFC)" (#106451)
Reverts llvm/llvm-project#106404

Breaks:
https://lab.llvm.org/buildbot/#/builders/169/builds/2590
https://lab.llvm.org/buildbot/#/builders/164/builds/2454
2024-08-28 13:40:34 -07:00
Changpeng Fang
41b55071a1
DAG: Change round-mode operand type to i32 for FPTRUNC_ROUND (#106424)
We need this immediate type to be consistent. This is the pre-commit for
https://github.com/llvm/llvm-project/pull/105761
2024-08-28 11:16:41 -07:00
Kazu Hirata
a4989cd603
[CodeGen] Use MachineInstr::{all_uses,all_defs} (NFC) (#106404) 2024-08-28 11:07:31 -07:00
Craig Topper
829c47f4e0 [InterleavedAccess] Use SmallVectorImpl references. NFC
Instead of repeating SmallVector size in multiple places.
2024-08-28 09:37:59 -07:00
Maciej Gabka
95d2d1cba0
Move stepvector intrinsic out of experimental namespace (#98043)
This patch is moving out stepvector intrinsic from the experimental
namespace.

This intrinsic exists in LLVM for several years now, and is widely used.
2024-08-28 12:48:20 +01:00
Piyou Chen
2def1c4458
[RISCV][MCP] Remove redundant move from tail duplication (#89865)
Tail duplication will generate the redundant move before return. It is
because the MachineCopyPropogation can't recognize COPY after post-RA
pseudoExpand.

This patch make MachineCopyPropogation recognize `%0 = ADDI %1, 0` as
COPY
2024-08-28 08:32:54 +08:00
Kyungwoo Lee
f9ad249460
[StableHash] Implement stable global name for the hash computation (#106156)
LLVM often extends global names by adding suffixes to distinguish unique
identities. However, these suffixes are not always stable across
different runs and build environments. To address this issue, I
implemented `get_stable_name` to ignore such suffixes and obtain the
original name. This approach is not new, as PGO or Bolt already handle
this issue similarly. Using the stable name obtained from
`get_stable_name`, I implemented `stable_hash_name` while utilizing the
same underlying `xxh3_64bit` algorithm as before.
2024-08-27 15:09:06 -07:00
Kyungwoo Lee
93b8d07a75
[MachineOutliner][NFC] Refactor (#105398)
This patch prepares the NFC groundwork for global outlining using
CGData, which will follow
https://github.com/llvm/llvm-project/pull/90074.

- The `MinRepeats` parameter is now explicitly passed to the
`getOutliningCandidateInfo` function, rather than relying on a default
value of 2. For local outlining, the minimum number of repetitions is
typically 2, but for the global outlining (mentioned above), we will
optimistically create a single `Candidate` for each `OutlinedFunction`
if stable hashes match a specific code sequence. This parameter is
adjusted accordingly in global outlining scenarios.
- I have also implemented `unique_ptr` for `OutlinedFunction` to ensure
safe and efficient memory management within `FunctionList`, avoiding
unnecessary implicit copies.

This depends on https://github.com/llvm/llvm-project/pull/101461.
This is a patch for
https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
2024-08-27 14:38:36 -07:00
Sergei Barannikov
4d7a0abae8
[DataLayout] Change return type of getStackAlignment to MaybeAlign (#105478)
Currently, `getStackAlignment` asserts if the stack alignment wasn't
specified. This makes it inconvenient to use and complicates testing.

This change also makes `exceedsNaturalStackAlignment` method redundant.
2024-08-27 22:59:33 +03:00
Simon Pilgrim
4baf29e81e [DAG] Handle cases where a shift amount is larger than the pre-extended value bitwidth
In the (zext (shl (zext x), cst)) -> (shl (zext x), cst) fold, don't use a bitmask / MaskedValueIsZero as we can't guarantee that the shift amount is in bounds.

Fixes #106202
2024-08-27 18:12:24 +01:00
Craig Topper
f6b0c09214
[LiveDebugVariables] Use VirtRegMap::hasPhys. NFC (#106186)
Use hasPhys instead of MCRegister::isPhysicalRegister.

I think the MCRegister returned from getPhys can only contain a physical
register or 0. hasPhys checks that the register returned from getPhys is non-zero.
So I think they are equivalent in this usage.
2024-08-27 09:07:20 -07:00
chuongg3
d58bd21150
[GlobalISel] Look between instructions to be matched (#101675)
When a pattern is matched in TableGen, a check is run called
isObviouslySafeToFold(). One of the condition that it checks for is
whether the instructions that are being matched are consecutive, so the
instruction's insertion point does not change.

This patch allows the movement of the insertion point of a load
instruction if none of the intervening instructions are stores or have
side-effects.
2024-08-27 16:56:40 +01:00
Kiran
c50d11e6d9 Revert "[ARM] musttail fixes"
committed by accident, see #104795

This reverts commit a2088a24dad31ebe44c93751db17307fdbe1f0e2.
2024-08-27 11:17:17 +01:00
Kiran
ad468da038 Revert "Seperate frontend changes, add debug directives, remove redundant stuff from tests"
This reverts commit 1a908c6be3317bbbac73e6a6fc52cabefbdebf7d.
2024-08-27 10:46:18 +01:00
Kiran
1a908c6be3 Seperate frontend changes, add debug directives, remove redundant stuff from tests 2024-08-27 10:44:06 +01:00
Kiran
a2088a24da [ARM] musttail fixes
Backend:
- Caller and callee arguments no longer have to match, just to take up the same space, as they can be changed before the call
- Allowed tail calls if callee and callee both (or neither) use sret, wheras before it would be dissalowed if either used sret
- Allowed tail calls if byval args are used
- Added debug trace for IsEligibleForTailCallOptimisation

Frontend (clang):
- Do not generate extra alloca if sret is used with musttail, as the space for the sret is allocated already

Change-Id: Ic7f246a7eca43c06874922d642d7dc44bdfc98ec
2024-08-27 10:44:06 +01:00
Piyou Chen
b01c006f73
[TII][RISCV] Add renamable bit to copyPhysReg (#91179)
The renamable flag is useful during MachineCopyPropagation but renamable
flag will be dropped after lowerCopy in some case.

This patch introduces extra arguments to pass the renamable flag to
copyPhysReg.
2024-08-27 10:08:43 +08:00
Philip Reames
824cffe152 [GC] Rename gc_args to gc_live [nfc]
Better reflect the recent history of the code, and improve readability
for when I have to glance back at this to answer a question.
2024-08-26 14:06:04 -07:00
Kazu Hirata
399d7cce37
[CodeGen] Use MachineInstr::all_defs (NFC) (#106017) 2024-08-26 07:22:17 -07:00
Craig Topper
c503758ab6 [CodeGen] Use std::pair<MCRegister, Register> to match return from MRI.liveins(). NFC
MachineRegisterInfo::liveins returns std::pair<MCRegister, Register>.
Don't convert to std::pair<unsigned, unsigned>.
2024-08-25 15:28:08 -07:00