7667 Commits

Author SHA1 Message Date
David Green
4dcf33b6c2 [AArch64] Cleanup and GISel coverage for lrint tests. NFC 2024-04-10 18:13:57 +01:00
Dinar Temirbulatov
990c4bc95f
[AArch64][SVE2] Generate SVE2 BSL instruction in LLVM for bit-twiddling. (#83514)
Allow to fold or/and-and to BSL instuction for scalable vectors.
2024-04-10 11:07:59 +01:00
Dinar Temirbulatov
528943f153
[AArch64][SME] Allow memory operations lowering to custom SME functions. (#79263)
This change allows to lower memcpy, memset, memmove to custom SME
version provided by LibRT.
2024-04-09 17:27:46 +01:00
Sam Tebbs
fb8dbd1fb6
[AArch64] Remove copy in SVE/SME predicate spill and fill (#81716)
7dc20ab introduced an extra COPY when spilling and filling a PNR
register, which can't be elided as the input (PNR predicate) and output
(PPR predicate) register classes differ. The patch adds a new register
class that covers both PPR and PNR so that STR_PXI and LDR_PXI can
take either of them, removing the need for the copy.
2024-04-09 16:17:27 +01:00
Eli Friedman
7ad481e76c
Revert "[AArch64] Add support for -ffixed-x30" (#88019)
This reverts commit e770153865c53c4fd72a68f23acff33c24e42a08.

This wasn't reviewed, and the functionality in question was
intentionally rejected the last time it was discussed in
https://reviews.llvm.org/D56305 .
2024-04-08 15:16:00 -07:00
Leonard Grey
c23135c548
-fsanitize=function: fix .subsections_via_symbols (#87527)
-fsanitize=function emits a signature and function hash before a
function. Similar to 7f6e2c9, these can be sheared off when
`.subsections_via_symbols` is used.

This change uses the same technique 7f6e2c9 introduced for prefixes:
emitting a symbol for the metadata, then marking the actual function
entry as an .alt_entry symbol.
2024-04-08 16:05:52 -04:00
Daniil Kovalev
89eb1a5a8e
[test][AArch64][CodeGen] Delete redundant check lines (#87965)
llvm/test/CodeGen/AArch64/elf-globals-pic.ll:

Since https://reviews.llvm.org/D91734, elf-globals-static.ll test
contains several `CHECK-PIC` lines. They do not seem to bring any value
since there are no FileCheck run lines checking against this prefix. The
right place for such tests should be elf-globals-pic.ll, which already
contains check lines being deleted in this commit. Both
elf-globals-pic.ll and elf-globals-static.ll were created after
splitting arm64-elf-globals.ll in 6dbd0ea, and having `CHECK-PIC` lines
in elf-globals-static.ll seems like an issue occurred because of git
thinking that elf-globals-pic.ll is a new file and elf-global-static.ll
is a rename of arm64-elf-globals.ll.

llvm/test/CodeGen/AArch64/tagged-globals-pic.ll:

Similar to elf-globals-pic.ll, contains unneeded
`CHECK-SELECTIONDAGISEL` and `CHECK-GLOBALISEL` directives not checked
by any FileCheck invocation. These directives are present in
tagged-globals-static.ll. Both tests are present in the code tree since
fd32639 when tagged-globals.ll was splitted into
tagged-globals-{pic|static}.ll.
2024-04-08 22:27:50 +03:00
Matt Arsenault
8cb642bf18 GlobalISel: Regenerate test checks 2024-04-08 08:32:04 -04:00
David Green
9fd2e2c2fd
[DAG][AArch64] Support masked loads/stores with nontemporal flags (#87608)
SVE has some non-temporal masked loads and stores. The metadata coming
from the nodes is not copied to the MMO at the moment though, meaning it
will generate a normal instruction. This patch ensures that the right
flags are set if the instruction has non-temporal metadata.
2024-04-08 08:53:27 +01:00
David Green
ac321cbb03
[AArch64][GlobalISel] Legalize Insert vector element (#81453)
This attempts to standardize and extend some of the insert vector
element lowering. Most notably:
- More types are handled by splitting illegal vectors.
- The index type for G_INSERT_VECTOR_ELT is canonicalized to
  TLI.getVectorIdxTy(), similar to extact_vector_element.
- Some of the existing patterns now have the index type specified to
  make sure they can apply to GISel too.
- The C++ selection code has been removed, relying on tablegen patterns.
- G_INSERT_VECTOR_ELT with small GPR input elements are pre-selected to
  use a i32 type, allowing the existing patterns to apply.
- Variable index inserts are lowered in post-legalizer lowering,
  expanding into a stack store and reload.
2024-04-08 08:44:13 +01:00
darkbuck
8e98435ae9
[GISel][Combine] Enhance combining on G_BUILD_VECTOR
Reviewers: aemerson, arsenm

Reviewed By: arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/87831
2024-04-06 18:33:01 -04:00
Sizov Nikita
d38bff460a
[AArch64] SimplifyDemandedBitsForTargetNode - add AArch64ISD::BICi handling (#76644)
Fold BICi if all destination bits are already known to be zeroes

```llvm
define <8 x i16> @haddu_known(<8 x i8> %a0, <8 x i8> %a1) {
  %x0 = zext <8 x i8> %a0 to <8 x i16>
  %x1 = zext <8 x i8> %a1 to <8 x i16>
  %hadd = call <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16> %x0, <8 x i16> %x1)
  %res = and <8 x i16> %hadd, <i16 511, i16 511, i16 511, i16 511,i16 511, i16 511, i16 511, i16 511>
  ret <8 x i16> %res
}
declare <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16>, <8 x i16>)
```

```
haddu_known:                            // @haddu_known
        ushll   v0.8h, v0.8b, #0
        ushll   v1.8h, v1.8b, #0
        uhadd   v0.8h, v0.8h, v1.8h
        bic     v0.8h, #254, lsl #8 <-- this one will be removed as we know high bits are zero extended
        ret
```

Fixes #53881
Fixes #53622
2024-04-06 21:41:24 +01:00
Matt Arsenault
4cb110a84f
[RFC] IR: Support atomicrmw FP ops with vector types (#86796)
Allow using atomicrmw fadd, fsub, fmin, and fmax with vectors of
floating-point type. AMDGPU supports atomic fadd for <2 x half> and <2 x
bfloat> on some targets and address spaces.

Note this only supports the proper floating-point operations; float
vector typed xchg is still not supported. cmpxchg still only supports
integers, so this inserts bitcasts for the loop expansion.

I have support for fp vector typed xchg, and vector of int/ptr
separately implemented but I don't have an immediate need for those
beyond feature consistency.
2024-04-06 15:27:45 -04:00
Amara Emerson
60fc4ac67a [GlobalISel] Don't form anyextending atomic loads.
Until we can reliably check the legality and improve our selection of these,
don't form them at all.
2024-04-05 13:34:59 -07:00
Michael Liao
a1b2f0cc44 Reland "[GlobalISel] Fix the infinite loop issue in commute_int_constant_to_rhs"
- That test needs to disable combine rules by name and hence requires `asserts`.
2024-04-05 10:34:12 -04:00
Eli Friedman
c83f23d6ab
[AArch64] Fix heuristics for folding "lsl" into load/store ops. (#86894)
The existing heuristics were assuming that every core behaves like an
Apple A7, where any extend/shift costs an extra micro-op... but in
reality, nothing else behaves like that.

On some older Cortex designs, shifts by 1 or 4 cost extra, but all other
shifts/extensions are free. On all other cores, as far as I can tell,
all shifts/extensions for integer loads are free (i.e. the same cost as
an unshifted load).

To reflect this, this patch:

- Enables aggressive folding of shifts into loads by default.

- Removes the old AddrLSLFast feature, since it applies to everything
except A7 (and even if you are explicitly targeting A7, we want to
assume extensions are free because the code will almost always run on a
newer core).

- Adds a new feature AddrLSLSlow14 that applies specifically to the
Cortex cores where shifts by 1 or 4 cost extra.

I didn't add support for AddrLSLSlow14 on the GlobalISel side because it
would require a bunch of refactoring to work correctly. Someone can pick
this up as a followup.
2024-04-04 11:25:44 -07:00
Daniil Kovalev
d97d560fbf
[AArch64][PAC][MC][ELF] Support PAuth ABI compatibility tag (#85236)
Depends on #87545

Emit `GNU_PROPERTY_AARCH64_FEATURE_PAUTH` property in
`.note.gnu.property` section depending on
`aarch64-elf-pauthabi-platform` and `aarch64-elf-pauthabi-version` llvm
module flags.
2024-04-04 21:05:03 +03:00
Gulfem Savrun Yeniceri
be8fd86f6a Revert "[GlobalISel] Fix the infinite loop issue in commute_int_constant_to_rhs"
This reverts commit 1f01c580444ea2daef67f95ffc5fde2de5a37cec
because combine-commute-int-const-lhs.mir test failed in
multiple builders.
https://lab.llvm.org/buildbot/#/builders/124/builds/10375
https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-linux-x64/b8751607530180046481/overview
2024-04-04 16:39:31 +00:00
darkbuck
1f01c58044
[GlobalISel] Fix the infinite loop issue in commute_int_constant_to_rhs
- When both operands are constant, the matcher runs into an infinite
  loop as the commutation should be applied only when LHS is a constant
  and RHS is not.

Reviewers: arsenm

Reviewed By: arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/87426
2024-04-03 20:52:21 -04:00
David Green
52ae02db40 [AArch64] Add a test for non-temporal masked loads / stores. NFC 2024-04-03 19:31:25 +01:00
aniplcc
d650fcd6bf
[DAG] SimplifyDemandedVectorElts - add ISD::AVGCEILS/AVGCEILU/AVGFLOORS/AVGFLOORU nodes (#86284)
Fixes #84768
2024-04-03 15:00:50 +01:00
David Green
6288f36c16
[AArch64][GlobalISel] Basic add_sat and sub_sat vector handling. (#80650)
This tries to fill in the basic vector handling for sadd_sat/uadd_sat
and ssub_sat/usub_sat. It just handles the basics, marking legal types
and clamping illegally sized vectors to legal ones.
2024-04-03 08:44:51 +01:00
Ryotaro KASUGA
ea4a11926b
Reapply "[CodeGen] Fix register pressure computation in MachinePipeli… (#87312)
…ner (#87030)"

Fix broken test.

This reverts commit b8ead2198f27924f91b90b6c104c1234ccc8972e.
2024-04-03 09:28:09 +09:00
Kevin P. Neal
737fc353d2 [FPEnv][AArch64] Correct strictfp test.
Correct strictfp tests to follow the rules documented in the LangRef:
https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics

These tests needed the strictfp attribute added to some function
definitions and some function calls.

Test changes verified with D146845.
2024-04-02 09:35:44 -04:00
Il-Capitano
0ef7437780
[SelectionDAG][Statepoint] Fix truncation of gc.statepoint ID argument (#85908)
The ID argument of `gc.statepoint` gets incorrectly truncated to 32 bits
during code generation.
This is fixed by using `uint64_t` instead of `unsigned` for the `ID`
member in `SelectionDAGBuilder::StatepointLoweringInfo`, and a
`patchpoint` test case is extended to check for 64 bit ID generation in
stackmaps.
2024-04-02 09:28:19 -04:00
Thorsten Schütt
8bb9443333
[GlobalIsel] Combine G_EXTRACT_VECTOR_ELT (#85321)
preliminary steps
2024-04-02 09:01:24 +02:00
Gulfem Savrun Yeniceri
b8ead2198f Revert "[CodeGen] Fix register pressure computation in MachinePipeliner (#87030)"
This reverts commit a4dec9d6bc67c4d8fbd4a4f54ffaa0399def9627
because the test failed in the following builder:
https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-linux-x64/b8751864477467126481/overview
2024-04-01 18:27:41 +00:00
Ryotaro KASUGA
a4dec9d6bc
[CodeGen] Fix register pressure computation in MachinePipeliner (#87030)
`RegisterClassInfo::getRegPressureSetLimit` has been changed to return a
smaller value than before so the limit may become negative in later
calculations. As a workaround, change to use
`TargetRegisterInfo::getRegPressureSetLimit`.
Also improve tests.
2024-04-01 17:04:44 +09:00
Vitaly Buka
20f56e1f8e
[CodeGen] Add default lowering for llvm.allow.{runtime,ubsan}.check() (#86049)
RFC:
https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641
2024-03-31 22:19:33 -07:00
Jacek Caban
799e1d6a12
[IR] Use EXPORTAS for ARM64EC mangled symbols with dllexport attribute. (#81940)
We currently just use mangled name. This works fine, because linker
should detect that and demangle it for the export table. However, on
MSVC, the compiler is more specific and passes demangled name as well,
with EXPORTAS. This PR aims to match that. MSVC doesn't use quotes in
this case, so I added '#' to the list of characters that don't need it.
2024-03-30 16:48:39 +01:00
Shilei Tian
3a106e5b2c
[GlobalISel] Fold G_ICMP if possible (#86357)
This patch tries to fold `G_ICMP` if possible.
2024-03-29 15:59:50 -04:00
Marc Auberer
d3bc9cc99b
[AArch64][GISEL] Regenerate select tests with inline register classes (#87013)
Use inline register class syntax for select test file.
2024-03-29 15:45:06 +01:00
Thorsten Schütt
84299df301
[GlobalIsel] add trunc flags (#87045)
https://github.com/llvm/llvm-project/pull/85592
2024-03-29 13:38:08 +01:00
Wang Pengcheng
610b9e23c5
[SDAG] Use shifts if ISD::MUL is illegal when lowering ISD::CTPOP (#86505)
We can avoid libcalls.

Fixes #86205
2024-03-29 15:38:39 +08:00
Marc Auberer
c482fad2c1
[AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972)
Fixes #86917

`FCMP_TRUE` and `FCMP_FALSE` were previously not considered and we ended
up in an llvm_unreachable assertion.
2024-03-28 23:08:38 +01:00
Craig Topper
23d45e55ed
[MCP] Remove dead copies from basic blocks with successors. (#86973)
Previously we wouldn't remove dead copies from basic blocks with
successors. The comment said we didn't want to trust the live-in lists.
The comment is very old so I'm not sure if that's still a concern today.

This patch checks the live-in lists and removes copies from
MaybeDeadCopies if they are referenced by any live-ins in any
successors. We only do this if the tracksLiveness property is set. If
that property is not set, we retain the old behavior.
2024-03-28 14:43:49 -07:00
Eli Friedman
036e7ee9d1 [NFC][AArch64] Regenerate regression tests. 2024-03-27 17:08:02 -07:00
Florian Hahn
b9cd48f96a
Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709)"
This reverts commit df75183d70e029352a49c93f275db703c81a65c1.

Revert for now as this appears to cause failures on some buildbots,
e.g.:
https://lab.llvm.org/buildbot/#/builders/93/builds/19428/steps/10/logs/stdio
2024-03-27 21:22:15 +00:00
Craig Topper
acab142751 [LegalizeDAG] Freeze index when converting insert_elt/insert_subvector to load/store on stack.
We try clamp the index to be within the bounds of the stack object
we create, but if we don't freeze it, poison can propagate into the
clamp code. This can cause the access to leave the bounds of the
stack object.

We have other instances of this issue in type legalization and extract_elt/subvector,
but posting this patch first for direction check.

Fixes #86717
2024-03-27 13:01:23 -07:00
Craig Topper
0d7ea50d20 [AArch64] Pre-commit test for #86717. NFC 2024-03-27 13:01:23 -07:00
David Green
36e74cfdbd
[AArch64] Clear kill flags when removing FMOVDr. (#86308)
The uses of OldDef/NewDef may not be killed in the same place they
previously were after they are replaced, and so need to be cleared.
2024-03-27 18:36:02 +00:00
Julian Nagele
df75183d70
[TBAA] Add verifier for tbaa.struct metadata (#86709)
Adds logic to the IR verifier that checks whether !tbaa.struct nodes are
well-formed. That is, it checks that the operands of !tbaa.struct nodes
are in groups of three, that each group of three operands consists of
two integers and a valid tbaa node, and that the regions described by
the offset and size operands are non-overlapping.

PR: https://github.com/llvm/llvm-project/pull/86709
2024-03-27 10:30:27 +01:00
Thorsten Schütt
da6cc4a24f
[CodeGen] Add nneg and disjoint flags (#86650)
MachineInstr learned the new flags.
2024-03-26 18:44:34 +01:00
Il-Capitano
308ed0233a
[Intrinsics] Make patchpoint.i64 generic on its return type (#85911)
Currently patchpoints can only have two result types, `void` and `i64`.
This limits the result to general purpose registers.
This patch makes `patchpoint.i64` an overloadable intrinsic, allowing
result values that can fit in a single register (e.g. integers,
pointers, floats).
2024-03-26 19:08:52 +05:30
Sander de Smalen
f914e8e77c
[AArch64][SME] Add coalescer barrier for args/results in locally streaming functions. (#85388)
Similar to how we protected FP/fixed-vector arguments and results from
calls, we should do the same for arguments/results from locally-streaming
functions such that those are not spilled/filled as ZPR registers.

This may cause a small regression (additional spills/fills), which is
addressed by #85386.
2024-03-26 11:40:31 +00:00
David Green
fbc247367a
[AArch64][GlobalISel] Legalization for small anyext/sext/zext (#86438)
Similar to #85625, some of the codegen is still far from optimal but
this helps fix quite a few fallback cases.
2024-03-26 09:48:06 +00:00
David Green
4d315ff382
[GlobalISel] Add CTLZ known bits. (#86436)
Replicated from SDAG.
2024-03-26 09:11:35 +00:00
David Green
96819daa3d
[AArch64] Handle v2i16 and v2i8 in concat load combine. (#86264)
This extends the concat load patch from
https://reviews.llvm.org/D121400, which was later moved to a combine, to
handle v2i8 and v2i16 concat loads too.
2024-03-25 17:10:23 +00:00
houndlord
9632e1515c
Match fixed width ISD::AVGFLOORS + ISD::AVGCEILS patterns (#86222) 2024-03-24 15:33:16 +00:00
David Green
e8d5223ce4 [AArch64] Additional GISel test coverage. NFC 2024-03-24 12:32:47 +00:00