llvm-project

Author	SHA1	Message	Date
David Green	4dcf33b6c2	[AArch64] Cleanup and GISel coverage for lrint tests. NFC	2024-04-10 18:13:57 +01:00
Dinar Temirbulatov	990c4bc95f	[AArch64][SVE2] Generate SVE2 BSL instruction in LLVM for bit-twiddling. (#83514 ) Allow to fold or/and-and to BSL instuction for scalable vectors.	2024-04-10 11:07:59 +01:00
Dinar Temirbulatov	528943f153	[AArch64][SME] Allow memory operations lowering to custom SME functions. (#79263 ) This change allows to lower memcpy, memset, memmove to custom SME version provided by LibRT.	2024-04-09 17:27:46 +01:00
Sam Tebbs	fb8dbd1fb6	[AArch64] Remove copy in SVE/SME predicate spill and fill (#81716 ) 7dc20ab introduced an extra COPY when spilling and filling a PNR register, which can't be elided as the input (PNR predicate) and output (PPR predicate) register classes differ. The patch adds a new register class that covers both PPR and PNR so that STR_PXI and LDR_PXI can take either of them, removing the need for the copy.	2024-04-09 16:17:27 +01:00
Eli Friedman	7ad481e76c	Revert "[AArch64] Add support for -ffixed-x30" (#88019 ) This reverts commit e770153865c53c4fd72a68f23acff33c24e42a08. This wasn't reviewed, and the functionality in question was intentionally rejected the last time it was discussed in https://reviews.llvm.org/D56305 .	2024-04-08 15:16:00 -07:00
Leonard Grey	c23135c548	-fsanitize=function: fix .subsections_via_symbols (#87527 ) -fsanitize=function emits a signature and function hash before a function. Similar to 7f6e2c9, these can be sheared off when `.subsections_via_symbols` is used. This change uses the same technique 7f6e2c9 introduced for prefixes: emitting a symbol for the metadata, then marking the actual function entry as an .alt_entry symbol.	2024-04-08 16:05:52 -04:00
Daniil Kovalev	89eb1a5a8e	[test][AArch64][CodeGen] Delete redundant check lines (#87965 ) llvm/test/CodeGen/AArch64/elf-globals-pic.ll: Since https://reviews.llvm.org/D91734, elf-globals-static.ll test contains several `CHECK-PIC` lines. They do not seem to bring any value since there are no FileCheck run lines checking against this prefix. The right place for such tests should be elf-globals-pic.ll, which already contains check lines being deleted in this commit. Both elf-globals-pic.ll and elf-globals-static.ll were created after splitting arm64-elf-globals.ll in 6dbd0ea, and having `CHECK-PIC` lines in elf-globals-static.ll seems like an issue occurred because of git thinking that elf-globals-pic.ll is a new file and elf-global-static.ll is a rename of arm64-elf-globals.ll. llvm/test/CodeGen/AArch64/tagged-globals-pic.ll: Similar to elf-globals-pic.ll, contains unneeded `CHECK-SELECTIONDAGISEL` and `CHECK-GLOBALISEL` directives not checked by any FileCheck invocation. These directives are present in tagged-globals-static.ll. Both tests are present in the code tree since fd32639 when tagged-globals.ll was splitted into tagged-globals-{pic\|static}.ll.	2024-04-08 22:27:50 +03:00
Matt Arsenault	8cb642bf18	GlobalISel: Regenerate test checks	2024-04-08 08:32:04 -04:00
David Green	9fd2e2c2fd	[DAG][AArch64] Support masked loads/stores with nontemporal flags (#87608 ) SVE has some non-temporal masked loads and stores. The metadata coming from the nodes is not copied to the MMO at the moment though, meaning it will generate a normal instruction. This patch ensures that the right flags are set if the instruction has non-temporal metadata.	2024-04-08 08:53:27 +01:00
David Green	ac321cbb03	[AArch64][GlobalISel] Legalize Insert vector element (#81453 ) This attempts to standardize and extend some of the insert vector element lowering. Most notably: - More types are handled by splitting illegal vectors. - The index type for G_INSERT_VECTOR_ELT is canonicalized to TLI.getVectorIdxTy(), similar to extact_vector_element. - Some of the existing patterns now have the index type specified to make sure they can apply to GISel too. - The C++ selection code has been removed, relying on tablegen patterns. - G_INSERT_VECTOR_ELT with small GPR input elements are pre-selected to use a i32 type, allowing the existing patterns to apply. - Variable index inserts are lowered in post-legalizer lowering, expanding into a stack store and reload.	2024-04-08 08:44:13 +01:00
darkbuck	8e98435ae9	[GISel][Combine] Enhance combining on G_BUILD_VECTOR Reviewers: aemerson, arsenm Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/87831	2024-04-06 18:33:01 -04:00
Sizov Nikita	d38bff460a	[AArch64] SimplifyDemandedBitsForTargetNode - add AArch64ISD::BICi handling (#76644 ) Fold BICi if all destination bits are already known to be zeroes ```llvm define <8 x i16> @haddu_known(<8 x i8> %a0, <8 x i8> %a1) { %x0 = zext <8 x i8> %a0 to <8 x i16> %x1 = zext <8 x i8> %a1 to <8 x i16> %hadd = call <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16> %x0, <8 x i16> %x1) %res = and <8 x i16> %hadd, <i16 511, i16 511, i16 511, i16 511,i16 511, i16 511, i16 511, i16 511> ret <8 x i16> %res } declare <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16>, <8 x i16>) ``` ``` haddu_known: // @haddu_known ushll v0.8h, v0.8b, #0 ushll v1.8h, v1.8b, #0 uhadd v0.8h, v0.8h, v1.8h bic v0.8h, #254, lsl #8 <-- this one will be removed as we know high bits are zero extended ret ``` Fixes #53881 Fixes #53622	2024-04-06 21:41:24 +01:00
Matt Arsenault	4cb110a84f	[RFC] IR: Support atomicrmw FP ops with vector types (#86796 ) Allow using atomicrmw fadd, fsub, fmin, and fmax with vectors of floating-point type. AMDGPU supports atomic fadd for <2 x half> and <2 x bfloat> on some targets and address spaces. Note this only supports the proper floating-point operations; float vector typed xchg is still not supported. cmpxchg still only supports integers, so this inserts bitcasts for the loop expansion. I have support for fp vector typed xchg, and vector of int/ptr separately implemented but I don't have an immediate need for those beyond feature consistency.	2024-04-06 15:27:45 -04:00
Amara Emerson	60fc4ac67a	[GlobalISel] Don't form anyextending atomic loads. Until we can reliably check the legality and improve our selection of these, don't form them at all.	2024-04-05 13:34:59 -07:00
Michael Liao	a1b2f0cc44	Reland "[GlobalISel] Fix the infinite loop issue in `commute_int_constant_to_rhs`" - That test needs to disable combine rules by name and hence requires `asserts`.	2024-04-05 10:34:12 -04:00
Eli Friedman	c83f23d6ab	[AArch64] Fix heuristics for folding "lsl" into load/store ops. (#86894 ) The existing heuristics were assuming that every core behaves like an Apple A7, where any extend/shift costs an extra micro-op... but in reality, nothing else behaves like that. On some older Cortex designs, shifts by 1 or 4 cost extra, but all other shifts/extensions are free. On all other cores, as far as I can tell, all shifts/extensions for integer loads are free (i.e. the same cost as an unshifted load). To reflect this, this patch: - Enables aggressive folding of shifts into loads by default. - Removes the old AddrLSLFast feature, since it applies to everything except A7 (and even if you are explicitly targeting A7, we want to assume extensions are free because the code will almost always run on a newer core). - Adds a new feature AddrLSLSlow14 that applies specifically to the Cortex cores where shifts by 1 or 4 cost extra. I didn't add support for AddrLSLSlow14 on the GlobalISel side because it would require a bunch of refactoring to work correctly. Someone can pick this up as a followup.	2024-04-04 11:25:44 -07:00
Daniil Kovalev	d97d560fbf	[AArch64][PAC][MC][ELF] Support PAuth ABI compatibility tag (#85236 ) Depends on #87545 Emit `GNU_PROPERTY_AARCH64_FEATURE_PAUTH` property in `.note.gnu.property` section depending on `aarch64-elf-pauthabi-platform` and `aarch64-elf-pauthabi-version` llvm module flags.	2024-04-04 21:05:03 +03:00
Gulfem Savrun Yeniceri	be8fd86f6a	Revert "[GlobalISel] Fix the infinite loop issue in `commute_int_constant_to_rhs`" This reverts commit 1f01c580444ea2daef67f95ffc5fde2de5a37cec because combine-commute-int-const-lhs.mir test failed in multiple builders. https://lab.llvm.org/buildbot/#/builders/124/builds/10375 https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-linux-x64/b8751607530180046481/overview	2024-04-04 16:39:31 +00:00
darkbuck	1f01c58044	[GlobalISel] Fix the infinite loop issue in `commute_int_constant_to_rhs` - When both operands are constant, the matcher runs into an infinite loop as the commutation should be applied only when LHS is a constant and RHS is not. Reviewers: arsenm Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/87426	2024-04-03 20:52:21 -04:00
David Green	52ae02db40	[AArch64] Add a test for non-temporal masked loads / stores. NFC	2024-04-03 19:31:25 +01:00
aniplcc	d650fcd6bf	[DAG] SimplifyDemandedVectorElts - add ISD::AVGCEILS/AVGCEILU/AVGFLOORS/AVGFLOORU nodes (#86284 ) Fixes #84768	2024-04-03 15:00:50 +01:00
David Green	6288f36c16	[AArch64][GlobalISel] Basic add_sat and sub_sat vector handling. (#80650 ) This tries to fill in the basic vector handling for sadd_sat/uadd_sat and ssub_sat/usub_sat. It just handles the basics, marking legal types and clamping illegally sized vectors to legal ones.	2024-04-03 08:44:51 +01:00
Ryotaro KASUGA	ea4a11926b	Reapply "[CodeGen] Fix register pressure computation in MachinePipeli… (#87312 ) …ner (#87030)" Fix broken test. This reverts commit b8ead2198f27924f91b90b6c104c1234ccc8972e.	2024-04-03 09:28:09 +09:00
Kevin P. Neal	737fc353d2	[FPEnv][AArch64] Correct strictfp test. Correct strictfp tests to follow the rules documented in the LangRef: https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics These tests needed the strictfp attribute added to some function definitions and some function calls. Test changes verified with D146845.	2024-04-02 09:35:44 -04:00
Il-Capitano	0ef7437780	[SelectionDAG][Statepoint] Fix truncation of `gc.statepoint` ID argument (#85908 ) The ID argument of `gc.statepoint` gets incorrectly truncated to 32 bits during code generation. This is fixed by using `uint64_t` instead of `unsigned` for the `ID` member in `SelectionDAGBuilder::StatepointLoweringInfo`, and a `patchpoint` test case is extended to check for 64 bit ID generation in stackmaps.	2024-04-02 09:28:19 -04:00
Thorsten Schütt	8bb9443333	[GlobalIsel] Combine G_EXTRACT_VECTOR_ELT (#85321 ) preliminary steps	2024-04-02 09:01:24 +02:00
Gulfem Savrun Yeniceri	b8ead2198f	Revert "[CodeGen] Fix register pressure computation in MachinePipeliner (#87030 )" This reverts commit a4dec9d6bc67c4d8fbd4a4f54ffaa0399def9627 because the test failed in the following builder: https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-linux-x64/b8751864477467126481/overview	2024-04-01 18:27:41 +00:00
Ryotaro KASUGA	a4dec9d6bc	[CodeGen] Fix register pressure computation in MachinePipeliner (#87030 ) `RegisterClassInfo::getRegPressureSetLimit` has been changed to return a smaller value than before so the limit may become negative in later calculations. As a workaround, change to use `TargetRegisterInfo::getRegPressureSetLimit`. Also improve tests.	2024-04-01 17:04:44 +09:00
Vitaly Buka	20f56e1f8e	[CodeGen] Add default lowering for llvm.allow.{runtime,ubsan}.check() (#86049 ) RFC: https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641	2024-03-31 22:19:33 -07:00
Jacek Caban	799e1d6a12	[IR] Use EXPORTAS for ARM64EC mangled symbols with dllexport attribute. (#81940 ) We currently just use mangled name. This works fine, because linker should detect that and demangle it for the export table. However, on MSVC, the compiler is more specific and passes demangled name as well, with EXPORTAS. This PR aims to match that. MSVC doesn't use quotes in this case, so I added '#' to the list of characters that don't need it.	2024-03-30 16:48:39 +01:00
Shilei Tian	3a106e5b2c	[GlobalISel] Fold G_ICMP if possible (#86357 ) This patch tries to fold `G_ICMP` if possible.	2024-03-29 15:59:50 -04:00
Marc Auberer	d3bc9cc99b	[AArch64][GISEL] Regenerate select tests with inline register classes (#87013 ) Use inline register class syntax for select test file.	2024-03-29 15:45:06 +01:00
Thorsten Schütt	84299df301	[GlobalIsel] add trunc flags (#87045 ) https://github.com/llvm/llvm-project/pull/85592	2024-03-29 13:38:08 +01:00
Wang Pengcheng	610b9e23c5	[SDAG] Use shifts if ISD::MUL is illegal when lowering ISD::CTPOP (#86505 ) We can avoid libcalls. Fixes #86205	2024-03-29 15:38:39 +08:00
Marc Auberer	c482fad2c1	[AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972 ) Fixes #86917 `FCMP_TRUE` and `FCMP_FALSE` were previously not considered and we ended up in an llvm_unreachable assertion.	2024-03-28 23:08:38 +01:00
Craig Topper	23d45e55ed	[MCP] Remove dead copies from basic blocks with successors. (#86973 ) Previously we wouldn't remove dead copies from basic blocks with successors. The comment said we didn't want to trust the live-in lists. The comment is very old so I'm not sure if that's still a concern today. This patch checks the live-in lists and removes copies from MaybeDeadCopies if they are referenced by any live-ins in any successors. We only do this if the tracksLiveness property is set. If that property is not set, we retain the old behavior.	2024-03-28 14:43:49 -07:00
Eli Friedman	036e7ee9d1	[NFC][AArch64] Regenerate regression tests.	2024-03-27 17:08:02 -07:00
Florian Hahn	b9cd48f96a	Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709 )" This reverts commit df75183d70e029352a49c93f275db703c81a65c1. Revert for now as this appears to cause failures on some buildbots, e.g.: https://lab.llvm.org/buildbot/#/builders/93/builds/19428/steps/10/logs/stdio	2024-03-27 21:22:15 +00:00
Craig Topper	acab142751	[LegalizeDAG] Freeze index when converting insert_elt/insert_subvector to load/store on stack. We try clamp the index to be within the bounds of the stack object we create, but if we don't freeze it, poison can propagate into the clamp code. This can cause the access to leave the bounds of the stack object. We have other instances of this issue in type legalization and extract_elt/subvector, but posting this patch first for direction check. Fixes #86717	2024-03-27 13:01:23 -07:00
Craig Topper	0d7ea50d20	[AArch64] Pre-commit test for #86717 . NFC	2024-03-27 13:01:23 -07:00
David Green	36e74cfdbd	[AArch64] Clear kill flags when removing FMOVDr. (#86308 ) The uses of OldDef/NewDef may not be killed in the same place they previously were after they are replaced, and so need to be cleared.	2024-03-27 18:36:02 +00:00
Julian Nagele	df75183d70	[TBAA] Add verifier for tbaa.struct metadata (#86709 ) Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. PR: https://github.com/llvm/llvm-project/pull/86709	2024-03-27 10:30:27 +01:00
Thorsten Schütt	da6cc4a24f	[CodeGen] Add nneg and disjoint flags (#86650 ) MachineInstr learned the new flags.	2024-03-26 18:44:34 +01:00
Il-Capitano	308ed0233a	[Intrinsics] Make `patchpoint.i64` generic on its return type (#85911 ) Currently patchpoints can only have two result types, `void` and `i64`. This limits the result to general purpose registers. This patch makes `patchpoint.i64` an overloadable intrinsic, allowing result values that can fit in a single register (e.g. integers, pointers, floats).	2024-03-26 19:08:52 +05:30
Sander de Smalen	f914e8e77c	[AArch64][SME] Add coalescer barrier for args/results in locally streaming functions. (#85388 ) Similar to how we protected FP/fixed-vector arguments and results from calls, we should do the same for arguments/results from locally-streaming functions such that those are not spilled/filled as ZPR registers. This may cause a small regression (additional spills/fills), which is addressed by #85386.	2024-03-26 11:40:31 +00:00
David Green	fbc247367a	[AArch64][GlobalISel] Legalization for small anyext/sext/zext (#86438 ) Similar to #85625, some of the codegen is still far from optimal but this helps fix quite a few fallback cases.	2024-03-26 09:48:06 +00:00
David Green	4d315ff382	[GlobalISel] Add CTLZ known bits. (#86436 ) Replicated from SDAG.	2024-03-26 09:11:35 +00:00
David Green	96819daa3d	[AArch64] Handle v2i16 and v2i8 in concat load combine. (#86264 ) This extends the concat load patch from https://reviews.llvm.org/D121400, which was later moved to a combine, to handle v2i8 and v2i16 concat loads too.	2024-03-25 17:10:23 +00:00
houndlord	9632e1515c	Match fixed width ISD::AVGFLOORS + ISD::AVGCEILS patterns (#86222 )	2024-03-24 15:33:16 +00:00
David Green	e8d5223ce4	[AArch64] Additional GISel test coverage. NFC	2024-03-24 12:32:47 +00:00

1 2 3 4 5 ...

7667 Commits