llvm-project

Author	SHA1	Message	Date
Kerry McLaughlin	c34cba0413	[AArch64][SME] Lower aarch64.sme.cnts* to vscale when in streaming mode (#154305 ) In streaming mode, both the @llvm.aarch64.sme.cnts and @llvm.aarch64.sve.cnt intrinsics are equivalent. For SVE, cnt* is lowered in instCombineIntrinsic to @llvm.sme.vscale(). This patch lowers the SME intrinsic similarly when in streaming-mode.	2025-08-20 09:48:36 +01:00
Benjamin Maxwell	43a9ec2ecd	[AArch64][SME] Instcombine `llvm.aarch64.sme.in.streaming.mode()` (#147930 ) This can fold away in functions with known streaming modes.	2025-07-13 13:20:20 +01:00
Paul Walker	e478a22d54	[LLVM][IRBuilder] Use NUW arithmetic for Create{ElementCount,TypeSize}. (#143532 ) This put the onus on the caller to ensure the result type is big enough. In the unlikely event a cropped result is required then explicitly truncate a safe value.	2025-06-19 13:24:39 +01:00
Ricardo Jesus	c70c0a86a5	[AArch64][InstCombine] Combine AES instructions with zero operands. (#142781 ) We currently combine (AES (EOR (A, B)), 0) into (AES A, B) for Neon intrinsics when the zero operand appears in the RHS of the AES instruction. This patch extends the combine to support AES SVE intrinsics and the case where the zero operand appears in the LHS of the AES instructions.	2025-06-09 08:27:58 +01:00
Ricardo Jesus	fbd9a3160b	[AArch64][SVE] Combine UXT[BHW] intrinsics to AND. (#137956 ) This patch combines uxt[bhw] intrinsics to and_u when the governing predicate is all-true or the passthrough is undef (e.g. in cases of ``unknown'' merging). This improves code gen as the latter can be emitted as AND immediate instructions. For example, given: ```cpp svuint64_t foo(svuint64_t x) { return svextb_z(svptrue_b64(), x); } ``` Currently: ```gas foo: ptrue p0.d movi v1.2d, #0000000000000000 uxtb z0.d, p0/m, z0.d ret ``` Becomes: ```gas foo: and z0.d, z0.d, #0xff ret ```	2025-05-06 08:48:08 +01:00
Paul Walker	149d795ab0	[LLVM][InstCombine] Enable constant folding for SVE sdiv & udiv intrinsics. (#137966 )	2025-05-01 13:20:05 +01:00
Paul Walker	8dc89e3419	[LLVM][InstCombine] Enable constant folding for SVE asr,lsl and lsr intrinsics. (#137350 ) The SVE intrinsics support shift amounts greater-than-or-equal to the element type's bit-length, essentially saturating the shift amount to the bit-length. However, the IR instructions consider this undefined behaviour that results in poison. To account for this we now ignore the result of the simplifications that result in poison. This allows existing code to be used to simplify the shifts but does mean: 1) We don't simplify cases like "svlsl_s32(x, splat(32)) => 0". 2) We no longer constant fold cases like "svadd(poison, X) => poison" For (1) we'd need dedicated target specific combines anyway and the result of (2) is not specified by the ACLE and replicating LLVM IR behaviour might be confusing to ACLE writers.	2025-04-30 13:21:46 +01:00
Paul Walker	50d3febf17	[NFC][InstCombine][AArch64] Refactor SVE intrinsics tests. Only test a single element type because the combines are not sensitive to them. Extend coverage to cover the majority of SVE merging intrinsics.	2025-04-29 10:44:08 +00:00
Paul Walker	93321966d9	[NFC][InstCombine][AArch64] Add simplify tests for reversed SVE intrinsics. Add missing tests for fdivr, fsubr, sdivr, subr & udivr. Add test case to demonstrate incorrect poison propagation.	2025-04-29 10:33:47 +00:00
Paul Walker	6edcb52f40	[NFC][InstCombine][AArch64] Remove SVE mul idempotency tests. They have been subsumed into the SVE binop simplification tests.	2025-04-29 10:03:04 +00:00
Paul Walker	96ec17dfed	[LLVM][InstCombine] Enable constant folding for SVE add,and,eor,fadd,fdiv,fsub,orr & sub intrinsics. (#136849 ) This is the subset of binops (mul and fmul are already enabled) whose behaviour fully aligns with the equivalent SVE intrinsic. The omissions are integer divides and shifts that are defined to return poison for values where the intrinsics have a defined result. These will be covered in a seperate PR.	2025-04-25 11:30:03 +01:00
Paul Walker	013aab4051	[NFC][LLVM] Add test coverage for all binops to sve-intrinsic-simplify-binop.ll. Also adds sve-intrinsic-simplify-shift.ll to test asr, shl and shr.	2025-04-23 11:02:43 +00:00
Matthew Devereau	91a205653e	[AArch64][SVE] Instcombine ptrue(all) to splat(i1) (#135016 ) SVE Operations such as predicated loads become canonicalized to LLVM masked loads, and doing the same for ptrue(all) to splat(1) creates further optimization opportunities from generic LLVM IR passes.	2025-04-13 20:40:51 +01:00
Paul Walker	1997073a54	[LLVM][InstCombine][SVE] Refactor sve.mul/fmul combines. (#134116 ) After https://github.com/llvm/llvm-project/issues/126928 it's now possible to rewrite the existing combines, which mostly only handle cases where a operand is an identity value, to use existing simplify code to unlock general constant folding.	2025-04-08 11:38:27 +01:00
Paul Walker	575656877f	[LLVM][AArch64] Reduce uses of "undef" in SVE InstCombine tests. Also removes a largely duplicate test file and changes the other one to use autogenerated CHECK lines.	2025-02-26 11:11:02 +00:00
David Green	c71f9141a9	[AArch64] Add a phase-ordering test for dividing vscale. NFC See #126411 / #127055, the test isn't expected to fold in a single instcombine iteration, needing instcombine->cse->instcombine.	2025-02-18 10:48:50 +00:00
Nikita Popov	462cb3cd6c	[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144 ) If the gep is nusw (usually via inbounds) and the offset is non-negative, we can infer nuw. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-05 14:36:40 +01:00
Paul Walker	56c091ea71	[LLVM][IR] Use splat syntax when printing ConstantExpr based splats. (#116856 ) This brings the printing of scalable vector constant splats inline with their fixed length counterparts.	2024-11-21 11:21:12 +00:00
Paul Walker	33fcd6acc7	[NFC][LLVM] Migrate tests to use update_test_checks.py. Transforms/InstCombine/AArch64/sve-intrinsic-fmul-idempotency.ll Transforms/InstCombine/AArch64/sve-intrinsic-fmul_u-idempotency.ll Transforms/InstCombine/AArch64/sve-intrinsic-mul-idempotency.ll Transforms/InstCombine/AArch64/sve-intrinsic-mul_u-idempotency.ll Transforms/InstCombine/scalable-const-fp-splat.ll Transforms/InstSimplify/ConstProp/extractelement-vscale.ll Transforms/InstSimplify/ConstProp/vscale-shufflevector-inseltpoison.ll Transforms/InstSimplify/ConstProp/vscale-shufflevector.ll	2024-11-20 12:18:47 +00:00
Paul Walker	38fffa630e	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548 )	2024-11-06 11:53:33 +00:00
Paul Walker	5bb34803a4	[NFC] Migrate tests to use autoupdate for CHECK lines.	2024-10-22 12:55:15 +00:00
Danila Malyutin	1a609052b6	[AArch64][InstCombine] Eliminate redundant barrier intrinsics (#112023 ) If there are no memory ops on the path from one dmb to another then one barrier can be eliminated.	2024-10-17 21:04:04 +04:00
Paul Walker	d283705829	[AArch64][SVE] Fix definition of bfloat fcvt intrinsics. (#110281 ) Affected intrinsics: llvm.aarch64.sve.fcvt.bf16f32 llvm.aarch64.sve.fcvtnt.bf16f32 The named intrinsics took a predicate based on the smallest element type when it should be based on the largest. The intrinsics have been replace by v2 equivalents and affected code ported to use them. Patch includes changes to getSVEPredicateBitCast() that ensure the generated code for the auto-upgraded old intrinsics is unchanged.	2024-10-03 12:36:01 +01:00
Paul Walker	be9461cda6	[LLVM][InstCombine][SVE] fcvtnt(a,all_active,b) != fcvtnt(undef,all_active,b) (#110278 ) The "narrowing top" convert instructions leave the bottom half of active elements untouched and thus the first paramater of their associated intrinsic remains live even when there are no inactive lanes.	2024-10-01 11:13:04 +01:00
Paul Walker	622ae7ffa4	[LLVM][InstCombine][AArch64] sve.insr(splat(x), x) ==> splat(x) (#109445 ) Fixes https://github.com/llvm/llvm-project/issues/100497	2024-09-24 15:11:36 +01:00
Matthew Devereau	1808fc13c8	[AArch64][InstCombine] Bail from combining SRAD on +/-1 divisor (#109274 ) This fixes a crash when svdiv's third parameter is svdup_s64(1)	2024-09-20 13:53:02 +01:00
Lukacma	d57be195e3	[AArch64] replace SVE intrinsics with no active lanes with zero (#107413 ) This patch extends https://github.com/llvm/llvm-project/pull/73964 and optimises SVE intrinsics into zero constants when predicate is zero.	2024-09-09 10:28:01 +01:00
Lukacma	113806d187	[AArch64] optimise SVE cvt intrinsics with no active lanes (#104809 ) This patch extends https://github.com/llvm/llvm-project/pull/73964 and optimises SVE cvt intrinsics away when predicate is zero.	2024-08-29 11:45:14 +01:00
cceerczw	67a9093a47	[instCombine][bugfix] Fix crash caused by using of cast in instCombineSVECmpNE (#102472 )	2024-08-23 15:30:51 +01:00
Lukacma	29cb1e6b4f	[AArch64] optimise SVE cmp intrinsics with no active lanes (#104779 ) This patch extends https://github.com/llvm/llvm-project/pull/73964 and optimises SVE cmp intrinsics to zero vector when predicate is zero.	2024-08-22 15:51:51 +01:00
Lukacma	d7aeea626d	[AArch64] optimise SVE prefetch intrinsics with no active lanes (#103052 ) This patch extends https://github.com/llvm/llvm-project/pull/73964 and optimises away SVE prefetch intrinsics when predicate is zero.	2024-08-15 13:52:35 +01:00
Lukacma	9ceb45cc19	[AArch64][SVE] optimisation for unary SVE store intrinsics with no active lanes (#95793 ) This patch extends https://github.com/llvm/llvm-project/pull/73964 and adds optimisation of store SVE intrinsics when predicate is zero.	2024-07-02 11:37:52 +02:00
Lukacma	0bd9c49a29	[AArch64][SVE] optimisation for SVE load intrinsics with no active lanes (#95269 ) This patch extends #73964 and adds optimisation of load SVE intrinsics when predicate is zero.	2024-06-25 10:58:16 +02:00
Paul Walker	fd07b8f809	[LLVM][tests/Transforms/InstCombine] Convert instances of ConstantExpr based splats to use splat(). This is mostly NFC but some output does change due to consistently inserting into poison rather than undef and using i64 as the index type for inserts.	2024-02-27 13:37:23 +00:00
Usman Nadeem	267d6b5ed2	[AArch64][SVE] Instcombine uzp1/reinterpret svbool to use vector.insert (#81069 ) Concatenating two predictes using uzp1 after converting to double length using sve.convert.to/from.svbool is optimized poorly in the backend, resulting in additional `and` instructions to zero the lanes. See https://github.com/llvm/llvm-project/pull/78623/ Combine this pattern to use `llvm.vector.insert` to concatenate and get rid of convert to/from svbools.	2024-02-15 10:40:09 -08:00
Mark Harley	adfd13157d	[AArch64][SVE] Add optimisation for SVE intrinsics with no active lanes (#73964 ) This patch introduces optimisations for SVE intrinsic function calls which have all false predicates.	2024-01-10 11:56:52 +00:00
Nikita Popov	a5f3415533	[InstCombine] Replace non-demanded undef vector with poison If an operand (esp to shufflevector or insertelement) is not demanded, canonicalize it from undef to poison.	2023-12-18 16:12:37 +01:00
zhongyunde 00443407	bf90ffb9b4	[SVE][InstCombine] Delete redundante sel instructions with ptrue svsel(pture, x, y) => x. depend on D121792 Reviewed By: paulwalker-arm, david-arm	2023-10-13 09:20:36 +08:00
zhongyunde 00443407	127cf4ead3	[SVE][InstCombine] Precommit tests for select + ptrue	2023-10-13 09:20:36 +08:00
Nikita Popov	c00f49cf12	[InstCombine] Remove instcombine-infinite-loop-threshold option This option has been superseded by the fixpoint verification functionality.	2023-09-21 15:30:05 +02:00
Fangrui Song	d39b4ce3ce	[test] Replace aarch64-*-eabi with aarch64 Using "eabi" for aarch64 targets is a common mistake and warned by Clang Driver. We want to avoid it elsewhere as well. Just use the common "aarch64" without other triple components.	2023-06-27 20:02:52 -07:00
Fangrui Song	ebbfdca586	[test] Replace aarch64-arm-none-eabi with aarch64 Similar to 02e9441d6ca73314afa1973a234dce1e390da1da, but for llvm/test and one lld/test/ELF test.	2023-06-27 19:36:27 -07:00
Jolanta Jensen	ecb07f481b	[SVE ACLE] Implement IR combines to convert intrinsics used for _m C/C++ builtins This patch implements IR combines to convert intrinsics used for _m C/C++ builtins which take an all active predicate to their equivalent _u intrinsic. Differential Revision: https://reviews.llvm.org/D152005	2023-06-21 10:35:13 +00:00
Paul Walker	b7287a82d3	[SVE][AArch64TTI] Fix invalid mla combine that miscomputes the value of inactive lanes. Consider: add(pg, a, mul_u(pg, b, c)) Although the multiply's inactive lanes are undefined, they don't contribute to the final result. The overall result of the inactive lanes come from "a" and thus the above is another form of mla rather than mla_u.	2023-06-18 13:07:03 +01:00
Paul Walker	7a8f6a3eaa	Increase test coverage of Transforms/InstCombine/AArch64/sve-intrinsic-muladdsub.ll	2023-06-18 13:07:03 +01:00
Nikita Popov	f9f8517e03	[InstCombine][AArch64] Fix phi insertion point Fix the issue reported at https://reviews.llvm.org/rG724f4a5bac25#inline-9083, by specifying the correct insertion point for the new phi.	2023-06-16 14:58:33 +02:00
Nikita Popov	f10103ba59	[InstCombine] Regenerate test checks (NFC)	2023-06-16 14:50:16 +02:00
Jolanta Jensen	a963dbb5ac	[SVE ACLE] Extend existing aarch64_sve_mul combines to also act on aarch64_sve_mul_u. Differential Revision: https://reviews.llvm.org/D152004	2023-06-06 15:26:33 +00:00
Jolanta Jensen	dc63b35b02	[SVE ACLE] Extend IR combines for fmul, fsub, fadd to cover _u variants This patch extends existing IR combines for: fmul, fsub and fadd, relying on all active predicate to also apply to their equivalent undef (_u) intrinsics. Differential Revision: https://reviews.llvm.org/D150768	2023-06-02 11:06:57 +00:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0 since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00

1 2 3

126 Commits