llvm-project

Author	SHA1	Message	Date
Florian Hahn	8df64ed777	[LV] Don't consider IV increments uniform if exit value is used outside. In some cases, there might be a chain of uniform instructions producing the exit value. To generate correct code in all cases, consider the IV increment not uniform, if there are users outside the loop. Instead, let VPlan narrow the IV, if possible using the logic from 3ff1d01985752. Test case from #122602 verified with Alive2: https://alive2.llvm.org/ce/z/bA4EGj Fixes https://github.com/llvm/llvm-project/issues/122496. Fixes https://github.com/llvm/llvm-project/issues/122602.	2025-01-12 22:03:21 +00:00
Florian Hahn	3ff1d01985	Recommit "[VPlan] Try to narrow wide and replicating recipes to uniform recipes." This reverts commit 0ebb3ac7c92c4c1c44e7f3d17832d75ec5a42a67. Re-applies commit with typos fixed.	2025-01-12 20:10:28 +00:00
Florian Hahn	0ebb3ac7c9	Revert "[VPlan] Try to narrow wide and replicating recipes to uniform recipes." This reverts commit 1afba19913253dda865a8e57b37b9f4dabead1ac. Typo breaking the build	2025-01-12 19:37:45 +00:00
Florian Hahn	1afba19913	[VPlan] Try to narrow wide and replicating recipes to uniform recipes. Use the existing VPlan-based analysis to identify recipes that only have their first lane demanded and transform them to uniform recpliate recipes. This simplifies the generated code in some places and prepares for fixing https://github.com/llvm/llvm-project/issues/122496.	2025-01-12 19:32:01 +00:00
Florian Hahn	44058e5b5f	[LV] Precommit tests for #106441 . Tests for https://github.com/llvm/llvm-project/pull/106441 from https://github.com/llvm/llvm-project/issues/82936.	2025-01-10 18:49:44 +00:00
Florian Hahn	b0697dc1de	[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994 ) Currently we emit early-exit related debug messages/remarks even when there is a single exit. Update to only check isVectorizableEarlyExitLoop if there isn't a single exit block. PR: https://github.com/llvm/llvm-project/pull/121994	2025-01-09 12:05:19 +00:00
Luke Lau	f0d5104c94	[VPlan] Handle some VPInstructions in may{Read,Write}FromMemory (#120058 ) This just copies the same conservative definition from mayWriteToMemory, and enables more VPInstructions to be hoisted out in LICM. I think this should give more accurate costs, and I was able to build llvm-test-suite without the legacy-vplan cost model assertion going off.	2025-01-08 15:17:26 +08:00
Florian Mayer	ef391dbc29	[LV] Drop incorrect inbounds for reverse vector pointer when folding tail (#120730 ) When folding the tail, we may compute an address that we don't in the original scalar loop and it may not be inbounds. Drop Inbounds in that case.	2025-01-07 06:14:01 -08:00
Florian Hahn	f48884ded8	[VPlan] Remove loop region in optimizeForVFAndUF. (#108378 ) Update optimizeForVFAndUF to completely remove the vector loop region when possible. At the moment, we cannot remove the region if it contains * widened IVs: the recipe is needed to generate the step vector * reductions: ComputeReductionResults requires the reduction phi recipe for codegen. Both cases can be addressed by more explicit modeling. The patch also includes a number of updates to allow executing VPlans without a vector loop region. Depends on https://github.com/llvm/llvm-project/pull/110004	2025-01-05 15:50:42 +00:00
Florian Hahn	df4a615c98	[VPlan] Convert induction increment check to be VPlan-based. Check the VPlan directly to determine if a VPValue is an optimiziable IV or IV use instead of checking the underlying IR instructions. Split off from https://github.com/llvm/llvm-project/pull/112147. This refactoring enables moving IV end value creation from the legacy fixupIVUsers to a VPlan-based transform. There is one case we now won't optimize, that is IVs with subtracts and non-constant steps. But as this is a minor optimization and doesn't impact correctness, the benefits of performing the check in VPlan should outweigh the missed case.	2025-01-05 11:16:01 +00:00
Florian Hahn	b95cce9904	[VPlan] Update wide induction inc recipes to use same step as Wide IV. Update wide induction increments to use the same step as the corresponding wide induction. This enables detecting induction increments directly in VPlan and removes redundant splats.	2025-01-04 20:04:59 +00:00
Florian Hahn	4a7c0b8afe	[LV] Add X86-specific induction step tests. Adds additional test coverage for induction codegen.	2025-01-04 15:09:04 +00:00
Florian Mayer	62b5cf0410	[Vectorizer] precommit test for miscompilation (#120731 ) we generate GEPs that are out of bounds but mark them as "inbound"	2025-01-03 06:37:45 -08:00
Florian Hahn	7f3428d3ed	[VPlan] Compute induction end values in VPlan. (#112145 ) Use createDerivedIV to compute IV end values directly in VPlan, instead of creating them up-front. This allows updating IV users outside the loop as follow-up. Depends on https://github.com/llvm/llvm-project/pull/110004 and https://github.com/llvm/llvm-project/pull/109975. PR: https://github.com/llvm/llvm-project/pull/112145	2024-12-29 19:05:08 +00:00
Florian Hahn	2d038caeeb	[VPlan] Remove stray space when printing VPWidenCastRecipe. printFlags() already takes care of printing a single space if there are no flags. Remove the extra space when printing a recipe without flags.	2024-12-24 20:23:48 +00:00
Florian Hahn	df8efbdbbf	[SCEV] Remove existing predicates implied by newly added ones. (#118185 ) When adding a new predicate to a union predicate, some of the existing predicates may be implied by the new predicate. Remove any existing predicates that are already implied by the new predicate. Depends on https://github.com/llvm/llvm-project/pull/118184 to show the main benefit. PR: https://github.com/llvm/llvm-project/pull/118185	2024-12-20 20:49:37 +00:00
David Sherwood	5845298f94	[LoopVectorize] Teach some X86 cost model tests to use new vplan costs (#120738 ) I've only fixed up the tests where I was able to use a simple sed script to replace the text. Even after this patch lands, there are still over 50 tests that need updating in X86/CostModel!	2024-12-20 15:08:08 +00:00
David Sherwood	c18fda02e1	[LoopVectorize] Use new single string variant of reportVectorizationFailure (#120414 )	2024-12-19 10:07:13 +00:00
Alexander Kornienko	23a239267e	Revert "[InstCombine] Infer nuw for gep inbounds from base of object" (#120460 ) Reverts llvm/llvm-project#119225 due to the lack of sanitizer support, large potential of breaking code containing latent UB, non-trivial localization and investigation, and what seems to be a bad interaction with msan (a test is in the works). Related discussions: https://github.com/llvm/llvm-project/pull/119225#issuecomment-2551904822 https://github.com/llvm/llvm-project/pull/118472#issuecomment-2549986255	2024-12-18 19:06:34 +01:00
Florian Hahn	4ad0fdd163	[VPlan] Remove reverse() of predecessors from VPInstruction::generate. This was originally done to reduce the diff for the change. Remove it and update the remaining tests. NFC modulo reordering of incoming values. Clean up after https://github.com/llvm/llvm-project/pull/114292.	2024-12-17 20:44:32 +00:00
Nikita Popov	1157187496	[VPlan] Propagate all GEP flags (#119899 ) Store GEPNoWrapFlags instead of only InBounds and propagate them.	2024-12-17 13:48:50 +01:00
Nikita Popov	e21ab4d16b	[InstCombine] Infer nuw for gep inbounds from base of object (#119225 ) When we have a gep inbounds from the base of an object (e.g. alloca or global), we know that the index cannot be negative, as this would go out of bounds. As such, we can infer nuw as well. The implementation is a bit stricter than necessary, we could also accept one unknown index followed by known-non-negative indices. Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently incorrectly doesn't require the inbounds for the alloca case, see https://github.com/AliveToolkit/alive2/issues/1138).	2024-12-10 10:00:50 +01:00
Florian Hahn	0e70289f37	[VPlan] Create canonical IV resume value for epilogue in VPlan. (NFCI) Update the code to create induction resume PHIs to also create a resume phi for the canonical induction during epilogue vectorization. This unifies the code for handling induction resume values and removes the need to explicitly create manually resume PHI and return it during epilogue creation. Overall it helps to move the code for updating the canonical induction resume value to the place where all other header phi resume values are updated. This is NFC, modulo order of the created phis.	2024-12-09 23:11:38 +00:00
Nikita Popov	10f315dc9c	[ConstantFolding] Infer getelementptr nuw flag (#119214 ) Infer nuw from nusw and nneg. This is the constant expression variant of https://github.com/llvm/llvm-project/pull/111144. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-09 16:44:05 +01:00
Florian Hahn	7f7f540a48	Reapply "[VPlan] Update scalar induction resume values in VPlan. (#110577 )" This reverts commit f09b16e2671cbcdf7cb7dc7ed705db092a9deda1. The crash when building llvm-test-suite with stage2 should have been fixed by 1091fad31a83d5ab87eb6fa11fe3bdb3f0d152ea.	2024-12-06 19:41:51 +00:00
Nikita Popov	f09b16e267	Revert "[VPlan] Update scalar induction resume values in VPlan. (#110577 )" This reverts commit 0678e2058364ec10b94560d27ec7138dfa003287. This reverts commit 1091fad31a83d5ab87eb6fa11fe3bdb3f0d152ea. Causes crashes in llvm-test-suite when using stage 2 clang.	2024-12-06 18:01:42 +01:00
Florian Hahn	0678e20583	[VPlan] Update scalar induction resume values in VPlan. (#110577 ) Updated ILV.createInductionResumeValues (now createInductionResumeVPValue) to directly update the VPIRInstructions wrapping the original phis with the created resume values. This is the first step towards modeling them completely in VPlan. Subsequent patches will move creation of the resume values completely into VPlan. Depends on https://github.com/llvm/llvm-project/pull/109975. PR: https://github.com/llvm/llvm-project/pull/110577	2024-12-06 12:26:19 +00:00
Nikita Popov	f7685af4a5	[InstCombine] Move gep of phi fold into separate function This makes sure that an early return during this fold doesn't end up skipping later gep folds.	2024-12-05 15:20:56 +01:00
Nikita Popov	462cb3cd6c	[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144 ) If the gep is nusw (usually via inbounds) and the offset is non-negative, we can infer nuw. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-05 14:36:40 +01:00
Florian Hahn	82821254f5	[LV] Use IVUpdateMayOverflow to set HasNUW. (#111758 ) If IVUpdateMayOverflow is false, we proved that the induction increment cannot overflow in the vector loop. This allows setting NUW in some cases when folding the tail. PR: https://github.com/llvm/llvm-project/pull/111758	2024-11-28 10:12:41 +00:00
Lee Wei	abb9f9fa06	[llvm] Remove `br i1 undef` from some regression tests [NFC] (#117112 ) This PR removes tests with `br i1 undef` under `llvm/tests/Transforms/Loop, Lower`.	2024-11-21 08:06:56 +00:00
David Sherwood	aeb88f6778	Fix test failures introduced by PR #113697 (#116941 ) Don't match the entire floating point debug output since it's prone to rounding errors depending upon the target.	2024-11-20 09:10:51 +00:00
David Sherwood	3097c60928	[LoopVectorize][NFC] Rewrite tests to check output of vplan cost model (#113697 ) Currently it's very difficult to improve the cost model for tail-folded loops because as soon as you add a VPInstruction::computeCost function that adds the costs of instructions such as VPInstruction::ActiveLaneMask and VPInstruction::ExplicitVectorLength the assert in LoopVectorizationPlanner::computeBestVF fails for some tests. This is because the VF chosen by the legacy cost model doesn't match the vplan cost model. See PR #90191. This assert is currently making it difficult to improve the cost model. Hopefully we will be in a position to remove the assert soon, however in order to do that we have to fix up a whole bunch of tests that rely upon the legacy cost model output. I've tried my best to update these tests to use vplan output instead. There is still work needed for the VF=1 case because the vplan cost model is not printed out in this case. I've not attempted to fix those in this patch.	2024-11-19 08:55:39 +00:00
Julian Nagele	a8538b9138	[LV] Vectorize Epilogues for loops with small VF but high IC (#108190 ) - Consider MainLoopVF * IC when determining whether Epilogue Vectorization is profitable - Allow the same VF for the Epilogue as for the main loop - Use an upper bound for the trip count of the Epilogue when choosing the Epilogue VF PR: https://github.com/llvm/llvm-project/pull/108190 --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2024-11-17 19:35:32 +00:00
Paul Walker	38fffa630e	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548 )	2024-11-06 11:53:33 +00:00
Florian Hahn	a353e258ba	[LAA] Don't require Stride == 1/-1 for inbounds pointer AddRecs nowrap. (#113126 ) If we have a pointer AddRec, the maximum increment is 2^(pointer-index-wdith - 1) - 1. This means that if incrementing the AddRec wraps, the distance between the previously accessed location and the wrapped location is > 2^(pointer-index-wdith - 1), i.e. if the GEP for the AddRec is inbounds, this would be poison due to the object being larger than half the pointer index type space. The poison would be immediate UB when the memory access gets executed.. Similar reasoning can be applied for decrements. PR: https://github.com/llvm/llvm-project/pull/113126	2024-11-05 22:45:56 +01:00
Florian Hahn	b021464d35	[VPlan] Introduce scalar loop header in plan, remove VPLiveOut. (#109975 ) Update VPlan to include the scalar loop header. This allows retiring VPLiveOut, as the remaining live-outs can now be handled by adding operands to the wrapped phis in the scalar loop header. Note that the current version only includes the scalar loop header, no other loop blocks and also does not wrap it in a region block. PR: https://github.com/llvm/llvm-project/pull/109975	2024-10-31 21:36:44 +01:00
David Sherwood	7f498a865f	[CostModel][LoopVectorize] Move some loop vectoriser tests (#113702 ) Many tests that were in test/Analysis/CostModel were actually loop vectoriser tests. I've moved them as follows: Analysis/CostModel/X86 -> Transforms/LoopVectorize/X86/CostModel Analysis/CostModel/AArch64/arith-fp-frem.ll -> Transforms/LoopVectorize/AArch64/arith-fp-frem-costs.ll	2024-10-30 13:50:02 +00:00
Rohit Aggarwal	dfb60bb919	Adding more vector calls for -fveclib=AMDLIBM (#109662 ) AMD has it's own implementation of vector calls. New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos Please refer [https://github.com/amd/aocl-libm-ose] --------- Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>	2024-10-29 10:09:55 +00:00
Florian Hahn	e724226da7	[VPlan] Return cost of 0 for VPWidenCastRecipe without underlying value. In some cases, VPWidenCastRecipes are created but not considered in the legacy cost model, including truncates/extends when evaluating a reduction in a smaller type. Return 0 for such casts for now, to avoid divergences between VPlan and legacy cost models. Fixes https://github.com/llvm/llvm-project/issues/113526.	2024-10-25 21:25:44 +02:00
Florian Hahn	2dfb1c664c	[VPlan] Try to hoist Previous (and operands), if sinking fails for FORs. (#108945 ) In some cases, Previous (and its operands) can be hoisted. This allows supporting additional cases where sinking of all users of to FOR fails, e.g. due having to sink recipes with side-effects. This fixes a crash where we fail to create a scalar VPlan for a first-order recurrence, but can create a vector VPlan, because the trunc instruction of an IV which generates the previous value of the recurrence has been optimized to a truncated induction recipe, thus hoisting it to the beginning. Fixes https://github.com/llvm/llvm-project/issues/106523. PR: https://github.com/llvm/llvm-project/pull/108945	2024-10-23 13:12:03 -07:00
Florian Hahn	ddbb382a7c	[LV] Regenerate check-lines for some tests.	2024-10-23 04:34:13 +01:00
Florian Hahn	2a6b09e0d3	[LV] Use type from InsertPos for cost computation of interleave groups. Previously the legacy cost model would pick the type for the cost computation depending on the order of the members in the input IR. This is incompatible with the VPlan-based cost model (independent of original IR order) and also doesn't match code-gen, which uses the type of the insert position. Update the legacy cost model to use the type (and address space) from the Group's insert position. This brings the legacy cost model in line with the legacy cost model and fixes a divergence between both models. Note that the X86 cost model seems to assign different costs to groups with i64 and double types. Added a TODO to check. Fixes https://github.com/llvm/llvm-project/issues/112922.	2024-10-18 19:12:40 -07:00
Florian Hahn	b497010854	[VPlan] Use VPInstruction::Name when assigning names (NFCI). This slightly improves the printing of VPInstructions. NFC except debug output.	2024-10-18 05:52:35 +01:00
Yingwei Zheng	095d49da76	[InstCombine] Set `samesign` when converting signed predicates into unsigned (#112642 ) Alive2: https://alive2.llvm.org/ce/z/6cqdt-	2024-10-17 20:43:48 +08:00
Florian Hahn	3860e29e0e	[VPlan] Mark VPVectorPointerRecipe as not having sideeffects. VectorPointer doesn't read from memory or have any sideeffects. Mark it accordingly.	2024-10-16 06:10:19 +01:00
Florian Hahn	bb937e276d	[LV] Compute value of escaped induction based on the computed end value. (#110576 ) Update fixupIVUsers to compute the value for escaped inductions using the already computed end value of the induction (EndValue), but subtracting the step. This results in slightly simpler codegen, as we avoid computing the full transformed index at VectorTripCount - 1. PR: https://github.com/llvm/llvm-project/pull/110576	2024-10-10 20:04:46 +01:00
Florian Hahn	01cbbc52dc	[VPlan] Request lane 0 for pointer arg in PtrAdd. After 7f74651, the pointer operand may be replicated of a PtrAdd. Instead of requesting a single scalar, request lane 0, which correctly handles the case when there is a scalar-per-lane. Fixes https://github.com/llvm/llvm-project/issues/111606.	2024-10-09 13:18:54 +01:00
Florian Hahn	36fc291b6e	[VPlan] Implement VPBlendRecipe::computeCost. Implement VPBlendRecipe::computeCost. VPBlendRecipe is currently is also used if only the first lane is used. This also requires pre-computing costs for forced scalars and instructions considered profitable to scalarize. For those, the cost will be computed separately in the legacy cost model. This will also be needed when implementing VPReplicateRecipe::computeCost.	2024-10-08 21:33:42 +01:00
Florian Hahn	3ec6f805c5	[VPlan] Don't created GEP x, 0 for interleave group pointers. The GEP with offet 0 is redundant, remove it. This addresses a TODO from 7f74651837b ((#106431).	2024-10-08 12:08:13 +01:00

1 2 3 4 5 ...

846 Commits