llvm-project

Author	SHA1	Message	Date
Luke Lau	8107b430ed	[VPlan] Simplify select c, x, x -> x (#133731 ) As noted in 1a9358c090d0507be21c5e9b2d97a23ef1de8ab0, some simplifications can produce a redundant select where the true and false operands are the same, which this patch removes. The is_fpclass test was changed so the condition wasn't made dead.	2025-04-02 10:26:48 +01:00
YunQiang Su	e25187bc3e	LLVM/Test: Add vectorizing testcases for fminimumnum and fminimumnum (#133843 ) Vectorizing of fminimumnum and fminimumnum have not support yet. Let's add the testcase for it now, and we will update the testcase when we support it.	2025-04-02 08:46:02 +08:00
Ramkumar Ramachandra	3a66760d9b	[LV] Improve a test, regen with UTC (#130092 )	2025-04-01 14:11:20 +01:00
Florian Hahn	783a846507	[VPlan] Add VF as operand to VPScalarIVStepsRecipe. Similarly to other recipes, update VPScalarIVStepsRecipe to also take the runtime VF as argument. This removes some unnecessary runtime VF computations for scalable vectors. It will also allow dropping the UF == 1 restriction for narrowing interleave groups required in 577631f0a528.	2025-03-28 21:48:59 +00:00
Hari Limaye	bf5627c85e	[LV] Optimize VPWidenIntOrFpInductionRecipe for known TC (#118828 ) Optimize the IR generated for a VPWidenIntOrFpInductionRecipe to use the narrowest type necessary, when the trip-count of a loop is known to be constant and the only use of the recipe is the condition used by the vector loop's backedge branch.	2025-03-28 14:47:40 +00:00
Florian Hahn	5c26e80e57	[LV] Make cost model tests independent of VPValue numbers. Update tests to not rely on hard-coded VPValue numbers.	2025-03-27 21:15:32 +00:00
Florian Hahn	2c7d40b2f0	[VPlan] Generalize SCALAR-STEPS removal to any unroll factor. Follow-up to dfca6c0d3bf9d1a056 to extend isUnrolled handle any unrolled VPlan, which means there's a single UF, but it will be > 1 if unrolling took place.	2025-03-26 21:03:50 +00:00
Florian Hahn	577631f0a5	Reapply "[VPlan] Add transformation to narrow interleave groups. (#106441 )" This reverts commit ff3e2ba9eb94217f3ad3525dc18b0c7b684e0abf. The recommmitted version limits to transform to cases where no interleaving is taking place, to avoid a mis-compile when interleaving. Original commit message: This patch adds a new narrowInterleaveGroups transfrom, which tries convert a plan with interleave groups with VF elements to a plan that instead replaces the interleave groups with wide loads and stores processing VF elements. This effectively is a very simple form of loop-aware SLP, where we use interleave groups to identify candidates. This initial version is quite restricted and hopefully serves as a starting point for how to best model those kinds of transforms. Depends on https://github.com/llvm/llvm-project/pull/106431. Fixes https://github.com/llvm/llvm-project/issues/82936. PR: https://github.com/llvm/llvm-project/pull/106441	2025-03-25 20:57:10 +00:00
Florian Hahn	dfca6c0d3b	[VPlan] Remove no-op SCALAR-STEPS after unrolling. (#123655 ) After unrolling, there may be additional simplifications that can be applied. One example is removing SCALAR-STEPS for the first part where only the first lane is demanded. This removes redundant adds of 0 from a large number of tests (~200), many which I am still working on updating. In preparation for removing redundant WideIV steps added in https://github.com/llvm/llvm-project/pull/119284. PR: https://github.com/llvm/llvm-project/pull/123655	2025-03-25 12:57:24 +00:00
Martin Storsjö	ff3e2ba9eb	Revert "[VPlan] Add transformation to narrow interleave groups. (#106441 )" This reverts commit dfa665f19c52d98b8d833a8e9073427ba5641b19. This commit caused miscompilations in ffmpeg, see https://github.com/llvm/llvm-project/pull/106441 for details.	2025-03-23 23:27:39 +02:00
Florian Hahn	dfa665f19c	[VPlan] Add transformation to narrow interleave groups. (#106441 ) This patch adds a new narrowInterleaveGroups transfrom, which tries convert a plan with interleave groups with VF elements to a plan that instead replaces the interleave groups with wide loads and stores processing VF elements. This effectively is a very simple form of loop-aware SLP, where we use interleave groups to identify candidates. This initial version is quite restricted and hopefully serves as a starting point for how to best model those kinds of transforms. Depends on https://github.com/llvm/llvm-project/pull/106431. Fixes https://github.com/llvm/llvm-project/issues/82936. PR: https://github.com/llvm/llvm-project/pull/106441	2025-03-22 21:40:17 +00:00
Florian Hahn	0d3ba087f7	[LV] Move IV bypass value creation out of ILV (NFC) createInductionAdditionalBypassValues is only used for epilogue vectorization now. Move it out of ILV, which means we do not have to thread through ExpandedSCEVs and also don't have to track the bypass values in ILV. Instead, directly create them if needed after executing the epilogue plan. This moves more the epilogue specific logic out of the generic executePlan.	2025-03-22 20:36:45 +00:00
Florian Hahn	870f753f1f	[VPlan] Also materialize broadcasts for backedge-taken-counts (NFC). Also include VPlan's BTC in the set of VPValues to materialize broadcasts for, if it is used.	2025-03-18 22:35:18 +00:00
Luke Lau	eef5ea0c42	[VPlan] Account for dead FOR splice simplification in cost model (#131486 ) Fixes #131359 After #129645, a first-order recurrence will no longer have it's splice costed if the VPInstruction::FirstOrderRecurrenceSplice has no users and is dead. The legacy cost model didn't account for this, so this accounts for it in planContainsAdditionalSimplifications to avoid the "VPlan cost model and legacy cost model disagreed" assertion.	2025-03-18 00:00:54 +08:00
Florian Hahn	6a8d5f22ff	[VPlan] Don't access canonical IV in VPWidenPointerInduction::execute. This updates VPWidenPointerInductionRecipe::execute to not use the canonical IV to determine the insert point. Instead, it relies on the current recipe position. In cases where this is not sufficient, set the insert point to the first non-phi instruction, to ensure phis are created together.	2025-03-15 21:32:48 +00:00
Florian Hahn	aadfa9f6c8	[LV] Add additional tests for narrowing interleave groups. Extend test coverage for https://github.com/llvm/llvm-project/pull/106441.	2025-03-15 21:13:49 +00:00
Florian Hahn	62994c3291	[VPlan] Also introduce explicit broadcasts for values from entry VPBB. Update and generalize materializeBroadcasts to also introduce explicit broadcasts for VPValues defined in the Plans Entry block. This fixes a crash when trying to insert the broadcasts generated by VPTransformState::get after the generating instruction, which isn't possible after invoke instructions. Fixes https://github.com/llvm/llvm-project/issues/128838.	2025-03-12 22:03:19 +00:00
Florian Hahn	8132c4f554	[VPlan] Also introduce broadcasts for live-ins used in vec preheader. Slightly generalize materializeLiveInBroadcasts to also introduce broadcasts for live-ins used in the vector preheader. This should cover all live-ins. If the live-in is used in the vector preheader, insert the broadcast at the beginning of the block.	2025-03-11 21:19:14 +00:00
Florian Hahn	8dd160f476	Revert "[VPlan] Fold NOT into predicate of wide compares." (#130347 ) Reverts llvm/llvm-project#129430 this seems to have introduced a divergence between legacy and VPlan-based cost model https://lab.llvm.org/buildbot/#/builders/30/builds/17159	2025-03-07 21:18:49 +00:00
Florian Hahn	cb3ce30ca8	[VPlan] Fold NOT into predicate of wide compares. (#129430 ) Add simplification to fold negation into a compare, if the negation is the only user of the compare. This removes a number of redundant negations. Alive2 Proofs for FPCMP test changes: https://alive2.llvm.org/ce/z/WGDz9U PR: https://github.com/llvm/llvm-project/pull/129430	2025-03-07 20:32:43 +00:00
Ramkumar Ramachandra	ddffb74afd	[LV] Strip unreachable SCEV-check blocks (#130079 ) emitSCEVChecks checks if SCEVCheckCond matches zero, and returns nullptr. However, it sets SCEVCheckCond as used before it does this, which prevents it from being removed during cleanup, resulting in unreachable blocks being emitted. Fix this.	2025-03-06 19:30:25 +00:00
Florian Hahn	f937b17e85	[LV] Don't query SCEV for non-invariant values in cost model. This fixes a divergence between VPlan and legacy cost model, matching behavior further up in getInstructionCost as well. Fixes https://github.com/llvm/llvm-project/issues/129236.	2025-03-02 10:55:52 +00:00
Florian Hahn	1e1b9bccc0	[VPlan] Simplify BLEND %a, %b, NOT(%m) -> BLEND %b, %a, %m. (#128375 ) Avoid negations for normalized blends by reordering operands. PR: https://github.com/llvm/llvm-project/pull/128375	2025-02-27 17:43:24 +00:00
Florian Hahn	4277c21059	[VPlan] Introduce explicit broadcasts for live-ins. (#124644 ) Add a new VPInstruction::Broadcast opcode and use it to materialize explicit broadcasts of live-ins. The initial patch only materlizes the broadcasts if the vector preheader dominates all uses that need it. Later patches will pick the best valid insert point, thus retiring implicit hoisting of broadcasts from VPTransformsState::get(). PR: https://github.com/llvm/llvm-project/pull/124644	2025-02-26 13:57:51 +00:00
Elvis Wang	8009c1fd81	[LV][VPlan] Prevent calculate cost for skiped instructions in precomputeCosts(). (#127966 ) Skip calculating instruction costs for exit conditions in precomputeCosts() when it should be skipped. Reported from: https://github.com/llvm/llvm-project/issues/115744#issuecomment-2670479463 Godbolt for reduced test cases: https://godbolt.org/z/fr4YMeqcv	2025-02-25 11:09:09 +08:00
Florian Hahn	52ded67249	[LAA] Always require non-wrapping pointers for runtime checks. (#127543 ) Currently we only check if the pointers involved in runtime checks do not wrap if we need to perform dependency checks. If that's not the case, we generate runtime checks, even if the pointers may wrap (see test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll). If the pointer wraps, then we swap start and end of the runtime check, leading to incorrect checks. An Alive2 proof of what the runtime checks are checking conceptually (on i4 to have it complete in reasonable time) showing the incorrect result should be https://alive2.llvm.org/ce/z/KsHzn8 Depends on https://github.com/llvm/llvm-project/pull/127410 to avoid more regressions. PR: https://github.com/llvm/llvm-project/pull/127543	2025-02-20 19:00:23 +01:00
Florian Hahn	04b5c63ddf	[LV] Add inbounds to interleave test. In preparation for https://github.com/llvm/llvm-project/pull/127543	2025-02-20 16:33:01 +01:00
Florian Hahn	e5f5517f91	[VPlan] Create IR basic block for middle.block in VPlan. Create a IR BB directly for the middle.block, instead of creating the IR BB during skeleton creation and then replacing the middle VPBB with a VPIRBB. This moves another part of skeleton creation to VPlan and simplififes the code slightly by removing code to disconnect the middle block and vector preheader + the corresponding DT update. NFC modulo IR block naming and block creation order, which changes the IR names for the blocks.	2025-02-15 21:54:16 +01:00
Florian Hahn	e258bca950	[VPlan] Only skip expansion for SCEVUnknown if it isn't an instruction. (#125235 ) Update getOrCreateVPValueForSCEVExpr to only skip expansion of SCEVUnknown if the underlying value isn't an instruction. Instructions may be defined in a loop and using them without expansion may break LCSSA form. SCEVExpander will take care of preserving LCSSA if needed. We could also try to pass LoopInfo, but there are some users of the function where it won't be available and main benefit from skipping expansion is slightly more concise VPlans. Note that SCEVExpander is now used to expand SCEVUnknown with floats. Adjust the check in expandCodeFor to only check the types and casts if the type of the value is different to the requested type. Otherwise we crash when trying to expand a float and requesting a float type. Fixes https://github.com/llvm/llvm-project/issues/121518. PR: https://github.com/llvm/llvm-project/pull/125235	2025-02-11 13:03:12 +01:00
Simon Pilgrim	70906f0514	[LV][X86] Regenerate interleaved load/store costs. NFC. update_analyze_test_checks has improved the checks since these were last updated. Reduce noise diffs in future patches.	2025-02-09 15:02:41 +00:00
Florian Hahn	32c4493d5f	[VPlan] Add incoming values for all predecessor to ResumePHI (NFCI). Follow-up as discussed when using VPInstruction::ResumePhi for all resume values (#112147). This patch explicitly adds incoming values for each predecessor in VPlan. This simplifies codegen and allows transformations adjusting the predecessors of blocks with NFC modulo incoming block order in phis.	2025-02-09 11:20:20 +00:00
Florian Hahn	1611059f5d	[VPlan] Compute cost for binary op VPInstruction with underlying values. (#125434 ) As exposed by https://github.com/llvm/llvm-project/pull/125094, we are missing cost computation for some binary VPInstructions we created based on original IR instructions. Their cost should be considered. PR: https://github.com/llvm/llvm-project/pull/125434	2025-02-07 15:27:31 +00:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
David Sherwood	c836b8956d	[LoopVectorize][NFC] Disable output for tests that don't need it (#124747 ) There are a lot of tests that do not depend upon the IR output for validation, relying instead on the debug output. For these tests we can add the -disable-output command line argument.	2025-01-29 08:09:50 +00:00
Florian Hahn	713482fccf	[VPlan] Use State.get to extract lane mask for BranchOnMask. Simplifies the code slightly and avoids redundant extracts/broadcasts if the operand is live-in or already scalar.	2025-01-27 21:35:36 +00:00
David Sherwood	b7286dbef9	Reland "[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop #96752 " (#123616 ) The last attempt failed a sanitiser build because we were creating a reference to a null Predicates pointer in isDereferenceableAndAlignedInLoop. This was exposed by the unit test IsDerefReadOnlyLoop in unittests/Analysis/LoadsTest.cpp. I fixed this by falling back on getConstantMaxBackedgeTakenCount if Predicates is null - see line 316 in llvm/lib/Analysis/Loads.cpp. There are no other changes.	2025-01-27 11:59:38 +00:00
Florian Hahn	2c87133c62	Reapply "[VPlan] Update final IV exit value via VPlan. (#112147 )" This reverts the revert commit 58326f1d5b5b379590af92dd129b2f3b3e96af46. The build failure in sanitizer stage2 builds has been fixed with 0d39fe6f5bb3edf0bddec09a8c6417377390aeac. Original commit message: Model updating IV users directly in VPlan, replace fixupIVUsers. Now simple extracts are created for all phis in the exit block during initial VPlan construction. A later VPlan transform (optimizeInductionExitUsers) replaces extracts of inductions with their pre-computed values if possible. This completes the transition towards modeling all live-outs directly in VPlan. There are a few follow-ups: * emit extracts initially also for resume phis, and optimize them tougher with IV exit users * support for VPlans with multiple exits in optimizeInductionExitUsers. Depends on https://github.com/llvm/llvm-project/pull/110004, https://github.com/llvm/llvm-project/pull/109975 and https://github.com/llvm/llvm-project/pull/112145.	2025-01-19 19:32:03 +00:00
Florian Hahn	58326f1d5b	Revert "[VPlan] Update final IV exit value via VPlan. (#112147 )" This reverts commit c2d15ac4d4432788557e77c15ce572ac655a8fec. Causes build failures on PPC stage2 & fuchsia bots https://lab.llvm.org/buildbot/#/builders/168/builds/7650 https://lab.llvm.org/buildbot/#/builders/11/builds/11248	2025-01-18 13:40:33 +00:00
Florian Hahn	c2d15ac4d4	[VPlan] Update final IV exit value via VPlan. (#112147 ) Model updating IV users directly in VPlan, replace fixupIVUsers. Now simple extracts are created for all phis in the exit block during initial VPlan construction. A later VPlan transform (optimizeInductionExitUsers) replaces extracts of inductions with their pre-computed values if possible. This completes the transition towards modeling all live-outs directly in VPlan. There are a few follow-ups: * emit extracts initially also for resume phis, and optimize them tougher with IV exit users * support for VPlans with multiple exits in optimizeInductionExitUsers. Depends on https://github.com/llvm/llvm-project/pull/110004, https://github.com/llvm/llvm-project/pull/109975 and https://github.com/llvm/llvm-project/pull/112145.	2025-01-18 13:22:34 +00:00
David Sherwood	a00938eedd	Revert "[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop (#96752 )" (#123057 ) This reverts commit bfedf6460c2cad6e6f966b457d8d27084579dcd8.	2025-01-15 13:56:42 +00:00
David Sherwood	bfedf6460c	[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop (#96752 ) Currently when we encounter a negative step in the induction variable isDereferenceableAndAlignedInLoop bails out because the element size is signed greater than the step. This patch adds support for negative steps in cases where we detect the start address for the load is of the form base + offset. In this case the address decrements in each iteration so we need to calculate the access size differently. I have done this by caling getStartAndEndForAccess from LoopAccessAnalysis.cpp. The motivation for this patch comes from PR #88385 where a reviewer requested reusing isDereferenceableAndAlignedInLoop, but that PR itself does support reverse loops. The changed test in LoopVectorize/X86/load-deref-pred.ll now passes because previously we were calculating the total access size incorrectly, whereas now it is 412 bytes and fits perfectly into the alloca.	2025-01-15 12:47:43 +00:00
LiqinWeng	0294dab79e	[LV][VPlan] Add fast flags for selectRecipe (#121023 ) Change the inheritance of class VPWidenSelectRecipe to class VPRecipeWithIRFlags, which allows recipe of the select to pass the fastmath flags.The patch of #119847 will add the fastmath flag to for recipe	2025-01-15 10:10:11 +08:00
Florian Hahn	1de3dc7d23	[LV] Bail out early if BTC+1 wraps. Currently we fail to detect the case where BTC + 1 wraps, i.e. the vector trip count is 0, In those cases, the minimum iteration count check will fail, and the vector code will never be executed. Explicitly check for this condition in computeMaxVF and avoid trying to vectorize alltogether. Note that a number of tests needed to be updated, because the vector loop would never be executed given the input IR. Fixes https://github.com/llvm/llvm-project/issues/122558.	2025-01-14 22:07:38 +00:00
Florian Hahn	8df64ed777	[LV] Don't consider IV increments uniform if exit value is used outside. In some cases, there might be a chain of uniform instructions producing the exit value. To generate correct code in all cases, consider the IV increment not uniform, if there are users outside the loop. Instead, let VPlan narrow the IV, if possible using the logic from 3ff1d01985752. Test case from #122602 verified with Alive2: https://alive2.llvm.org/ce/z/bA4EGj Fixes https://github.com/llvm/llvm-project/issues/122496. Fixes https://github.com/llvm/llvm-project/issues/122602.	2025-01-12 22:03:21 +00:00
Florian Hahn	3ff1d01985	Recommit "[VPlan] Try to narrow wide and replicating recipes to uniform recipes." This reverts commit 0ebb3ac7c92c4c1c44e7f3d17832d75ec5a42a67. Re-applies commit with typos fixed.	2025-01-12 20:10:28 +00:00
Florian Hahn	0ebb3ac7c9	Revert "[VPlan] Try to narrow wide and replicating recipes to uniform recipes." This reverts commit 1afba19913253dda865a8e57b37b9f4dabead1ac. Typo breaking the build	2025-01-12 19:37:45 +00:00
Florian Hahn	1afba19913	[VPlan] Try to narrow wide and replicating recipes to uniform recipes. Use the existing VPlan-based analysis to identify recipes that only have their first lane demanded and transform them to uniform recpliate recipes. This simplifies the generated code in some places and prepares for fixing https://github.com/llvm/llvm-project/issues/122496.	2025-01-12 19:32:01 +00:00
Florian Hahn	44058e5b5f	[LV] Precommit tests for #106441 . Tests for https://github.com/llvm/llvm-project/pull/106441 from https://github.com/llvm/llvm-project/issues/82936.	2025-01-10 18:49:44 +00:00
Florian Hahn	b0697dc1de	[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994 ) Currently we emit early-exit related debug messages/remarks even when there is a single exit. Update to only check isVectorizableEarlyExitLoop if there isn't a single exit block. PR: https://github.com/llvm/llvm-project/pull/121994	2025-01-09 12:05:19 +00:00
Luke Lau	f0d5104c94	[VPlan] Handle some VPInstructions in may{Read,Write}FromMemory (#120058 ) This just copies the same conservative definition from mayWriteToMemory, and enables more VPInstructions to be hoisted out in LICM. I think this should give more accurate costs, and I was able to build llvm-test-suite without the legacy-vplan cost model assertion going off.	2025-01-08 15:17:26 +08:00

1 2 3 4 5 ...

889 Commits