llvm-project

Author	SHA1	Message	Date
Florian Hahn	3026ecaff5	[LV] Also verify loops in vector loop removal tests. Also verify loop info in tests added in 7d6ec3b9680.	2024-12-31 20:11:23 +00:00
Florian Hahn	7d6ec3b968	[LV] Add more tests for vector loop removal. Add missing test coverage of loops where the vector loop region can be removed that include replicate recipes as well as nested loops. Extra test coverage for https://github.com/llvm/llvm-project/pull/108378.	2024-12-31 20:08:54 +00:00
Muhammad Omair Javaid	332d2647ff	Revert "[LV]: Teach LV to recursively (de)interleave. (#89018 )" This reverts commit ccfe0de0e1e37ed369c9bf89dd0188ba0afb2e9a. This breaks LLVM build on AArch64 SVE Linux buildbots https://lab.llvm.org/buildbot/#/builders/143/builds/4462 https://lab.llvm.org/buildbot/#/builders/17/builds/4902 https://lab.llvm.org/buildbot/#/builders/4/builds/4399 https://lab.llvm.org/buildbot/#/builders/41/builds/4299	2024-12-31 03:12:24 +05:00
Florian Hahn	b20b6e9ea9	[LV] Check IR generated for both interleaving and vectorizing in test. Currently the tests would in some cases would only check the vectorized IR, but not the interleaved IR, if they are different.	2024-12-30 19:57:26 +00:00
Florian Hahn	c2be48a6ce	[LV] Add additional tests with induction users. Adds test coverage of post-inc IV users with different opcodes.	2024-12-30 17:36:48 +00:00
Florian Hahn	7f3428d3ed	[VPlan] Compute induction end values in VPlan. (#112145 ) Use createDerivedIV to compute IV end values directly in VPlan, instead of creating them up-front. This allows updating IV users outside the loop as follow-up. Depends on https://github.com/llvm/llvm-project/pull/110004 and https://github.com/llvm/llvm-project/pull/109975. PR: https://github.com/llvm/llvm-project/pull/112145	2024-12-29 19:05:08 +00:00
Zequan Wu	4d8f9594b2	Revert "Reland "[LoopVectorizer] Add support for partial reductions" (#120721 )" This reverts commit c858bf620c3ab2a4db53e84b9365b553c3ad1aa6 as it casuse optimization crash on -O2, see https://github.com/llvm/llvm-project/pull/120721#issuecomment-2563192057	2024-12-27 11:51:54 -08:00
Hassnaa Hamdi	ccfe0de0e1	[LV]: Teach LV to recursively (de)interleave. (#89018 ) Currently available intrinsics are only ld2/st2, which don't support interleaving factor > 2. This patch teaches the LV to use ld2/st2 recursively to support high interleaving factors.	2024-12-27 12:42:07 +00:00
Elvis Wang	47e1c87a61	[VPlan] Set debug location for VPReduction/VPWidenIntrinsicRecipe. (#120054 ) This patch add missing debug location for VPReduction/VPWidenIntrinsicRecipe.	2024-12-27 10:37:21 +08:00
Florian Hahn	2d038caeeb	[VPlan] Remove stray space when printing VPWidenCastRecipe. printFlags() already takes care of printing a single space if there are no flags. Remove the extra space when printing a recipe without flags.	2024-12-24 20:23:48 +00:00
Sam Tebbs	c858bf620c	Reland "[LoopVectorizer] Add support for partial reductions" (#120721 ) This re-lands the reverted #92418 When the VF is small enough so that dividing the VF by the scaling factor results in 1, the reduction phi execution thinks the VF is scalar and sets the reduction's output as a scalar value, tripping assertions expecting a vector value. The latest commit in this PR fixes that by using `State.VF` in the scalar check, rather than the divided VF. --------- Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>	2024-12-24 12:08:17 +00:00
Florian Hahn	db2307d2d7	[LV] Add tests with dereferenceable assumptions. Add a number of tests with dereferenceable assumptions and different alignment info.	2024-12-22 16:32:40 +00:00
LiqinWeng	86fa35ce7e	[LV][VPlan] Use opcode to retrieve the VPID of the CallRecipe, rather than underlying instruction (#120816 ) This patch may cause the flags in the CallRecipe to be lost after EVL transformation, and it has been addressed in the patch: #119847	2024-12-22 10:28:20 +08:00
Florian Hahn	bc23ef3feb	[LV] Add test showing incorrect debug location for scalar casts.	2024-12-21 22:30:36 +00:00
Florian Hahn	9b496deb90	[VPlan] Set and use debug location for VPPredInstPHIRecipe. Update the recipe it always set its debug location and use it during IR generation.	2024-12-21 21:57:47 +00:00
Florian Hahn	df8efbdbbf	[SCEV] Remove existing predicates implied by newly added ones. (#118185 ) When adding a new predicate to a union predicate, some of the existing predicates may be implied by the new predicate. Remove any existing predicates that are already implied by the new predicate. Depends on https://github.com/llvm/llvm-project/pull/118184 to show the main benefit. PR: https://github.com/llvm/llvm-project/pull/118185	2024-12-20 20:49:37 +00:00
David Sherwood	5845298f94	[LoopVectorize] Teach some X86 cost model tests to use new vplan costs (#120738 ) I've only fixed up the tests where I was able to use a simple sed script to replace the text. Even after this patch lands, there are still over 50 tests that need updating in X86/CostModel!	2024-12-20 15:08:08 +00:00
Florian Hahn	5f096fd221	Revert "[LoopVectorizer] Add support for partial reductions (#92418 )" This reverts commit 060d62b48aeb5080ffcae1dc56e41a06c6f56701. It looks like this is triggering an assertion when build llvm-test-suite on ARM64 macOS. Reproducer from MultiSource/Benchmarks/Ptrdist/bc/number.c target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-n32:64-S128-Fn32" target triple = "arm64-apple-macosx15.0.0" define void @test(i64 %idx.neg, i8 %0) #0 { entry: br label %while.body while.body: ; preds = %while.body, %entry %n1ptr.0.idx131 = phi i64 [ %n1ptr.0.add, %while.body ], [ %idx.neg, %entry ] %n2ptr.0.idx130 = phi i64 [ %n2ptr.0.add, %while.body ], [ 0, %entry ] %sum.1129 = phi i64 [ %add99, %while.body ], [ 0, %entry ] %n1ptr.0.add = add i64 %n1ptr.0.idx131, 1 %conv = sext i8 %0 to i64 %n2ptr.0.add = add i64 %n2ptr.0.idx130, 1 %1 = load i8, ptr null, align 1 %conv97 = sext i8 %1 to i64 %mul = mul i64 %conv97, %conv %add99 = add i64 %mul, %sum.1129 %cmp94 = icmp ugt i64 %n1ptr.0.idx131, 0 %cmp95 = icmp ne i64 %n2ptr.0.idx130, -1 %2 = and i1 %cmp94, %cmp95 br i1 %2, label %while.body, label %while.end.loopexit while.end.loopexit: ; preds = %while.body %add99.lcssa = phi i64 [ %add99, %while.body ] ret void } attributes #0 = { "target-cpu"="apple-m1" } > opt -p loop-vectorize Assertion failed: ((VF.isScalar() \|\| V->getType()->isVectorTy()) && "scalar values must be stored as (0, 0)"), function set, file VPlan.h, line 284.	2024-12-19 21:46:51 +00:00
Nicholas Guy	060d62b48a	[LoopVectorizer] Add support for partial reductions (#92418 ) Following on from https://github.com/llvm/llvm-project/pull/94499, this patch adds support to the Loop Vectorizer to emit the partial reduction intrinsics where they may be beneficial for the target. --------- Co-authored-by: Samuel Tebbs <samuel.tebbs@arm.com>	2024-12-19 11:42:40 +00:00
David Sherwood	c18fda02e1	[LoopVectorize] Use new single string variant of reportVectorizationFailure (#120414 )	2024-12-19 10:07:13 +00:00
Alexander Kornienko	23a239267e	Revert "[InstCombine] Infer nuw for gep inbounds from base of object" (#120460 ) Reverts llvm/llvm-project#119225 due to the lack of sanitizer support, large potential of breaking code containing latent UB, non-trivial localization and investigation, and what seems to be a bad interaction with msan (a test is in the works). Related discussions: https://github.com/llvm/llvm-project/pull/119225#issuecomment-2551904822 https://github.com/llvm/llvm-project/pull/118472#issuecomment-2549986255	2024-12-18 19:06:34 +01:00
Florian Hahn	0e8d022ffe	[VPlan] Handle exit phis with multiple operands in addUsersInExitBlocks. (#120260 ) Currently the addUsersInExitBlocks incorrectly assumes exit phis only have a single operand, which may not be the case for loops with early exits when they share a common exit block. Also further relax the assertion in fixupIVUsers to allow exit values if they come from theloop latch/middle.block. PR: https://github.com/llvm/llvm-project/pull/120260	2024-12-18 14:47:16 +00:00
Florian Hahn	3e02038948	[LV] Fixup check lines after 13107cb09441.	2024-12-18 09:37:30 +00:00
David Sherwood	13107cb094	[LoopVectorize] Enable more early exit vectorisation tests (#117008 ) PR #112138 introduced initial support for dispatching to multiple exit blocks via split middle blocks. This patch fixes a few issues so that we can enable more tests to use the new enable-early-exit-vectorization flag. Fixes are: 1. The code to bail out for any loop live-out values happens too late. This is because collectUsersInExitBlocks ignores induction variables, which get dealt with in fixupIVUsers. I've moved the check much earlier in processLoop by looking for outside users of loop-defined values. 2. We shouldn't yet be interleaving when vectorising loops with uncountable early exits, since we've not added support for this yet. 3. Similarly, we also shouldn't be creating vector epilogues. 4. Similarly, we shouldn't enable tail-folding. 5. The existing implementation doesn't yet support loops that require scalar epilogues, although I plan to add that as part of PR #88385. 6. The new split middle blocks weren't being added to the parent loop.	2024-12-18 09:25:45 +00:00
Luke Lau	b1f4a0201a	[LV] Update failing test with middle block. NFC	2024-12-18 11:51:48 +08:00
Luke Lau	c2a879ecaa	[VPlan] Fix VPTypeAnalysis cache clobbering in EVL transform (#120252 ) When building SPEC CPU 2017 with RISC-V and EVL tail folding, this assertion in VPTypeAnalysis would trigger during the transformation to EVL recipes: `d8a0709b10/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp (L135-L142)` It was caused by this recipe: ``` WIDEN ir<%shr> = vp.or ir<%add33>, ir<0>, vp<%6> ``` Having its type inferred as i16, when ir<%add33> and ir<0> had inferred types of i32 somehow. The cause of this turned out to be because the VPTypeAnalysis cache was getting clobbered: In this transform we were erasing recipes but keeping around the same mapping from VPValue* to Type. In the meantime, new recipes would be created which would have the same address as the old value. They would then incorrectly get the old erased VPValue's cached type: ``` --- before --- 0x600001ec5030: WIDEN ir<%mul21.neg> = vp.mul vp<%11>, ir<0>, vp<%6> 0x600001ec5450: <badref> <- some value that was erased --- after --- 0x600001ec5030: WIDEN ir<%mul21.neg> = vp.mul vp<%11>, ir<0>, vp<%6> 0x600001ec5450: WIDEN ir<%shr> = vp.or ir<%add33>, ir<0>, vp<%6> <- a new value that happens to have the same address ``` This fixes this by deferring the erasing of recipes till after the transformation. The test case might be a bit flakey since it just happens to have the right conditions to recreate this. I tried to add an assert in inferScalarType that every VPValue in the cache was valid, but couldn't find a way of telling if a VPValue had been erased. --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2024-12-18 11:28:28 +08:00
Luke Lau	4a7f60d328	[VPlan] Handle VPWidenCastRecipe without underlying value in EVL transform (#120194 ) This fixes a crash that shows up when building SPEC CPU 2017 with EVL tail folding on RISC-V. A VPWidenCastRecipe doesn't always have an underlying value, and in the case of this crash this happens whenever a widened cast is created via truncateToMinimalBitwidths. Fix this by just using the opcode stored in the recipe itself. I think a similar issue exists with VPWidenIntrinsicRecipe and how it's widened, but I haven't run into any crashes with it just yet.	2024-12-18 11:28:07 +08:00
Florian Hahn	4ad0fdd163	[VPlan] Remove reverse() of predecessors from VPInstruction::generate. This was originally done to reduce the diff for the change. Remove it and update the remaining tests. NFC modulo reordering of incoming values. Clean up after https://github.com/llvm/llvm-project/pull/114292.	2024-12-17 20:44:32 +00:00
Nikita Popov	1157187496	[VPlan] Propagate all GEP flags (#119899 ) Store GEPNoWrapFlags instead of only InBounds and propagate them.	2024-12-17 13:48:50 +01:00
Florian Hahn	0e528ac404	[VPlan] Use start value operand for FindLastIV reduction phis. Update VPReductionPHIRecipe::execute to use the start value from the start value operand of the recipe. This is needed to make sure we resume from the correct value during epilogue vectorization. At the moment, the start value is set to the sentinel value in adjustRecipesForReductions, as the original start value needs to be used when creating ResumePhi recipes. Fixes a mis-compile introduced by b3cba9be41bfa8 in SPEC2017 on AArch64.	2024-12-16 23:29:49 +00:00
Florian Hahn	0f6d93f8d5	[LV] Add test showing bug in epilogue vectorization of selects. This is causing mis-compiles when in SPEC2017 on AArch64 after b3cba9be41bfa8.	2024-12-16 23:23:38 +00:00
Florian Hahn	95e509a989	[VPlan] Add VPWidenInduction recipe as common base class (NFC). (#120008 ) This helps to simplify some existing code and new code (https://github.com/llvm/llvm-project/pull/112145) PR: https://github.com/llvm/llvm-project/pull/120008	2024-12-16 09:40:03 +00:00
Luke Lau	4746395bd7	[VPlan] Omit zero add in VPWidenIntOrFpInductionRecipe (#119668 ) I'm not sure if getStepVector was used for other things in the past where StartIdx was non-zero, but nowadays VPWidenIntOrFpInductionRecipe is the only user of it, and just passes zero to it. I presume InstCombine was already catching this so hopefully removing this won't affect codegen.	2024-12-16 11:55:48 +08:00
Florian Hahn	43045051d4	[VPlan] Modernize VPWidenIntOrFpInductionRecipe printing (NFC). Modernize VPWidenIntOrFpInductionRecipe printing by including the result VPValue and all operand VPValues, similar to VPScalarIVStepsRecipe and VPDerivedIVRecipe.	2024-12-15 20:46:52 +00:00
Florian Hahn	e64650d702	[VPlan] Get types and step from VPWidenPointerInductionRecipe (NFC). Use information directly from operands instead of going through IVDescriptor.	2024-12-15 18:52:10 +00:00
Florian Hahn	2067e604a4	[VPlan] Manage VPWidenPointerInduction debug location via recipe. Update VPWidenPointerInduction to manage its debug location via recipe. This makes sure we emit a proper debug location for VPWidenPointerInductionRecipes.	2024-12-15 14:41:07 +00:00
Florian Hahn	6c98f70b30	[LV] Add test with missing debug location for pointer IV in vector loop.	2024-12-15 14:36:03 +00:00
Florian Hahn	2564f1e199	[VPlan] Simplify Not(Not(A)) -> A. Follow-up simplification to 5fae408d3a4c073ee4.	2024-12-14 20:08:26 +00:00
Florian Hahn	d1dff1dc18	[LV] Remove hard-coded VPValue numbers in test check lines. (NFC) Make tests independent of VPlan value numbers.	2024-12-12 22:33:00 +00:00
Florian Hahn	6c8f41d336	[VPlan] Hook IR blocks into VPlan during skeleton creation (NFC) (#114292 ) As a first step to move towards modeling the full skeleton in VPlan, start by wrapping IR blocks created during legacy skeleton creation in VPIRBasicBlocks and hook them into the VPlan. This means the skeleton CFG is represented in VPlan, just before execute. This allows moving parts of skeleton creation into recipes in the VPBBs gradually. Note that this allows retiring some manual DT updates, as this will be handled automatically during VPlan execution. PR: https://github.com/llvm/llvm-project/pull/114292	2024-12-12 15:58:16 +00:00
Mel Chen	b3cba9be41	[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable (#67812 ) Consider the following loop: ``` int rdx = init; for (int i = 0; i < n; ++i) rdx = (a[i] > b[i]) ? i : rdx; ``` We can vectorize this loop if `i` is an increasing induction variable. The final reduced value will be the maximum of `i` that the condition `a[i] > b[i]` is satisfied, or the start value `init`. This patch added new RecurKind enums - IFindLastIV and FFindLastIV. --------- Co-authored-by: Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>	2024-12-12 16:48:31 +08:00
Florian Hahn	4993a30365	[LV] Add missing REQUIRES: asserts to test using -debug. Fixup for test added in 5fae408d3a4c07.	2024-12-11 21:39:45 +00:00
Florian Hahn	5fae408d3a	[VPlan] Dispatch to multiple exit blocks via middle blocks. (#112138 ) A more lightweight variant of https://github.com/llvm/llvm-project/pull/109193, which dispatches to multiple exit blocks via the middle blocks. The patch also introduces a bit of required scaffolding to enable early-exit vectorization, including an option. At the moment, early-exit vectorization doesn't come with legality checks, and is only used if the option is provided and the loop has metadata forcing vectorization. This is only intended to be used for testing during bring-up, with @david-arm enabling auto early-exit vectorization plugging in the changes from https://github.com/llvm/llvm-project/pull/88385. PR: https://github.com/llvm/llvm-project/pull/112138	2024-12-11 21:11:05 +00:00
LiqinWeng	77b6910b27	[Test] Fix the failed test of #108351 (#119495 )	2024-12-11 11:43:25 +08:00
LiqinWeng	b759020cc8	[LV][EVL] Support cast instruction with EVL-vectorization (#108351 )	2024-12-11 10:01:41 +08:00
Mel Chen	f4081711f0	[LV][NFC] Add test cases for FindLastIV reduction idiom. (#118519 ) Pre-commit for #67812	2024-12-10 20:09:18 +08:00
Nikita Popov	e21ab4d16b	[InstCombine] Infer nuw for gep inbounds from base of object (#119225 ) When we have a gep inbounds from the base of an object (e.g. alloca or global), we know that the index cannot be negative, as this would go out of bounds. As such, we can infer nuw as well. The implementation is a bit stricter than necessary, we could also accept one unknown index followed by known-non-negative indices. Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently incorrectly doesn't require the inbounds for the alloca case, see https://github.com/AliveToolkit/alive2/issues/1138).	2024-12-10 10:00:50 +01:00
Florian Hahn	0e70289f37	[VPlan] Create canonical IV resume value for epilogue in VPlan. (NFCI) Update the code to create induction resume PHIs to also create a resume phi for the canonical induction during epilogue vectorization. This unifies the code for handling induction resume values and removes the need to explicitly create manually resume PHI and return it during epilogue creation. Overall it helps to move the code for updating the canonical induction resume value to the place where all other header phi resume values are updated. This is NFC, modulo order of the created phis.	2024-12-09 23:11:38 +00:00
Igor Kirillov	337936a83b	[LV] Ignore some costs when loop gets fully unrolled (#106699 ) When VF has a fixed width and equals the number of iterations, and we are not tail folding by masking, comparison instruction and induction operation will be DCEed later. Ignoring the costs of these instructions improves the cost model.	2024-12-09 18:17:52 +00:00
Nikita Popov	10f315dc9c	[ConstantFolding] Infer getelementptr nuw flag (#119214 ) Infer nuw from nusw and nneg. This is the constant expression variant of https://github.com/llvm/llvm-project/pull/111144. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-09 16:44:05 +01:00

1 2 3 4 5 ...

2820 Commits