llvm-project

Author	SHA1	Message	Date
AZero13	ffd2633061	[InstCombine] Fold mul (shr exact (X, N)), 2^N + 1 -> add (X , shr exact (X, N)) (#112407 ) Alive2 Proofs: https://alive2.llvm.org/ce/z/aJnxyp https://alive2.llvm.org/ce/z/dyeGEv	2025-02-13 14:25:09 +08:00
vporpo	31cb807537	[SanbdoxVec][BottomUpVec] Fix diamond shuffle with multiple vector inputs (#126965 ) When the operand comes from multiple inputs then we need additional packing code. When the operands are scalar then we can use a single InsertElementInst. But when the operands are vectors then we need a chain of ExtractElementInst and InsertElementInst instructions to insert the vector value into the destination vector. This is what this patch implements.	2025-02-12 14:33:05 -08:00
vporpo	7a7f9190d0	[SandboxVec][Legality] Fix mask on diamond reuse with shuffle (#126963 ) This patch fixes a bug in the creation of shuffle masks when vectorizing vectors in case of a diamond reuse with shuffle. The mask needs to enumerate all elements of a vector, not treat the original vector value as a single element. That is: if vectorizing two <2 x float> vectors into a <4 x float> the mask needs to have 4 indices, not just 2.	2025-02-12 12:29:09 -08:00
vporpo	6d7a84d72b	[SandboxVec][Scheduler] Fix top of schedule (#126820 ) This patch fixes the way the top-of-schedule variable gets set and updated. Before this patch it used to get updated whenever we scheduled a bundle, which is wrong, as the top-of-schedule needs to be maintained across scheduling attempts. It should get reset only when we clear the schedule or when we destroy the current schedule and re-schedule.	2025-02-12 11:52:01 -08:00
Yingwei Zheng	324e27e8ba	[ValueTracking] Infer NonEqual from dominating conditions/assumptions (#117442 ) This patch adds context-sensitive analysis support for `isKnownNonEqual`. It is required for https://github.com/llvm/llvm-project/issues/117436.	2025-02-12 20:15:14 +08:00
Alexey Bataev	10844fb9b0	[SLP]Fix attempt to build the reorder mask for non-adjusted reuse mask When building the reorder for non-single use reuse mask, need to check if the size of the mask is multiple of the number of unique scalars. Otherwise, the compiler may crash when trying to reorder nodes. Fixes #126304	2025-02-11 13:41:25 -08:00
David Green	3ef5348a04	[AArch64] Add a phase-order test for dot patterns. NFC	2025-02-11 21:20:07 +00:00
Elvin Wang	71e623d878	[llvm] Avoid out-of-order evaluation in DebugInfo (#125116 ) This is an upstream proposal from `e60884cb98` We observed malfunctioning StripNonLineTableDebugInfo during debugging and it's caused by out-of-order evaluation, this is a C++ level semantic ambiguity issue, refer https://en.cppreference.com/w/cpp/language/eval_order Solution is simply separating one line into two.	2025-02-11 10:33:07 -08:00
Andreas Jonson	cf87eb9d9b	[ValueTracking] Handle trunc to i1 as condition in dominating condition. (#126414 ) proof: https://alive2.llvm.org/ce/z/gALGmv	2025-02-11 18:11:23 +01:00
David Sherwood	efc72347fd	[AArch64] Improve getPartialReductionCost for fixed-width VFs (#126538 ) NEON does not have a version of udot/sdot that accumulates into 64-bit integer values, so we should return Invalid from getPartialReductionCost for 64-bit types and fixed-width VFs. In theory, if the 64-bit versions of SVE udot/sdot are available we could use those, but we don't currently have lowering support for that.	2025-02-11 15:10:39 +00:00
Florian Hahn	e258bca950	[VPlan] Only skip expansion for SCEVUnknown if it isn't an instruction. (#125235 ) Update getOrCreateVPValueForSCEVExpr to only skip expansion of SCEVUnknown if the underlying value isn't an instruction. Instructions may be defined in a loop and using them without expansion may break LCSSA form. SCEVExpander will take care of preserving LCSSA if needed. We could also try to pass LoopInfo, but there are some users of the function where it won't be available and main benefit from skipping expansion is slightly more concise VPlans. Note that SCEVExpander is now used to expand SCEVUnknown with floats. Adjust the check in expandCodeFor to only check the types and casts if the type of the value is different to the requested type. Otherwise we crash when trying to expand a float and requesting a float type. Fixes https://github.com/llvm/llvm-project/issues/121518. PR: https://github.com/llvm/llvm-project/pull/125235	2025-02-11 13:03:12 +01:00
Shilei Tian	15412d9d83	[FIX] Add `REQUIRES: asserts` to `llvm/test/Transforms/StructurizeCFG/simple-structurizecfg-crash.ll`	2025-02-10 19:43:37 -05:00
Andreas Jonson	9e0077c921	[ValueTracking] Handle not in dominating condition. (#126423 ) General handling of not in dominating condition. proof: https://alive2.llvm.org/ce/z/FjJN8q	2025-02-10 18:14:09 +01:00
Florian Hahn	3706dfef66	[LV] Forget LCSSA phi with new pred before other SCEV invalidation. (#119897 ) `forgetLcssaPhiWithNewPredecessor` performs additional invalidation if there is an existing SCEV for the phi, but earlier `forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate the SCEV for the phi. Change the order to first call `forgetLcssaPhiWithNewPredecessor` to ensure it runs before its SCEV gets invalidated too eagerly. Fixes https://github.com/llvm/llvm-project/issues/119665. PR: https://github.com/llvm/llvm-project/pull/119897	2025-02-10 16:29:42 +00:00
Shilei Tian	71fcc825b4	[NFC][StructurizeCFG] Add a test that can crash StructurizeCFG pass (#126087 ) I tried to fix it in #124051 but failed to do so. This PR adds the test and marks it as xfail.	2025-02-10 11:12:54 -05:00
David Sherwood	0010a3c97e	[NFC][LoopVectorize] Add more partial reduction tests (#126525 ) * Adds variants of dotp (dotp_i8_to_i64_has_neon_dotprod, dotp_i16_to_i64_has_neon_dotprod) that show how the loop vectoriser has generated fixed-width partial reductions without any matching NEON udot instruction. * Adds loops that could also benefit from partial reductions once the work is done to recognise patterns such as %zext = zext i8 %load to i32 %acc.next = add i32 %acc, %zext See zext_add_reduc_i8_i32, etc. I intend to follow up with a patch to add support for vectorising such patterns.	2025-02-10 16:04:43 +00:00
Ramkumar Ramachandra	c6b13a2871	Revert "SCEV: teach isImpliedViaOperations about samesign" (#126506 ) The commit f5d24e6c is buggy, and following miscompiles have been reported: #126409 and https://github.com/llvm/llvm-project/pull/124270#issuecomment-2647222903 Revert it while we investigate.	2025-02-10 13:31:18 +00:00
Nikita Popov	2d31a12dbe	[DSE] Don't use initializes on byval argument (#126259 ) There are two ways we can fix this problem, depending on how the semantics of byval and initializes should interact: * Don't infer initializes on byval arguments. initializes on byval refers to the original caller memory (or having both attributes is made a verifier error). * Infer initializes on byval, but don't use it in DSE. initializes on byval refers to the callee copy. This matches the semantics of readonly on byval. This is slightly more powerful, for example, we could do a backend optimization where byval + initializes will allocate the full size of byval on the stack but not copy over the parts covered by initializes. I went with the second variant here, skipping byval + initializes in DSE (FunctionAttrs already doesn't propagate initializes past byval). I'm open to going in the other direction though. Fixes https://github.com/llvm/llvm-project/issues/126181.	2025-02-10 10:34:03 +01:00
Nikita Popov	7aed53eb19	[ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236 ) The code already guards against values coming from a previous iteration using properlyDominates(). However, addrecs are considered to properly dominate the loop they are defined in. Handle this special case separately, by checking for expressions that have computable loop evolution (this should cover cases like a zext of an addrec as well). I considered changing the definition of properlyDominates() instead, but decided against it. The current definition is useful in other context, e.g. when deciding whether an expression is safe to expand in a given block. Fixes https://github.com/llvm/llvm-project/issues/126012.	2025-02-10 10:07:21 +01:00
Ricardo Jesus	5f84b6edd9	[AArch64] Add MATCH loops to LoopIdiomVectorizePass (#101976 ) This patch adds a new loop to LoopIdiomVectorizePass, enabling it to recognise and vectorise loops such as: ```cpp template<class InputIt, class ForwardIt> InputIt find_first_of(InputIt first, InputIt last, ForwardIt s_first, ForwardIt s_last) { for (; first != last; ++first) for (ForwardIt it = s_first; it != s_last; ++it) if (first == it) return first; return last; } ``` These loops match the C++ standard library function `std::find_first_of`.	2025-02-10 08:23:34 +00:00
Elvis Wang	2e3729bf40	[LV] Prevent query the computeCost() when VF=1 in emitInvalidCostRemarks(). (#117288 ) We should only query the computeCost() when the VF is vector.	2025-02-10 08:40:28 +08:00
Hassnaa Hamdi	e9a20f77ee	Reland "[LV]: Teach LV to recursively (de)interleave." (#125094 ) This patch relands the changes from "[LV]: Teach LV to recursively (de)interleave.#122989" Reason for revert: - The patch exposed an assert in the vectorizer related to VF difference between legacy cost model and VPlan-based cost model because of uncalculated cost for VPInstruction which is created by VPlanTransforms as a replacement to 'or disjoint' instruction. VPlanTransforms do that instructions change when there are memory interleaving and predicated blocks, but that change didn't cause problems because at most cases the cost difference between legacy/new models is not noticeable. - Issue is fixed by #125434 Original patch: https://github.com/llvm/llvm-project/pull/89018 Reviewed-by: paulwalker-arm, Mel-Chen	2025-02-09 19:21:54 +00:00
Andreas Jonson	3d140004c7	[ValueTracking] Test for not in dominating condition. (NFC)	2025-02-09 18:16:51 +01:00
Simon Pilgrim	70906f0514	[LV][X86] Regenerate interleaved load/store costs. NFC. update_analyze_test_checks has improved the checks since these were last updated. Reduce noise diffs in future patches.	2025-02-09 15:02:41 +00:00
Andreas Jonson	09a500b3db	[ValueTracking] more test of trunc to i1 as condition in dominating condition. (NFC)	2025-02-09 13:57:41 +01:00
Florian Hahn	32c4493d5f	[VPlan] Add incoming values for all predecessor to ResumePHI (NFCI). Follow-up as discussed when using VPInstruction::ResumePhi for all resume values (#112147). This patch explicitly adds incoming values for each predecessor in VPlan. This simplifies codegen and allows transformations adjusting the predecessors of blocks with NFC modulo incoming block order in phis.	2025-02-09 11:20:20 +00:00
Andreas Jonson	5ecc86bbca	[ValueTracking] test trunc to i1 as condition in dominating condition. (NFC)	2025-02-09 10:35:14 +01:00
Florian Hahn	9266b48c5b	[VPlan] Add outer loop tests with wide phis in inner loop. Add test coverage with phis outside a header block with multiple incoming values.	2025-02-08 18:09:45 +00:00
Vasileios Porpodas	7f2f905361	[SandboxVec] Fix: Add missing lit.local.cfg for target test	2025-02-08 09:06:00 -08:00
vporpo	69b8cf4f06	[SandboxVec][BottomUpVec] Add cost estimation and tr-accept-or-revert pass (#126325 ) The TransactionAcceptOrRevert pass is the final pass in the Sandbox Vectorizer's default pass pipeline. It's job is to check the cost before/after vectorization and accept or revert the IR to its original state. Since we are now starting the transaction in BottomUpVec, tests that run a custom pipeline need to accept the transaction. This is done with the help of the TransactionAlwaysAccept pass (tr-accept).	2025-02-08 08:34:18 -08:00
Florian Hahn	cea799afc6	[LV] Add ordered reduction test with live-in. Extra test for https://github.com/llvm/llvm-project/pull/124644.	2025-02-07 20:50:46 +00:00
Florian Hahn	1611059f5d	[VPlan] Compute cost for binary op VPInstruction with underlying values. (#125434 ) As exposed by https://github.com/llvm/llvm-project/pull/125094, we are missing cost computation for some binary VPInstructions we created based on original IR instructions. Their cost should be considered. PR: https://github.com/llvm/llvm-project/pull/125434	2025-02-07 15:27:31 +00:00
Ramkumar Ramachandra	52b59476cd	SCEV: re-org a test, regen via UTC (#126237 )	2025-02-07 13:19:34 +00:00
Nikita Popov	ae08969a20	[IndVars] Add test for #126012 (NFC)	2025-02-07 12:41:23 +01:00
David Sherwood	1930524bbd	[LoopVectorize] Fix cost model assert when vectorising calls (#125716 ) The legacy and vplan cost models did not agree because VPWidenCallRecipe::computeCost only calculates the cost of the call instruction, whereas LoopVectorizationCostModel::setVectorizedCallDecision in some cases adds on the cost of a synthesised mask argument. However, this mask is always 'splat(i1 true)' which should be hoisted out of the loop during codegen. In order to synchronise the two cost models I have two options: 1) Also add the cost of the splat to the vplan model, or 2) Remove the cost of the splat from the legacy model. I chose 2) because I feel this more closely represents what the final code will look like. There is an argument that we should take account of such broadcast costs in the preheader when deciding if it's profitable to vectorise a loop, however there isn't currently a mechanism to do this. We currently only take account of the runtime checks when assessing profitability and what the minimum trip count should be. However, I don't believe this work needs doing as part of this PR.	2025-02-07 09:36:52 +00:00
James Chesterman	ac158aa13b	[LoopVectorizer] Allow partial reductions to be made in predicated loops (#124268 ) Does a select on the input rather than the output. This way the mask has the same number of lanes as the other operand in the select instruction.	2025-02-07 09:09:10 +00:00
Kazu Hirata	b7feccb31d	[memprof] Dump call site matching information (#125130 ) MemProfiler.cpp annotates the IR with the memory profile so that we can later duplicate context. This patch dumps the entire inline call stack for each call site match.	2025-02-06 23:37:10 -08:00
Mel Chen	4d3148d926	[LV][EVL] Fix the check for legality of folding with EVL. (#125678 ) The current legality check for folding with EVL has incomplete verification for VF. This patch fixes the VF check, ensuring that tail folding with EVL is enabled only when a scalable VF is available. This allows loops that prefer tail folding with EVL but cannot use scalable VF vectorization to still be vectorized using a fixed VF, rather than abandoning vectorization entirely.	2025-02-07 12:53:10 +08:00
Yingwei Zheng	9cd83d6ea2	[InstCombine] Drop samesign in `foldLogOpOfMaskedICmps` (#125829 ) Alive2: https://alive2.llvm.org/ce/z/6zLAYp Note: We can also apply this fix to the logic below (`if (Mask & AMask_NotAllOnes)`), but it seems unreachable.	2025-02-07 11:56:52 +08:00
Joseph Huber	d9500f5032	[OpenMP] Fix the OpenMPOpt pass incorrectly optimizing if definition was missing Summary: This code is intended to block transformations if the call isn't present, however the way it's coded it silently lets it pass if the definition doesn't exist at all. This previously was always valid since we included the runtime as one giant blob so everything was always there, but now that we want to move towards separate ones, it's not quite correct.	2025-02-06 21:38:36 -06:00
vporpo	a0d86b23c0	[SandboxVec][Scheduler] Notify scheduler about instruction creation (#126141 ) This patch implements the vectorizer's callback for getting notified about new instructions being created. This updates the scheduler state, which may involve removing dependent instructions from the ready list and update the "scheduled" flag. Since we need to remove elements from the ready list, this patch also implements the `remove()` operation.	2025-02-06 15:45:44 -08:00
Mingming Liu	5399782508	[IR] Generalize Function's {set,get}SectionPrefix to GlobalObjects, the base class of {Function, GlobalVariable, IFunc} (#125757 ) This is a split of https://github.com/llvm/llvm-project/pull/125756	2025-02-06 14:51:13 -08:00
David Pagan	a5fc7c3ac1	[clang][OpenMP] New OpenMP 6.0 assumption clause, 'no_openmp_constructs' (#125933 ) Add initial parsing/sema support for new assumption clause so clause can be specified. For now, it's ignored, just like the others. Added support for 'no_openmp_construct' to release notes. Testing - Updated appropriate LIT tests. - Testing: check-all	2025-02-06 12:41:10 -08:00
Lei Wang	068d0c0f4b	[CSSPGO] Turn on call-graph matching by default for CSSPGO (#125938 ) Tested call-graph matching on some of Meta's large services, it works to reuse some renamed function profiles, no negative perf or significant build speed regression observed. Turned it on by default for CSSPGO mode.	2025-02-06 11:54:59 -08:00
Krzysztof Drewniak	e41ffd3420	[NaryReassociate] Fix crash from pointer width / index width confusion (#125923 ) NaryReassociate would crash on expressions like the one in the added test that involved pointers where the size of the type was greater than the index width of the pointer, causing calls to SCEV's zext expression on types that didn't need to be zero-extended. This commit fixes the issue.	2025-02-06 12:48:52 -06:00
Ramkumar Ramachandra	f5d24e6cbe	SCEV: teach isImpliedViaOperations about samesign (#124270 ) Use CmpPredicate::getMatching in isImpliedCondBalancedTypes to pass samesign information to isImpliedViaOperations, and teach it to call CmpPredicate::getPreferredSignedPredicate, effectively making it optimize with samesign information.	2025-02-06 18:14:54 +00:00
Ramkumar Ramachandra	34624d89c0	IndVarSimplify: improve a test, stripping undef (#126069 )	2025-02-06 18:10:20 +00:00
Simon Pilgrim	eb2b453eb7	[VectorCombine] foldInsExtVectorToShuffle - ensure we call getShuffleCost with the input operand type, not the result Typo in #121216 Fixes #126085	2025-02-06 17:41:24 +00:00
Nikita Popov	2f7d3ec023	[InstCombine] Regenerate test checks The output changes in the meantime.	2025-02-06 17:06:11 +01:00
Teresa Johnson	b9e4bde804	[MemProf] Re-enable cloning of callsites in recursive cycles with fixes (#125947 ) This change addresses a number of issues with the support added by PR121985 which were exposed through more exhaustive testing, specifically places that needed updates to perform correct graph updates in the presence of cycles. A new test case is added that reproduces these issues, and the default is flipped back to enabling this handling.	2025-02-06 08:04:42 -08:00

1 2 3 4 5 ...

31092 Commits