llvm-project

Author	SHA1	Message	Date
Sam Tebbs	795e35a653	Reland "[LoopVectorizer] Add support for partial reductions" with non-phi operand fix. (#121744 ) This relands the reverted #120721 with a fix for cases where neither reduction operand are the reduction phi. Only 63114239cc8d26225a0ef9920baacfc7cc00fc58 and 63114239cc8d26225a0ef9920baacfc7cc00fc58 are new on top of the reverted PR. --------- Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>	2025-01-13 11:20:35 +00:00
Mel Chen	56a37a3c76	[SLPVectorizer] Refactor HorizontalReduction::createOp (NFC) (#121549 ) This patch simplifies select-based integer min/max reductions by utilizing `llvm::getMinMaxReductionPredicate`, and generates intrinsic-based min/max reductions by utilizing `llvm::getMinMaxReductionIntrinsicOp`.	2025-01-13 16:11:31 +08:00
Florian Hahn	8df64ed777	[LV] Don't consider IV increments uniform if exit value is used outside. In some cases, there might be a chain of uniform instructions producing the exit value. To generate correct code in all cases, consider the IV increment not uniform, if there are users outside the loop. Instead, let VPlan narrow the IV, if possible using the logic from 3ff1d01985752. Test case from #122602 verified with Alive2: https://alive2.llvm.org/ce/z/bA4EGj Fixes https://github.com/llvm/llvm-project/issues/122496. Fixes https://github.com/llvm/llvm-project/issues/122602.	2025-01-12 22:03:21 +00:00
Florian Hahn	3ff1d01985	Recommit "[VPlan] Try to narrow wide and replicating recipes to uniform recipes." This reverts commit 0ebb3ac7c92c4c1c44e7f3d17832d75ec5a42a67. Re-applies commit with typos fixed.	2025-01-12 20:10:28 +00:00
Florian Hahn	0ebb3ac7c9	Revert "[VPlan] Try to narrow wide and replicating recipes to uniform recipes." This reverts commit 1afba19913253dda865a8e57b37b9f4dabead1ac. Typo breaking the build	2025-01-12 19:37:45 +00:00
Florian Hahn	1afba19913	[VPlan] Try to narrow wide and replicating recipes to uniform recipes. Use the existing VPlan-based analysis to identify recipes that only have their first lane demanded and transform them to uniform recpliate recipes. This simplifies the generated code in some places and prepares for fixing https://github.com/llvm/llvm-project/issues/122496.	2025-01-12 19:32:01 +00:00
Kazu Hirata	43fdd6e81d	[memprof] Migrate away from PointerUnion::is (NFC) (#122622 ) Note that PointerUnion::is have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> In this patch, I'm calling call().getBase() for an instance of PointerUnion. call() alone would return an instance of IndexCall, which wraps PointerUnion. Note that isa<> cannot directly accept an instance of IndexCall, at least without defining CastInfo. I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.	2025-01-12 11:06:42 -08:00
Ruhung	4f7dc1b55a	[InstCombine] Fold (add (add A, 1), (sext (icmp ne A, 0))) to call umax(A, 1) (#122491 ) Transform (add (add A, 1), (sext (icmp ne A, 0))) into call umax(A, 1). Fixes #121853. Alive2: https://alive2.llvm.org/ce/z/TweTan	2025-01-12 16:51:58 +01:00
Florian Hahn	7f59b4e998	[VPlan] Skip non-induction phi recipes in legalizeAndOptimizeInductions. The body of the loop only applies to wide induction recipes, skip any other header phi recipes up-frond	2025-01-11 20:33:02 +00:00
Mingjie Xu	876fa60f08	[TySan] Skip instrumentation for function declarations (#122488 ) Skip function declarations for instrumentation. Fixes https://github.com/llvm/llvm-project/issues/122467	2025-01-11 20:15:21 +08:00
Amr Hesham	642e493d4d	[InstCombine] Convert fshl(x, 0, y) to shl(x, and(y, BitWidth - 1)) when BitWidth is pow2 (#122362 ) Convert `fshl(x, 0, y)` to `shl(x, and(y, BitWidth - 1))` when BitWidth is pow2 Alive2 proof: https://alive2.llvm.org/ce/z/3oTEop Fixes: #122235	2025-01-11 11:48:05 +01:00
Ramkumar Ramachandra	f38c40bff3	VT: teach isImpliedCondMatchingOperands about samesign (#122474 ) Move isImplied{True,False}ByMatchingCmp from CmpInst to ICmpInst, so that it can operate on CmpPredicate instead of CmpInst::Predicate, and teach it about samesign. There are two callers of this function, and we choose to migrate the one in ValueTracking, namely isImpliedCondMatchingOperands to CmpPredicate, hence teaching it about samesign, with visible test impact.	2025-01-11 09:08:57 +00:00
Veera	2d5f07c828	[InstCombine] Fold `X udiv Y` to `X lshr cttz(Y)` if Y is a power of 2 (#121386 ) Fixes #115767 This PR folds `X udiv Y` to `X lshr cttz(Y)` if Y is a power of two since bitwise operations are faster than division. Proof: https://alive2.llvm.org/ce/z/qHmLta	2025-01-11 13:56:13 +08:00
Vitaly Buka	8af4d206e0	[NFCI][BoundsChecking] Apply nosanitize on local-bounds instrumentation (#122416 ) Should be NFCI as we run sanitizer, like msan, before local-bounds.	2025-01-10 18:11:19 -08:00
Vasileios Porpodas	25b90c4ef6	[SandboxVec][SeedCollector][NFC] Remove redundant 'else' and move the assertion within the 'if'	2025-01-10 14:54:44 -08:00
Noah Goldstein	0d9c027ad7	[InstCombine] Make `takeLog2` visible in all of InstCombine; NFC Also add `tryGetLog2` helper that encapsulates the common pattern: ``` if (takeLog2(..., /DoFold=/false)) { Value * Log2 = takeLog2(..., /DoFold=/true); ... } ``` Closes #122498	2025-01-10 16:21:35 -06:00
vporpo	9248428db7	[SandboxVec][DAG][NFC] Refactor setNextNode() and setPrevNode() (#122363 ) This patch updates DAG's `setNextNode()` and `setPrevNode()` to update both nodes of the link.	2025-01-10 13:32:33 -08:00
Han-Kuan Chen	35e76b6a4f	Revert "[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 )" This reverts commit f3d6cdc5aebafac3961d4fccbd2ca0e302c6082c.	2025-01-10 10:09:54 -08:00
Alexey Bataev	681c83a2f9	[SLP]Fix mask generation after cost estimation When estimating the cost of entries shuffles for buildvectors, need to rebuild original mask, not a generated submask, used for subregisters analysis. Fixes #122430	2025-01-10 09:32:35 -08:00
Alex MacLean	986f2ac48f	[SLPVectorizer] minor tweaks around lambdas for compatibility with older compilers (#122348 ) Older version of msvc do not have great lambda support and are not able to handle uses of class data or lambdas with implicit return types in some cases. These minor changes improve the sources compatibility with older msvc and don't hurt readability either.	2025-01-10 09:18:28 -08:00
Alexey Bataev	3c9c94a24f	Revert "[SLP]Fix mask generation after cost estimation" This reverts commit 547ba9730bf05df3383150f730a689f2c8336206 to fix buildbots reported in https://lab.llvm.org/buildbot/#/builders/123/builds/11370, https://lab.llvm.org/buildbot/#/builders/133/builds/9492	2025-01-10 08:46:42 -08:00
Alexey Bataev	547ba9730b	[SLP]Fix mask generation after cost estimation When estimating the cost of entries shuffles for buildvectors, need to rebuild original mask, not a generated submask, used for subregisters analysis. Fixes #122430	2025-01-10 08:17:56 -08:00
Nikita Popov	c39500f88c	Revert "[GVN] MemorySSA for GVN: add optional `AllowMemorySSA`" This reverts commit eb63cd62a4a1907dbd58f12660efd8244e7d81e9. This changes the preservation behavior for MSSA when the new flag is not enabled.	2025-01-10 12:57:00 +01:00
Momchil Velikov	eb63cd62a4	[GVN] MemorySSA for GVN: add optional `AllowMemorySSA` Preparatory work to migrate from MemoryDependenceAnalysis towards MemorySSA in GVN. Co-authored-by: Antonio Frighetto <me@antoniofrighetto.com>	2025-01-10 10:43:12 +01:00
Mel Chen	e0f14e11c7	[SLPVectorizer] Refine the scope of RdxOpcode in HorizontalReduction::createOp (NFC) (#122239 ) This patch is one part of unifying IAnyOf and FAnyOf reduction. #118393 The related patch is #118777.	2025-01-10 16:01:36 +08:00
Han-Kuan Chen	f3d6cdc5ae	[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 ) Add TreeEntry::hasState. Add assert for getTreeEntry. Remove the OpValue parameter from the canReuseExtract function. Remove the Opcode parameter from the ComputeMaxBitWidth lambda function.	2025-01-09 23:41:52 -08:00
Vitaly Buka	4c8fdc2954	[nfc][BoundsChecking] Rename BoundsCheckingOptions into Options (#122359 )	2025-01-09 20:38:13 -08:00
Vitaly Buka	9c2de994a1	[nfc][BoundsChecking] Refactor BoundsCheckingOptions (#122346 ) Remove ReportingMode and ReportingOpts.	2025-01-09 20:19:01 -08:00
Han-Kuan Chen	5454ac28b3	Revert "[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 )" This reverts commit 760f550de25792db83cd39c88ef57ab6d80a41a0.	2025-01-09 18:41:47 -08:00
Han-Kuan Chen	36b423e0f8	[SLP] NFC. Refactor getSameOpcode and reduce for loop iterations. (#122241 ) Replace Cnt and AltIndex with MainOp and AltOp. Reduce the number of iterations in the for loop.	2025-01-10 09:06:07 +08:00
Han-Kuan Chen	760f550de2	[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 ) Add TreeEntry::hasState. Add assert for getTreeEntry. Remove the OpValue parameter from the canReuseExtract function. Remove the Opcode parameter from the ComputeMaxBitWidth lambda function.	2025-01-10 09:05:39 +08:00
Florian Hahn	7ffb691595	[VPlan] Remove dead ToRemove (NFC).	2025-01-09 22:02:32 +00:00
Thurston Dang	4f42e16516	[hwasan] Omit tag check for null pointers (#122206 ) If the pointer to be checked is statically known to be zero, the tag check will always pass since: 1) the tag is zero 2) shadow memory for address 0 is initialized to 0 and never updated. We can therefore elide the tag check. We perform the elision in two places: 1) the HWASan pass 2) when lowering the CHECK_MEMACCESS intrinsic. Conceivably, the HWASan pass may encounter a "cannot currently statically prove to be null" pointer (and is therefore unable to omit the intrinsic) that later optimization passes convert into a statically known-null pointer. As a last line of defense, we perform elision here too. This also updates the tests from https://github.com/llvm/llvm-project/pull/122186	2025-01-09 13:48:26 -08:00
Teresa Johnson	3055e86c71	[MemProf] Disable cloning of callsites in recursive cycles by default (#122354 ) This disables the support added in PR121985 by default while we investigate a compile time crash.	2025-01-09 12:01:43 -08:00
vporpo	6312beef78	[SandboxVec][BottomUpVec] Use SeedCollector and slice seeds (#120826 ) With this patch we switch from the temporary dummy seeds to actual seeds provided by the seed collector. The seeds get sliced and each slice is used as the starting point for vectorization.	2025-01-09 11:53:48 -08:00
Alexey Bataev	5ff36748cf	[SLP]Fix mask processing for reused gathered scalars Need to sync the mask between cost and actual emission to avoid bugs in mask calculation Fixes #122324	2025-01-09 11:24:48 -08:00
Florian Hahn	b0697dc1de	[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994 ) Currently we emit early-exit related debug messages/remarks even when there is a single exit. Update to only check isVectorizableEarlyExitLoop if there isn't a single exit block. PR: https://github.com/llvm/llvm-project/pull/121994	2025-01-09 12:05:19 +00:00
Nikita Popov	dcdf44aca7	[InstCombine] Remove foldSelectICmpEq() fold (#122098 ) This fold matches complex patterns, for which we have no proof of real-world relevance, and which does not actually handle the originally motivating cases from https://github.com/llvm/llvm-project/issues/71792 either. In https://github.com/llvm/llvm-project/pull/121708 and https://github.com/llvm/llvm-project/pull/121753 we have handled some simpler variants by extending existing folds. I propose to remove this code until we have evidence that it is useful for something.	2025-01-09 12:33:01 +01:00
Sergio Afonso	b79ed8729b	[OpenMP][OMPIRBuilder] Handle non-failing calls properly (#115863 ) The preprocessor definition used to enable asserts and the one that `llvm::Error` and `llvm::Expected` use to ensure all created instances are checked are not the same. By making these checks inside of an `assert` in cases where errors are not expected, certain build configurations would trigger runtime failures (e.g. `-DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_UNREACHABLE_OPTIMIZE=ON`). The `llvm::cantFail()` function, which was intended for this use case, is used by this patch in place of `assert` to prevent these runtime failures. In tests, new preprocessor definitions based on `ASSERT_THAT_EXPECTED` and `EXPECT_THAT_EXPECTED` are used instead, to avoid silent failures in release builds.	2025-01-09 10:28:16 +00:00
Benjamin Maxwell	f88ef1bd1b	[LV] Teach LoopVectorizationLegality about struct vector calls (#119221 ) This is a split-off from #109833 and only adds code relating to checking if a struct-returning call can be vectorized. This initial patch only allows the case where all users of the struct return are `extractvalue` operations that can be widened. ``` %call = tail call { float, float } @foo(float %in_val) %extract_a = extractvalue { float, float } %call, 0 %extract_b = extractvalue { float, float } %call, 1 ``` Note: The tests require the VFABI changes from #119000 to pass.	2025-01-09 09:27:29 +00:00
Nikita Popov	71f7b972c3	[Local] Make combineAAMetadata() more principled (#122091 ) This moves combineAAMetadata() into Local and implements it via a new AAOnly flag, which will intersect only AA metadata and keep other known metadata. The existing KnownIDs list is dropped, because it is redundant with the switch in combineMetadata(), which already drops unknown metadata. I tried a few variants of this, and ultimately went with the AAOnly flag because this way we make an explicit choice for each metadata kind supported by combineMetadata(), and ignoring the flag gives you conservatively correct behavior. I checked that the memcpy tests still pass if we adjust the logic for MD_memprof/MD_callsite to drop the metadata instead of arbitrarily picking one. Fixes https://github.com/llvm/llvm-project/issues/121495.	2025-01-09 09:34:46 +01:00
Yingwei Zheng	d80bdf7261	[IRBuilder] Add a helper function to intersect FMFs from two instructions (#122059 ) Address review comment in https://github.com/llvm/llvm-project/pull/121899#discussion_r1905765776	2025-01-09 14:36:42 +08:00
Yingwei Zheng	b8337dc4b2	[InstCombine] Handle commuted patterns in `foldBinOpShiftWithShift` (#122126 ) Closes https://github.com/llvm/llvm-project/issues/121775.	2025-01-09 14:36:17 +08:00
Akshat Oke	f6c76d5180	[PM] Remove is_analysis label for LoopSimplify (#121433 ) This reverts part of the changes in #118779	2025-01-09 10:11:14 +05:30
Alexey Bataev	5b76a2e51b	[SLP]Correctly calculate mask for the inserted vector	2025-01-08 15:18:06 -08:00
Alexey Bataev	0d921f96d4	[SLP][NFC]Introduce and use createInsertVector helper function, NFC	2025-01-08 14:26:13 -08:00
David Green	676c641718	[VectorCombine] Use getInstructionCost to cost Shuffle. (#122068 ) This allows it to produce a more accurate cost for the shuffle, using the more accurate calls to getShuffleCost in getInstructionCost. It helps fix some of the regressions from vector combine a little while ago, now that we have better subvector extract costs.	2025-01-08 20:48:40 +00:00
Andreas Jonson	d4182f1b56	[InstCombine] move foldAndOrOfICmpsOfAndWithPow2 into foldLogOpOfMaskedICmps (#121970 )	2025-01-08 18:04:38 +01:00
Alexey Bataev	1160994602	[SLP]Fix a crash for very long GEP chains Need to check if the GEP bases are equal and return false early. Also, need to return false if the lookup is too deep, considering bases equal too. Fixes a crash in the assertion.	2025-01-08 06:47:41 -08:00
Yingwei Zheng	03e7862962	[ValueTracking] Move `getFlippedStrictnessPredicateAndConstant` into ValueTracking. NFC. (#122064 ) Needed by https://github.com/llvm/llvm-project/pull/121958.	2025-01-08 20:02:49 +08:00

1 2 3 4 5 ...

38597 Commits