llvm-project

Author	SHA1	Message	Date
Florian Hahn	4fc190351e	[VPlan] Remove uneeded NeedsVectorIV from VPWidenIntOrFpInduction. After recent improvements, all instances of VPWidenIntOrFpInductionRecipe should needs a vector IV and there's no need for a separate field.	2023-04-17 13:38:00 +01:00
Bjorn Pettersson	3e38187662	Revert "[Passes] Remove legacy PM versions of InstructionNamer and MetaRenamer" This reverts commit 981ec1faeb508a364cc47c8246b72fc89dd8c1d8. It broke polly build bots. Polly still uses -instnamer with legacy PM.	2023-04-17 14:24:50 +02:00
Nikita Popov	6f7e5c0f1a	Reapply [SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating This exposed a miscompile in GVN, which was fixed by D148129. ----- After D141386, violation of nonnull, range and align metadata results in poison rather than immediate undefined behavior, which means that these are now safe to retain when speculating. We only need to remove UB-implying metadata like noundef. This is done by adding a dropUBImplyingAttrsAndMetadata() helper, which lists the metadata which is known safe to retain on speculation. Differential Revision: https://reviews.llvm.org/D146629	2023-04-17 14:15:14 +02:00
Bjorn Pettersson	981ec1faeb	[Passes] Remove legacy PM versions of InstructionNamer and MetaRenamer	2023-04-17 13:54:20 +02:00
Bjorn Pettersson	21a6890856	[Vectorize] Clean up Transforms/Vectorize.h Removed definitions of vectorizeBasicBlock and VectorizeConfig (possibly a remnant from the BBVectorize pass that was removed way back in 2017). Also reduced amount of include dependencies to Transforms/Vectorize.h.	2023-04-17 13:54:19 +02:00
Bjorn Pettersson	a20f7efbc5	Remove several no longer needed includes. NFCI Mostly removing includes of InitializePasses.h and Pass.h in passes that no longer has support for the legacy PM.	2023-04-17 13:54:19 +02:00
Florian Hahn	02369b75fd	[VPlan] Mark recurrence recipes as not having side-effects. Add support for FirstOrderRecurrenceSplice and VPFirstOrderRecurrencePHI recipes to mayHaveSideEffects. They both don't have side-effects.	2023-04-17 12:30:52 +01:00
Nikita Popov	8cdca96690	[GVN] Adjust metadata for coerced load CSE When reusing a load in a way that requires coercion (i.e. casts or bit extraction) we currently fail to adjust metadata. Unfortunately, none of our existing tooling for this is really suitable, because combineMetadataForCSE() expects both loads to have the same type. In this case we may work on loads of different types and possibly offset memory location. As such, what this patch does is to simply drop all metadata, with the following exceptions: * Metadata for which violation is known to always cause UB. * If the load is !noundef, keep all metadata, as this will turn poison-generating metadata into UB as well. This fixes the miscompile that was exposed by D146629. Differential Revision: https://reviews.llvm.org/D148129	2023-04-17 12:52:31 +02:00
David Sherwood	69ee653313	[LoopVectorize] Take vscale into account when deciding to create epilogues In LoopVectorizationCostModel::isEpilogueVectorizationProfitable we check to see if the chosen main vector loop VF >= 16. If so, we decide to create a vector epilogue loop. However, this doesn't take VScaleForTuning into account because we could be targeting a CPU where vscale > 1, and hence the runtime VF would be a multiple of the known minimum value. This patch multiplies scalable VFs by VScaleForTuning and several tests have been updated that now produce vector epilogues. Differential Revision: https://reviews.llvm.org/D147522	2023-04-17 10:49:40 +00:00
Florian Hahn	83ab5708d1	[LV] Don't sink scalar instructions that may read from memory. The current sinking code doesn't prevent us from sinking a load past an aliasing store. Skip sinking instructions that may read from memory to avoid a mis-compile. See @minimal_bit_widths_with_aliasing_store for an example where 2 loads are sunk past aliasing stores before this fix. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147259	2023-04-17 09:30:25 +01:00
Zain Jaffal	721ecc9d41	[ConstraintElimination] Transfer info from sgt %a, %b to ugt %a, %b if %b > 0 Differential Revision: https://reviews.llvm.org/D148326	2023-04-17 09:27:33 +01:00
Kazu Hirata	7b014a0732	[Scalar] Use range-based for loops (NFC)	2023-04-16 09:05:20 -07:00
Kazu Hirata	c83c4b58d1	[Transforms] Apply fixes from performance-for-range-copy (NFC)	2023-04-16 08:25:28 -07:00
Florian Hahn	668045eb77	[VPlan] Unify Value2VPValue and VPExternalDefs maps (NFCI). Before this patch, a VPlan contained 2 mappings for Values -> VPValue: 1) Value2VPValue and 2) VPExternalDefs. This duplication is unnecessary and there are already cases where external defs are added to Value2VPValue. This patch replaces all uses of VPExternalDefs with Value2VPValue. It clarifies the naming of getOrAddVPValue (to getOrAddExternalVPValue) and addVPValue (to addExternalVPValue). At the moment, this is NFC, but will enable additional simplifications in D147783. Depends on D147891. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147892	2023-04-16 15:38:31 +01:00
DianQK	2832d7941f	[SROA] Remove UB-implying metadata when promoting speculative instruction. After D138238 introduced the then/else blocks, we should remove UB-implying metadata for the promoted speculative instruction. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D148456	2023-04-16 22:35:52 +08:00
Florian Hahn	2db031528e	[VPlan] Check VPValue step in isCanonical (NFCI). Update the isCanonical() implementations to check the VPValue step operand instead of the step in the induction descriptor. At the moment this is NFC, but it enables further optimizations if the step is replaced by a constant in D147783. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147891	2023-04-16 14:48:03 +01:00
Kazu Hirata	4bac5f8344	Apply fixes from performance-faster-string-find (NFC)	2023-04-16 00:51:27 -07:00
Kazu Hirata	1ca496bd61	Remove redundant initialization of std::optional (NFC)	2023-04-16 00:40:05 -07:00
Kazu Hirata	804467de94	Use isNegative (NFC)	2023-04-15 14:26:24 -07:00
Kazu Hirata	d775fc390d	[InstCombine] Generate better code for std::bit_floor from libstdc++ Without this patch, std::bit_floor<uint32_t> in libstdc++ is compiled as: %eq0 = icmp eq i32 %x, 0 %lshr = lshr i32 %x, 1 %ctlz = tail call i32 @llvm.ctlz.i32(i32 %lshr, i1 false) %sub = sub i32 32, %ctlz %shl = shl i32 1, %sub %sel = select i1 %eq0, i32 0, i32 %shl With this patch: %eq0 = icmp eq i32 %x, 0 %ctlz = call i32 @llvm.ctlz.i32(i32 %x, i1 false) %lshr = lshr i32 -2147483648, %1 %sel = select i1 %eq0, i32 0, i32 %lshr This patch recognizes the specific pattern emitted for std::bit_floor in libstdc++. https://alive2.llvm.org/ce/z/piMdFX This patch fixes: https://github.com/llvm/llvm-project/issues/61183 Differential Revision: https://reviews.llvm.org/D145890	2023-04-15 11:32:33 -07:00
Vasileios Porpodas	7e67a9473d	[SLP][NFC] Remove Limit from tryToVectorizeSequence() arguments. Limit turns out to be implemented in the exact same way for all calls to tryToVectorizeSequence(). So this patch removes it and implements it internally as a lambda function. Differential Revision: https://reviews.llvm.org/D148382	2023-04-14 14:58:57 -07:00
Noah Goldstein	82f0827613	[InstCombine] Make `FoldOpIntoSelect` handle non-constants and use condition to deduce constants. Make the fold use the information present in the condition for deducing constants i.e: ``` %c = icmp eq i8 %x, 10 %s = select i1 %c, i8 3, i8 2 %r = mul i8 %x, %s ``` If we fold the `mul` into the select, on the true side we insert `10` for `%x` in the `mul`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D146349	2023-04-14 13:14:32 -05:00
Bjorn Pettersson	0b911a3dc3	[passes] Remove the legacy PM version of IRCE Differential Revision: https://reviews.llvm.org/D148338	2023-04-14 18:56:20 +02:00
Bjorn Pettersson	b74e89c0d4	[passes] Remove the legacy PM version of AlignmentFromAssumptions Differential Revision: https://reviews.llvm.org/D148337	2023-04-14 18:56:20 +02:00
Bjorn Pettersson	40c60c025c	[Passes] Remove the legacy DemandedBitsWrapperPass Last user of DemandedBitsWrapperPass was the BDCE pass. Since the legacy PM version of BDCE was removed in an earlier commit, this patch removes the now unused DemandedBitsWrapperPass. Differential Revision: https://reviews.llvm.org/D148336	2023-04-14 18:56:20 +02:00
Bjorn Pettersson	fb93f98ffa	[Passes] Remove legacy PM version of BDCE (aka BitTrackingDCEPass) BDCE is not used by the codegen pipeline so we should not need the legacy PM version of the pass any longer. Differential Revision: https://reviews.llvm.org/D148335	2023-04-14 18:56:20 +02:00
Joseph Huber	46ee1021d9	[OpenMP] Replace HeapToShared's initial value with `poison` There's a desire to move away from `undef` in LLVM. Currently we want to have the `addressspace(3)` variables use `poison` instead. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D147719	2023-04-14 09:39:32 -05:00
Florian Hahn	98e50881e9	[Matrix] Refine cost estimate for dot-product. Adjust lowerDotProduct cost estimate to include the cost benefits of: * emitting a wide load * emitting a wide multiply. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D147330	2023-04-14 11:35:01 +01:00
Nikita Popov	62ef97e063	[llvm-c] Remove PassRegistry and initialization APIs Remove C APIs for interacting with PassRegistry and pass initialization. These are legacy PM concepts, and are no longer relevant for the new pass manager. Calls to these initialization functions can simply be dropped. Differential Revision: https://reviews.llvm.org/D145043	2023-04-14 12:12:48 +02:00
Nikita Popov	9fe78db4cd	[FunctionAttrs] Fix nounwind inference for landingpads Currently, FunctionAttrs treats landingpads as non-throwing, and will infer nounwind for functions with landingpads (assuming they can't unwind in some other way, e.g. via resum). There are two problems with this: * Non-cleanup landingpads with catch/filter clauses do not necessarily catch all exceptions. Unless there are catch ptr null or filter [0 x ptr] zeroinitializer clauses, we should assume that we may unwind past this landingpad. This seems like an outright bug. * Cleanup landingpads are skipped during phase one unwinding, so we effectively need to support unwinding past them. Marking these nounwind is technically correct, but not compatible with how unwinding works in reality. Fixes https://github.com/llvm/llvm-project/issues/61945. Differential Revision: https://reviews.llvm.org/D147694	2023-04-14 11:46:00 +02:00
Nikita Popov	a759745169	[InstCombine] Support multiple comparisons in foldAllocaCmp() foldAllocaCmp() needs to fold all comparisons of an alloca at the same time, to ensure that there is a consistent view of the alloca address. Currently, it folds "all" comparisons by limiting to the case where there is only one. This patch switches the algorithm to instead actually collect and fold all comparisons. Something we need to be careful about here is that there may be comparisons where both sides of the icmp are based on the alloca. Such comparisons are comparing offsets of the alloca, and as such can be ignored here, but shouldn't be folded to false. Differential Revision: https://reviews.llvm.org/D144492	2023-04-14 11:32:58 +02:00
Nikita Popov	e4251fc6bb	[LangRef][Local] dereferenceable metadata violation is UB I believe !dereferencable violation is immediate undefined behavior, but this was not explicitly spelled out in LangRef. We already assume that !dereferenceable is implicitly !noundef and cannot return poison in isGuaranteedNotToBeUndefOrPoison(). The reason why we made dereferenceable implicitly noundef is that the purpose of this metadata is to allow speculation, and that would not be legal on a potential poison pointer. Differential Revision: https://reviews.llvm.org/D148202	2023-04-14 10:54:01 +02:00
Nikita Popov	c508e93327	[InstSimplify] Remove unused ORE argument (NFC)	2023-04-14 10:38:32 +02:00
Nikita Popov	243e62b9d8	[Coroutines] Directly remove unnecessary lifetime intrinsics The insertSpills() code will currently skip lifetime intrinsic users when replacing the alloca with a frame reference. Rather than leaving behind the dead lifetime intrinsics working on the old alloca, directly remove them. This makes sure the alloca can be dropped as well. I noticed this as a regression when converting tests to opaque pointers. Without opaque pointers, this code didn't really do anything, because there would usually be a bitcast in between. The lifetimes would get rewritten to the frame pointer. With opaque pointers, this code now triggers and leaves behind users of the old allocas. Differential Revision: https://reviews.llvm.org/D148240	2023-04-14 10:22:30 +02:00
Max Kazantsev	a39b807d41	[IRCE][NFC] Refactor parseRangeCheckICmp to compute SCEVs instead of Values The motivation is to make an opportunity to compute and return expressions after parsing ICmp into a range check (e.g. Length + 1). Patch by Aleksandr Popov! Differential Revision: https://reviews.llvm.org/D148205	2023-04-14 12:58:51 +07:00
Florian Hahn	7fc0b3049d	[VPlan] Switch to checking sinking legality for recurrences in VPlan. Building on D142885 and D142589, retire the SinkAfter map from the recurrence handling code. It is replaced by checking whether it is possible to sink all users of a recurrence directly in VPlan. This results in simpler code overall and allows to handle additional cases (see the improvements in @test_crash). Depends on D142885. Depends on D142589. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142886	2023-04-13 22:00:52 +01:00
Craig Topper	8bba57b1f1	[LoopIdiomRecognize] Remove NUW flag from SCEV in getTripCount. Based on the conversation in D147355. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D148170	2023-04-13 11:58:10 -07:00
Alexey Bataev	a4eff2b56c	[SLP][NFC]Remove extra semicolons after function definitions, NFC	2023-04-13 11:33:25 -07:00
Florian Hahn	e6ab86a887	[Matrix] Fix IsSupported check in lowerDotProduct. The check incorrectly checks the RHS while LHS is transformed later. Update to check LHS, which fixes a crash in the newly added test cases.	2023-04-13 19:00:30 +01:00
Alexey Bataev	f82eb7e066	[SLP]Introduce gather cost estimation function. Introduced BoUpSLP::ShuffleCostEstimator::gather function as an initial implementation of the gather/buildvector cost estimation for buildvector nodes. It will allow to use general codegen infrastructure for better cost estimation + it improves the cost estimation for the gathers/buildvectors. Improved part of D110978. Differential Revision: https://reviews.llvm.org/D148174	2023-04-13 10:16:00 -07:00
Simon Pilgrim	b3480d5ede	[SLP] Compute min/max scalar reduction costs using min/max intrinsics instead of expanded cmp+sel By default these will expand back to cmp/sel, but some targets (X86) has optimized costs for scalar integer min/max patterns which are lower than the default expansion (pre-SSE41 is particularly weak for vector min/max support). Differential Revision: [SLP] Compute min/max scalar reduction costs using min/max intrinsics instead of expanded cmp+sel	2023-04-13 17:00:39 +01:00
Simon Pilgrim	aa754f7e0f	[IR] llvm::createMinMaxOp - create integer min/max intrinsics instead of icmp/sel Based off D148215, when expanding a min/max reduction we should be creating min/max intrinsics directly instead of relying on instcombine to fold them back together. This patch handles integer min/max cases. Hopefully we can add floating point support soon (at least for fastmath/nnan cases) - but we're missing some of the plumbing to pass the correct FMF to the intrinsic at the moment. Differential Revision: https://reviews.llvm.org/D148221	2023-04-13 16:40:43 +01:00
Simon Pilgrim	9e30b87afb	[TTI] getMinMaxReductionCost - add FastMathFlag argument Similar to the getArithmeticReductionCost / getExtendedReductionCost calls (which really don't need to use std::optional<>). This will be necessary to correct recognize fast/nnan fmax/fmul reductions which can avoid nan handling - which will allow us to remove the fmax/fmin special case in X86TTIImpl::getMinMaxCost and use getIntrinsicInstrCost like we do for integer reductions (63c3895327839ba5b57f5b99ec9e888abf976ac6). Differential Revision: https://reviews.llvm.org/D148149	2023-04-13 10:42:42 +01:00
Jun Zhang	e3175f7f1b	[InstCombine] icmp(X \| OrC, C) --> icmp(X, 0) We can eliminate the or operation based on the predicate and the relation between OrC and C. sge: X \| OrC s>= C --> X s>= 0 iff OrC s>= C s>= 0 sgt: X \| OrC s> C --> X s>= 0 iff OrC s> C s>= 0 sle: X \| OrC s<= C --> X s< 0 iff OrC s> C s>= 0 slt: X \| OrC s< C --> X s< 0 iff OrC s>= C s>= 0 Alive2 links: sge: https://alive2.llvm.org/ce/z/W-6FHE sgt: https://alive2.llvm.org/ce/z/TKK2yJ sle: https://alive2.llvm.org/ce/z/vURQGM slt: https://alive2.llvm.org/ce/z/JAsVfw Related issue: https://github.com/llvm/llvm-project/issues/61538 Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D147597	2023-04-13 17:26:24 +08:00
Max Kazantsev	2124505fe4	[IRCE] Relax restrictions on IRCE's latch exit count It seems that existing logic is too strict about latch block exit count. It is required to be computable, however it is not used in any computations, and effectively the only thing it is used for is to get the type of computed exit count. Sometimes the exit count for latch block is not known, but the loop is still finite because of other exits, and safe bounds are still computable. In this case, we miss an opportunity to apply IRCE. We could instead use a more relaxed version - max symbolic exit count, which, if exists, is enough to say that the loop is finite, and its type should be good enough. There is a subtlety with type: we do not support latch count type wider than range check type. Because of that, we want to have the narrowest type available. So if it can be computed from latch block immediately, take it. Otherwise, take whatever whole loop provides and hope that it's type isn't too wide. Differential Revision: https://reviews.llvm.org/D147910 Reviewed By: danilaml	2023-04-13 16:00:19 +07:00
Bjorn Pettersson	410775ecfd	[Transforms][LTO] Remove some redundant includes. NFC No need to include CallGraphSCCPass.h from the IPO/Inliner. Also removed the include of LegacyPassManager.h in a couple of files that do not really depend on that header file. Differential Revision: https://reviews.llvm.org/D148083	2023-04-13 10:12:00 +02:00
Max Kazantsev	246f8d4be5	[NFC][IRCE] Remove meaningless local variable	2023-04-13 13:04:45 +07:00
Max Kazantsev	d093d34c33	[IRCE][NFC] Remove unused variable IsSigned Patch by Aleksandr Popov! Differential Revision: https://reviews.llvm.org/D148113	2023-04-13 12:08:46 +07:00
Yashwant Singh	aea2a14736	[LoopUnroll] Prevent LoopFullUnrollPass to perform partial/runtime unrolling FullLoopUnroll was performing runtime unrolling in certain cases when '#pragma unroll' was specified. Patch to fix this by introducing new parameter to tryToUnrollLoop() to differentiate between LoopUnrollPass and FullLoopUnrollPass. Based on the discussion here (https://discourse.llvm.org/t/loop-unroller-fails-to-unroll-loop/69834) Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D148071	2023-04-13 10:21:24 +05:30
Craig Topper	4b47d875a1	[LV] Optimize trip count SCEV. To calculate the trip count we need to add 1 to the backedge taken count. If we need to widen the backedge count, it's better to do the add before the widening if we can guarantee it won't overflow. The code here is based on similar code I found in LoopIdiomRecognize. This is the vectorizer version of this InstCombine patch D142783. Looking at the IR diffs, this does look like it gets more cases than the InstCombine patch. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D147355	2023-04-12 16:17:58 -07:00

1 2 3 4 5 ...

33436 Commits