llvm-project

Author	SHA1	Message	Date
Nikita Popov	fd63a7d5c8	Revert "ValueTracking: Handle freeze in computeKnownFPClass" This reverts commit 2c8d0048f03d054f13909a26f959ef95b2a0a4de. This is incorrect: computeKnownFPClass() is only known up to poison, and freeze poison may have any FP class.	2023-04-17 12:59:23 +02:00
pvanhout	ae77aceba5	[Analysis] Remove DA & LegacyDA UniformityAnalysis offers all of the same features and much more, there is no reason left to use the legacy DAs. See RFC: https://discourse.llvm.org/t/rfc-deprecate-divergenceanalysis-legacydivergenceanalysis/69538 - Remove LegacyDivergenceAnalysis.h/.cpp - Remove DivergenceAnalysis.h/.cpp + Unit tests - Remove SyncDependenceAnalysis - it was not a real registered analysis and was only used by DAs - Remove/adjust references to the passes in the docs where applicable - Remove TTI hook associated with those passes. - Move tests to UniformityAnalysis folder. - Remove RUN lines for the DA, leave only the UA ones. - Some tests had to be adjusted/removed depending on how they used the legacy DAs. Reviewed By: foad, sameerds Differential Revision: https://reviews.llvm.org/D148116	2023-04-17 09:01:22 +02:00
Noah Goldstein	f688d215e5	[ValueTracking] Add `shl nsw %val, %cnt != 0` if `%val != 0`. Alive2 Link: https://alive2.llvm.org/ce/z/mxZLJn Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D147898	2023-04-14 18:23:47 -05:00
Noah Goldstein	684963b86d	[ValueTracking] Use maximum shift count in `shl` when determining if `shl` can be zero. Previously only return `shl` non-zero if the shift value was `1`. We can expand this if we have some bounds on the shift count. For example: ``` %cnt = and %c, 16 ; Max cnt == 16 %val = or %v, 4 ; val[2] is known one %shl = shl %val, %cnt ; (val.known.one << cnt.maxval) != 0 ``` Differential Revision: https://reviews.llvm.org/D147897	2023-04-14 18:23:45 -05:00
Matt Arsenault	2c8d0048f0	ValueTracking: Handle freeze in computeKnownFPClass	2023-04-14 17:53:41 -04:00
Matt Arsenault	49b931bdc5	ValueTracking: Implement computeKnownFPClass for arithmetic.fence	2023-04-14 17:41:27 -04:00
Matt Arsenault	3dabcdc78b	ValueTracking: Implement computeKnownFPClass for llvm.trunc	2023-04-14 17:41:26 -04:00
Matt Arsenault	656b52a6c6	ValueTracking: Handle non-splat vectors in computeKnownFPClass Avoids some regressions when the implementation of isKnownNeverNaN is replaced with computeKnownFPClass.	2023-04-14 17:41:26 -04:00
Matt Arsenault	e2d68c2fa4	ValueTracking: Implement computeKnownFPClass for canonicalize	2023-04-14 16:17:55 -04:00
Matt Arsenault	cb022084f0	ValueTracking: Handle fptrunc in computeKnownFPClass Handle nan.	2023-04-14 14:36:56 -04:00
Matt Arsenault	a517b4ad2d	InstSimplify: Perform cheaper check first	2023-04-14 14:36:56 -04:00
Matt Arsenault	409ef45000	ValueTracking: Handle extractelement and extractvalue in computeKnownFPClass	2023-04-14 14:36:56 -04:00
Matt Arsenault	c603fd2f39	ValueTracking: Implement computeKnownFPClass for sin/cos	2023-04-14 14:36:55 -04:00
Bjorn Pettersson	40c60c025c	[Passes] Remove the legacy DemandedBitsWrapperPass Last user of DemandedBitsWrapperPass was the BDCE pass. Since the legacy PM version of BDCE was removed in an earlier commit, this patch removes the now unused DemandedBitsWrapperPass. Differential Revision: https://reviews.llvm.org/D148336	2023-04-14 18:56:20 +02:00
Dmitry Makogon	e08f9894ec	[SCEV] Preserve NSW for AddRec multiplied by -1 if it cannot be signed minimum This preserves NSW flag for AddRecs multiplied by -1 if we can prove via constant ranges that the AddRec cannot be signed minimum. An explanation: Let M be signed minimum. If AddRec's range contains M, then M * (-1) will stay M and (M + 1) * (-1) will be signed maximum, so we get a signed overflow. In all other cases if an AddRec didn't signed overflow, then AddRec * (-1) wouldn't too. Differential Revision: https://reviews.llvm.org/D148084	2023-04-14 19:36:56 +07:00
Nikita Popov	62ef97e063	[llvm-c] Remove PassRegistry and initialization APIs Remove C APIs for interacting with PassRegistry and pass initialization. These are legacy PM concepts, and are no longer relevant for the new pass manager. Calls to these initialization functions can simply be dropped. Differential Revision: https://reviews.llvm.org/D145043	2023-04-14 12:12:48 +02:00
Nikita Popov	0b88adacd6	[InstSimplify] Add MaxRecurse argument to simplifyInstructionWithOperands (NFC)	2023-04-14 11:19:19 +02:00
Nikita Popov	c508e93327	[InstSimplify] Remove unused ORE argument (NFC)	2023-04-14 10:38:32 +02:00
Florian Hahn	7fc0b3049d	[VPlan] Switch to checking sinking legality for recurrences in VPlan. Building on D142885 and D142589, retire the SinkAfter map from the recurrence handling code. It is replaced by checking whether it is possible to sink all users of a recurrence directly in VPlan. This results in simpler code overall and allows to handle additional cases (see the improvements in @test_crash). Depends on D142885. Depends on D142589. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142886	2023-04-13 22:00:52 +01:00
Matt Arsenault	054cac104f	ValueTracking: Address todo for nan fmul handling in computeKnownFPClass If both operands can't be zero or nan, the result can't be nan.	2023-04-13 14:44:34 -04:00
Matt Arsenault	4d044bfb33	ValueTracking: Handle no-nan check for computeKnownFPClass for fmul Copy the logic from isKnownNeverNaN for fadd/fsub. Leave the extension to handle the zero case for a future change.	2023-04-13 14:44:34 -04:00
Simon Pilgrim	fb8038db73	[TTI] getExtendedReductionCost - replace std::optional<FastMathFlags> args with FastMathFlags Followup to D148149 where it was noticed that the std::optional wrapper wasn't helping with anything (we can just use an empty FastMathFlags()).	2023-04-13 11:26:28 +01:00
Simon Pilgrim	9e30b87afb	[TTI] getMinMaxReductionCost - add FastMathFlag argument Similar to the getArithmeticReductionCost / getExtendedReductionCost calls (which really don't need to use std::optional<>). This will be necessary to correct recognize fast/nnan fmax/fmul reductions which can avoid nan handling - which will allow us to remove the fmax/fmin special case in X86TTIImpl::getMinMaxCost and use getIntrinsicInstrCost like we do for integer reductions (63c3895327839ba5b57f5b99ec9e888abf976ac6). Differential Revision: https://reviews.llvm.org/D148149	2023-04-13 10:42:42 +01:00
Matt Arsenault	6aca400986	ValueTracking: Handle no-nan check for computeKnownFPClass for fadd/fsub Copy the logic from isKnownNeverNaN for fadd/fsub.	2023-04-12 06:48:58 -04:00
Matt Arsenault	eb8e43a2a1	ValueTracking: Remove outdated todo	2023-04-12 06:48:58 -04:00
Mircea Trofin	f3b5fca12a	[mlgo] Fix the help message for interactive mode default advice This avoids the use-after-free introduced by D147794 and fixed in 437dfa5b0365.	2023-04-11 13:04:11 -07:00
Michael Liao	72fc08a541	[InstCombine] Teach alloca replacement to handle `addrspacecast` - As the address space cast may not be valid on a specific target, `addrspacecast` is not handled when an `alloca` is able to be replaced with the source of memcpy/memmove. This patch addresses that by querying a target hook on whether that address space cast is valid. For example, on most GPU targets, the cast from a global pointer to a generic pointer is valid. - If that cast is allowedd (by querying `isValidAddrSpaceCast`), the replacement is enhanced to handle that `addrspacecast` as well. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D147025	2023-04-11 11:47:37 -04:00
John McIver	03dcd9da1a	[InstCombine] Allow splats with poison/undef in llvm::decomposeBitTestICmp This change is made to enable conversion of a masked icmp splat vector containing poison/undef to an equality expression. llvm::decomposeBitTestICmp Alive2 correctness examples using splat/masking vectors: SLT < https://alive2.llvm.org/ce/z/pPTTHh SLE <= https://alive2.llvm.org/ce/z/qQhAmU SGT > https://alive2.llvm.org/ce/z/koFHzF SGE >= https://alive2.llvm.org/ce/z/3SNz2S ULT <u https://alive2.llvm.org/ce/z/W8ktzQ ULE <=u https://alive2.llvm.org/ce/z/G5SdUY UGT >u https://alive2.llvm.org/ce/z/WFwYxq UGE >=u https://alive2.llvm.org/ce/z/DzJszP Tests have been verified using Alive2: icmp-logical.ll: @nomask_splat_and_B_allones https://alive2.llvm.org/ce/z/zmJwQU icmp-logical.ll: @nomask_splat_and_B_mixed https://alive2.llvm.org/ce/z/ktzgzd signed-truncation-check.ll: @positive_vec_undef0 https://alive2.llvm.org/ce/z/-sTRLD Differential Revision: https://reviews.llvm.org/D143032	2023-04-11 09:03:01 +01:00
Mehdi Amini	437dfa5b03	Fix use-after-free in help message: this cl::opt was binding a StringRef to a temporary string Caught by ASAN on a bot: https://lab.llvm.org/buildbot/#/builders/168/builds/12872/steps/14/logs/stdio	2023-04-11 00:26:15 -06:00
Joshua Cao	921b8f40e8	[SCEV][NFC] GetMinTrailingZeros switch case and naming cleanup * combine zext and sext into the one switch case * combine vscale and udiv into one switch case * renames according to LLVM style	2023-04-10 22:56:29 -07:00
Joshua Cao	898a9ca5e9	[SCEV] Strengthen huge constant trip multiples. SCEV determines that loops with trip count >=2^32 have a trip multiple of 1 to guard against huge multiples. This patch stregthens this to instead find the greatest power of 2 divisor that is less than the threshold. Differential Revision: https://reviews.llvm.org/D147868	2023-04-10 20:00:46 -07:00
Joshua Cao	569f7e547d	[SCEV][NFC] Convert check to assert getSmallConstantTripMultiple()	2023-04-10 19:59:01 -07:00
Joshua Cao	585742cbfc	[SCEV] When computing trip count, only zext if necessary This patch improves on https://reviews.llvm.org/D110587. To summarize the patch, given backedge-taken count BC, trip count TC is `BC + 1`. However, we don't know if BC we might overflow. So the patch modifies TC computation to `1 + zext(BC)`. This patch only adds the zext if necessary by looking at the constant range. If we can determine that BC cannot be the max value for its bitwidth, then we know adding 1 will not overflow, and the zext is not needed. We apply loop guards before computing TC to get more data. The primary motivation is to support my work on more precise trip multiples in https://reviews.llvm.org/D141823. For example: ``` void test(unsigned n) __builtin_assume(n % 6 == 0); for (unsigned i = 0; i < n; ++i) foo(); ``` Prior to this patch, we had `TC = 1 + zext(-1 + 6 * ((6 umax %n) /u 6))<nuw>`. SCEV range computation is able to determine that the BC cannot be the max value, so the zext is not needed. The result is `TC -> (6 * ((6 umax %n) /u 6))<nuw>`. From here, we would be able to determine that %n is a multiple of 6. There was one change in LoopCacheAnalysis/LoopInterchange required. Before this patch, if a loop has BC = false, it would compute `TC -> 1 + zext(false) -> 1`, which was fine. After this patch, it computes `TC -> 1 + false = true`. CacheAnalysis would then sign extend the `true`, which was not the intended the behavior. I modified CacheAnalysis such that it would only zero extend trip counts. This patch is not NFC, but also does not change any SCEV outputs. I would like to get this patch out first to make work with trip multiples easier. Differential Revision: https://reviews.llvm.org/D147117	2023-04-10 19:40:52 -07:00
Mircea Trofin	ab2e7666c2	[mlgo][inl] Interactive mode: optionally tell the default decision This helps training algorithms that may want to sometimes replicate the default decision. The default decision is presented as an extra feature called `inlining_default`. It's not normally exported to save computation time. This is only available in interactive mode. Differential Revision: https://reviews.llvm.org/D147794	2023-04-10 12:20:09 -07:00
Max Kazantsev	5b96b13fdf	[SCEV] Improve AddRecs' range computation in Expensive Range Sharpening mode Apply loop guards to AddRec's start in range computation for non-self-wrapping AddRecs. According to CT measurements, this has a wide negative compile time impact, so we hold it in expensive range sharpening mode where it's not so critical. However, we need to find a way to share benefits of this mode with default mode. Patch by Aleksandr Popov! Differential Revision: https://reviews.llvm.org/D147557 Reviewed By: mkazantsev	2023-04-10 16:37:10 +07:00
Joshua Cao	24170fb8cd	[SCEV][NFC] Fix `Do not use 'else' after 'return'` Follow LLVM coding standards and make clangd emit less warnings.	2023-04-08 15:56:08 -07:00
Philip Reames	0437f88b77	[LAA] Cleanup casting in replaceSymbolicStrideSCEV [nfc]	2023-04-06 09:13:55 -07:00
Philip Reames	2d79b71366	[LAA] Continue moving utilities to sole use to isolate symbolic stride reasoning [nfc]	2023-04-06 08:27:57 -07:00
Dávid Bolvanský	e1f94336e9	Revert "[InlineCost] isKnownNonNullInCallee - handle also dereferenceable attribute" This reverts commit 3b5ff3a67c1f0450a100dca34d899ecd3744cb36.	2023-04-06 16:54:26 +02:00
Dávid Bolvanský	3b5ff3a67c	[InlineCost] isKnownNonNullInCallee - handle also dereferenceable attribute	2023-04-06 16:51:28 +02:00
Philip Reames	800a99c4f4	[LAA] Group implementation of stride speculation into one file [nfc] These utilities are only used in one place, so move them there and make them static.	2023-04-05 20:39:08 -07:00
Philip Reames	c416f6700f	[IVDescriptors] Add pointer InductionDescriptors with non-constant strides (try 2) (JFYI - This has been heavily reframed since original attempt at landing.) This change updates the InductionDescriptor logic to allow matching a pointer IV with a non-constant stride, but also updates the LoopVectorizer to bailout on such descriptors by default. This preserves the default vectorizer behavior. In review, it was pointed out that there's multiple unfortunate performance implications which need to be addressed before this can be enabled. Having a flag allows us to exercise the behavior, and write test cases for logic which is otherwise unreachable (or hard to reach). This will also enable non-constant stride pointer recurrences for other consumers. I've audited said code, and don't see any obvious issues. Differential Revision: https://reviews.llvm.org/D147336	2023-04-05 09:32:35 -07:00
David Sherwood	b4089cfa2f	[NFC][LoopVectorize] Simplify preferPredicateOverEpilogue interface Given just how many arguments we pass to preferPredicateOverEpilogue and considering this list may grow over time I've decided to pass in a pointer to a new TailFoldingInfo structure instead, similar to what we do with IntrinsicCostAttributes, etc. In addition, many of the arguments we pass in are actually available in the LoopVectorizationLegality class so I've managed to reduce the set of pointers that we need to pass in the TailFoldingInfo struct. Differential Revision: https://reviews.llvm.org/D146127	2023-04-04 14:00:49 +00:00
Craig Topper	1f60c8d025	[IR] Replace calls to ConstantFP::getNullValue with ConstantFP::getZero. NFC There is no getNullValue in ConstantFP. Due to inheritance, we're calling Constant::getNullValue which handles any type including FP. Since we already know we want an FP constant we can use ConstantFP::getZero which might be faster and is a more readable name for an FP zero.	2023-04-03 23:14:02 -07:00
Noah Goldstein	87c97d052c	[InstSimplify] Extend simplifications for `(icmp ({z\|s}ext X), C)` where `C` is vector Previous logic only applied for `ConstantInt` which misses all vector cases. New code works for splat/non-splat vectors as well. No change to the underlying simplifications. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D147275	2023-04-03 11:04:57 -05:00
Florian Hahn	0d61ffd350	[Loads] Support SCEVAddExpr as start for pointer AddRec. Extend handling to support `%base + offset` as start for AddRecs in isDereferenceableAndAlignedInLoop. This is done by adjusting AccessSize by the offset and effectively checking if the full object starting from %base to %base + offset + access-size is dereferenceable. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D147260	2023-04-02 12:33:44 +01:00
Nikita Popov	3f53a58597	[ValueTracking] Fix incorrect computeConstantRange() arguments The second argument is ForSigned, not UseInstrInfo.	2023-03-31 16:56:56 +02:00
David Green	965a090f02	Revert "[IVDescriptors] Add pointer InductionDescriptors with non-constant strides" Multiple errors have being reported on https://reviews.llvm.org/rG498aa534f472d28db893aa9a8627d0b46e17f312 Reverting until the correctness issues can be resolved. We are also seeing a lot of performance differences from the patch. Some are looking good, but some are looking pretty bad.	2023-03-31 11:08:50 +01:00
Philip Reames	498aa534f4	[IVDescriptors] Add pointer InductionDescriptors with non-constant strides This matches the handling for integer IVs. I left the non-opaque cases alone, mostly because they're largely irrelevant today. This doesn't actually make much difference in vectorization right now as we immediately fail on aliasing checks (which also bail on non-constant strides). Slightly suprisingly, it's the case which do need runtime checks which work after this patch as they don't use the same dependency analysis path. This will also enable non-constant stride pointer recurrences for other consumers. I've auditted said code, and don't see any obvious issues.	2023-03-30 11:56:00 -07:00
Kazu Hirata	236c9217a9	Use Dense{Map,Set}::contains (NFC)	2023-03-29 23:01:11 -07:00

1 2 3 4 5 ...

12358 Commits