llvm-project

Author	SHA1	Message	Date
OCHyams	086635d6b9	[Assignment Tracking][SROA] Fix fragment when slice size equals variable size Correctly handle the case of splitting an alloca which backs contiguous distinct variables, where a slice's size equals the size of a backed variable. We need to ensure that we don't generate fragments expressions with fragments of the same size as the variable as this is a verifier error. Prior to this patch a fragment expression would be created in this situation. e.g. splitting an alloca i64 with two adjacent 32-bit variables into two 32-bit allocas, the new dbg.assign expressions would contain (DW_OP_LLVM_fragment, 0, 32) and (DW_OP_LLVM_fragment, 32, 32) even though those fragments cover each variable entirely. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D147696	2023-04-06 15:29:18 +01:00
Dmitry Makogon	3d7242f05e	Reapply "[LSR] Preserve LCSSA when rewriting instruction with PHI user" This reverts commit efd34ba60f3839b0a68b2e32ff9011b6823bc16f. Reapplies 8ff4832679e1. Missed a failing test. Needed to just update test checks.	2023-04-06 17:31:27 +07:00
Serguei Katkov	6bda53c591	[GuardWidening] Re-factor freezeAndPush. Re-write the code to avoid iteration over users of constants and global values. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D147450	2023-04-06 16:46:47 +07:00
David Sherwood	9278dd7b2b	[LoopVectorize] Fix zext/sext cost calculations when types are shrunk In getInstructionCost if we know a zext/sext is going to be shrunk we should only be changing the destination type, and leave the source type unchanged. For example, we may change a zext from zext <16 x i8> %a to <16 x i32> to zext <16 x i8> %a to <16 x i16> However, we were previously calculating the cost for doing zext <16 x i16> %a to <16 x i16> which is incorrect. Differential Revision: https://reviews.llvm.org/D147152	2023-04-06 08:52:25 +00:00
Nikita Popov	503ef0a8e7	[InstCombine] Remove addrspacecast bitcast extraction fold (NFC) This is not relevant for opaque pointers, and as such no longer necessary.	2023-04-06 09:53:32 +02:00
Bjorn Pettersson	44773b798a	[SimpleLoopUnswitch] Fix SCEV invalidation issue This patch is making sure that we use getTopMostExitingLoop when finding out which loops to forget, when dealing with unswitchNontrivialInvariants and unswitchTrivialSwitch. It seems to at least be needed for unswitchNontrivialInvariants as detected by the included test case. Note that unswitchTrivialBranch already used getTopMostExitingLoop. This was done in commit 4a9cde5a791cd49b96993e6. The commit message in that commit says "If the patch makes sense, I will also update those places to a similar approach ...", referring to these functions mentioned above. As far as I can tell that never happened, but this is an attempt to finally fix that. Fixes https://github.com/llvm/llvm-project/issues/61080 Differential Revision: https://reviews.llvm.org/D147058	2023-04-06 09:46:42 +02:00
Nikita Popov	a162ddf7f2	[InstCombine] Remove various checks for opaque pointers (NFC) All pointers are opaque now, so these are no longer necessary.	2023-04-06 09:45:51 +02:00
Nikita Popov	db6b30b183	[InstCombine] Remove GEP of bitcast folds (NFC) These only support typed pointers, and as such are no longer relevant.	2023-04-06 09:15:33 +02:00
Nikita Popov	cf9f1a8203	[InstCombine] Remove visitGEPOfBitcast() fold (NFC) This does not apply to opaque pointers, and as such is no longer necessary.	2023-04-06 09:04:31 +02:00
David Green	28c8616a5b	[LV] Cleanup and reformatting for some debug messages. NFC This is just some cleanup of various debug messages, pulled out of another patch to simplify it a little.	2023-04-05 17:50:01 +01:00
Alexey Bataev	40105a9933	[SLP]Find reused scalars in buildvector sequences, if any. Patch generalizes analysis of scalars. The main part is outlined into lambda, which can be used to find reused inserted scalars and emit shuffle for them instead of multiple insertelement instructions, if the permutation is found alreadyi. I.e. some scalars are transformed by the permutation of previously vectorized nodes, and some are inserted directly. Reworked part of D110978 Differential Revision: https://reviews.llvm.org/D146564	2023-04-05 09:37:05 -07:00
Philip Reames	c416f6700f	[IVDescriptors] Add pointer InductionDescriptors with non-constant strides (try 2) (JFYI - This has been heavily reframed since original attempt at landing.) This change updates the InductionDescriptor logic to allow matching a pointer IV with a non-constant stride, but also updates the LoopVectorizer to bailout on such descriptors by default. This preserves the default vectorizer behavior. In review, it was pointed out that there's multiple unfortunate performance implications which need to be addressed before this can be enabled. Having a flag allows us to exercise the behavior, and write test cases for logic which is otherwise unreachable (or hard to reach). This will also enable non-constant stride pointer recurrences for other consumers. I've audited said code, and don't see any obvious issues. Differential Revision: https://reviews.llvm.org/D147336	2023-04-05 09:32:35 -07:00
Nikita Popov	7c78cb4b1f	Revert "[SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating" This reverts commit 78b1fbc63f78660ef10e3ccf0e527c667a563bc8. This causes or exposes miscompiles in Rust, revert until they have been investigated.	2023-04-05 17:05:39 +02:00
Florian Hahn	04681243b4	[Matrix] Limit dot lowering to column major matrixes. Limit to dot product lowering to column major matrixes for now. This simplifies the code and reasoning for upcoming planned improvements. Support for row-major matrixes can be added later as extension.	2023-04-05 15:49:06 +01:00
Nikita Popov	238a59c3f1	[InstCombine] Remove varargs cast transform (NFC) This is no longer relevant with opaque pointers. Also drop the CastInst::isLosslessCast() method, which was only used here.	2023-04-05 16:36:21 +02:00
Jie Fu	ae5f049378	[Transforms] Fix -Wunused-function for 'GetReplicateRegion' with -DLLVM_ENABLE_ASSERTIONS=OFF (NFC) /Users/jiefu/llvm-project/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:614:23: error: unused function 'GetReplicateRegion' [-Werror,-Wunused-function] static VPRegionBlock GetReplicateRegion(VPRecipeBase R) { ^ 1 error generated.	2023-04-05 22:34:42 +08:00
Nikita Popov	032e5d403e	[InstCombine] Remove convertBitCastToGEP() fold (NFC) This only applies to typed pointers, so the fold is no longer necessary.	2023-04-05 16:20:14 +02:00
Jie Fu	d1dd995196	[InstCombine] Remove unneeded internal function 'decomposeSimpleLinearExpr' in InstCombineCasts.cpp (NFC) /data/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp:32:15: error: function 'decomposeSimpleLinearExpr' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration] static Value decomposeSimpleLinearExpr(Value Val, unsigned &Scale, ^ 1 error generated.	2023-04-05 22:18:39 +08:00
Eric Gullufsen	b9bbe2f603	[InstCombine] Preserve nsw/nuw flags in canonicalization canonicalizeLogicFirst reorders logic op / math op for suitable constants, and this commit makes this function pass through nsw/nuw flags on the Add. Differential Revision: https://reviews.llvm.org/D147568	2023-04-05 10:12:54 -04:00
Nikita Popov	3cbdcd6ebf	[InstCombine] Remove PromoteCastOfAllocation() fold (NFC) This fold does not apply to opaque pointers, and as such is no longer needed.	2023-04-05 15:55:43 +02:00
Florian Hahn	c18bc7f7fe	[VPlan] Replace check for replicate regions with assert (NFCI). After recent changes, replication regions only get introduced later, so there's no need to check for them.	2023-04-05 14:29:24 +01:00
Nikita Popov	53280dba83	[InstCombine] Use CreateGEP() API (NFC) Use the IRBuilder API that accepts inbounds as a boolean parameter, rather than using a ternary.	2023-04-05 15:02:45 +02:00
Nikita Popov	b066505d88	[ArgPromotion] Require noundef to copy poison-generating metadata For poison-generating (rather than IUB) metadata, only copy it from the dominating must-exec load if it is combined with !noundef. This could be further extended by additionall intersecting the metadata from all loads, which does not require !noundef.	2023-04-05 14:34:33 +02:00
OCHyams	76740fb40e	[Assignment Tracking][SROA] Handle createFragmentExpression failure createFragmentExpression will fail if it determines that the expression cannot be split over fragments. Handle this case in SROA. Similarly to D147312 this should be a rare occurrence as the `dbg.assign` will usually reference the `Value` being stored without modifying it with a `DIExpression`. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D147431	2023-04-05 11:20:32 +01:00
Graham Hunter	185863f7de	[LV] Use available masked vector function variants when required LLVM has the ability to vectorize using function variants that require a mask by creating an all-true mask, and to vectorize a conditional call via scalarization, now we want to join the two parts together and use a masked variant when a mask is required. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D136251	2023-04-05 11:18:38 +01:00
Nikita Popov	7553bad1ac	[LICM] Don't require optimized uses LICM currently requests optimized use MSSA form. This is wasteful, because LICM doesn't actually care about most uses, only those of invariant pointers in loops. Everything else doesn't need to be optimized. LICM already uses the clobber walker in most places. This patch adjusts one place that was using getDefiningAccess() to use it as well, so we no longer have a dependence on pre-optimized uses. This change is not NFC in that the fallback on the defining access when there are too many clobber calls may now fall back to an unoptimized use. In practice, I've not seen any problems with this though. If desired, we could also increase licm-mssa-optimization-cap to a higher value (increasing this from 100 to 200 has no impact on average compile-time -- but also doesn't appear to have any impact on LICM quality either). This makes for a 0.9% geomean compile-time improvement on CTMark. Differential Revision: https://reviews.llvm.org/D147437	2023-04-05 11:20:25 +02:00
Evgenii Stepanov	e0f7ef4b9c	[msan] Fix handling of ParamTLS overflow. Ironically, MSan copies uninitialized data off the stack into VAArgTLSCopy in the callee-side handling of va_start. Clamp the copy size to the actual length of the buffer, and zero-initialize the remainder. Differential Revision: https://reviews.llvm.org/D146858	2023-04-04 13:52:09 -07:00
Jeff Byrnes	9b79d0b610	[MergedLoadStoreMotion] Merge stores with conflicting value types Since memory does not have an intrinsic type, we do not need to require value type matching on stores in order to sink them. To facilitate that, this patch finds stores which are sinkable, but have conflicting types, and bitcasts the ValueOperand so they are easily sinkable into a PHINode. Rather than doing fancy analysis to optimally insert the bitcast, we always insert right before the relevant store in the diamond branch. The assumption is that later passes (e.g. GVN, SimplifyCFG) will clean up bitcasts as needed. Differential Revision: https://reviews.llvm.org/D147348	2023-04-04 12:01:29 -07:00
serge-sans-paille	ad9ad3735c	Do not move "auto-init" instruction if they're volatile This is overly conservative, but at least it's safe. This is a follow-up to https://reviews.llvm.org/D137707	2023-04-04 20:42:05 +02:00
David Sherwood	b4089cfa2f	[NFC][LoopVectorize] Simplify preferPredicateOverEpilogue interface Given just how many arguments we pass to preferPredicateOverEpilogue and considering this list may grow over time I've decided to pass in a pointer to a new TailFoldingInfo structure instead, similar to what we do with IntrinsicCostAttributes, etc. In addition, many of the arguments we pass in are actually available in the LoopVectorizationLegality class so I've managed to reduce the set of pointers that we need to pass in the TailFoldingInfo struct. Differential Revision: https://reviews.llvm.org/D146127	2023-04-04 14:00:49 +00:00
Nikita Popov	78b1fbc63f	[SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating After D141386, violation of nonnull, range and align metadata results in poison rather than immediate undefined behavior, which means that these are now safe to retain when speculating. We only need to remove UB-implying metadata like noundef. This is done by adding a dropUBImplyingAttrsAndMetadata() helper, which lists the metadata which is known safe to retain on speculation. Differential Revision: https://reviews.llvm.org/D146629	2023-04-04 10:03:45 +02:00
Craig Topper	1f60c8d025	[IR] Replace calls to ConstantFP::getNullValue with ConstantFP::getZero. NFC There is no getNullValue in ConstantFP. Due to inheritance, we're calling Constant::getNullValue which handles any type including FP. Since we already know we want an FP constant we can use ConstantFP::getZero which might be faster and is a more readable name for an FP zero.	2023-04-03 23:14:02 -07:00
serge-sans-paille	50b2a113db	Move "auto-init" instructions to the dominator of their users As a result of -ftrivial-auto-var-init, clang generates instructions to set alloca'd memory to a given pattern, right after the allocation site. In some cases, this (somehow costly) operation could be delayed, leading to conditional execution in some cases. This is not an uncommon situation: it happens ~500 times on the cPython code base, and much more on the LLVM codebase. The benefit greatly varies on the execution path, but it should not regress on performance. This is a recommit of cca01008cc31a891d0ec70aff2201b25d05d8f1b with MemorySSA update fixes. Differential Revision: https://reviews.llvm.org/D137707	2023-04-04 07:30:03 +02:00
Philip Reames	f6b217c7cb	[LV] Remmove unused default argument to isLegalGatherOrScatter [nfc]	2023-04-03 11:03:35 -07:00
Alexey Bataev	c1660006b2	[SLP]Reorder counters for same values, if the root node is reordered. The counters for the repeated scalars are ordered in the natural order, but the original scalars might be reordered during SLP graph reordering and this order can be dropped. Need to use the scalars after the reordering, not the original ones, to emit correct code for same value counters.	2023-04-03 07:52:49 -07:00
Nikita Popov	9b5ff4436e	[EarlyCSE] Call combineMetadataForCSE() when CSEing loads We may have to adjust metadata on the replacement load if the metadata is poison-generating.	2023-04-03 16:10:19 +02:00
Nikita Popov	d68800d15d	[Local] Preserve !invariant.load of dominating instruction Per LangRef: > If a load instruction tagged with the !invariant.load metadata > is executed, the memory location referenced by the load has to > contain the same value at all points in the program where the > memory location is dereferenceable; otherwise, the behavior is > undefined. As invariant.load violation is immediate undefined behavior, it is sufficient for it to be present on the dominating load (for the case where K does not move).	2023-04-03 16:05:02 +02:00
serge-sans-paille	11ae47dfc6	Revert "Move "auto-init" instructions to the dominator of their users" This reverts commit cca01008cc31a891d0ec70aff2201b25d05d8f1b. This change breaks memory ssa checks, see https://lab.llvm.org/buildbot#builders/109/builds/60970	2023-04-03 15:46:18 +02:00
Nikita Popov	e20331cec0	[Local] Use combineMetadataForCSE() in patchReplacementInstruction() patchReplacementInstruction() is used for CSE-style transforms. Avoid the need to maintain two separate lists of known metadata IDs, which can and do go out of sync.	2023-04-03 15:30:21 +02:00
Nikita Popov	0b5068695a	[Local] Add MD_fpmath to combineMetadataForCSE() This was present in patchReplacementInstruction() but not combineMetadataForCSE(). combineMetadata() already knows how to merge these properly.	2023-04-03 15:27:59 +02:00
serge-sans-paille	cca01008cc	Move "auto-init" instructions to the dominator of their users As a result of -ftrivial-auto-var-init, clang generates instructions to set alloca'd memory to a given pattern, right after the allocation site. In some cases, this (somehow costly) operation could be delayed, leading to conditional execution in some cases. This is not an uncommon situation: it happens ~500 times on the cPython code base, and much more on the LLVM codebase. The benefit greatly varies on the execution path, but it should not regress on performance. Differential Revision: https://reviews.llvm.org/D137707	2023-04-03 15:27:27 +02:00
Zain Jaffal	1d23d60c8d	[ConstraintElimination] Add function arguments to constraint system before solving If there is an optimisation opportunity and the function argument hasn’t been added to constraint system through previous facts we fail to optimise it. It might be a good idea to start the constraint system with all the function arguments added to the system Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D144879	2023-04-03 14:16:49 +01:00
Nikita Popov	b58a697f3e	[LICM] Don't promote store to global even in single-thread mode Even if there are no thread-safety concerns, we should not promote (not guaranteed-to-execute) stores to globals without further analysis: While the global may be writable, we may not have provenance to perform the write. The @promote_global_noalias test case illustrates a miscompile in the presence of a noalias pointer to the global. Worth noting that the load-only promotion may also not be well-defined depending on precise semantics (we don't specify whether load violating noalias is poison or UB -- though I believe the general inclination is to make it poison, and only stores UB), but that's a more general issue. This is inspired by https://github.com/llvm/llvm-project/issues/60860, which is a related issue with TBAA metadata. Differential Revision: https://reviews.llvm.org/D146233	2023-04-03 14:20:06 +02:00
Serguei Katkov	2b9509627c	[GuardWidening] Fix the crash while replacing the users of poison. When we replace poison with freeze poison it might appear that user of poison is a constant (for example vector constant). In this case we will get that constant will get non-constant operand. Moreover replacing poison and GlobalValue everywhere in module seems to be overkill. So the solution will be just make a replacement only in instructions we visited (contributing to hoisted condition). Moreover if user of posion is constant, this constant also should need a freeze and it does not make sense to replace poison with frozen version, just freeze another constant. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D147429	2023-04-03 17:20:38 +07:00
Nikita Popov	0b9259c00d	[LICM] Extract helper for getClobberingMemoryAccess() Extract a helper that does the clobber walk while taking into account the cap. Slightly reflow things to check this first in the store case, before we start walking over all accesses in the loop.	2023-04-03 12:02:55 +02:00
Kazu Hirata	52dd9deb15	[Scalar] Use SmallPtrSet::contains (NFC)	2023-03-31 23:50:17 -07:00
Alexey Bataev	c1bcf5dd0a	[SLP]Fix PR61835: Assertion `I->use_empty() && "trying to erase instruction with users."' failed. If the externally used scalar is part of the tree and is replaced by extractelement instruction, need to add generated extractelement instruction to the list of the ExternallyUsedValues to avoid deletion during vectorization.	2023-03-31 14:21:19 -07:00
Jie Fu	297242a2bb	[InstCombine] Fix -Wimplicit-fallthrough in InstCombinerImpl::visitCallInst (NFC) /data/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:3078:3: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough] default: ^ /data/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:3078:3: note: insert 'break;' to avoid fall-through default: ^ break; 1 error generated.	2023-03-31 22:52:46 +08:00
Nikita Popov	6261adfa51	[InstCombine] Fold more intrinsics over selects Move this handling to a centralized place and extend it to handle saturating add/sub intrinsics. I originally wanted to make this fully generic rather than whitelist based, because this is legal and likely profitable for all speculatable intrinsics. The caveat is that for vector selects, the intrinsic can't perform cross-lane operations like a shuffle or reduction, which we don't really expose as a generic property right now. So for now I'm just extending the list.	2023-03-31 16:32:21 +02:00
Nikita Popov	cbca9ce91c	[InstCombine] Remove min/max special case when folding into select Now that we canonicalize to min/max intrinsics, we no longer need to guard against this here. In fact, it seems like the issue from PR46271 was the final push for introducing the intrinsics in the first place...	2023-03-31 13:48:21 +02:00

1 2 3 4 5 ...

33436 Commits