llvm-project

Author	SHA1	Message	Date
Craig Topper	d66e42ca41	[LoopIdiomRecognize] Replace getNegativeSCEV(getOne()) with getMinusOne. NFC	2023-04-12 13:42:35 -07:00
Alexey Bataev	b28f407df9	[SLP]Improve reduction cost model for scalars. Instead of abstract cost of the scalar reduction ops, try to use the cost of actual reduction operation instructions, where possible. Also, remove the estimation of the vectorized GEPs pointers for reduced loads, since it is already handled in the tree. Differential Revision: https://reviews.llvm.org/D148036	2023-04-12 11:32:51 -07:00
Anna Thomas	76e4070843	Revert "[GuardUtils] Add asserts about loop varying widenable conditions" This reverts commit 5675757f5fc6e27ce01b3b12bdfd04044df53aa3. Assert maybe too strict. revert and investigate why assert fires.	2023-04-12 10:58:45 -04:00
Nikita Popov	ae28e016d3	[VNCoercion] Drop some redundant functions (NFC) These load and store APIs now do the same thing, so combine them into one.	2023-04-12 16:46:54 +02:00
Nikita Popov	6e78fd58cd	[GVN][VNCoercion] Remove load widening leftovers (NFCI) GVN load widening was disabled in D24096. This removes various support code that is no longer relevant. The way this works nowadays is that we return PartialAlias with an offset from BasicAA and this gets passed on as a clobber by MDA. However, PartialAlias will only be returned if the load is properly nested inside the other load. This just removes the bulk of the code, but some additional cleanup can be done here now that we don't need to distinguish between load and store cases.	2023-04-12 16:32:46 +02:00
Florian Hahn	78148eba49	[Matrix] Fix crash during dot product lowering. Perform dot-product lowering before instruction fusion to avoid crash in newly added test. Also update lowerDotProduct to properly mark optimized matmul as fused.	2023-04-12 15:08:39 +01:00
Max Kazantsev	3b73892b43	[SimpleLoopUnswitch] Do not try to inject pointer conditions. PR62058 As shown in https://github.com/llvm/llvm-project/issues/62058, canonicalication may fail with pointer types (and basically this transform is not expected to work with pointers).	2023-04-12 20:38:17 +07:00
Dmitry Makogon	797da79a92	[LoopUtils] Add isKnownPositiveInLoop and isKnownNonPositiveInLoop functions	2023-04-12 18:46:11 +07:00
Hans Wennborg	a6d9730f40	Revert "Move "auto-init" instructions to the dominator of their users" This could also move initialization of sret args, causing actually initialized parts of such return values to be uninitialized. See discussion on the code review. > As a result of -ftrivial-auto-var-init, clang generates instructions to > set alloca'd memory to a given pattern, right after the allocation site. > In some cases, this (somehow costly) operation could be delayed, leading > to conditional execution in some cases. > > This is not an uncommon situation: it happens ~500 times on the cPython > code base, and much more on the LLVM codebase. The benefit greatly > varies on the execution path, but it should not regress on performance. > > This is a recommit of cca01008cc31a891d0ec70aff2201b25d05d8f1b with > MemorySSA update fixes. > > Differential Revision: https://reviews.llvm.org/D137707 This reverts commit 50b2a113db197a97f60ad2aace8b7382dc9b8c31 and follow-up commit ad9ad3735c4821ff4651fab7537a75b8f0bb60f8.	2023-04-12 13:37:21 +02:00
Alexey Bataev	d00158cd28	[SLP][NFC]Introduce ShuffleCostEstimator and adjustExtracts member function. Added ShuffleCostEstimator class and the first adjustExtracts member, which is just a copy of previous AdjustExtractCost lambda. Differential Revision: https://reviews.llvm.org/D147787	2023-04-11 12:47:07 -07:00
OCHyams	9106960724	[Assignment Tracking][SROA] Don't un-poison dbg.assigns using multiple loc ops Some dbg.assigns using poison become un-poisoned in SROA. The reason this happens at all is because dbg.assigns linked to memory intrinsics use poison to indicate they can't describe the stored value, but the value becomes available after some optimisations. This needs reworking eventually, but for now we need to ensure that when it does occur we don't create invalid expressions. D147312 prevented this occuring when the dbg.assign uses DIArgLists, but that wasn't a complete fix. We also need to ensure we avoid un-poisoning when the existing expression uses more than one location operand (DW_OP_arg, n). Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D148020	2023-04-11 18:18:11 +01:00
Florian Hahn	68afaa3f48	[LV] Use std::make_optional to fix build failure after 082a0046. Some compilers require std::make_optional(std::move()) to force construction of the std::optional return value. This should fix the build failure in https://lab.llvm.org/buildbot#builders/67/builds/10991	2023-04-11 17:56:15 +01:00
Michael Liao	72fc08a541	[InstCombine] Teach alloca replacement to handle `addrspacecast` - As the address space cast may not be valid on a specific target, `addrspacecast` is not handled when an `alloca` is able to be replaced with the source of memcpy/memmove. This patch addresses that by querying a target hook on whether that address space cast is valid. For example, on most GPU targets, the cast from a global pointer to a generic pointer is valid. - If that cast is allowedd (by querying `isValidAddrSpaceCast`), the replacement is enhanced to handle that `addrspacecast` as well. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D147025	2023-04-11 11:47:37 -04:00
Ellis Hoag	244be0b0de	[InstrProf] Temporal Profiling As described in [0], this extends IRPGO to support //Temporal Profiling//. When `-pgo-temporal-instrumentation` is used we add the `llvm.instrprof.timestamp()` intrinsic to the entry of functions which in turn gets lowered to a call to the compiler-rt function `INSTR_PROF_PROFILE_SET_TIMESTAMP()`. A new field in the `llvm_prf_cnts` section stores each function's timestamp. Then in `llvm-profdata merge` we convert these function timestamps into a //trace// and add it to the indexed profile. Since these traces could significantly increase the profile size, we've added `-max-temporal-profile-trace-length` and `-temporal-profile-trace-reservoir-size` to limit the length of a trace and the number of traces in a profile, respectively. In a future diff we plan to use these traces to construct an optimized function order to reduce the number of page faults during startup. Special thanks to Julian Mestre for helping with reservoir sampling. [0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068 Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D147287	2023-04-11 08:30:52 -07:00
Anna Thomas	5675757f5f	[GuardUtils] Add asserts about loop varying widenable conditions We have now seen two miscompiles because of widening widenable conditions at incorrect IR points and thereby changing a branch's loop invariant condition to a loop-varying one (see PR60234 and PR61963). This patch adds asserts in common guard utilities that we use for widening to proactively catch these bugs in future. Note that these asserts will not fire if we were to sink a widenable condition from out of a loop into a loop (that's also incorrect for the same reason as above). Tested this without the fix for PR60234 (guard widening miscompile) and confirmed the assert fires. WARNING: Sometimes, the assert can fire if we failed to hoist the invariant condition out of the loop. This is a pass-ordering issue or a limitation in LICM, which would need an investigation. See details in review. Differential Revision: https://reviews.llvm.org/D147752	2023-04-11 10:54:07 -04:00
Florian Hahn	082a004690	[VPlan] Allow building a VPlan to may fail. Update the planning code constructing VPlan to allow building VPlans to fail. This allows us to gradually shift some legality checks to VPlan construction. The first candidate is checking if all users of first-order recurrence phis can be sunk past the recipe computing the previous value. The new functionality will be used by D142886 which is approved and will be landed shortly. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142885	2023-04-11 15:41:18 +01:00
Max Kazantsev	a42f589197	[LICM][NFC] Unify arithmetic statistics collection Avoid divergence b/w different kinds of hoisting with reassociation. Make them all collect general stat NumHoisted and also specific stats for each particular transform.	2023-04-11 17:20:02 +07:00
Max Kazantsev	7b8692a55c	[LICM][NFC] Do not forward declaration of hoistMinMax They all are now handled by hoistArithmetics, and only it should be forwarded.	2023-04-11 17:06:20 +07:00
Nikita Popov	243df834c6	[LICM] Fix assert failure in no-allowspeculation mode In this case the source GEP might not be hoisted even though it has invariant operands. For now just bail out, but we might need additional checks for AllowSpeculation in these special-case reassociation folds.	2023-04-11 11:55:54 +02:00
Nikita Popov	b8917ac62a	[LICM] Reassociate GEPs to allow hoisting Reassociate gep (gep ptr, idx1), idx2 to gep (gep ptr, idx2), idx1 if this would make the inner GEP loop invariant and thus hoistable. This is intended to replace an InstCombine fold that does this (in `04f61fb73d/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (L2006)`). The problem with the InstCombine fold is that LoopInfo is an optional dependency, so it is not performed reliably. Differential Revision: https://reviews.llvm.org/D146813	2023-04-11 10:34:04 +02:00
Max Kazantsev	cd24665f13	[NFC] Fix typo in statistic description	2023-04-11 14:18:53 +07:00
Max Kazantsev	e5dc4dbe87	[LICM][NFC] Restructure code to have one entry point for reassociation-based hoistings We already hoist min/max functions and want to do more of this kind. Some refactoring to make growth points for it.	2023-04-11 14:18:53 +07:00
Florian Hahn	f9d0b35d22	[LV] Re-use already computed runtime VF in fixFixedOrderRecurrence. This was suggested as independent cleanup in D147472. This removes a redundant runtime VF computation when using scalable vectors.	2023-04-10 21:25:12 +01:00
Florian Hahn	954befe2a7	[LV] Turn check into assert in fixFixedOrderRecurrence (NFCI). Suggested as independent cleanup in D147567. Either VF or UF need to be > 1. Note that if the condition would be false, the code below would use a nullptr and crash.	2023-04-10 21:11:41 +01:00
Florian Hahn	35af27c30a	[VPlan] Only create extracts for recurrence exits if there are live-outs. Move the code to collect live-out earlier and only generate extracts for exit values if there are any live-outs that use them. Depends on D147472. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147567	2023-04-10 21:08:34 +01:00
Anna Thomas	27f8a62a54	[LoopPredication] Fix where we generate widened condition. PR61963 Loop predication's predicateLoopExit pass does two incorrect things: It sinks the widenable call into the loop, thereby converting an invariant condition to a variant one It widens the widenable call at a branch thereby converting the branch into a loop-varying one. The latter is problematic when the branch may have been loop-invariant and prior optimizations (such as indvars) may have relied on this fact, and updated the deopt state accordingly. Now, when we widen this with a loop-varying condition, the deopt state is no longer correct. https://github.com/llvm/llvm-project/issues/61963 fixed. Differential Revision: https://reviews.llvm.org/D147662	2023-04-10 10:37:05 -04:00
Florian Hahn	c255eb2c4b	[VPlan] Use VPLiveOut to update FOR live-out users. Instead of iterating over all LCSSA phis in the exit block, collect all LiveOut users of the FOR splice VPInstruction and only update those users. Building on top of D147471, this removes an access to the cost model after VPlan execution. Depends on D147471. Reviewed By: Ayal, michaelmaitland Differential Revision: https://reviews.llvm.org/D147472	2023-04-10 13:02:44 +01:00
Max Kazantsev	d0950d05a6	[NFC][IRCE] Do not store latch exit count It is not actually used for any computations. Its only purpose is to check that the loop is finite and find out the type of computed exit count. Refactor code so that we only store this type.	2023-04-10 14:00:14 +07:00
Florian Hahn	0dbcbfe0d0	[VPlan] Don't assign slots for external defs (NFCI). External defs are VPValues wrapping an IR value and hence will get printed as ir<>. We don't need to assign a slot for a VPValue number.	2023-04-09 21:01:21 +01:00
Florian Hahn	620e011a25	[VPlan] Don't add live-outs if scalar epilogue is required. Instead of clearing live outs when a scalar epilogue is required late, don't add live outs during VPlan construction if a scalar epilogue is required. This enables more VPlan-based DCE (if the live out would be the only user in the plan) and is a step towards removing an access of the cost model in fixedVectorizedLoop (which is after VPlan execution). Depends on D147468. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147471	2023-04-09 09:18:24 +01:00
Florian Hahn	c7a34d355a	[VPlan] Require VFRange.End to be a power-of-2. (NFCI) This removes the need to convert the end of the range to the next power-of-2 for the end iterator after 4bd3fda5124962 and was suggested as follow-up TODO in D147468.	2023-04-08 13:04:08 +01:00
Florian Hahn	4bd3fda512	[VPlan] Add VFRange::begin() and end() iterators. (NFCI) Add an iterator to iterate over all VFs in VFRange. This simplifies some existing code and allows using all_of,any_of and none_of on a VFRange. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147468	2023-04-08 10:22:25 +01:00
Zhongyunde	0e739ddd17	[MergeICmps] Attach metadata to new created loads Use clone to keep the metadata, the issue is reported by aeubanks on D141188. Reviewed By: nikic, paulwalker-arm Differential Revision: https://reviews.llvm.org/D146702	2023-04-08 10:45:58 +08:00
Alexey Bataev	85327f307b	[SLP][NFC]Make clusterSortPtrAccesses static.	2023-04-07 13:24:24 -07:00
Noah Goldstein	513251b765	[InstCombine] Improve transforms for `(mul X, Y)` -> `(shl X, log2(Y)` Using the more robust log2 search allows us to fold more cases (same logic as exists for idiv/irem). Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D146347	2023-04-07 14:58:20 -05:00
Alexey Bataev	6ff177d928	[SLP][NFC]Improve SLP time by precomputing value<->gather nodes dependencies. Improved compiled time by the precomputing the mapping between gathered scalars and their gather/buildvector nodes for later use in isGatherShuffledEntry to avoid recomputing this map each time this function is called.	2023-04-07 12:12:02 -07:00
Florian Hahn	11896357d4	[VPlan] Add VPInterleaveRecipe::NeedsMaskForGaps field (NFCI). This patch adds a NeedsMaskForGaps field to VPInterleaveRecipe to record whether a mask for gaps is needed. This removes a dependence on the cost model in VPlan code-generation. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147467	2023-04-07 13:11:03 +01:00
Serguei Katkov	f38365aef4	[InstCombine] Add support for maximum(a,b) + minimum(a,b) => a + b Unfortunately alive2 cannot prove the correctness due to fails by timeout even for float type half. However it should be correct. If a and b are not NaN, maximum and minimum will just return different values (a and b) and take into account a + b == b + a this is the same. If a or b is NaN, than maximum and minimum are equal to NaN and NaN + NaN is NaN. a + b is also a NaN. In terms of preserving fast flags, we cannot preserve ninf due to minimum(NaN, Infinity) == maximum(NaN, Infinity) == NaN, minimum(NaN, Infinity) +ninf maximum(NaN, Infinity) == NaN +ninf NaN = NaN However transformation will change minimum(NaN, Infinity) + maximum(NaN, Infinity) to NaN +ninf Infinity == poison. But if fadd is marked as nnan, we can preserve because NaN +ninf/nnan NaN = poison as well. The same optimization for maximum(a,b) * minimum(a,b) => a * b is added. All said above for fadd is correct for fmul. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D147299	2023-04-07 12:38:04 +07:00
Serguei Katkov	624973806c	[InstCombine] Add support for max(a,b) + min(a,b) => a + b. Re-land. The same optimization for max(a,b) * min(a,b) => a * b is added. Correctness check: uadd: https://alive2.llvm.org/ce/z/2rXDek sadd: https://alive2.llvm.org/ce/z/zNu_er uadd + nuw/nsw: https://alive2.llvm.org/ce/z/EaiNjB sadd + nuw/nsw: https://alive2.llvm.org/ce/z/w_2Nrs umul: https://alive2.llvm.org/ce/z/dgXRLr smul: https://alive2.llvm.org/ce/z/hBjGzz umul + nuw/nsw: https://alive2.llvm.org/ce/z/EaiNjB smul + nuw/nsw: https://alive2.llvm.org/ce/z/87MNeS Reviewed By: goldstein.w.n Differential Revision: https://reviews.llvm.org/D147296	2023-04-07 11:56:05 +07:00
Serguei Katkov	4665f3c838	Revert "[InstCombine] Add support for max(a,b) + min(a,b) => a + b." Revert commit due to failure on buildbot: error: 'match_combine_or' may not intend to support class template argument deduction This reverts commit b86a06ef284f2637bef89bf5bb20157a8b195568.	2023-04-07 11:14:28 +07:00
Serguei Katkov	b86a06ef28	[InstCombine] Add support for max(a,b) + min(a,b) => a + b. The same optimization for max(a,b) * min(a,b) => a * b is added. Correctness check: uadd: https://alive2.llvm.org/ce/z/2rXDek sadd: https://alive2.llvm.org/ce/z/zNu_er uadd + nuw/nsw: https://alive2.llvm.org/ce/z/EaiNjB sadd + nuw/nsw: https://alive2.llvm.org/ce/z/w_2Nrs umul: https://alive2.llvm.org/ce/z/dgXRLr smul: https://alive2.llvm.org/ce/z/hBjGzz umul + nuw/nsw: https://alive2.llvm.org/ce/z/EaiNjB smul + nuw/nsw: https://alive2.llvm.org/ce/z/87MNeS Reviewed By: goldstein.w.n Differential Revision: https://reviews.llvm.org/D147296	2023-04-07 10:24:07 +07:00
Alexey Bataev	52dd72a37a	[SLP][NFC]Make adjustExtracts/needToDelay members of ShuffleInstructionBuilder. Make adjustExtracts/needToDelay lambdas members of ShuffleInstructionBuilder to allow to overload them later for cost model. Differential Revision: https://reviews.llvm.org/D147730	2023-04-06 16:27:19 -07:00
Michael Maitland	e86ed9bf2a	[LV][NFC] Improve complexity of fixing users of recurrences The original loop has O(MxN) since `is_contained` iterates over all incoming values. This change makes it so only the phis which use the value as an incoming value are iterated over so it is now O(M). Differential Revision: https://reviews.llvm.org/D146999	2023-04-06 16:15:51 -07:00
Florian Hahn	3f36b9b456	[LV] Move conditional MaskForGaps construction to load case. Conditionally setting MaskForGaps is only needed for loads. This avoid re-computing MaskForGaps for stores. Suggested as independent cleanup in D147467.	2023-04-06 21:16:37 +01:00
Alexey Bataev	e58a49300e	[SLP][NFC]Evaluate FMF for reductions before the loop, no need to reevaluate it.	2023-04-06 11:57:20 -07:00
Alexey Bataev	50af6ab0ab	[SLP]Fix emission of the masks in shuffles for undefs. If the value is used in the expression, need to adjust the mask before applying the mask. Plus, need to fix the analysis of the phi nodes for reused scalars.	2023-04-06 10:16:58 -07:00
Alexey Bataev	cf62adbbd8	[SLP]Fix delete of the extractelement with users. Made the condition for the erasing of the gathered extractelements stricter, remove it only if it has single vectorized use, otherwise leave it for instcombiner/instsimplify analysis.	2023-04-06 09:15:30 -07:00
Philip Reames	92aae9e725	[LV] Remove a cover function with a single use [nfc] And more importantly, move the fixme to the sole caller where it actually makes sense in context.	2023-04-06 08:27:57 -07:00
Dávid Bolvanský	d5fe5604a6	Revert "xxx" This reverts commit f60592438a7446595cfbfa3944681c689952d859.	2023-04-06 16:54:00 +02:00
Dávid Bolvanský	f60592438a	xxx	2023-04-06 16:51:31 +02:00

1 2 3 4 5 ...

33436 Commits