llvm-project

Author	SHA1	Message	Date
Sanjay Patel	4cdf30d9d3	[InstCombine] FP with reassoc FMF: (X * C) + X --> X * (MulC + 1.0) This fold already exists for scalars via FAddCombine (and that's why 2 of the tests are only changed cosmetically), but that code misses vectors and has largely been replaced by simpler folds over time, so this is another step towards removing it.	2022-01-17 10:38:05 -05:00
Florian Hahn	aa7f0e6a55	[DSE] Remove commented-out InvisibleToCallerBeforeRet. (NFC) This code was is a leftover from earlier changes and should be removed.	2022-01-17 13:59:13 +00:00
Sanjay Patel	7037d110fa	[InstCombine] propagate IR flags from binop through select The tests with constant folding that produces poison could potentially remove the select entirely: https://alive2.llvm.org/ce/z/e-WUqF ...but this patch just removes the FMF-only limitation on propagation.	2022-01-17 08:42:48 -05:00
Florian Hahn	500fe60957	[VPlan] Drop unnecessary uses of getVPSingleValue (NFC).	2022-01-17 13:27:33 +00:00
Nikita Popov	12bee2c054	[GlobalOpt] Drop an incorrect check This was a last-minute addition to D117249, and of course I ended up inverting the condition in a way that caused an uninitialized memory read. I've dropped it entirely, as I don't think we actually care whether the size is zero or not here. The previous code wasn't checking this either.	2022-01-17 10:10:56 +01:00
Nikita Popov	499f1ca79f	[GlobalOpt] Use generic type when converting malloc to global The malloc to global transform currently determines the type of the global by looking at bitcasts of the malloc. This is limited (the transform fails if there are multiple different types) and incompatible with opaque pointers. My initial approach was to construct an appropriate struct type based on usage in loads/stores. What this patch does instead is to always create an [i8 x AllocSize] global, without trying to guess types at all. This does mean that other transforms that require a certain global type may break. I fixed two of these in D117034 and D117223, which I believe should be sufficient to avoid regressions. In particular, the global SRA change should end up splitting the global into naturally-typed sub-globals, at which point all other optimizations should work. Differential Revision: https://reviews.llvm.org/D117092	2022-01-17 09:55:33 +01:00
Nikita Popov	4796b4ae7b	[GlobalOpt] Make global SRA offset based Currently global SRA uses the GEP structure to determine how to split the global. This patch instead analyses the loads and stores that are performed on the global, and collects which types are used at which offset, and then splits the global according to those. This is both more general, and works fine with opaque pointers. This is also closer to how ordinary SROA is performed. Differential Revision: https://reviews.llvm.org/D117223	2022-01-17 09:28:36 +01:00
Nikita Popov	00b77d917c	[DSE] Remove alloc function check in canSkipDef() canSkipDef() currently skips inaccessiblememonly calls, but not if they are allocation functions. This check was added in D103009, but actually seems to be a leftover from a previous implementation in D101440. canSkipDef() is not used on the storeIsNoop() path, where the relevant transform ended up being implemented. Differential Revision: https://reviews.llvm.org/D117005	2022-01-17 09:23:51 +01:00
Florian Hahn	070d1034da	[LV] Restore metadata to disable runtime unrolling for epilogue loop. After d4a8fc3a87a1 LV stopped adding metadata to disable runtime unrolling to the vectorized epilogue loop. This was missed because 278aa65cc495 removed the relevant test coverage. This patch fixes that by adding the relevant metadata after vector loop generation.	2022-01-16 13:14:16 +00:00
Florian Hahn	62739204d4	[LV] Move AddRuntimeUnrollDisableMetaData so it can be used earlier (NFC) Move up the definition of AddRuntimeUnrollDisableMetaData, so it can be re-used earlier in the file in a follow-up patch.	2022-01-16 10:30:24 +00:00
Nikita Popov	c63a3175c2	[AttrBuilder] Remove ctor accepting AttributeList and Index Use the AttributeSet constructor instead. There's no good reason why AttrBuilder itself should exact the AttributeSet from the AttributeList. Moving this out of the AttrBuilder generally results in cleaner code.	2022-01-15 22:39:31 +01:00
Nikita Popov	d1675e4944	[AttrBuilder] Remove empty() / td_empty() methods The empty() method is a footgun: It only checks whether there are non-string attributes, which is not at all obvious from its name, and of dubious usefulness. td_empty() is entirely unused. Drop these methods in favor of hasAttributes(), which checks whether there are any attributes, regardless of whether these are string or enum attributes.	2022-01-15 17:57:18 +01:00
Florian Hahn	e00158ed5c	[LoopUtils] Use InstSimplifyFolder in addRuntimeChecks. Use the InstSimplifyFolder introduced earlier to perform initial simplification during runtime check construction.	2022-01-15 15:21:16 +00:00
Vitaly Buka	35d00fdc10	[msan] Reset shadow of byval before call If function is not sanitized we must reset shadow, not copy. Depends on D117285 Reviewed By: kda, eugenis Differential Revision: https://reviews.llvm.org/D117286	2022-01-14 22:35:43 -08:00
Quentin Colombet	a8ca4046e2	[LSR] Fix crash in Phi node with EHPad block This fixes a crash I observed in issue #48708 where the LSR pass tries to insert an instruction in a basic block with only a catchswitch statement in there. This happens because the Phi node being evaluated assumes the same value for different basic blocks. If the basic block associated with the incoming value of the operand being evaluated has an EHPad terminator LSR skips optimizing it. But if that incoming value can come from multiple different blocks there can be some incoming basic blocks which are terminated in an EHPad. If these are then rewritten in RewriteForPhi the ones containing an EHPad terminator will hit the "Insertion point must be a normal instruction" assert in AdjustInsertPositionForExpand. This fix makes CollectLoopInvariantFixupsAndFormulae also ignore cases where the same value has another incoming basic block with an EHPad, same as it already does in case the primary value has one. Patch by Lorenz Brun <lorenz@brun.one> Differential Revision: https://reviews.llvm.org/D98378	2022-01-14 18:53:18 -08:00
Vitaly Buka	0a46b6ec4e	[msan] Clear byval shadow in ignored functions If function has no sanitize_memory we still reset shadow for nested calls. The first return from getShadow() correctly returned shadow for argument, but it didn't reset shadow of byval pointee. Depends on D117277 Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D117278	2022-01-14 17:32:07 -08:00
Vitaly Buka	4959708502	[NFC][msan] Consolidate clean shadow handling Depends on D117276 Reviewed By: kda, eugenis Differential Revision: https://reviews.llvm.org/D117277	2022-01-14 17:06:39 -08:00
Vitaly Buka	18e4369e19	[NFC][msan] Don't setOrigin for byval pointer It's NFC because shadow of pointer is clean so origins will not be propagated anyway. Depends on D117275 Reviewed By: kda, eugenis Differential Revision: https://reviews.llvm.org/D117276	2022-01-14 16:42:26 -08:00
Heejin Ahn	c3a68c5d63	[SROA] Bail out on PHIs in catchswitch BBs In the process of rewriting `alloca`s and `phi`s that use them, the SROA pass can try to insert a non-PHI instruction by calling `getFirstInsertionPt()`, which is not possible in a catchswitch BB. This CL makes we bail out on these cases. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D117168	2022-01-14 14:55:07 -08:00
Congzhe Cao	fa6a2876c7	[LoopInterchange] Enable interchange with multiple inner loop indvars Currently loop interchange only supports loops with one inner loop induction variable. This patch adds support for transformation with more than one inner loop induction variables. The induction PHIs and induction increment instructions are moved/duplicated properly to the new outer header and the new outer latch, respectively. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D114917	2022-01-14 16:28:41 -05:00
Vitaly Buka	3552177229	[NFC][msan] Reorder branches in complex if Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D117274	2022-01-14 13:22:43 -08:00
Nadav Rotem	9551fc57b7	Fold ashr-exact into a icmp-ugt. This commit optimizes the code sequence: icmp-XXX (ashr-exact (X, C_1), C_2). Instcombine already implements this optimization for sgt, and this patch adds support to additional predicates. The transformation is legal for all predicates if the 'exact' flag is set, and to SGE, UGE, SLT, ULT when the exact flag is not present. This pattern is found in the std::vector bounds checks code of the at() method. Alive2 proof: https://alive2.llvm.org/ce/z/JT_WL8 Differential Revision: https://reviews.llvm.org/D117252	2022-01-14 12:58:44 -08:00
Jessica Paquette	acb8de565e	[JumpThreading] Change asserts for WantInteger into actual checks After e734e8286b4b521d829aaddb6d1cbbd264953625, it is possible to end up in a situation where an `indirectbr` is fed by a cast, which is in turn fed by an operation which only produces integers. `indirectbr` expects a block address, however these operations can't produce that. There were several asserts in `computeValueKnownInPredecessorsImpl` which check that we're not looking for a block address if we're walking through something which can never produce one. Since it's now possible to hit these asserts, this changes them into actual checks which return false if `Preference` is not `WantInteger`. This adds a testcase which verifies that we don't crash anymore in these situations. Differential Revision: https://reviews.llvm.org/D99814	2022-01-14 11:15:14 -08:00
Florian Hahn	42b34facfd	Recommit "[LV] Inline CreateSplatIV call for scalar VFs." This reverts the revert commit 073c27b5e5851f13d99d383e047309299b68827d. A reduced test case has been added in 5e4966cbae7ba5 and the code has been updated to handle the case where getInductionOpcode returns BinaryOpsEnd. In this case, the original code was always using Instruction::Add. Do the same in the patch. Note this commit may slightly change the value naming, because it now also assigns the 'induction' name in the floating point case.	2022-01-14 19:03:49 +00:00
Sanjay Patel	02455bea6b	[InstCombine] remove unnecessary use check on X >>exact == 0 fold The transform replaces one icmp with another, so we should not care if the shift has another use.	2022-01-14 12:52:16 -05:00
Florian Hahn	1ef9bfa013	[InstSimplify] Pass pointer and indices separately to SimplifyGEPInst. This doesn't require callers to put the pointer operand and the indices in a container like a vector when calling the function. This is not really an issue with the existing callers. But when using it from IRBuilder the inputs are available as separate pointer value and indices ArrayRef. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D117038	2022-01-14 09:59:52 +00:00
Caroline Concatto	8e5a5b619d	[InstCombine] Fold for masked scatters to a uniform address When masked scatter intrinsic does a uniform store to a destination address from a source vector, and in this case, the mask is all one value. This patch replaces the masked scatter with an extracted element of the last lane of the source vector and stores it in the destination vector. This patch also folds when the value in the masked scatter is a splat. In this case, the mask cannot be all zero, and it folds to a scalar store of the value in the destination pointer. Differential Revision: https://reviews.llvm.org/D115724	2022-01-14 09:44:34 +00:00
Bryce Wilson	28b6e2cb3d	[Attributor] [NFC] Use canonical variable name Differential Revision: https://reviews.llvm.org/D117241	2022-01-13 23:06:00 -08:00
Vitaly Buka	71a4fde397	[NFC][msan] Init few vars later	2022-01-13 22:00:37 -08:00
Vitaly Buka	36138d8252	[NFC][msan] Declare some getShadow vars later	2022-01-13 21:36:37 -08:00
James Y Knight	073c27b5e5	Revert "[LV] Inline CreateSplatIV call for scalar VFs (NFC)." Causes a crash with the following (creduce'd) test-case: clang -O3 '--target=aarch64-grtev4-linux-gnu' -xc - -c -o /dev/null <<EOF int e; int f; int g() { int h; int j = 0; while (&f - j > 0) { int k; k = j; if (e == j && *e) k = 5; h = k; j++; } return h; } EOF This reverts commit 7ce48be0fd83fb4fe3d0104f324bbbcfcc82983c.	2022-01-14 00:00:02 +00:00
Philip Reames	5d5d4d94f0	[Attributor] Generalize heap to stack to any allocator with relevant properties This completes removal of the isXLike queries, and depends on a whole series of earlier patches which have already landed. Differential Revision: https://reviews.llvm.org/D117242	2022-01-13 15:33:24 -08:00
Philip Reames	cf66f01ec1	[Attributor] Share code for abstract interpretation of allocation sizes with getObjectSize [NFC-ish] The basic idea is that we can parameterize the getObjectSize implementation with a callback which lets us replace the operand before analysis if desired. This is what Attributor is doing during it's abstract interpretation, and allows us to have one copy of the code. Note this is not NFC for two reasons: * The existing attributor code is wrong. (Well, this is under-specified to be honest, but at least inconsistent.) The intermediate math needs to be done in the index type of the pointer space. Imagine e.g. i64 arguments in a 32 bit address space. * I did not preserve the behavior in getAPInt where we return 0 for a partially analyzed value. This looks simply wrong in the original code, and nothing test wise contradicts that. Differential Revision: https://reviews.llvm.org/D117241	2022-01-13 15:33:24 -08:00
Arthur Eubanks	9a0fe1b0fc	[Inline] Attempt to delete any discardable if unused functions Previously we limited ourselves to only internal/private functions. We can also delete linkonce_odr functions. Minor compile time wins: https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=instructions Major memory wins on tramp3d: https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=max-rss Relanding with fix for compile times D117236. Reviewed By: nikic, mtrofin Differential Revision: https://reviews.llvm.org/D115545	2022-01-13 14:48:38 -08:00
Arthur Eubanks	757e044dce	[Inliner] Don't removeDeadConstantUsers() when checking if a function is dead If a function has many uses, this can take a good chunk of compile times. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117236	2022-01-13 14:29:45 -08:00
Congzhe Cao	37e34b74e9	[LoopInterchange] Enable interchange with multiple outer loop indvars This patch enables loop interchange with multiple outer loop induction variables, and hence removes the limitation that only a single outer loop induction variable is supported. In fact, it turns out that the current pass already trivially supports multiple outer indvars, which is the result of a previous patch `https://reviews.llvm.org/D102743`. Therefore, this patch removed that limitation and provides test cases for multiple outer indvars. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D114916	2022-01-13 16:51:32 -05:00
Roman Lebedev	82c8aca934	[SimplifyCFG] Be more aggressive when sinking into block followed by unreachable I strongly believe we need some variant of this. The main problem is e.g. that the glibc's assert has 4 parameters, but the profitability check is only okay with one extra phi node, so D116692 doesn't even trigger on most of the expected cases. While that restriction probably makes sense in normal code, if we are about to run off of a cliff (into an `unreachable`), this successor block is unlikely so the cost to setup these PHI nodes should not be on the hotpath, and shouldn't matter performance-wise. Likewise, we don't sink if there are unconditional predecessors UNLESS we'd sink at least one non-speculatable instruction, which is a performance workaround, but if we are about to run into `unreachable`, it shouldn't matter. Note that we only allow the case where there are at most unconditiona branches on the way to the unreachable block. Differential Revision: https://reviews.llvm.org/D117045	2022-01-13 23:30:31 +03:00
Florian Hahn	3f2fb767e3	[VPlan] Make IV operand explicit for VPWidenCanonicalIVRecipe (NFC). This makes the def-use relationship between VPCanonicalIVPHIRecipe and VPWidenCanonicalIVRecipe explicit. Needed for D117140.	2022-01-13 11:13:05 +00:00
Nikita Popov	1cbb456123	[GlobalOpt] Fix global to select transform under opaque pointers We need to check that the load/store type is also the same, as this is no longer implicitly checked through the pointer type.	2022-01-13 11:13:06 +01:00
Florian Hahn	7ce48be0fd	[LV] Inline CreateSplatIV call for scalar VFs (NFC). This is a NFC change split off from D116123, as suggested there. D116123 will remove the last user of CreateSplatIV.	2022-01-13 09:34:31 +00:00
James Y Knight	55fcbf0a84	Revert "[Inline] Attempt to delete any discardable if unused functions" Somehow this ends up causing an infinite loop in the inliner. This reverts commit d5be48c66d3e5e8be21805c3a33dc67a20e258be.	2022-01-13 03:06:47 +00:00
Philip Reames	9979299705	[Attributor] Simplify how we handle required alignment during heap-to-stack [NFC] The existing code duplicated the same concern in two places, and (weirdly) changed the inference of the allocation size based on whether we could meet the alignment requirement. Instead, just directly check the allocation requirement.	2022-01-12 17:34:17 -08:00
Philip Reames	d1f4c6a611	[Attributor] Generalize calloc handling in heap-to-stack for any init value [NFC] Rewrite the calloc specific handling in heap-to-stack to allow arbitrary init values. The basic problem being solved is that if an allocation is initilized to anything other than zero, this must be explicitly done for the formed alloca as well. This covers the calloc case today, but once a couple of earlier guards are removed in this code, downstream allocators with other init values could also be handled. Inspired by discussion on D116971	2022-01-12 16:58:39 -08:00
Philip Reames	8e76720cf2	[Attributor] Reuse object size evaluation code [NFC]	2022-01-12 16:58:39 -08:00
Philip Reames	db57065b36	[Attributor] Use getAllocAlignment where possible [NFC] Inspired by D116971.	2022-01-12 16:58:39 -08:00
Arthur Eubanks	fe827a93f6	[ModuleInliner] Properly delete dead functions Followup to D116964 where we only did this in the CGSCC inliner. Fixes leaks reported in D116964.	2022-01-12 09:57:43 -08:00
Arthur Eubanks	d5be48c66d	[Inline] Attempt to delete any discardable if unused functions Previously we limited ourselves to only internal/private functions. We can also delete linkonce_odr functions. Minor compile time wins: https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=instructions Major memory wins on tramp3d: https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=max-rss Reviewed By: nikic, mtrofin Differential Revision: https://reviews.llvm.org/D115545	2022-01-12 08:36:04 -08:00
Florian Hahn	d4a8fc3a87	[VPlan] Introduce and use BranchOnCount VPInstruction. This patch adds a new BranchOnCount VPInstruction opcode with 2 operands. It first compares its 2 operands (increment of canonical induction and vector trip count), followed by a branch to either the exit block or back to the vector header. It must be the last recipe in the exit block of the topmost vector loop region. This extracts parts from D113224 and was discussed in D113223. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D116479	2022-01-12 13:42:13 +00:00
Rosie Sumpter	552eb372cb	[LoopVectorize] Pass a vector type to isLegalMaskedGather/Scatter This is required to query the legality more precisely in the LoopVectorizer. This adds another TTI function named 'forceScalarizeMaskedGather/Scatter' function to work around the hack introduced for MVE, where isLegalMaskedGather/Scatter would return an answer by second-guessing where the function was called from, based on the Type passed in (vector vs scalar). The new interface makes this explicit. It is also used by X86 to check for vector widths where gather/scatters aren't profitable (or don't exist) for certain subtargets. Differential Revision: https://reviews.llvm.org/D115329	2022-01-12 13:34:12 +00:00
Florian Hahn	e3275cfa94	[BuildLibCalls] Add nounwind,willreturn to memset_pattern{4,8,16}. Similar to memset, memset_pattern{4,8,16} all will return and do not unwind. Use fallthrough to include all attributes also set for memset. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D114904	2022-01-12 10:32:53 +00:00

... 3 4 5 6 7 ...

29605 Commits