llvm-project

Author	SHA1	Message	Date
Roman Lebedev	6a563e2570	[NFC][SCEV][IndVars] Add more tests for exit count w/ `select` See https://github.com/llvm/llvm-project/issues/53020	2022-01-07 01:30:21 +03:00
Congzhe Cao	c251bfc3b9	[LoopInterchange] Remove a limitation in LoopInterchange legality There was a limitation in legality that in the original inner loop latch, no instruction was allowed between the induction variable increment and the branch instruction. This is because we used to split the inner latch at the induction variable increment instruction. Since now we have split at the inner latch branch instruction and have properly duplicated instructions over to the split block, we remove this limitation. Please refer to the test case updates to see how we now interchange loops where instructions exist between the induction variable increment and the branch instruction. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D115238	2022-01-06 15:56:32 -05:00
Alexey Bataev	d130df544d	[SLP]Improve reordering for the nodes beeing used in alternate vectorization. No need to include the order of the scalars beeing used as part of the alternate vectorization into account when trying to reorder the whole graph. Such elements better to reorder in the following phase because the subtree still ends up in shuffle. Part of D116688, fixes the regression in D116690. Differential Revision: https://reviews.llvm.org/D116740	2022-01-06 11:18:57 -08:00
Alexey Bataev	7cb19fe493	[SLP]Initialize the lane with the given value instead of default 0. There is a bug in the reordering analysis stage. If the element with the given hash is not added to the map but has the same number of APOs and instructions with same parent, but different instruction opcode, it will be initalized with default values and then the counter is increased by 1. But the lane is not updated and default to 0 instead of the actual `Lane` value. It leads to the fact that the analysis is useless in many cases and default to lane 0 instead of actual lane with the minimum amount of APO operands. Differential Revision: https://reviews.llvm.org/D116690	2022-01-06 10:57:11 -08:00
Nikita Popov	918015c9ba	[EarlyCSE] Support opaque pointers Explicitly check the load/store value type, because this is no longer implicitly checked through the pointer type.	2022-01-06 17:08:50 +01:00
Alexey Bataev	bf5a688252	[SLP][NFC]Add a test for the extra shuffle after alternate node, NFC.	2022-01-06 06:34:58 -08:00
Nikita Popov	41a522779d	[LICM] Check for noalias call instead of alloc like fn When determining whether the memory is local to the function (and we can thus introduce spurious writes without thread-safety issues), check for a noalias call rather than the hardcoded list of memory allocation functions. Noalias calls are the more general way to determine allocation functions, as long as we're only interested in the property that the returned value is distinct from any other accessible memory. Differential Revision: https://reviews.llvm.org/D116728	2022-01-06 14:38:19 +01:00
Nikita Popov	f430c1eb64	[Tests] Add elementtype attribute to indirect inline asm operands (NFC) This updates LLVM tests for D116531 by adding elementtype attributes to operands that correspond to indirect asm constraints.	2022-01-06 14:23:51 +01:00
Florian Hahn	86d113a8b8	[SCEVExpand] Do not create redundant 'or false' for pred expansion. This patch updates SCEVExpander::expandUnionPredicate to not create redundant 'or false, x' instructions. While those are trivially foldable, they can be easily avoided and hinder code that checks the size/cost of the generated checks before further folds. I am planning on look into a few other similar improvements to code generated by SCEVExpander. I remember a while ago @lebedev.ri working on doing some trivial folds like that in IRBuilder itself, but there where concerns that such changes may subtly break existing code. Reviewed By: reames, lebedev.ri Differential Revision: https://reviews.llvm.org/D116696	2022-01-06 11:52:19 +00:00
Nikita Popov	0fa174398b	[LICM] Add test for noalias call (NFC) Add a test with a noalias call that is not a known allocation function.	2022-01-06 11:46:27 +01:00
Nikita Popov	c41aa41957	[ConstFold] Add missing check for inbounds gep If the gep is not inbounds, then the gep might compute a null value even if the base pointer is non-null.	2022-01-06 09:59:40 +01:00
Nikita Popov	37c9171764	[ConstantFold] Add test for invalid non-inbounds gep icmp fold The gep evaluated to null in this case, and as such is not ne null.	2022-01-06 09:59:40 +01:00
Congzhe Cao	8ade3d43a3	Revert "[LoopInterchange] Remove a limitation in LoopInterchange legality" This reverts commit 15702ff9ce28b3f4aafec13be561359d4c721595 while I investigate a ppc build bot failure at https://lab.llvm.org/buildbot#builders/36/builds/16051.	2022-01-05 23:34:36 -05:00
Congzhe Cao	15702ff9ce	[LoopInterchange] Remove a limitation in LoopInterchange legality There was a limitation in legality that in the original inner loop latch, no instruction was allowed between the induction variable increment and the branch instruction. This is because we used to split the inner latch at the induction variable increment instruction. Since now we have split at the inner latch branch instruction and have properly duplicated instructions over to the split block, we remove this limitation. Please refer to the test case updates to see how we now interchange loops where instructions exist between the induction variable increment and the branch instruction. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D115238	2022-01-05 22:37:54 -05:00
Philip Reames	cfcd7af8de	[instcombine] Add test coverage for (x >>u y) pred x [part 2]	2022-01-05 13:37:17 -08:00
Philip Reames	8cc52ca734	[instcombine] Add test coverage for (x >>u y) pred x	2022-01-05 13:34:05 -08:00
Philip Reames	4016d440fe	Precommit test for D116683	2022-01-05 11:57:15 -08:00
Philip Reames	1a97138a1c	Add test case from 356ada9	2022-01-05 11:19:16 -08:00
Philip Reames	58a0e449e1	[instcombine] Allow sinking of calls with known writes to uses If we have a call whose only side effect is a write to a location which is known to be dead, we can sink said call to the users of the call's result value. This is analogous to the recent changes to delete said calls if unused, but framed as a sinking transform instead. Differential Revision: https://reviews.llvm.org/D116200	2022-01-05 10:37:22 -08:00
Nico Weber	085f078307	Revert "Revert D109159 "[amdgpu] Enable selection of `s_cselect_b64`."" This reverts commit 859ebca744e634dcc89a2294ffa41574f947bd62. The change contained many unrelated changes and e.g. restored unit test failes for the old lld port.	2022-01-05 13:10:25 -05:00
Sanjay Patel	e2165e0968	[InstCombine] remove trunc user restriction for match of bswap This does not appear to cause any problems, and it fixes #50910 Extra tests with a trunc user were added with: 3a239379 ...but they don't match either way, so there's an opportunity to improve the matching further.	2022-01-05 13:04:11 -05:00
David Salinas	859ebca744	Revert D109159 "[amdgpu] Enable selection of `s_cselect_b64`." This reverts commit 640beb38e7710b939b3cfb3f4c54accc694b1d30. That commit caused performance degradtion in Quicksilver test QS:sGPU and a functional test failure in (rocPRIM rocprim.device_segmented_radix_sort). Reverting until we have a better solution to s_cselect_b64 codegen cleanup Change-Id: Ibf8e397df94001f248fba609f072088a46abae08 Reviewed By: kzhuravl Differential Revision: https://reviews.llvm.org/D115960 Change-Id: Id169459ce4dfffa857d5645a0af50b0063ce1105	2022-01-05 17:57:32 +00:00
Sanjay Patel	3a2393795f	[InstCombine] add tests for bswap; NFC	2022-01-05 08:33:04 -05:00
Sanjay Patel	4a8c0aa094	[InstSimplify] add tests for udiv/urem with known bits; NFC	2022-01-05 08:33:04 -05:00
Nikita Popov	6e474d3308	[GlobalOpt][Evaluator] Fix off by one error in bounds check (PR53002) We should bail out if the index is >= the size, not > the size. Fixes https://github.com/llvm/llvm-project/issues/53002.	2022-01-05 14:06:02 +01:00
Sander de Smalen	95a93722db	[LV] Remove what seems like stale code in collectElementTypesForWidening. This was originally added in rG22174f5d5af1eb15b376c6d49e7925cbb7cca6be although that patch doesn't really mention any reasons for ignoring the pointer type in this calculation if the memory access isn't consecutive. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D115356	2022-01-05 12:20:59 +00:00
Nikita Popov	3dc1907d06	[ConstantFold] Use ConstantFoldLoadFromUniformValue() in more places In particular, this also preserves undef when loading from padding, rather than converting it to zero through a different codepath. This is the remaining part of D115924.	2022-01-05 12:47:50 +01:00
Nikita Popov	4e62d210c4	[ConstantFold] Add test for load of padding (NFC) This currently load zero rather than undef.	2022-01-05 12:47:49 +01:00
Nikita Popov	99c6b12b92	[ConstantFolding] Unify handling of load from uniform value There are a number of places that specially handle loads from a uniform value where all the bits are the same (zero, one, undef, poison), because we a) don't care about the load offset in that case b) it bypasses casts that might not be legal generally but do work with uniform values. We had multiple implementations of this, with a different set of supported values each time. This replaces two usages with a more complete helper. Other usages will be replaced separately, because they have larger impact. This is part of D115924.	2022-01-05 12:30:46 +01:00
Nikita Popov	00686ab4af	[ConstantFold] Add additional load from uniform value tests (NFC)	2022-01-05 12:30:46 +01:00
Benjamin Kramer	5f0a349738	Revert "Revert "[InferAttrs] Add writeonly to all the math functions"" This reverts commit 29b6e967f3e99ac45340ea37a70262c70e4e7528. The bug it found in PartiallyInlineLibCalls was fixed in c8ffc73350dbb6044ca947bbead127b9b914cdf3.	2022-01-05 12:16:35 +01:00
Benjamin Kramer	c8ffc73350	[PartiallyInlineLibCalls] Don't crash when there's a writeonly attribute on the call readnone subsumes writeonly, so just swap out the attributes. The verifier doesn't allow us to have both on a call.	2022-01-05 12:16:26 +01:00
Florian Hahn	65c4d6191f	[VPlan] Add VPCanonicalIVPHIRecipe, partly retire createInductionVariable. At the moment, the primary induction variable for the vector loop is created as part of the skeleton creation. This is tied to creating the vector loop latch outside of VPlan. This prevents from modeling the whole vector loop in VPlan, which in turn is required to model preheader and exit blocks in VPlan as well. This patch introduces a new recipe VPCanonicalIVPHIRecipe to represent the primary IV in VPlan and CanonicalIVIncrement{NUW} opcodes for VPInstruction to model the increment. This allows us to partly retire createInductionVariable. At the moment, a bit of patching up is done after executing all blocks in the plan. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D113223	2022-01-05 10:46:06 +00:00
Martin Storsjö	29b6e967f3	Revert "[InferAttrs] Add writeonly to all the math functions" This reverts commit ea75be3d9df448b6abafaf752a8141764d93ca33 and 1eb5b6e85045d22720f177a02aaf7097930e4b4f. That commit caused crashes with compilation e.g. like this (not fixed by the follow-up commit): $ cat sqrt.c float a; b() { sqrt(a); } $ clang -target x86_64-linux-gnu -c -O2 sqrt.c Attributes 'readnone and writeonly' are incompatible! %sqrtf = tail call float @sqrtf(float %0) #1 in function b fatal error: error in backend: Broken function found, compilation aborted!	2022-01-05 11:12:19 +02:00
Nikita Popov	00e6869463	[MemCpyOpt] Look through pointer casts when checking capture The user scanning loop above looks through pointer casts, so we also need to strip pointer casts in the capture check. Previously the source was incorrectly considered not captured if a bitcast was passed to the call.	2022-01-05 09:50:33 +01:00
Nikita Popov	487a34ed9d	[MemCpyOpt] Make capture check during call slot optimization more precise Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implementation bug, this check currently doesn't trigger if a bitcast of the source pointer is passed instead. I'm somewhat afraid of the fallout of fixing this bug (due to heavy reliance on call slot optimization in rust), so I'd like to strengthen the capture reasoning a bit first. In particular, I believe that the capture is fine as long as a) the call itself cannot depend on the pointer identity, because neither dest has been captured before/at nor src before the call and b) there is no potential use of the captured pointer before the lifetime of the source alloca ends, either due to lifetime.end or a return from a function. At that point the potentially captured pointer becomes dangling. Differential Revision: https://reviews.llvm.org/D115615	2022-01-05 09:39:25 +01:00
Nikita Popov	c2e77c9122	[MemCpyOpt] Add additional call slot capture tests (NFC)	2022-01-05 09:33:04 +01:00
Nikita Popov	787f86e68c	[GlobalOpt][Evaluator] Don't create bitcast for same type (PR52994) isBitOrNoopPointerCastable() returns true if the types are the same, but it's not actually possible to create a bitcast for all such types. The assumption seems to be that the user will omit creating the cast in that case, as it is unnecessary. Fixes https://github.com/llvm/llvm-project/issues/52994.	2022-01-05 09:17:07 +01:00
Fangrui Song	1eb5b6e850	[InferAttrs] If readonly is already set, set readnone instead of writeonly D116426 may lead to an assertion failure `Attributes 'readonly and writeonly' are incompatible!` if the builtin function already has `readonly`.	2022-01-04 18:59:35 -08:00
Chuanqi Xu	c75cedc237	[Coroutines] Set presplit attribute in Clang and mlir This fixes bug49264. Simply, coroutine shouldn't be inlined before CoroSplit. And the marker for pre-splited coroutine is created in CoroEarly pass, which ran after AlwaysInliner Pass in O0 pipeline. So that the AlwaysInliner couldn't detect it shouldn't inline a coroutine. So here is the error. This patch set the presplit attribute in clang and mlir. So the inliner would always detect the attribute before splitting. Reviewed By: rjmccall, ezhulenev Differential Revision: https://reviews.llvm.org/D115790	2022-01-05 10:25:02 +08:00
Philip Reames	11a46b1749	precommit tests for a planned followon to D116200	2022-01-04 12:02:25 -08:00
Philip Reames	1be54bc764	precommit additional tests for D116200	2022-01-04 11:50:44 -08:00
Philip Reames	0b09313cd5	[funcattrs] Infer writeonly argument attribute [part 2] This builds on the code from D114963, and extends it to handle calls both direct and indirect. With the revised code structure (from series of previously landed NFCs), this is pretty straight forward. One thing to note is that we can not infer writeonly for arguments which might be captured. If the pointer can be read back by the caller, and then read through, we have no way to track that. This is the same restriction we have for readonly, except that we get no mileage out of the "callee can be readonly" exception since a writeonly param on a readonly function is either a) readnone or b) UB. This means we can't actually infer much unless nocapture has already been inferred. Differential Revision: https://reviews.llvm.org/D115003	2022-01-04 09:07:54 -08:00
Benjamin Kramer	ea75be3d9d	[InferAttrs] Add writeonly to all the math functions All of these functions would be `readnone`, but can't be on platforms where they can set `errno`. A `writeonly` function with no pointer arguments can only write (but never read) global state. Writeonly theoretically allows these calls to be CSE'd (a writeonly call with the same arguments will always result in the same global stores) or hoisted out of loops, but that's not implemented currently. There are a few functions in this list that could be `readnone` instead of `writeonly`, if someone is interested. Differential Revision: https://reviews.llvm.org/D116426	2022-01-04 16:58:05 +01:00
Florian Hahn	d8276208be	[LAA] Remove overeager assertion for aggregate types. 0a00d64 turned an early exit here into an assertion, but the assertion can be triggered, as PR52920 shows. The later code is agnostic to the accessed type, so just drop the assert. The patch also adds tests for LAA directly and loop-load-elimination to show the behavior is sane.	2022-01-04 15:20:35 +00:00
Nikita Popov	6c031780aa	[ConstantFold] Remove another incorrect icmp of gep fold This folded (null + X) == g to false, but of course this is incorrect if X == g. Possibly this got confused with the null == g case, which is already handled elsewhere.	2022-01-04 16:08:09 +01:00
Nikita Popov	25448826dd	[InstSimplify] Update test to make miscompile more obvious (NFC) This is now testing (null + g3) != g3 and still coming up with "true" as the answer. The original case was a less obvious miscompile with index overflow involved.	2022-01-04 16:08:09 +01:00
Nikita Popov	d74212987b	[ConstantFold] Remove unnecessary bounded index restriction The fold for merging a GEP of GEP into a single GEP currently bails if doing so would result in notional overindexing. The justification given in the comment above this check is dangerously incorrect: GEPs with notional overindexing are perfectly fine, and if some code treats them incorrectly, then that code is broken, not the GEP. Such a GEP might legally appear in source IR, so only preventing its creation cannot be sufficient. (The constant folder also ends up canonicalizing the GEP to remove the notional overindexing, but that's neither here nor there.) This check dates back to `bd4fef4a89`, and as far as I can tell the original issue this was trying to patch around has since been resolved. Differential Revision: https://reviews.llvm.org/D116587	2022-01-04 15:23:09 +01:00
Nikita Popov	75db002725	[ConstantFold] Remove another incorrect icmp of GEP fold This fold is not correct, because indices might evaluate to zero even if they are not a literal zero integer. Additionally, this fold would be wrong (in the general case) for non-i8 types as well, due to index overflow. Drop this fold and instead let the target-dependent constant folder compute the actual offset and fold the comparison based on that.	2022-01-04 12:27:40 +01:00
Nikita Popov	aefab6f8d5	[InstSimplify] Use weak symbol in test to show miscompile (NFC) This fold is incorrect, because it assumes that all indices are non-zero. This happens to be true for the test as written, but doesn't hold if we use an extern weak global instead, for which ptrtoint might be zero. Add separate tests for the simple constant int case.	2022-01-04 12:27:40 +01:00

1 2 3 4 5 ...

20587 Commits