llvm-project

Author	SHA1	Message	Date
Sanjay Patel	8d76fbb5f0	[VectorCombine] fix crashing on match of non-canonical fneg We can't assume that operand 0 is the negated operand because the matcher handles "fsub -0.0, X" (and also +0.0 with FMF). By capturing the extract within the match, we avoid the bug and make the transform more robust (can't assume that this pass will only see canonical IR).	2022-10-17 10:47:48 -04:00
Nikita Popov	779fd39684	Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify Relative to the previous attempt, this is rebased over the InstSimplify fix in ac74e7a7806480a000c9a3502405c3dedd8810de, which addresses the miscompile reported in PR58401. ----- foldOpIntoPhi() currently only folds operations into the phi if all but one operands constant-fold. The two exceptions to this are freeze and select, where we allow more general simplification. This patch makes foldOpIntoPhi() generally simplification based and removes all the instruction-specific logic. We just try to simplify the instruction for each operand, and for the (potentially) one non-simplified operand, we move it into the new block with adjusted operands. This fixes https://github.com/llvm/llvm-project/issues/57448, which was my original motivation for the change. Differential Revision: https://reviews.llvm.org/D134954	2022-10-17 16:11:05 +02:00
Nikita Popov	291924a6f9	[InstCombine] Add test for PR58401 (NFC)	2022-10-17 15:36:54 +02:00
Florian Hahn	699396131f	Revert "Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify" This reverts commit 333246b48ea4a70842e78c977cc92d365720465f. It looks like this patch causes a mis-compile: https://github.com/llvm/llvm-project/issues/58401 Fixes #58401.	2022-10-17 12:56:28 +01:00
Nikita Popov	436fb27186	[BasicAA] Support loop phis in pointsToConstantMemory() When looking for underlying objects, if we encounter one that we have already seen, then we should skip it (as it has already been checked) rather than bail out. In particular, this adds support for the case where we have a loop use of a phi recurrence.	2022-10-17 12:34:55 +02:00
Nikita Popov	aa89f08afa	[BasicAA] Add tests for constant memory with loop phi (NFC)	2022-10-17 12:32:15 +02:00
Max Kazantsev	95935d3f6d	[Test] Add tests showing that instcombine does not deal with freeze(load !range)	2022-10-17 12:08:49 +07:00
Max Kazantsev	221411ea12	[Test][NFC] Regenerate test check using update_tests script	2022-10-17 12:07:46 +07:00
Chuanqi Xu	1cedc51ff5	[Coroutines] Don't merge readnone calls in presplit coroutines Another alternative to fix the thread identification problem in coroutines. We plan to fix this problem by unifying memory effecting attributes. See https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. But it may be a long-term project. And it is a pity that the coroutines can't resume in different threads for years. So this one is temporary fix. It may cause unnecessary performance regression for coroutines. But correctness are more important. And this one is planned to be reverted after we are able to unify the memory effecting attributes actually. Reviewed By: jdoerfert, rjmccall Differential Revision: https://reviews.llvm.org/D135550	2022-10-17 10:22:43 +08:00
Florian Hahn	aec0c1009f	[ConstraintElim] Replace custom GEP index handling by using existing code Instead of duplicating the existing decomposition code for GEP indices just use the existing code by calling the existing decompose function on the index expression and multiply the result's coefficients by the scale of the index. This both reduces code duplication and generalizes the pattern we can handle.	2022-10-16 21:53:11 +01:00
Florian Hahn	a4635ec710	[ConstraintElim] Support `add nsw` for unsigned preds with positive ops. If both operands of an `add nsw` are known positive, it can be treated the same as `add nuw` and added to the unsigned system. https://alive2.llvm.org/ce/z/6gprff	2022-10-16 20:25:14 +01:00
Sanjay Patel	e5ee0b06d6	[InstCombine] try to determine "exact" for sdiv If the divisor is a power-of-2 or negative-power-of-2 and the dividend is known to have >= trailing zeros than the divisor, the division is exact: https://alive2.llvm.org/ce/z/UGBksM (general proof) https://alive2.llvm.org/ce/z/D4yPS- (examples based on regression tests) This isn't the most direct optimization (we could create ashr in these examples instead of relying on existing folds for exact divides), but it's possible that there's a more general constraint than just a pow2 divisor, so this might be extended in the future. This should solve issue #58348. Differential Revision: https://reviews.llvm.org/D135970	2022-10-16 10:59:56 -04:00
Sanjay Patel	78e3aeda3c	[InstCombine] add tests for sdiv with (neg)pow2 divisor; NFC	2022-10-16 10:59:56 -04:00
Florian Hahn	067b744dbb	[ConstraintElim] Add tests for add nsw with unsigned predicates.	2022-10-16 15:51:33 +01:00
Florian Hahn	7c1b80e35c	[ConstraintElim] Support unsigned decomposition of mul/shl nuw..const Support decomposition for `mul/shl nuw` with constant operand for unsigned queries. Those expressions should not wrap in the unsigned sense and can be added directly to the unsigned system.	2022-10-15 21:28:08 +01:00
Florian Hahn	f12684d36e	[ConstraintElim] Support signed decomposition of `add nsw`. Add support decomposition for `add nsw` for signed queries. `add nsw` won't wrap and can be directly added to the signed system.	2022-10-15 18:34:03 +01:00
Zequan Wu	82035ec777	Revert "[PGO] Make emitted symbols hidden" This reverts commit ecac223b0e4b05a65cf918f90824380db6b9ce64. The commit causes instrprof-darwin-dead-strip.c to fail on mac.	2022-10-14 15:23:26 -07:00
Florian Hahn	16cf666bb7	[Loop] Move block and loop dispo invalidation to makeLoopInvariant. makeLoopInvariant may recursively move its operands to make them invariant, before moving the passed in instruction. Those recursively moved instructions are currently missed when invalidating block and loop dispositions. To address this, move the invalidation code to Loop::makeLoopInvariant. Fixes #58314. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D135909	2022-10-14 21:58:14 +01:00
Argyrios Kyrtzidis	d877e3fe71	[Transforms/ObjCARC] Fix non-deterministic output of `ObjCARCOptPass` `ProvenanceAnalysis::related()` was assuming that the order of parameters for `relatedCheck()` was not affecting the result but this was not the case when both parameters were `PHINode`s. Due to this assumption `ProvenanceAnalysis::related()` was ordering the parameters based on pointer value which resulted in non-deterministic behavior. To address this change `relatedPHI()` so that it gives the same result independent of the parameter order. rdar://100325456 Differential Revision: https://reviews.llvm.org/D135376	2022-10-14 12:26:58 -07:00
Craig Topper	44f0b13494	[RISCV] Correct RISCVTTIImpl::getRegUsageForType for vectors of pointers. getPrimitiveSizeInBits returns 0 for pointers, we need to query the size via DataLayout instead. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135976	2022-10-14 11:34:12 -07:00
chenglin.bi	a43c0974f0	[SimplifyCFG] Add tests for simpilfycfg, switch to lookup table with i2 types; NFC	2022-10-15 02:25:27 +08:00
Florian Hahn	fb3e2bef4c	[ConstraintElim] Add test cases for shl and mul.	2022-10-14 16:59:13 +01:00
Matt Arsenault	d0750ec475	AtomicExpand: Avoid some operations if the atomic is overaligned Let some of the pointer bithacking fold away if we know the LSB are 0.	2022-10-13 23:31:00 -07:00
Alexandros Lamprineas	25162418c6	[NFC][FuncSpec] Add a test to show redundant function cloning. Happens when we find identical specializations. Differential Revision: https://reviews.llvm.org/D135459	2022-10-13 23:00:23 +01:00
Wolfgang Pieb	b43a1d1bd9	[PGO] Do not create block count annotations when all weights are 0, avoiding an assertion. A BB with a nonzero count, whose successor blocks all have 0 counts, could cause an assertion. Don't create any branch weights in this case. Reviewed By: xur Differential Revision: https://reviews.llvm.org/D134203	2022-10-13 14:57:42 -07:00
Sanjay Patel	d85505a932	[InstCombine] fold logical and/or to xor (A \| B) & ~(A & B) --> A ^ B https://alive2.llvm.org/ce/z/qpFMns We already have the equivalent fold for real logic instructions, but this pattern may occur with selects too. This is part of solving issue #58313.	2022-10-13 16:12:20 -04:00
Sanjay Patel	b78306c9f7	[InstCombine] add tests for logical select xor folds; NFC issue #58313	2022-10-13 16:12:20 -04:00
Florian Hahn	572d5d374c	[ConstraintElim] Add support for GEPs with multiple indices. Lift restriction on GEPs with a single index by iterating over all indices and joining the {Coefficient, Variable} entries for all indices together.	2022-10-13 21:08:33 +01:00
Florian Hahn	52fdbbd86d	[ConstraintElim] Add nested GEP test with scalable vectors.	2022-10-13 20:58:11 +01:00
Alex Brachet	ecac223b0e	[PGO] Make emitted symbols hidden This was reverted because it was breaking when targeting Darwin which tried to export these symbols which are now hidden. It should be safe to just stop attempting to export these symbols in the clang driver, though Apple folks will need to change their TAPI allow list described in the commit where these symbols were originally exported `f538018562` Bug: https://github.com/llvm/llvm-project/issues/58265 Differential Revision: https://reviews.llvm.org/D135340	2022-10-13 19:47:15 +00:00
Nikita Popov	f386f7690d	[MemCpyOpt] Add additional tests with lifetime intrinsics (NFC)	2022-10-13 17:29:59 +02:00
Nikita Popov	19aa1aab2e	[MemCpyOpt] Don't run full pipeline in test (NFC) Just memcpyopt is enough for this test.	2022-10-13 17:03:44 +02:00
Florian Hahn	518bccfd6e	[LV] Add epilogue test with variable induction start value. Add additional test mentioned by @venkataramanan.kumar.llvm in D92132.	2022-10-13 15:56:27 +01:00
Alexey Bataev	c787986cdd	[SLP]Improve costs of vectorized loads/stores by analyzing GEPs. When generating masked gathers nodes, SLP vectorizer accounts the cost of the GEPs for loads as part of the scalar-vector transformation cost estimation. But it does not do it for vectorized loads/stores, while it may completely remove some of the GEPs completely. Because of this in some cases masked gather operation can be much more profitable rather than regular vectorization (masked-gather cost + vector GEP - scalar loads + GEPs comparing to vectorized loads - scalar loads). Added the analysis of the removed scalarGEPs for vectorized load/store nodes for better cost estimation. Differential Revision: https://reviews.llvm.org/D135282	2022-10-13 07:20:41 -07:00
Philip Reames	fe755af3a9	Revert "Remove PlaceSafepoints pass" This reverts commit cb66e123c6bc82a793300b6fb3ecbed79c58f557. It was reported via https://reviews.llvm.org/rGcb66e123c6bc82a793300b6fb3ecbed79c58f557#1132969 that the Microsoft.NET compiler is still using this pass.	2022-10-13 07:17:25 -07:00
Matt Devereau	be0d427a14	[VectorCombine] Add insertelement-shufflevector VectorCombine tests This is a precommit which adds some tests to show the functionality of an upcoming VectorCombine optimization	2022-10-13 14:10:06 +00:00
Nikita Popov	86126dbc15	[FunctionAttrs] Regenerate test checks (NFC)	2022-10-13 11:24:07 +02:00
Florian Hahn	359bc5c541	[ConstraintElim] Bail out for GEPs when index size > 64 bits. Limit pointer decomposition to pointers with index sizes of at most 64 bits. int64_t is used for coefficients, so as long as the index size <= 64 bits we should be able to represent all pointer offsets. Pointer decomposition is limited to inbounds GEPs, so if a index computation would overflow the result is poison, so it doesn't matter that the coefficient overflows. This allows replacing MulOverflow with regular multiplications.	2022-10-13 10:19:30 +01:00
Bjorn Pettersson	3be72f4029	[test][SLPVectorizer] Use -passes syntax in RUN lines. NFC	2022-10-13 10:44:38 +02:00
Bjorn Pettersson	f15ed06a65	[test][IndVarSimplify] Use -passes syntax in RUN lines. NFC	2022-10-13 10:44:37 +02:00
Bjorn Pettersson	8f527e08a5	[test][AggressiveInstCombine] Use -passes syntax in RUN lines. NFC	2022-10-13 10:44:37 +02:00
Bjorn Pettersson	f497a00da9	[test][DSE] Use -passes=dse instead of -dse in lit tests. NFC	2022-10-13 10:44:37 +02:00
Nikita Popov	e74390cc96	[FunctionAttrs] Convert tests to use opaque pointers (NFC) Conversion performed using the script at: https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34	2022-10-13 10:38:11 +02:00
Nikita Popov	45e595880a	[FunctionAttrs] Regenerate test checks (NFC)	2022-10-13 10:35:38 +02:00
Nikita Popov	5b3776842f	[FunctionAttrs] Account for memory effects of inalloca/preallocated The code for inferring memory attributes on arguments claims that inalloca/preallocated arguments are always clobbered: `d71ad41080/llvm/lib/Transforms/IPO/FunctionAttrs.cpp (L640-L642)` However, we would still infer memory attributes for the whole function without taking this into account, so we could still end up inferring readnone for the function. This adds an argument clobber if there are any inalloca/preallocated arguments. Differential Revision: https://reviews.llvm.org/D135783	2022-10-13 10:20:17 +02:00
Florian Hahn	e143e52c22	[ConstraintElimination] Add tests with 128 bit pointers.	2022-10-12 19:49:29 +01:00
Benjamin Maxwell	14b9505be9	Add test to show missed optimization for masked load/stores This test shows instcombine failing to remove a alloca and memcpy for for a constant array that is read with a masked load. This will be addressed in a subsequent commit.	2022-10-12 17:43:54 +00:00
Sanjay Patel	23fa3031ff	[InstCombine] add test for udiv with shl divisor; NFC This would solve an example from issue #58137 more generally, but it may require adding a canonicalization for shift + shift to shift + add.	2022-10-12 11:53:02 -04:00
Sanjay Patel	7b9482df3d	[InstCombine] fold sdiv with common shl amount in operands (X << Z) / (Y << Z) --> X / Y https://alive2.llvm.org/ce/z/CLKzqT This requires a surprising "nuw" constraint because we have to guard against immediate UB via signed-div overflow with -1 divisor. This extends 008a89037a49ca0d9 and is another transform derived from issue #58137.	2022-10-12 11:32:15 -04:00
Alexey Bataev	d71ad41080	[SLP]Fix insertpoint of the extractellements instructions to avoid reshuffle crash. Need to set the insertpoint for extractelement to point to the first instruction in the node to avoid possible crash during external uses combine process. Without it we may endup with the incorrect transformation. Differential Revision: https://reviews.llvm.org/D135591	2022-10-12 08:18:30 -07:00

1 2 3 4 5 ...

23332 Commits