23332 Commits

Author SHA1 Message Date
Sanjay Patel
8d76fbb5f0 [VectorCombine] fix crashing on match of non-canonical fneg
We can't assume that operand 0 is the negated operand because
the matcher handles "fsub -0.0, X" (and also +0.0 with FMF).

By capturing the extract within the match, we avoid the bug
and make the transform more robust (can't assume that this
pass will only see canonical IR).
2022-10-17 10:47:48 -04:00
Nikita Popov
779fd39684 Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify
Relative to the previous attempt, this is rebased over the
InstSimplify fix in ac74e7a7806480a000c9a3502405c3dedd8810de,
which addresses the miscompile reported in PR58401.

-----

foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.

Differential Revision: https://reviews.llvm.org/D134954
2022-10-17 16:11:05 +02:00
Nikita Popov
291924a6f9 [InstCombine] Add test for PR58401 (NFC) 2022-10-17 15:36:54 +02:00
Florian Hahn
699396131f
Revert "Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify"
This reverts commit 333246b48ea4a70842e78c977cc92d365720465f.

It looks like this patch causes a mis-compile:
https://github.com/llvm/llvm-project/issues/58401

Fixes #58401.
2022-10-17 12:56:28 +01:00
Nikita Popov
436fb27186 [BasicAA] Support loop phis in pointsToConstantMemory()
When looking for underlying objects, if we encounter one that we
have already seen, then we should skip it (as it has already been
checked) rather than bail out. In particular, this adds support
for the case where we have a loop use of a phi recurrence.
2022-10-17 12:34:55 +02:00
Nikita Popov
aa89f08afa [BasicAA] Add tests for constant memory with loop phi (NFC) 2022-10-17 12:32:15 +02:00
Max Kazantsev
95935d3f6d [Test] Add tests showing that instcombine does not deal with freeze(load !range) 2022-10-17 12:08:49 +07:00
Max Kazantsev
221411ea12 [Test][NFC] Regenerate test check using update_tests script 2022-10-17 12:07:46 +07:00
Chuanqi Xu
1cedc51ff5 [Coroutines] Don't merge readnone calls in presplit coroutines
Another alternative to fix the thread identification problem in
coroutines.

We plan to fix this problem by unifying memory effecting attributes. See
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
But it may be a long-term project. And it is a pity that the coroutines
can't resume in different threads for years. So this one is temporary
fix. It may cause unnecessary performance regression for coroutines. But
correctness are more important. And this one is planned to be reverted
after we are able to unify the memory effecting attributes actually.

Reviewed By: jdoerfert, rjmccall

Differential Revision: https://reviews.llvm.org/D135550
2022-10-17 10:22:43 +08:00
Florian Hahn
aec0c1009f
[ConstraintElim] Replace custom GEP index handling by using existing code
Instead of duplicating the existing decomposition code for GEP indices
just use the existing code by calling the existing decompose function on
the index expression and multiply the result's coefficients by the scale of
the index.

This both reduces code duplication and generalizes the pattern we can
handle.
2022-10-16 21:53:11 +01:00
Florian Hahn
a4635ec710
[ConstraintElim] Support add nsw for unsigned preds with positive ops.
If both operands of an `add nsw` are known positive, it can be treated
the same as `add nuw` and added to the unsigned system.

https://alive2.llvm.org/ce/z/6gprff
2022-10-16 20:25:14 +01:00
Sanjay Patel
e5ee0b06d6 [InstCombine] try to determine "exact" for sdiv
If the divisor is a power-of-2 or negative-power-of-2 and the dividend
is known to have >= trailing zeros than the divisor, the division is exact:
https://alive2.llvm.org/ce/z/UGBksM (general proof)
https://alive2.llvm.org/ce/z/D4yPS- (examples based on regression tests)

This isn't the most direct optimization (we could create ashr in these
examples instead of relying on existing folds for exact divides), but
it's possible that there's a more general constraint than just a pow2
divisor, so this might be extended in the future.

This should solve issue #58348.

Differential Revision: https://reviews.llvm.org/D135970
2022-10-16 10:59:56 -04:00
Sanjay Patel
78e3aeda3c [InstCombine] add tests for sdiv with (neg)pow2 divisor; NFC 2022-10-16 10:59:56 -04:00
Florian Hahn
067b744dbb
[ConstraintElim] Add tests for add nsw with unsigned predicates. 2022-10-16 15:51:33 +01:00
Florian Hahn
7c1b80e35c
[ConstraintElim] Support unsigned decomposition of mul/shl nuw..const
Support decomposition for `mul/shl nuw` with constant operand for unsigned
queries. Those expressions should not wrap in the unsigned sense and can
be added directly to the unsigned system.
2022-10-15 21:28:08 +01:00
Florian Hahn
f12684d36e
[ConstraintElim] Support signed decomposition of add nsw.
Add support decomposition for `add nsw` for signed queries.
`add nsw` won't wrap and can be directly added to the signed
system.
2022-10-15 18:34:03 +01:00
Zequan Wu
82035ec777 Revert "[PGO] Make emitted symbols hidden"
This reverts commit ecac223b0e4b05a65cf918f90824380db6b9ce64.

The commit causes instrprof-darwin-dead-strip.c to fail on mac.
2022-10-14 15:23:26 -07:00
Florian Hahn
16cf666bb7
[Loop] Move block and loop dispo invalidation to makeLoopInvariant.
makeLoopInvariant may recursively move its operands to make them
invariant, before moving the passed in instruction. Those recursively
moved instructions are currently missed when invalidating block and loop
dispositions.

To address this, move the invalidation code to Loop::makeLoopInvariant.

Fixes #58314.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D135909
2022-10-14 21:58:14 +01:00
Argyrios Kyrtzidis
d877e3fe71 [Transforms/ObjCARC] Fix non-deterministic output of ObjCARCOptPass
`ProvenanceAnalysis::related()` was assuming that the order of parameters for `relatedCheck()` was not affecting
the result but this was not the case when both parameters were `PHINode`s.
Due to this assumption `ProvenanceAnalysis::related()` was ordering the parameters based on pointer value which resulted in
non-deterministic behavior.

To address this change `relatedPHI()` so that it gives the same result independent of the parameter order.

rdar://100325456

Differential Revision: https://reviews.llvm.org/D135376
2022-10-14 12:26:58 -07:00
Craig Topper
44f0b13494 [RISCV] Correct RISCVTTIImpl::getRegUsageForType for vectors of pointers.
getPrimitiveSizeInBits returns 0 for pointers, we need to query
the size via DataLayout instead.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135976
2022-10-14 11:34:12 -07:00
chenglin.bi
a43c0974f0 [SimplifyCFG] Add tests for simpilfycfg, switch to lookup table with i2 types; NFC 2022-10-15 02:25:27 +08:00
Florian Hahn
fb3e2bef4c
[ConstraintElim] Add test cases for shl and mul. 2022-10-14 16:59:13 +01:00
Matt Arsenault
d0750ec475 AtomicExpand: Avoid some operations if the atomic is overaligned
Let some of the pointer bithacking fold away if we know the LSB are 0.
2022-10-13 23:31:00 -07:00
Alexandros Lamprineas
25162418c6 [NFC][FuncSpec] Add a test to show redundant function cloning.
Happens when we find identical specializations.

Differential Revision: https://reviews.llvm.org/D135459
2022-10-13 23:00:23 +01:00
Wolfgang Pieb
b43a1d1bd9 [PGO] Do not create block count annotations when all weights are 0,
avoiding an assertion.

A BB with a nonzero count, whose successor blocks all have 0 counts, could
cause an assertion. Don't create any branch weights in this case.

Reviewed By: xur

Differential Revision: https://reviews.llvm.org/D134203
2022-10-13 14:57:42 -07:00
Sanjay Patel
d85505a932 [InstCombine] fold logical and/or to xor
(A | B) & ~(A & B) --> A ^ B

https://alive2.llvm.org/ce/z/qpFMns

We already have the equivalent fold for real
logic instructions, but this pattern may occur
with selects too.

This is part of solving issue #58313.
2022-10-13 16:12:20 -04:00
Sanjay Patel
b78306c9f7 [InstCombine] add tests for logical select xor folds; NFC
issue #58313
2022-10-13 16:12:20 -04:00
Florian Hahn
572d5d374c
[ConstraintElim] Add support for GEPs with multiple indices.
Lift restriction on GEPs with a single index by iterating over all
indices and joining the {Coefficient, Variable} entries for all indices
together.
2022-10-13 21:08:33 +01:00
Florian Hahn
52fdbbd86d
[ConstraintElim] Add nested GEP test with scalable vectors. 2022-10-13 20:58:11 +01:00
Alex Brachet
ecac223b0e [PGO] Make emitted symbols hidden
This was reverted because it was breaking when targeting Darwin which
tried to export these symbols which are now hidden. It should be safe
to just stop attempting to export these symbols in the clang driver,
though Apple folks will need to change their TAPI allow list described
in the commit where these symbols were originally exported
f538018562

Bug: https://github.com/llvm/llvm-project/issues/58265

Differential Revision: https://reviews.llvm.org/D135340
2022-10-13 19:47:15 +00:00
Nikita Popov
f386f7690d [MemCpyOpt] Add additional tests with lifetime intrinsics (NFC) 2022-10-13 17:29:59 +02:00
Nikita Popov
19aa1aab2e [MemCpyOpt] Don't run full pipeline in test (NFC)
Just memcpyopt is enough for this test.
2022-10-13 17:03:44 +02:00
Florian Hahn
518bccfd6e
[LV] Add epilogue test with variable induction start value.
Add additional test mentioned by @venkataramanan.kumar.llvm in
D92132.
2022-10-13 15:56:27 +01:00
Alexey Bataev
c787986cdd [SLP]Improve costs of vectorized loads/stores by analyzing GEPs.
When generating masked gathers nodes, SLP vectorizer accounts the cost
of the GEPs for loads as part of the scalar-vector transformation cost
estimation. But it does not do it for vectorized loads/stores, while it
may completely remove some of the GEPs completely. Because of this in
some cases masked gather operation can be much more profitable rather
than regular vectorization (masked-gather cost + vector GEP - scalar
loads + GEPs comparing to vectorized loads - scalar loads).
Added the analysis of the removed scalarGEPs for vectorized load/store nodes for better cost estimation.

Differential Revision: https://reviews.llvm.org/D135282
2022-10-13 07:20:41 -07:00
Philip Reames
fe755af3a9 Revert "Remove PlaceSafepoints pass"
This reverts commit cb66e123c6bc82a793300b6fb3ecbed79c58f557.  It was reported via https://reviews.llvm.org/rGcb66e123c6bc82a793300b6fb3ecbed79c58f557#1132969 that the Microsoft.NET compiler is still using this pass.
2022-10-13 07:17:25 -07:00
Matt Devereau
be0d427a14 [VectorCombine] Add insertelement-shufflevector VectorCombine tests
This is a precommit which adds some tests to show the functionality of an
upcoming VectorCombine optimization
2022-10-13 14:10:06 +00:00
Nikita Popov
86126dbc15 [FunctionAttrs] Regenerate test checks (NFC) 2022-10-13 11:24:07 +02:00
Florian Hahn
359bc5c541
[ConstraintElim] Bail out for GEPs when index size > 64 bits.
Limit pointer decomposition to pointers with index sizes of at most 64
bits. int64_t is used for coefficients, so as long as the index size <=
64 bits we should be able to represent all pointer offsets.

Pointer decomposition is limited to inbounds GEPs, so if a index
computation would overflow the result is poison, so it doesn't matter
that the coefficient overflows.

This allows replacing MulOverflow with regular multiplications.
2022-10-13 10:19:30 +01:00
Bjorn Pettersson
3be72f4029 [test][SLPVectorizer] Use -passes syntax in RUN lines. NFC 2022-10-13 10:44:38 +02:00
Bjorn Pettersson
f15ed06a65 [test][IndVarSimplify] Use -passes syntax in RUN lines. NFC 2022-10-13 10:44:37 +02:00
Bjorn Pettersson
8f527e08a5 [test][AggressiveInstCombine] Use -passes syntax in RUN lines. NFC 2022-10-13 10:44:37 +02:00
Bjorn Pettersson
f497a00da9 [test][DSE] Use -passes=dse instead of -dse in lit tests. NFC 2022-10-13 10:44:37 +02:00
Nikita Popov
e74390cc96 [FunctionAttrs] Convert tests to use opaque pointers (NFC)
Conversion performed using the script at:
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
2022-10-13 10:38:11 +02:00
Nikita Popov
45e595880a [FunctionAttrs] Regenerate test checks (NFC) 2022-10-13 10:35:38 +02:00
Nikita Popov
5b3776842f [FunctionAttrs] Account for memory effects of inalloca/preallocated
The code for inferring memory attributes on arguments claims that
inalloca/preallocated arguments are always clobbered:
d71ad41080/llvm/lib/Transforms/IPO/FunctionAttrs.cpp (L640-L642)

However, we would still infer memory attributes for the whole
function without taking this into account, so we could still end
up inferring readnone for the function. This adds an argument
clobber if there are any inalloca/preallocated arguments.

Differential Revision: https://reviews.llvm.org/D135783
2022-10-13 10:20:17 +02:00
Florian Hahn
e143e52c22
[ConstraintElimination] Add tests with 128 bit pointers. 2022-10-12 19:49:29 +01:00
Benjamin Maxwell
14b9505be9 Add test to show missed optimization for masked load/stores
This test shows instcombine failing to remove a alloca and memcpy for
for a constant array that is read with a masked load.

This will be addressed in a subsequent commit.
2022-10-12 17:43:54 +00:00
Sanjay Patel
23fa3031ff [InstCombine] add test for udiv with shl divisor; NFC
This would solve an example from issue #58137 more
generally, but it may require adding a canonicalization
for shift + shift to shift + add.
2022-10-12 11:53:02 -04:00
Sanjay Patel
7b9482df3d [InstCombine] fold sdiv with common shl amount in operands
(X << Z) / (Y << Z) --> X / Y

https://alive2.llvm.org/ce/z/CLKzqT

This requires a surprising "nuw" constraint because we have
to guard against immediate UB via signed-div overflow with
-1 divisor.

This extends 008a89037a49ca0d9 and is another transform
derived from issue #58137.
2022-10-12 11:32:15 -04:00
Alexey Bataev
d71ad41080 [SLP]Fix insertpoint of the extractellements instructions to avoid reshuffle crash.
Need to set the insertpoint for extractelement to point to the first
instruction in the node to avoid possible crash during external uses
combine  process. Without it we may endup with the incorrect
transformation.

Differential Revision: https://reviews.llvm.org/D135591
2022-10-12 08:18:30 -07:00