74 Commits

Author SHA1 Message Date
bipmis
38f3e44997 [AggressiveInstCombine] Load merge the reverse load pattern of consecutive loads.
This patch extends the load merge/widen in AggressiveInstCombine() to handle reverse load patterns.

Differential Revision: https://reviews.llvm.org/D135137
2022-10-19 11:22:58 +01:00
Bjorn Pettersson
8f527e08a5 [test][AggressiveInstCombine] Use -passes syntax in RUN lines. NFC 2022-10-13 10:44:37 +02:00
bipmis
8344dfab59 Add reverse load pattern tests 2022-10-04 10:39:41 +01:00
bipmis
3b49a9fcf6 [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
The patch simplifies some of the patterns as below

1. (ZExt(L1) << shift1) | (ZExt(L2) << shift2) -> ZExt(L3) << shift1
2. (ZExt(L1) << shift1) | ZExt(L2) -> ZExt(L3)

The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc.

Fix the error reported on reverse load merge.

Differential Revision: https://reviews.llvm.org/D127392
2022-09-28 17:32:47 +01:00
bipmis
48b8dee773 remove LE,BE labels inserted incorrectly 2022-09-28 17:07:26 +01:00
bipmis
1dd7e576d7 Add reverse load tests to test load combine patch 2022-09-28 16:51:23 +01:00
Dmitri Gribenko
954d3cd2c6 Revert "[AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load."
This reverts commit 3c70c8c1df66500f67f77596b1e76cf0a8447ee5.

After this commit, during the 3-stage bootstrap the second-stage Clang
crashes.
2022-09-23 19:21:09 +02:00
bipmis
3c70c8c1df [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
The patch simplifies some of the patterns as below

1. (ZExt(L1) << shift1) | (ZExt(L2) << shift2) -> ZExt(L3) << shift1
2. (ZExt(L1) << shift1) | ZExt(L2) -> ZExt(L3)

The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc.

Differential Revision: https://reviews.llvm.org/D127392
2022-09-23 10:19:50 +01:00
bipmis
dd48c0be55 Add Load merge tests to AggressiveInstCombine 2022-09-22 21:55:54 +01:00
Djordje Todorovic
f0f8b46863 Recommit "[AggressiveInstCombine] Lower Table Based CTTZ
The bug reported on the [0] has been fixed.
The issue was we have not checked if the global variables that
represent cttz tables was constant.
There is a new negative test added in negative-lower-table-based-cttz.ll
that represents this.

[0] https://reviews.llvm.org/rGdf868edee561eb973edd85ec9df41c67aa0bff6b
2022-09-20 13:12:47 +02:00
Djordje Todorovic
b080d0bae8 Revert ""Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"""
This reverts commit df868edee561eb973edd85ec9df41c67aa0bff6b, as it
introduces a bug found by Alive2 (more on the rGdf868edee561).
2022-09-12 08:23:07 +02:00
Djordje Todorovic
df868edee5 "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
This reverts commit 053841c5624ca7eacd108a26071d8a1cefe1bebd.

We faced a use-after-free after pushing the D113291, since the
foldSqrt() has a call to eraseFromParent(). The function
should be at the end of the main loop that folds the patterns.
This patch fixes that.
2022-09-09 10:29:39 +02:00
Djordje Todorovic
7aec9ddcfd Revert "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
This reverts commit f87993915768772d113bfd524347ce4341b843cf.
2022-09-08 17:01:16 +02:00
Djordje Todorovic
f879939157 Recommit "[AggressiveInstCombine] Lower Table Based CTTZ" 2022-09-08 16:36:46 +02:00
Richard Smith
053841c562 Revert "[AggressiveInstCombine] Lower Table Based CTTZ"
This reverts commit fec01ee3f5244bb9a04bc4310fc892c56c5b6bab.

According to asan, this patch introduces a heap use after free.
2022-09-02 16:19:09 -07:00
Djordje Todorovic
fec01ee3f5 [AggressiveInstCombine] Lower Table Based CTTZ
This patch introduces recognition of table-based ctz implementation
during the AggressiveInstCombine.

This fixes the [0].

[0] https://bugs.llvm.org/show_bug.cgi?id=46434

Differential Revision: https://reviews.llvm.org/D113291
2022-09-02 17:26:55 +02:00
Sanjay Patel
e079bf6558 [AggressiveInstCombine] check sqrt operand to allow more libcall->intrinsic transforms
This should fix issue #56383 (at least when compiled with -O3 because this pass is only
run at -O3 currently).
2022-07-27 11:36:13 -04:00
Sanjay Patel
3b718de2d3 [AggressiveInstCombine] add tests for sqrt with known positive operand; NFC 2022-07-27 11:36:12 -04:00
Sanjay Patel
e3205b8765 [AggressiveInstCombine] convert sqrt libcalls with "nnan" to sqrt intrinsics
This is an alternate to D129155 that uses TTI.haveFastSqrt() to avoid a
potential miscompile for programs with reads of errno. Moving the transform
to AggressiveInstCombine provides access to TTI.

If a sqrt call has "nnan", that implies that the input argument is never
negative because sqrt of {negative number} --> NAN.
If the argument is never negative and the call can be lowered without a
libcall, then we can assume that errno accesses are unchanged after lowering,
so the call can be translated to the LLVM intrinsic (which is expected to
become inline code).

This affects codegen for targets like x86 that have sqrt instructions, but
still have to conservatively assume that a libcall may be needed to set
errno as shown in issue #52620 and issue #56383.

This patch won't solve those examples - we will need to extend this to use
CannotBeOrderedLessThanZero or similar, enhance that analysis for new
operators, and/or deal with llvm.assume too.

Differential Revision: https://reviews.llvm.org/D129167
2022-07-26 15:50:14 -04:00
Nikita Popov
7c802f985f [AggressiveInstCombine] Update tests to use opaque pointers (NFC)
Update performed using (without manual fixup):
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
2022-06-22 12:33:06 +02:00
David Green
4a5cb957a1 [AggressiveInstcombine] Conditionally fold saturated fptosi to llvm.fptosi.sat
This adds a fold for aggressive instcombine that converts
smin(smax(fptosi(x))) into a llvm.fptosi.sat, providing that the
saturation constants are correct and the cost of the llvm.fptosi.sat is
lower.

Unfortunately, a llvm.fptosi.sat cannot always be converted back to a
smin/smax/fptosi. The llvm.fptosi.sat intrinsic is more defined that the
original, which produces poison if the original fptosi was out of range.
The llvm.fptosi.sat will saturate any value, so needs to be expanded to
a fptosi(fpmin(fpmax(x))), which can be worse for codegeneration
depending on the target.

So this change thais conditional on the backend reporting that the
llvm.fptosi.sat is cheaper that the original smin+smax+fptost.  This is
a change to the way that AggressiveInstrcombine has worked in the past.
Instead of just being a canonicalization pass, that canonicalization can
be dependant on the target in certain specific cases.

Differential Revision: https://reviews.llvm.org/D125755
2022-06-10 09:36:09 +01:00
David Green
f8f50a4975 [AggressiveInstcombine] Add target tests for fptosi.sat fold. NFC 2022-06-09 21:47:05 +01:00
Nikita Popov
03aceab08b [ValueTracking] Enable -branch-on-poison-as-ub by default
Now that SimpleLoopUnswitch and other transforms no longer introduce
branch on poison, enable the -branch-on-poison-as-ub option by
default. The practical impact of this is mostly better flag
preservation in SCEV, and some freeze instructions no longer being
necessary.

Differential Revision: https://reviews.llvm.org/D125299
2022-06-01 10:46:06 +02:00
Anton Afanasyev
0dd8401371 [AggressiveInstCombine] Add phi nodes support to TruncInstCombine
Expand `TruncInstCombine` to handle loops by adding `phi` nodes
to expression graph.

Reviewed by: RKSimon, lebedev.ri

(recommit of fixed f84d732f, reverted by 8ad6d5e after sanitizer breakage)

Differential Revision: https://reviews.llvm.org/D109817
2022-02-25 07:57:35 +03:00
Anton Afanasyev
8ad6d5e465 Revert "[AggressiveInstCombine] Add phi nodes support to TruncInstCombine"
This reverts commit f84d732f8c1737940afab71824134f41f37a048b.
Breakage of "sanitizer-x86_64-linux-fast"
2022-02-23 15:56:11 +03:00
Anton Afanasyev
f84d732f8c [AggressiveInstCombine] Add phi nodes support to TruncInstCombine
Expand `TruncInstCombine` to handle loops by adding `phi` nodes
to expression graph.

Reviewed by: RKSimon, lebedev.ri

Differential Revision: https://reviews.llvm.org/D109817
2022-02-23 14:01:55 +03:00
Anton Afanasyev
ea249489f5 [Test][AggressiveInstCombine] Add test for phi instruction 2022-02-23 12:50:50 +03:00
Bjorn Pettersson
3f8027fb67 [test] Update some test cases to use -passes when specifying the pipeline
This updates transform test cases for
  ADCE
  AddDiscriminators
  AggressiveInstCombine
  AlignmentFromAssumptions
  ArgumentPromotion
  BDCE
  CalledValuePropagation
  DCE
  Reg2Mem
  WholeProgramDevirt
to use the -passes syntax when specifying the pipeline.

Given that LLVM_ENABLE_NEW_PASS_MANAGER isn't set to off (which is
a deprecated feature) the updated test cases already used the new
pass manager, but they were using the legacy syntax when specifying
the passes to run. This patch can be seen as a step toward deprecating
that interface.

This patch also removes some redundant RUN lines. Here I am
referring to test cases that had multiple RUN lines verifying both
the legacy "-passname" syntax and the new "-passes=passname" syntax.
Since we switched the default pass manager to "new PM" both RUN lines
have verified the new PM version of the pass (more or less wasting
time running the same test twice), unless LLVM_ENABLE_NEW_PASS_MANAGER
is set to "off". It is assumed that it is enough to run these tests
with the new pass manager now.

Differential Revision: https://reviews.llvm.org/D108472
2021-09-29 21:51:08 +02:00
Anton Afanasyev
6a5f49a1ac [AggressiveInstCombine] Add {insert/extract}element to TruncInstCombine DAG
Alive2 for `{insert/extract}element`: https://alive2.llvm.org/ce/z/hwy_E-

Actually, no one file of test suite is touched by this change,
which means that is rare pattern not generated by frontend. But
it's worth being in place.

Differential Revision: https://reviews.llvm.org/D109236
2021-09-16 11:24:31 +03:00
Anton Afanasyev
8371a4c9d5 [Test][AggressiveInstCombine] Add test for truncation of vector instructions
Precommit test for D109236
2021-09-16 11:24:30 +03:00
Anton Afanasyev
54d8ebbbfd [AggressiveInstCombine] Add udiv and urem instrs to TruncInstCombine DAG
Add `udiv` and `urem` instructions to the DAG post-dominated by `trunc`,
allowing TruncInstCombine to reduce bitwidth of expressions containing these
instructions. It is sufficient to require that all truncated bits of both
operands are zeros: https://alive2.llvm.org/ce/z/yiithn
(`urem` case is identical).

Differential Revision: https://reviews.llvm.org/D109515
2021-09-10 20:29:08 +03:00
Anton Afanasyev
ea7b2c147f [Test][AggressiveInstCombine] Add test for udiv and urem
Precommit test for D109515
2021-09-10 20:29:08 +03:00
Anton Afanasyev
d1f9b21677 [AggressiveInstCombine] Add AssumptionCache to aggressive instcombine
Add support for @llvm.assume() to TruncInstCombine allowing
optimizations based on these intrinsics while computing known bits.
2021-09-07 16:45:00 +03:00
Anton Afanasyev
388b7a1502 [AggressiveInstCombine][Test] Add test for assumptions 2021-09-07 16:45:00 +03:00
Anton Afanasyev
bed587631f [AggressiveInstCombine] Add arithmetic shift right instr to TruncInstCombine DAG
Add `ashr` instruction to the DAG post-dominated by `trunc`, allowing
`TruncInstCombine` to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are sign bits (all zeros or ones) and
one sign bit is left untruncated: https://alive2.llvm.org/ce/z/Ajo2__

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108355
2021-08-24 10:41:16 +03:00
Anton Afanasyev
280a0b735f [Test][AggressiveInstCombine] Modify shift tests
Add `sext` for `ashr`, remove unrelated tests
2021-08-24 10:30:27 +03:00
Sanjay Patel
dd19f342fa [AggressiveInstCombine] guard against applying instruction flags with constant folding
This is a minimized version of a crash reported in:
D108201
2021-08-20 12:22:18 -04:00
Anton Afanasyev
2eefe4bd17 [Test][AggressiveInstCombine] Split shift tests to shl, lshr and ashr 2021-08-20 06:33:19 +03:00
Anton Afanasyev
85c503422d [Test][AggressiveInstCombine] Add test for arithmetic shift 2021-08-20 06:26:03 +03:00
Anton Afanasyev
cfb6dfcbd1 [AggressiveInstCombine] Add logical shift right instr to TruncInstCombine DAG
Add `lshr` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are zeros: https://alive2.llvm.org/ce/z/_LytbB

Alive2 variable-length proof:
https://godbolt.org/z/1srE1aqzf => s/32/8/ => https://alive2.llvm.org/ce/z/StwPia

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108201
2021-08-18 22:20:58 +03:00
Anton Afanasyev
2498c3edcd [Test][AggressiveInstCombine] Add one more tests for shifts 2021-08-18 22:20:57 +03:00
Anton Afanasyev
0988488ed4 [Test][AggressiveInstCombine] Add one more test for shift truncation
Add test for which `OrigBitWidth != SrcBitWidth`
(https://reviews.llvm.org/D108091#2950131)
2021-08-18 09:29:49 +03:00
Anton Afanasyev
803270c0c6 [AggressiveInstCombine] Fix unsigned overflow
Fix issue reported here: https://reviews.llvm.org/D108091#2950930
2021-08-18 08:42:46 +03:00
Anton Afanasyev
1f3e35b6d1 [AggressiveInstCombine] Add shift left instruction to TruncInstCombine DAG
Add `shl` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing left shifts.

The only thing we need to check is that the target bitwidth must be wider
than the maximal shift amount: https://alive2.llvm.org/ce/z/AwArqu

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108091
2021-08-17 12:44:37 +03:00
Anton Afanasyev
8f8f9260a9 [Test][AggressiveInstCombine] Add test for shifts
Precommit test for D107766/D108091. Also move fixed test for PR50555
from SLPVectorizer/X86/ to PhaseOrdering/X86/ subdirectory.
2021-08-17 12:39:53 +03:00
Anton Afanasyev
c0a42d4491 [Test] Move test for PR50555 from InstCombine to AggressiveInstCombine 2021-08-12 14:42:02 +03:00
Simon Pilgrim
88c5b50060 [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts (REAPPLIED)
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Ensure we block poison in a funnel shift value - similar to rG0fe91ad463fea9d08cbcd640a62aa9ca2d8d05e0

Reapplied with fix for PR48068 - we weren't checking that the shift values could be hoisted from their basicblocks.

Differential Revision: https://reviews.llvm.org/D90625
2020-12-21 15:22:27 +00:00
Jun Ma
137674f882 [TruncInstCombine] Remove scalable vector restriction
Differential Revision: https://reviews.llvm.org/D92819
2020-12-10 18:00:19 +08:00
Martin Storsjö
36cf1e7d0e Revert "[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts"
This reverts commit 59b22e495c15d2830f41381a327f5d6bf49ff416.

That commit broke building for ARM and AArch64, reproducible like this:

$ cat apedec-reduced.c
a;
b(e) {
  int c;
  unsigned d = f();
  c = d >> 32 - e;
  return c;
}
g() {
  int h = i();
  if (a)
    h = h << a | b(a);
  return h;
}
$ clang -target aarch64-linux-gnu -w -c -O3 apedec-reduced.c
clang: ../lib/Transforms/InstCombine/InstructionCombining.cpp:3656: bool llvm::InstCombinerImpl::run(): Assertion `DT.dominates(BB, UserParent) && "Dominance relation broken?"' failed.

Same thing for e.g. an armv7-linux-gnueabihf target.
2020-11-04 08:39:32 +02:00
Simon Pilgrim
59b22e495c [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Differential Revision: https://reviews.llvm.org/D90625
2020-11-03 10:49:49 +00:00