81 Commits

Author SHA1 Message Date
Nikita Popov
cdfb99b069 [AggressiveInstCombine] Convert tests to opaque pointers (NFC) 2022-12-23 09:47:48 +01:00
Paul Walker
f53234cbfd [AggressiveInstCombine] Fix invalid TypeSize conversion when combining loads.
Much of foldLoadsRecursive relies on knowing the size of loaded
data, which is not possible for scalable vector types.  However,
the logic of combining two small loads into one bigger load does
not apply for vector types so rather than converting the algorithm
to use TypeSize I've simply added an early exit for vectors.

Fixes #59510

Differential Revision: https://reviews.llvm.org/D140106
2022-12-17 15:34:27 +00:00
bipmis
e9393789a9 [AggressiveInstCombine] Handle the insert point of the merged load correctly.
This patch updates the load insert point of the merged load in AggressiveInstCombine().
This is done to handle the reported test breaks by handling Alias Analysis correctly.

Differential Revision: https://reviews.llvm.org/D137201
2022-11-29 10:53:51 +00:00
bipmis
ee53abb070 Add more tests for Reverse Load and AA testing 2022-11-28 15:34:26 +00:00
bipmis
150fc73dda [AggressiveInstCombine] Avoid load merge/widen if stores are present b/w loads
This patch is to address the test cases in which the load has to be inserted at a right point. This happens when there is a store b/w the loads.

This patch reverts the loads merge in all cases when stores are present b/w loads and will eventually be replaced with proper fix and test cases.

Differential Revision: https://reviews.llvm.org/D137333
2022-11-03 14:32:07 +00:00
bipmis
3ee1882299 Add another test which breaks the load insert point 2022-11-03 12:28:24 +00:00
bipmis
cc7b03b01e Add tests which need right Insert Point for merged load 2022-11-01 21:49:44 +00:00
bipmis
38f3e44997 [AggressiveInstCombine] Load merge the reverse load pattern of consecutive loads.
This patch extends the load merge/widen in AggressiveInstCombine() to handle reverse load patterns.

Differential Revision: https://reviews.llvm.org/D135137
2022-10-19 11:22:58 +01:00
Bjorn Pettersson
8f527e08a5 [test][AggressiveInstCombine] Use -passes syntax in RUN lines. NFC 2022-10-13 10:44:37 +02:00
bipmis
8344dfab59 Add reverse load pattern tests 2022-10-04 10:39:41 +01:00
bipmis
3b49a9fcf6 [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
The patch simplifies some of the patterns as below

1. (ZExt(L1) << shift1) | (ZExt(L2) << shift2) -> ZExt(L3) << shift1
2. (ZExt(L1) << shift1) | ZExt(L2) -> ZExt(L3)

The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc.

Fix the error reported on reverse load merge.

Differential Revision: https://reviews.llvm.org/D127392
2022-09-28 17:32:47 +01:00
bipmis
48b8dee773 remove LE,BE labels inserted incorrectly 2022-09-28 17:07:26 +01:00
bipmis
1dd7e576d7 Add reverse load tests to test load combine patch 2022-09-28 16:51:23 +01:00
Dmitri Gribenko
954d3cd2c6 Revert "[AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load."
This reverts commit 3c70c8c1df66500f67f77596b1e76cf0a8447ee5.

After this commit, during the 3-stage bootstrap the second-stage Clang
crashes.
2022-09-23 19:21:09 +02:00
bipmis
3c70c8c1df [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
The patch simplifies some of the patterns as below

1. (ZExt(L1) << shift1) | (ZExt(L2) << shift2) -> ZExt(L3) << shift1
2. (ZExt(L1) << shift1) | ZExt(L2) -> ZExt(L3)

The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc.

Differential Revision: https://reviews.llvm.org/D127392
2022-09-23 10:19:50 +01:00
bipmis
dd48c0be55 Add Load merge tests to AggressiveInstCombine 2022-09-22 21:55:54 +01:00
Djordje Todorovic
f0f8b46863 Recommit "[AggressiveInstCombine] Lower Table Based CTTZ
The bug reported on the [0] has been fixed.
The issue was we have not checked if the global variables that
represent cttz tables was constant.
There is a new negative test added in negative-lower-table-based-cttz.ll
that represents this.

[0] https://reviews.llvm.org/rGdf868edee561eb973edd85ec9df41c67aa0bff6b
2022-09-20 13:12:47 +02:00
Djordje Todorovic
b080d0bae8 Revert ""Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"""
This reverts commit df868edee561eb973edd85ec9df41c67aa0bff6b, as it
introduces a bug found by Alive2 (more on the rGdf868edee561).
2022-09-12 08:23:07 +02:00
Djordje Todorovic
df868edee5 "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
This reverts commit 053841c5624ca7eacd108a26071d8a1cefe1bebd.

We faced a use-after-free after pushing the D113291, since the
foldSqrt() has a call to eraseFromParent(). The function
should be at the end of the main loop that folds the patterns.
This patch fixes that.
2022-09-09 10:29:39 +02:00
Djordje Todorovic
7aec9ddcfd Revert "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
This reverts commit f87993915768772d113bfd524347ce4341b843cf.
2022-09-08 17:01:16 +02:00
Djordje Todorovic
f879939157 Recommit "[AggressiveInstCombine] Lower Table Based CTTZ" 2022-09-08 16:36:46 +02:00
Richard Smith
053841c562 Revert "[AggressiveInstCombine] Lower Table Based CTTZ"
This reverts commit fec01ee3f5244bb9a04bc4310fc892c56c5b6bab.

According to asan, this patch introduces a heap use after free.
2022-09-02 16:19:09 -07:00
Djordje Todorovic
fec01ee3f5 [AggressiveInstCombine] Lower Table Based CTTZ
This patch introduces recognition of table-based ctz implementation
during the AggressiveInstCombine.

This fixes the [0].

[0] https://bugs.llvm.org/show_bug.cgi?id=46434

Differential Revision: https://reviews.llvm.org/D113291
2022-09-02 17:26:55 +02:00
Sanjay Patel
e079bf6558 [AggressiveInstCombine] check sqrt operand to allow more libcall->intrinsic transforms
This should fix issue #56383 (at least when compiled with -O3 because this pass is only
run at -O3 currently).
2022-07-27 11:36:13 -04:00
Sanjay Patel
3b718de2d3 [AggressiveInstCombine] add tests for sqrt with known positive operand; NFC 2022-07-27 11:36:12 -04:00
Sanjay Patel
e3205b8765 [AggressiveInstCombine] convert sqrt libcalls with "nnan" to sqrt intrinsics
This is an alternate to D129155 that uses TTI.haveFastSqrt() to avoid a
potential miscompile for programs with reads of errno. Moving the transform
to AggressiveInstCombine provides access to TTI.

If a sqrt call has "nnan", that implies that the input argument is never
negative because sqrt of {negative number} --> NAN.
If the argument is never negative and the call can be lowered without a
libcall, then we can assume that errno accesses are unchanged after lowering,
so the call can be translated to the LLVM intrinsic (which is expected to
become inline code).

This affects codegen for targets like x86 that have sqrt instructions, but
still have to conservatively assume that a libcall may be needed to set
errno as shown in issue #52620 and issue #56383.

This patch won't solve those examples - we will need to extend this to use
CannotBeOrderedLessThanZero or similar, enhance that analysis for new
operators, and/or deal with llvm.assume too.

Differential Revision: https://reviews.llvm.org/D129167
2022-07-26 15:50:14 -04:00
Nikita Popov
7c802f985f [AggressiveInstCombine] Update tests to use opaque pointers (NFC)
Update performed using (without manual fixup):
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
2022-06-22 12:33:06 +02:00
David Green
4a5cb957a1 [AggressiveInstcombine] Conditionally fold saturated fptosi to llvm.fptosi.sat
This adds a fold for aggressive instcombine that converts
smin(smax(fptosi(x))) into a llvm.fptosi.sat, providing that the
saturation constants are correct and the cost of the llvm.fptosi.sat is
lower.

Unfortunately, a llvm.fptosi.sat cannot always be converted back to a
smin/smax/fptosi. The llvm.fptosi.sat intrinsic is more defined that the
original, which produces poison if the original fptosi was out of range.
The llvm.fptosi.sat will saturate any value, so needs to be expanded to
a fptosi(fpmin(fpmax(x))), which can be worse for codegeneration
depending on the target.

So this change thais conditional on the backend reporting that the
llvm.fptosi.sat is cheaper that the original smin+smax+fptost.  This is
a change to the way that AggressiveInstrcombine has worked in the past.
Instead of just being a canonicalization pass, that canonicalization can
be dependant on the target in certain specific cases.

Differential Revision: https://reviews.llvm.org/D125755
2022-06-10 09:36:09 +01:00
David Green
f8f50a4975 [AggressiveInstcombine] Add target tests for fptosi.sat fold. NFC 2022-06-09 21:47:05 +01:00
Nikita Popov
03aceab08b [ValueTracking] Enable -branch-on-poison-as-ub by default
Now that SimpleLoopUnswitch and other transforms no longer introduce
branch on poison, enable the -branch-on-poison-as-ub option by
default. The practical impact of this is mostly better flag
preservation in SCEV, and some freeze instructions no longer being
necessary.

Differential Revision: https://reviews.llvm.org/D125299
2022-06-01 10:46:06 +02:00
Anton Afanasyev
0dd8401371 [AggressiveInstCombine] Add phi nodes support to TruncInstCombine
Expand `TruncInstCombine` to handle loops by adding `phi` nodes
to expression graph.

Reviewed by: RKSimon, lebedev.ri

(recommit of fixed f84d732f, reverted by 8ad6d5e after sanitizer breakage)

Differential Revision: https://reviews.llvm.org/D109817
2022-02-25 07:57:35 +03:00
Anton Afanasyev
8ad6d5e465 Revert "[AggressiveInstCombine] Add phi nodes support to TruncInstCombine"
This reverts commit f84d732f8c1737940afab71824134f41f37a048b.
Breakage of "sanitizer-x86_64-linux-fast"
2022-02-23 15:56:11 +03:00
Anton Afanasyev
f84d732f8c [AggressiveInstCombine] Add phi nodes support to TruncInstCombine
Expand `TruncInstCombine` to handle loops by adding `phi` nodes
to expression graph.

Reviewed by: RKSimon, lebedev.ri

Differential Revision: https://reviews.llvm.org/D109817
2022-02-23 14:01:55 +03:00
Anton Afanasyev
ea249489f5 [Test][AggressiveInstCombine] Add test for phi instruction 2022-02-23 12:50:50 +03:00
Bjorn Pettersson
3f8027fb67 [test] Update some test cases to use -passes when specifying the pipeline
This updates transform test cases for
  ADCE
  AddDiscriminators
  AggressiveInstCombine
  AlignmentFromAssumptions
  ArgumentPromotion
  BDCE
  CalledValuePropagation
  DCE
  Reg2Mem
  WholeProgramDevirt
to use the -passes syntax when specifying the pipeline.

Given that LLVM_ENABLE_NEW_PASS_MANAGER isn't set to off (which is
a deprecated feature) the updated test cases already used the new
pass manager, but they were using the legacy syntax when specifying
the passes to run. This patch can be seen as a step toward deprecating
that interface.

This patch also removes some redundant RUN lines. Here I am
referring to test cases that had multiple RUN lines verifying both
the legacy "-passname" syntax and the new "-passes=passname" syntax.
Since we switched the default pass manager to "new PM" both RUN lines
have verified the new PM version of the pass (more or less wasting
time running the same test twice), unless LLVM_ENABLE_NEW_PASS_MANAGER
is set to "off". It is assumed that it is enough to run these tests
with the new pass manager now.

Differential Revision: https://reviews.llvm.org/D108472
2021-09-29 21:51:08 +02:00
Anton Afanasyev
6a5f49a1ac [AggressiveInstCombine] Add {insert/extract}element to TruncInstCombine DAG
Alive2 for `{insert/extract}element`: https://alive2.llvm.org/ce/z/hwy_E-

Actually, no one file of test suite is touched by this change,
which means that is rare pattern not generated by frontend. But
it's worth being in place.

Differential Revision: https://reviews.llvm.org/D109236
2021-09-16 11:24:31 +03:00
Anton Afanasyev
8371a4c9d5 [Test][AggressiveInstCombine] Add test for truncation of vector instructions
Precommit test for D109236
2021-09-16 11:24:30 +03:00
Anton Afanasyev
54d8ebbbfd [AggressiveInstCombine] Add udiv and urem instrs to TruncInstCombine DAG
Add `udiv` and `urem` instructions to the DAG post-dominated by `trunc`,
allowing TruncInstCombine to reduce bitwidth of expressions containing these
instructions. It is sufficient to require that all truncated bits of both
operands are zeros: https://alive2.llvm.org/ce/z/yiithn
(`urem` case is identical).

Differential Revision: https://reviews.llvm.org/D109515
2021-09-10 20:29:08 +03:00
Anton Afanasyev
ea7b2c147f [Test][AggressiveInstCombine] Add test for udiv and urem
Precommit test for D109515
2021-09-10 20:29:08 +03:00
Anton Afanasyev
d1f9b21677 [AggressiveInstCombine] Add AssumptionCache to aggressive instcombine
Add support for @llvm.assume() to TruncInstCombine allowing
optimizations based on these intrinsics while computing known bits.
2021-09-07 16:45:00 +03:00
Anton Afanasyev
388b7a1502 [AggressiveInstCombine][Test] Add test for assumptions 2021-09-07 16:45:00 +03:00
Anton Afanasyev
bed587631f [AggressiveInstCombine] Add arithmetic shift right instr to TruncInstCombine DAG
Add `ashr` instruction to the DAG post-dominated by `trunc`, allowing
`TruncInstCombine` to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are sign bits (all zeros or ones) and
one sign bit is left untruncated: https://alive2.llvm.org/ce/z/Ajo2__

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108355
2021-08-24 10:41:16 +03:00
Anton Afanasyev
280a0b735f [Test][AggressiveInstCombine] Modify shift tests
Add `sext` for `ashr`, remove unrelated tests
2021-08-24 10:30:27 +03:00
Sanjay Patel
dd19f342fa [AggressiveInstCombine] guard against applying instruction flags with constant folding
This is a minimized version of a crash reported in:
D108201
2021-08-20 12:22:18 -04:00
Anton Afanasyev
2eefe4bd17 [Test][AggressiveInstCombine] Split shift tests to shl, lshr and ashr 2021-08-20 06:33:19 +03:00
Anton Afanasyev
85c503422d [Test][AggressiveInstCombine] Add test for arithmetic shift 2021-08-20 06:26:03 +03:00
Anton Afanasyev
cfb6dfcbd1 [AggressiveInstCombine] Add logical shift right instr to TruncInstCombine DAG
Add `lshr` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are zeros: https://alive2.llvm.org/ce/z/_LytbB

Alive2 variable-length proof:
https://godbolt.org/z/1srE1aqzf => s/32/8/ => https://alive2.llvm.org/ce/z/StwPia

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108201
2021-08-18 22:20:58 +03:00
Anton Afanasyev
2498c3edcd [Test][AggressiveInstCombine] Add one more tests for shifts 2021-08-18 22:20:57 +03:00
Anton Afanasyev
0988488ed4 [Test][AggressiveInstCombine] Add one more test for shift truncation
Add test for which `OrigBitWidth != SrcBitWidth`
(https://reviews.llvm.org/D108091#2950131)
2021-08-18 09:29:49 +03:00
Anton Afanasyev
803270c0c6 [AggressiveInstCombine] Fix unsigned overflow
Fix issue reported here: https://reviews.llvm.org/D108091#2950930
2021-08-18 08:42:46 +03:00