51 Commits

Author SHA1 Message Date
Anton Afanasyev
0dd8401371 [AggressiveInstCombine] Add phi nodes support to TruncInstCombine
Expand `TruncInstCombine` to handle loops by adding `phi` nodes
to expression graph.

Reviewed by: RKSimon, lebedev.ri

(recommit of fixed f84d732f, reverted by 8ad6d5e after sanitizer breakage)

Differential Revision: https://reviews.llvm.org/D109817
2022-02-25 07:57:35 +03:00
Anton Afanasyev
8ad6d5e465 Revert "[AggressiveInstCombine] Add phi nodes support to TruncInstCombine"
This reverts commit f84d732f8c1737940afab71824134f41f37a048b.
Breakage of "sanitizer-x86_64-linux-fast"
2022-02-23 15:56:11 +03:00
Anton Afanasyev
f84d732f8c [AggressiveInstCombine] Add phi nodes support to TruncInstCombine
Expand `TruncInstCombine` to handle loops by adding `phi` nodes
to expression graph.

Reviewed by: RKSimon, lebedev.ri

Differential Revision: https://reviews.llvm.org/D109817
2022-02-23 14:01:55 +03:00
Anton Afanasyev
ea249489f5 [Test][AggressiveInstCombine] Add test for phi instruction 2022-02-23 12:50:50 +03:00
Bjorn Pettersson
3f8027fb67 [test] Update some test cases to use -passes when specifying the pipeline
This updates transform test cases for
  ADCE
  AddDiscriminators
  AggressiveInstCombine
  AlignmentFromAssumptions
  ArgumentPromotion
  BDCE
  CalledValuePropagation
  DCE
  Reg2Mem
  WholeProgramDevirt
to use the -passes syntax when specifying the pipeline.

Given that LLVM_ENABLE_NEW_PASS_MANAGER isn't set to off (which is
a deprecated feature) the updated test cases already used the new
pass manager, but they were using the legacy syntax when specifying
the passes to run. This patch can be seen as a step toward deprecating
that interface.

This patch also removes some redundant RUN lines. Here I am
referring to test cases that had multiple RUN lines verifying both
the legacy "-passname" syntax and the new "-passes=passname" syntax.
Since we switched the default pass manager to "new PM" both RUN lines
have verified the new PM version of the pass (more or less wasting
time running the same test twice), unless LLVM_ENABLE_NEW_PASS_MANAGER
is set to "off". It is assumed that it is enough to run these tests
with the new pass manager now.

Differential Revision: https://reviews.llvm.org/D108472
2021-09-29 21:51:08 +02:00
Anton Afanasyev
6a5f49a1ac [AggressiveInstCombine] Add {insert/extract}element to TruncInstCombine DAG
Alive2 for `{insert/extract}element`: https://alive2.llvm.org/ce/z/hwy_E-

Actually, no one file of test suite is touched by this change,
which means that is rare pattern not generated by frontend. But
it's worth being in place.

Differential Revision: https://reviews.llvm.org/D109236
2021-09-16 11:24:31 +03:00
Anton Afanasyev
8371a4c9d5 [Test][AggressiveInstCombine] Add test for truncation of vector instructions
Precommit test for D109236
2021-09-16 11:24:30 +03:00
Anton Afanasyev
54d8ebbbfd [AggressiveInstCombine] Add udiv and urem instrs to TruncInstCombine DAG
Add `udiv` and `urem` instructions to the DAG post-dominated by `trunc`,
allowing TruncInstCombine to reduce bitwidth of expressions containing these
instructions. It is sufficient to require that all truncated bits of both
operands are zeros: https://alive2.llvm.org/ce/z/yiithn
(`urem` case is identical).

Differential Revision: https://reviews.llvm.org/D109515
2021-09-10 20:29:08 +03:00
Anton Afanasyev
ea7b2c147f [Test][AggressiveInstCombine] Add test for udiv and urem
Precommit test for D109515
2021-09-10 20:29:08 +03:00
Anton Afanasyev
d1f9b21677 [AggressiveInstCombine] Add AssumptionCache to aggressive instcombine
Add support for @llvm.assume() to TruncInstCombine allowing
optimizations based on these intrinsics while computing known bits.
2021-09-07 16:45:00 +03:00
Anton Afanasyev
388b7a1502 [AggressiveInstCombine][Test] Add test for assumptions 2021-09-07 16:45:00 +03:00
Anton Afanasyev
bed587631f [AggressiveInstCombine] Add arithmetic shift right instr to TruncInstCombine DAG
Add `ashr` instruction to the DAG post-dominated by `trunc`, allowing
`TruncInstCombine` to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are sign bits (all zeros or ones) and
one sign bit is left untruncated: https://alive2.llvm.org/ce/z/Ajo2__

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108355
2021-08-24 10:41:16 +03:00
Anton Afanasyev
280a0b735f [Test][AggressiveInstCombine] Modify shift tests
Add `sext` for `ashr`, remove unrelated tests
2021-08-24 10:30:27 +03:00
Sanjay Patel
dd19f342fa [AggressiveInstCombine] guard against applying instruction flags with constant folding
This is a minimized version of a crash reported in:
D108201
2021-08-20 12:22:18 -04:00
Anton Afanasyev
2eefe4bd17 [Test][AggressiveInstCombine] Split shift tests to shl, lshr and ashr 2021-08-20 06:33:19 +03:00
Anton Afanasyev
85c503422d [Test][AggressiveInstCombine] Add test for arithmetic shift 2021-08-20 06:26:03 +03:00
Anton Afanasyev
cfb6dfcbd1 [AggressiveInstCombine] Add logical shift right instr to TruncInstCombine DAG
Add `lshr` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are zeros: https://alive2.llvm.org/ce/z/_LytbB

Alive2 variable-length proof:
https://godbolt.org/z/1srE1aqzf => s/32/8/ => https://alive2.llvm.org/ce/z/StwPia

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108201
2021-08-18 22:20:58 +03:00
Anton Afanasyev
2498c3edcd [Test][AggressiveInstCombine] Add one more tests for shifts 2021-08-18 22:20:57 +03:00
Anton Afanasyev
0988488ed4 [Test][AggressiveInstCombine] Add one more test for shift truncation
Add test for which `OrigBitWidth != SrcBitWidth`
(https://reviews.llvm.org/D108091#2950131)
2021-08-18 09:29:49 +03:00
Anton Afanasyev
803270c0c6 [AggressiveInstCombine] Fix unsigned overflow
Fix issue reported here: https://reviews.llvm.org/D108091#2950930
2021-08-18 08:42:46 +03:00
Anton Afanasyev
1f3e35b6d1 [AggressiveInstCombine] Add shift left instruction to TruncInstCombine DAG
Add `shl` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing left shifts.

The only thing we need to check is that the target bitwidth must be wider
than the maximal shift amount: https://alive2.llvm.org/ce/z/AwArqu

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108091
2021-08-17 12:44:37 +03:00
Anton Afanasyev
8f8f9260a9 [Test][AggressiveInstCombine] Add test for shifts
Precommit test for D107766/D108091. Also move fixed test for PR50555
from SLPVectorizer/X86/ to PhaseOrdering/X86/ subdirectory.
2021-08-17 12:39:53 +03:00
Anton Afanasyev
c0a42d4491 [Test] Move test for PR50555 from InstCombine to AggressiveInstCombine 2021-08-12 14:42:02 +03:00
Simon Pilgrim
88c5b50060 [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts (REAPPLIED)
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Ensure we block poison in a funnel shift value - similar to rG0fe91ad463fea9d08cbcd640a62aa9ca2d8d05e0

Reapplied with fix for PR48068 - we weren't checking that the shift values could be hoisted from their basicblocks.

Differential Revision: https://reviews.llvm.org/D90625
2020-12-21 15:22:27 +00:00
Jun Ma
137674f882 [TruncInstCombine] Remove scalable vector restriction
Differential Revision: https://reviews.llvm.org/D92819
2020-12-10 18:00:19 +08:00
Martin Storsjö
36cf1e7d0e Revert "[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts"
This reverts commit 59b22e495c15d2830f41381a327f5d6bf49ff416.

That commit broke building for ARM and AArch64, reproducible like this:

$ cat apedec-reduced.c
a;
b(e) {
  int c;
  unsigned d = f();
  c = d >> 32 - e;
  return c;
}
g() {
  int h = i();
  if (a)
    h = h << a | b(a);
  return h;
}
$ clang -target aarch64-linux-gnu -w -c -O3 apedec-reduced.c
clang: ../lib/Transforms/InstCombine/InstructionCombining.cpp:3656: bool llvm::InstCombinerImpl::run(): Assertion `DT.dominates(BB, UserParent) && "Dominance relation broken?"' failed.

Same thing for e.g. an armv7-linux-gnueabihf target.
2020-11-04 08:39:32 +02:00
Simon Pilgrim
59b22e495c [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Differential Revision: https://reviews.llvm.org/D90625
2020-11-03 10:49:49 +00:00
Simon Pilgrim
004bacb9aa [AggressiveInstCombine] Add funnel shift tests
Based off existing rotate test coverage
2020-11-02 17:26:36 +00:00
Simon Pilgrim
9c92daaa87 [AggressiveInstCombine] Regenerate rotate tests 2020-11-02 17:26:35 +00:00
Simon Pilgrim
fadd152317 [AggressiveInstCombine] foldAnyOrAllBitsSet - add uniform vector support
Replace m_ConstantInt with m_APInt to support uniform vectors (with no undef elements)

Adding non-undef support would involve some refactoring of the MaskOps struct but this might still be worth it.
2020-10-15 11:02:35 +01:00
Simon Pilgrim
196bee9648 [AggressiveInstCombine] foldAnyOrAllBitsSet - add uniform vector tests 2020-10-15 10:48:24 +01:00
Ayman Musa
cd515a6538 [AggressiveInstCombine] Add test with baseline CHECKs for aggressive inst combine for ICmp instruction. 2020-02-12 15:09:38 +02:00
Ayman Musa
3bda9059b8 [AggressiveInstCombine] Add support for select instruction.
Differential Revision: https://reviews.llvm.org/D72837
2020-02-12 13:59:34 +02:00
Ayman Musa
10c7b7708b [AggressiveInstCombine] Add test with baseline CHECKs for aggressive inst combine for SELECT. 2020-02-09 12:07:25 +02:00
Chen Zheng
c6c6f717af [NFC] run specific pass instead of whole -O3 pipeline for popcount recoginzation testcase.
llvm-svn: 374514
2019-10-11 05:30:18 +00:00
Chen Zheng
c17c5864ff [InstCombine] recognize popcount.
This patch recognizes popcount intrinsic according to algorithm from website
  http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

Differential Revision: https://reviews.llvm.org/D68189

llvm-svn: 374512
2019-10-11 05:13:56 +00:00
Eric Christopher
cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher
a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Sanjay Patel
200885e654 [AggressiveInstCombine] convert rotate with guard branch into funnel shift (PR34924)
Now, that we have funnel shift intrinsics, it should be safe to convert this form of rotate to it. 
In the worst case (a target that doesn't have rotate instructions), we will expand this into a 
branch-less sequence of ALU ops (neg/and/and/lshr/shl/or) in the backend, so it's still very 
likely to be a perf improvement over the original code.

The motivating source code pattern for this is shown in:
https://bugs.llvm.org/show_bug.cgi?id=34924

Background:
I looked at several different options before deciding where to try this - instcombine, simplifycfg, 
CGP - because it doesn't fit cleanly anywhere AFAIK.

The backend (CGP, SDAG, GlobalIsel?) is too late for what we're trying to accomplish. We want to 
have the IR converted before we reach things like vectorization because the reduced code can make a 
loop much simpler to transform.

Technically, this could be included in instcombine, but it's a large pattern match that includes 
control-flow, so it just felt wrong to stuff into there (although I have a draft of that patch). 
Similarly, this could be part of simplifycfg, but all of this pattern matching is a stretch.

So we're left with our relatively new dumping ground for homeless transforms: aggressive-instcombine. 
This only runs at -O3, but that seems like a reasonable limitation given that source code has many 
options to avoid this pattern (including the recently added clang intrinsics for rotates).

I'm including a PhaseOrdering test because we require the teamwork of 3 passes (aggressive-instcombine, 
instcombine, simplifycfg) to get this into the minimal IR form that we want. That test shows a bug
with the new pass manager that's independent of this change (but it will be masked if we canonicalize
harder to funnel shift intrinsics in instcombine).

Differential Revision: https://reviews.llvm.org/D55604

llvm-svn: 349396
2018-12-17 21:14:51 +00:00
Sanjay Patel
beb7bb6192 [AggressiveInstCombine] add test for rotate insertion point; NFC
As noted in D55604 - we need a test to make sure that the new intrinsic
is inserted into a valid position.

llvm-svn: 349347
2018-12-17 12:36:35 +00:00
Sanjay Patel
d8ccc0e3e4 [AggressiveInstCombine] add tests for rotates with branch; NFC
llvm-svn: 348933
2018-12-12 15:28:21 +00:00
Sanjay Patel
bf55e6dee1 [AggressiveInstCombine] avoid crashing on unsimplified code (PR37446)
This bug:
https://bugs.llvm.org/show_bug.cgi?id=37446
...raises another question: why do we run aggressive-instcombine before 
regular instcombine?

llvm-svn: 332243
2018-05-14 13:43:32 +00:00
Sanjay Patel
ac3951a735 [AggressiveInstCombine] convert a chain of 'and-shift' bits into masked compare
This is a follow-up to D45986. As suggested there, we should match the "all-bits-set" 
pattern in addition to "any-bits-set".

This was a little more complicated than I thought it would be initially because the 
"and 1" instruction can be anywhere in the chain. Hopefully, the code comments make 
that logic understandable, but if you see a way to simplify or improve that, it's 
most appreciated.

This transforms patterns that emerge from bitfield tests as seen in PR37098:
https://bugs.llvm.org/show_bug.cgi?id=37098

I think it would also help reduce the large test from:
D46336
D46595 
but we need something to reassociate that case to the forms we're expecting here first.

Differential Revision: https://reviews.llvm.org/D46649

llvm-svn: 331937
2018-05-09 23:08:15 +00:00
Sanjay Patel
d2025a2e31 [AggressiveInstCombine] convert a chain of 'or-shift' bits into masked compare
and (or (lshr X, C), ...), 1 --> (X & C') != 0

I initially thought about implementing the minimal pattern in instcombine as mentioned here:
https://bugs.llvm.org/show_bug.cgi?id=37098#c6

...but we need to do better to catch the more general sequence from the motivating test 
(more than 2 bits in the compare). And a test-suite run with statistics showed that this 
pattern only happened 2 times currently. It would potentially happen more often if 
reassociation worked better (D45842), but it's probably still not too frequent?

This is small enough that I didn't see a need to create a whole new class/file within 
AggressiveInstCombine. There are likely other relatively small matchers like what was 
discussed in D44266 that would slide under foldUnusualPatterns() (name suggestions welcome). 
We could potentially also consolidate matchers for ctpop, bswap, etc under here.

Differential Revision: https://reviews.llvm.org/D45986

llvm-svn: 331311
2018-05-01 21:02:09 +00:00
Sanjay Patel
f70671582d [AggressiveInstCombine] add more bitfield test patterns; NFC
Add another baseline for D45986 and a pattern that won't be
matched with that patch.

llvm-svn: 331309
2018-05-01 20:55:03 +00:00
Sanjay Patel
fa8f5ad9f3 [AggressiveInstCombine] add tests for PR37098; NFC
I'm not sure if this is where we should try to fold these
patterns inspired by:
https://bugs.llvm.org/show_bug.cgi?id=37098
...if this isn't the right place, we can move the tests.

llvm-svn: 330642
2018-04-23 20:20:32 +00:00
Amjad Aboud
b86b771c02 [AggressiveInstCombine] Fixed TruncCombine class to handle TruncInst leaf node correctly.
This covers the case where TruncInst leaf node is a constant expression.
See PR36121 for more details.

Differential Revision: https://reviews.llvm.org/D42622

llvm-svn: 323926
2018-01-31 22:39:05 +00:00
Amjad Aboud
d895bff5f2 [AggressiveInstCombine] Make TruncCombine class ignore unreachable basic blocks.
Because dead code may contain non-standard IR that causes infinite looping or crashes in underlying analysis.
See PR36134 for more details.

Differential Revision: https://reviews.llvm.org/D42683

llvm-svn: 323862
2018-01-31 10:41:31 +00:00
Amjad Aboud
f1f57a3137 Another try to commit 323321 (aggressive instruction combine).
llvm-svn: 323416
2018-01-25 12:06:32 +00:00
Amjad Aboud
d53504e379 Reverted 323321.
llvm-svn: 323326
2018-01-24 14:48:49 +00:00