40 Commits

Author SHA1 Message Date
Anton Afanasyev
bed587631f [AggressiveInstCombine] Add arithmetic shift right instr to TruncInstCombine DAG
Add `ashr` instruction to the DAG post-dominated by `trunc`, allowing
`TruncInstCombine` to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are sign bits (all zeros or ones) and
one sign bit is left untruncated: https://alive2.llvm.org/ce/z/Ajo2__

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108355
2021-08-24 10:41:16 +03:00
Anton Afanasyev
280a0b735f [Test][AggressiveInstCombine] Modify shift tests
Add `sext` for `ashr`, remove unrelated tests
2021-08-24 10:30:27 +03:00
Sanjay Patel
dd19f342fa [AggressiveInstCombine] guard against applying instruction flags with constant folding
This is a minimized version of a crash reported in:
D108201
2021-08-20 12:22:18 -04:00
Anton Afanasyev
2eefe4bd17 [Test][AggressiveInstCombine] Split shift tests to shl, lshr and ashr 2021-08-20 06:33:19 +03:00
Anton Afanasyev
85c503422d [Test][AggressiveInstCombine] Add test for arithmetic shift 2021-08-20 06:26:03 +03:00
Anton Afanasyev
cfb6dfcbd1 [AggressiveInstCombine] Add logical shift right instr to TruncInstCombine DAG
Add `lshr` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are zeros: https://alive2.llvm.org/ce/z/_LytbB

Alive2 variable-length proof:
https://godbolt.org/z/1srE1aqzf => s/32/8/ => https://alive2.llvm.org/ce/z/StwPia

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108201
2021-08-18 22:20:58 +03:00
Anton Afanasyev
2498c3edcd [Test][AggressiveInstCombine] Add one more tests for shifts 2021-08-18 22:20:57 +03:00
Anton Afanasyev
0988488ed4 [Test][AggressiveInstCombine] Add one more test for shift truncation
Add test for which `OrigBitWidth != SrcBitWidth`
(https://reviews.llvm.org/D108091#2950131)
2021-08-18 09:29:49 +03:00
Anton Afanasyev
803270c0c6 [AggressiveInstCombine] Fix unsigned overflow
Fix issue reported here: https://reviews.llvm.org/D108091#2950930
2021-08-18 08:42:46 +03:00
Anton Afanasyev
1f3e35b6d1 [AggressiveInstCombine] Add shift left instruction to TruncInstCombine DAG
Add `shl` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing left shifts.

The only thing we need to check is that the target bitwidth must be wider
than the maximal shift amount: https://alive2.llvm.org/ce/z/AwArqu

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108091
2021-08-17 12:44:37 +03:00
Anton Afanasyev
8f8f9260a9 [Test][AggressiveInstCombine] Add test for shifts
Precommit test for D107766/D108091. Also move fixed test for PR50555
from SLPVectorizer/X86/ to PhaseOrdering/X86/ subdirectory.
2021-08-17 12:39:53 +03:00
Anton Afanasyev
c0a42d4491 [Test] Move test for PR50555 from InstCombine to AggressiveInstCombine 2021-08-12 14:42:02 +03:00
Simon Pilgrim
88c5b50060 [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts (REAPPLIED)
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Ensure we block poison in a funnel shift value - similar to rG0fe91ad463fea9d08cbcd640a62aa9ca2d8d05e0

Reapplied with fix for PR48068 - we weren't checking that the shift values could be hoisted from their basicblocks.

Differential Revision: https://reviews.llvm.org/D90625
2020-12-21 15:22:27 +00:00
Jun Ma
137674f882 [TruncInstCombine] Remove scalable vector restriction
Differential Revision: https://reviews.llvm.org/D92819
2020-12-10 18:00:19 +08:00
Martin Storsjö
36cf1e7d0e Revert "[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts"
This reverts commit 59b22e495c15d2830f41381a327f5d6bf49ff416.

That commit broke building for ARM and AArch64, reproducible like this:

$ cat apedec-reduced.c
a;
b(e) {
  int c;
  unsigned d = f();
  c = d >> 32 - e;
  return c;
}
g() {
  int h = i();
  if (a)
    h = h << a | b(a);
  return h;
}
$ clang -target aarch64-linux-gnu -w -c -O3 apedec-reduced.c
clang: ../lib/Transforms/InstCombine/InstructionCombining.cpp:3656: bool llvm::InstCombinerImpl::run(): Assertion `DT.dominates(BB, UserParent) && "Dominance relation broken?"' failed.

Same thing for e.g. an armv7-linux-gnueabihf target.
2020-11-04 08:39:32 +02:00
Simon Pilgrim
59b22e495c [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Differential Revision: https://reviews.llvm.org/D90625
2020-11-03 10:49:49 +00:00
Simon Pilgrim
004bacb9aa [AggressiveInstCombine] Add funnel shift tests
Based off existing rotate test coverage
2020-11-02 17:26:36 +00:00
Simon Pilgrim
9c92daaa87 [AggressiveInstCombine] Regenerate rotate tests 2020-11-02 17:26:35 +00:00
Simon Pilgrim
fadd152317 [AggressiveInstCombine] foldAnyOrAllBitsSet - add uniform vector support
Replace m_ConstantInt with m_APInt to support uniform vectors (with no undef elements)

Adding non-undef support would involve some refactoring of the MaskOps struct but this might still be worth it.
2020-10-15 11:02:35 +01:00
Simon Pilgrim
196bee9648 [AggressiveInstCombine] foldAnyOrAllBitsSet - add uniform vector tests 2020-10-15 10:48:24 +01:00
Ayman Musa
cd515a6538 [AggressiveInstCombine] Add test with baseline CHECKs for aggressive inst combine for ICmp instruction. 2020-02-12 15:09:38 +02:00
Ayman Musa
3bda9059b8 [AggressiveInstCombine] Add support for select instruction.
Differential Revision: https://reviews.llvm.org/D72837
2020-02-12 13:59:34 +02:00
Ayman Musa
10c7b7708b [AggressiveInstCombine] Add test with baseline CHECKs for aggressive inst combine for SELECT. 2020-02-09 12:07:25 +02:00
Chen Zheng
c6c6f717af [NFC] run specific pass instead of whole -O3 pipeline for popcount recoginzation testcase.
llvm-svn: 374514
2019-10-11 05:30:18 +00:00
Chen Zheng
c17c5864ff [InstCombine] recognize popcount.
This patch recognizes popcount intrinsic according to algorithm from website
  http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

Differential Revision: https://reviews.llvm.org/D68189

llvm-svn: 374512
2019-10-11 05:13:56 +00:00
Eric Christopher
cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher
a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Sanjay Patel
200885e654 [AggressiveInstCombine] convert rotate with guard branch into funnel shift (PR34924)
Now, that we have funnel shift intrinsics, it should be safe to convert this form of rotate to it. 
In the worst case (a target that doesn't have rotate instructions), we will expand this into a 
branch-less sequence of ALU ops (neg/and/and/lshr/shl/or) in the backend, so it's still very 
likely to be a perf improvement over the original code.

The motivating source code pattern for this is shown in:
https://bugs.llvm.org/show_bug.cgi?id=34924

Background:
I looked at several different options before deciding where to try this - instcombine, simplifycfg, 
CGP - because it doesn't fit cleanly anywhere AFAIK.

The backend (CGP, SDAG, GlobalIsel?) is too late for what we're trying to accomplish. We want to 
have the IR converted before we reach things like vectorization because the reduced code can make a 
loop much simpler to transform.

Technically, this could be included in instcombine, but it's a large pattern match that includes 
control-flow, so it just felt wrong to stuff into there (although I have a draft of that patch). 
Similarly, this could be part of simplifycfg, but all of this pattern matching is a stretch.

So we're left with our relatively new dumping ground for homeless transforms: aggressive-instcombine. 
This only runs at -O3, but that seems like a reasonable limitation given that source code has many 
options to avoid this pattern (including the recently added clang intrinsics for rotates).

I'm including a PhaseOrdering test because we require the teamwork of 3 passes (aggressive-instcombine, 
instcombine, simplifycfg) to get this into the minimal IR form that we want. That test shows a bug
with the new pass manager that's independent of this change (but it will be masked if we canonicalize
harder to funnel shift intrinsics in instcombine).

Differential Revision: https://reviews.llvm.org/D55604

llvm-svn: 349396
2018-12-17 21:14:51 +00:00
Sanjay Patel
beb7bb6192 [AggressiveInstCombine] add test for rotate insertion point; NFC
As noted in D55604 - we need a test to make sure that the new intrinsic
is inserted into a valid position.

llvm-svn: 349347
2018-12-17 12:36:35 +00:00
Sanjay Patel
d8ccc0e3e4 [AggressiveInstCombine] add tests for rotates with branch; NFC
llvm-svn: 348933
2018-12-12 15:28:21 +00:00
Sanjay Patel
bf55e6dee1 [AggressiveInstCombine] avoid crashing on unsimplified code (PR37446)
This bug:
https://bugs.llvm.org/show_bug.cgi?id=37446
...raises another question: why do we run aggressive-instcombine before 
regular instcombine?

llvm-svn: 332243
2018-05-14 13:43:32 +00:00
Sanjay Patel
ac3951a735 [AggressiveInstCombine] convert a chain of 'and-shift' bits into masked compare
This is a follow-up to D45986. As suggested there, we should match the "all-bits-set" 
pattern in addition to "any-bits-set".

This was a little more complicated than I thought it would be initially because the 
"and 1" instruction can be anywhere in the chain. Hopefully, the code comments make 
that logic understandable, but if you see a way to simplify or improve that, it's 
most appreciated.

This transforms patterns that emerge from bitfield tests as seen in PR37098:
https://bugs.llvm.org/show_bug.cgi?id=37098

I think it would also help reduce the large test from:
D46336
D46595 
but we need something to reassociate that case to the forms we're expecting here first.

Differential Revision: https://reviews.llvm.org/D46649

llvm-svn: 331937
2018-05-09 23:08:15 +00:00
Sanjay Patel
d2025a2e31 [AggressiveInstCombine] convert a chain of 'or-shift' bits into masked compare
and (or (lshr X, C), ...), 1 --> (X & C') != 0

I initially thought about implementing the minimal pattern in instcombine as mentioned here:
https://bugs.llvm.org/show_bug.cgi?id=37098#c6

...but we need to do better to catch the more general sequence from the motivating test 
(more than 2 bits in the compare). And a test-suite run with statistics showed that this 
pattern only happened 2 times currently. It would potentially happen more often if 
reassociation worked better (D45842), but it's probably still not too frequent?

This is small enough that I didn't see a need to create a whole new class/file within 
AggressiveInstCombine. There are likely other relatively small matchers like what was 
discussed in D44266 that would slide under foldUnusualPatterns() (name suggestions welcome). 
We could potentially also consolidate matchers for ctpop, bswap, etc under here.

Differential Revision: https://reviews.llvm.org/D45986

llvm-svn: 331311
2018-05-01 21:02:09 +00:00
Sanjay Patel
f70671582d [AggressiveInstCombine] add more bitfield test patterns; NFC
Add another baseline for D45986 and a pattern that won't be
matched with that patch.

llvm-svn: 331309
2018-05-01 20:55:03 +00:00
Sanjay Patel
fa8f5ad9f3 [AggressiveInstCombine] add tests for PR37098; NFC
I'm not sure if this is where we should try to fold these
patterns inspired by:
https://bugs.llvm.org/show_bug.cgi?id=37098
...if this isn't the right place, we can move the tests.

llvm-svn: 330642
2018-04-23 20:20:32 +00:00
Amjad Aboud
b86b771c02 [AggressiveInstCombine] Fixed TruncCombine class to handle TruncInst leaf node correctly.
This covers the case where TruncInst leaf node is a constant expression.
See PR36121 for more details.

Differential Revision: https://reviews.llvm.org/D42622

llvm-svn: 323926
2018-01-31 22:39:05 +00:00
Amjad Aboud
d895bff5f2 [AggressiveInstCombine] Make TruncCombine class ignore unreachable basic blocks.
Because dead code may contain non-standard IR that causes infinite looping or crashes in underlying analysis.
See PR36134 for more details.

Differential Revision: https://reviews.llvm.org/D42683

llvm-svn: 323862
2018-01-31 10:41:31 +00:00
Amjad Aboud
f1f57a3137 Another try to commit 323321 (aggressive instruction combine).
llvm-svn: 323416
2018-01-25 12:06:32 +00:00
Amjad Aboud
d53504e379 Reverted 323321.
llvm-svn: 323326
2018-01-24 14:48:49 +00:00
Amjad Aboud
e4453233d7 [InstCombine] Introducing Aggressive Instruction Combine pass (-aggressive-instcombine).
Combine expression patterns to form expressions with fewer, simple instructions.
This pass does not modify the CFG.

For example, this pass reduce width of expressions post-dominated by TruncInst
into smaller width when applicable.

It differs from instcombine pass in that it contains pattern optimization that
requires higher complexity than the O(1), thus, it should run fewer times than
instcombine pass.

Differential Revision: https://reviews.llvm.org/D38313

llvm-svn: 323321
2018-01-24 12:42:42 +00:00