10 Commits

Author SHA1 Message Date
Roman Lebedev
08c2f4eb7a
[CVP] When expanding urem, always freeze the nominator
As per the post-commit feedback - that was not the correct precondition
to avoid it here. I think we should generally start changing mentality
about `freeze`, the fact that we have been conditioned to be afraid of it
(or of anything in LLVM in general) is the key problem here.
2022-12-31 05:00:43 +03:00
Roman Lebedev
66efb98632
[CVP] Expand bound urems
This kind of thing happens really frequently in LLVM's very own
shuffle combining methods, and it is even considered bad practice
to use `%` there, instead of using this expansion directly.
Though, many of the cases there have variable divisors,
so this won't help everything.

Simple case: https://alive2.llvm.org/ce/z/PjvYf-
There's alternative expansion via `umin`:
https://alive2.llvm.org/ce/z/hWCVPb

BUT while we can transform the first expansion
into the `umin` one (e.g. for SCEV):
https://alive2.llvm.org/ce/z/iNxKmJ
... we can't go in the opposite direction.

Also, the non-`umin` expansion seems somewhat more codegen-friendly:
https://godbolt.org/z/qzjx5bqWK
https://godbolt.org/z/a7bj1axbx

There's second variant of precondition:
https://alive2.llvm.org/ce/z/zE6cbM
but there the numerator must be non-undef / must be frozen.
2022-12-30 19:40:46 +03:00
Roman Lebedev
3d852d1e74
[NFC][PhaseOrdering] Re-autogenerate check lines in one test 2022-12-30 19:40:46 +03:00
Matt Arsenault
1c55cc600e PhaseOrdering: Convert tests to opaque pointers
Required manually running update_test_checks:
  AArch64/hoisting-sinking-required-for-vectorization.ll
  AArch64/peel-multiple-unreachable-exits-for-vectorization.ll
  ARM/arm_mult_q15.ll
  X86/hoist-load-of-baseptr.ll
  X86/spurious-peeling.ll
2022-11-27 21:26:41 -05:00
Sanjay Patel
cc88445a91 [InstCombine] canonicalize 'icmp (trunc X), C' to 'icmp (X & Mask), C'
I looked at canonicalizing in the other direction, but that causes
many potential regressions and infinite loops because we already
(possibly wrongly) canonicalize "trunc X to i1" into an and+icmp.

This has a data layout restriction to avoid creating illegal
mask instructions, but we could remove that if we can show
that the backend can undo this when needed.

The motivating example from issue #56119 is modeled by the
PhaseOrdering test.
2022-06-30 15:51:39 -04:00
Sanjay Patel
dbe4bb7d12 [PhaseOrdering] add test to show missing folds from PR56119; NFC
issue #56119
2022-06-30 15:51:39 -04:00
Sanjay Patel
f31d39c42c [InstCombine] remove cast-of-signbit to shift transform
The transform was wrong in 3 ways:

1. It created an extra instruction when the source and dest types don't match.
2. It did not account for an extra use of the icmp, so could create 2 extra insts.
3. It favored bit hacks over icmp (icmp generally has better analysis).

This fixes #54692 (modeled by the PhaseOrdering tests).

This is a minimal step to fix the bug, but we should likely invert
this and the sibling transform for the "is negative" pattern too.

The backend should be able to invert this back to a shift if that
leads to better codegen.

This is a reduced try of 3794cc0e9964 - that was reverted because
it could cause infinite loops by conflicting with the related
transforms in this block that create shifts.
2022-05-17 11:10:28 -04:00
Sanjay Patel
07d549bce9 Revert "[InstCombine] invert canonicalization for cast of signbit test"
This reverts commit 3794cc0e996481e10307b67c8436aa44e0d65d22.
This change is suspected of causing bots to hang at stage 2
compiles, so reverting to confirm and investigate.
2022-05-16 17:47:02 -04:00
Sanjay Patel
3794cc0e99 [InstCombine] invert canonicalization for cast of signbit test
The existing transform was wrong in 3 ways:
1. It created an extra instruction when the source and dest types don't match.
2. It did not account for an extra use of the icmp, so could create 2 extra insts.
3. It favored bit hacks over icmp (icmp generally has better analysis).

This fixes #54692 (modeled by the PhaseOrdering tests).

This is a minimal step to fix the bug, but we should likely invert
the sibling transform for the "is negative" pattern too.

The backend should be able to invert this back to a shift if that
leads to better codegen.
2022-05-16 12:55:52 -04:00
Sanjay Patel
325896d823 [PhaseOrdering] add tests for cmp + boolean/bitwise logic; NFC
The tests (see C++ source in #54692) have multiple potential
optimizations/canonicalizations, but we should be consistent
since they are logically identical.
2022-05-16 10:35:10 -04:00