1055 Commits

Author SHA1 Message Date
luxufan
05ef449600 [SimplifyCFG] Handle MD_noundef when hoisting common codes
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D144939
2023-03-03 19:02:14 +08:00
Yaxun (Sam) Liu
fbec8f04ab [SimplifyCFG] Do not hoist/sink convergent function calls
Currently SimplifyCFG hoists/sink common instructions in then/else basic blocks
when certain options are enabled, which is the case for default clang optimization
pipelines for -O3. It tries to hoist/sink convergent function calls in divergent
control flow, which causes incorrect ISA generated for GPU, e.g.
https://github.com/ROCm-Developer-Tools/HIP/issues/3172

This patch fixes that by conservatively disable hoisting/sinking common
convergent function calls in then/else blocks.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D144756
2023-02-28 12:41:56 -05:00
DianQK
f890f010f6
[SimplifyCFG] Improve the precision of PtrValueMayBeModified
The result value of `getelementptr inbounds (TY, null, not zero)` is a poison value. We can think of it as undefined behavior.

> Please let me know if there is anything I don't understand correctly.

Reviewed By: nikic, xbolva00

Differential Revision: https://reviews.llvm.org/D144563
2023-02-25 19:42:59 +08:00
Daniel Woodworth
a33f018b89 [Local][SimplifyCFG][GVN] Handle !nontemporal in combineMetadata
SimplifyCFG currently drops !nontemporal metadata when sinking
common instructions. With this change, SimplifyCFG and similar
transforms will preserve !nontemporal metadata as long as it is
set on both original instructions.

Differential Revision: https://reviews.llvm.org/D144298
2023-02-22 14:47:00 +01:00
DianQK
b6a0be8ce3
[SimplifyCFG] Check if the return instruction causes undefined behavior
This should fix https://github.com/rust-lang/rust/issues/107681.

Return undefined to a noundef return value is undefined.

Example:

```
define noundef i32 @test_ret_noundef(i1 %cond) {
entry:
  br i1 %cond, label %bb1, label %bb2
bb1:
  br label %bb2
bb2:
  %r = phi i32 [ undef, %entry ], [ 1, %bb1 ]
  ret i32 %r
}
```

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D144319
2023-02-21 21:42:13 +08:00
DianQK
1235ed9133
Revert "[SimplifyCFG] Check if the return instruction causes undefined behavior"
This reverts commit b6eed9a82e0ce530d94a194c88615d6c272e1854.
2023-02-19 21:08:29 +08:00
DianQK
b6eed9a82e
[SimplifyCFG] Check if the return instruction causes undefined behavior
This should fix https://github.com/rust-lang/rust/issues/107681.

Return undefined to a noundef return value is undefined.

Example:

```
define noundef i32 @test_ret_noundef(i1 %cond) {
entry:
  br i1 %cond, label %bb1, label %bb2
bb1:
  br label %bb2
bb2:
  %r = phi i32 [ undef, %entry ], [ 1, %bb1 ]
  ret i32 %r
}
```

Differential Revision: https://reviews.llvm.org/D144319
2023-02-19 19:42:40 +08:00
Vitaly Buka
c23f29d6f0 Revert "[SimplifyCFG] Check if the return instruction causes undefined behavior"
Breaks bots
https://lab.llvm.org/buildbot/#/builders/236/builds/2349
https://lab.llvm.org/buildbot/#/builders/74/builds/17361
https://lab.llvm.org/buildbot/#/builders/168/builds/11972

This reverts commit 7be55b007698f6b6398cbbea69c327b5a971938a.
2023-02-18 12:21:10 -08:00
DianQK
7be55b0076
[SimplifyCFG] Check if the return instruction causes undefined behavior
This should fix https://github.com/rust-lang/rust/issues/107681.

Return undefined to a noundef return value is undefined.

Example:

```
define noundef i32 @test_ret_noundef(i1 %cond) {
entry:
  br i1 %cond, label %bb1, label %bb2
bb1:
  br label %bb2
bb2:
  %r = phi i32 [ undef, %entry ], [ 1, %bb1 ]
  ret i32 %r
}
```

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D144319
2023-02-18 23:31:57 +08:00
Nick Desaulniers
094190c2f5 [llvm][CallBrPrepare] add llvm.callbr.landingpad intrinsic
Insert a new intrinsic call after splitting critical edges, and verify
it. Later commits will update the SSA values to use this new value along
indirect branches rather than the callbr's value, and have SelectionDAG
consume this new value.

Part 2b of
https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8.

Reviewed By: efriedma, jyknight

Differential Revision: https://reviews.llvm.org/D139883
2023-02-16 17:58:33 -08:00
Roman Lebedev
333cdd4125
[SimplifyCFG] Reapply: when eliminating unreachable landing pads, mark calls as nounwind
This time the change is in it's least intrusive form since only the return
type in prototype for `removeUnwindEdge()` is changed, since only a single
specific caller need that knowledge.

We really can't recover that knowledge, and `nounwind` knowledge,
(and not just a lack of the unwind edge, aka `call` instead of `invoke`),
is e.g. part of the reasoning in e.g. `mayHaveSideEffects()`.

Note that this is call-site-specific knowledge,
just because some callsite had an `unreachable`
unwind edge, does not mean that all will.
2023-01-13 21:04:17 +03:00
Roman Lebedev
fbcefff9d0
Revert "[SimplifyCFG] When eliminating unreachable landing pads, mark calls as nounwind"
The bool is in the wrong place and might get implicitly converted from
the previous second argument - a pointer. Thinking about it more,
it's not really the best place for that functionality anyways,
only a single caller needs that.

This reverts commit 3c5b1f2d94d021005ce3769a4402d4a4ae843989.
2023-01-13 01:18:56 +03:00
Roman Lebedev
3c5b1f2d94
[SimplifyCFG] When eliminating unreachable landing pads, mark calls as nounwind
We really can't recover that knowledge, and `nounwind` knowledge,
(and not just a lack of the unwind edge, aka `call` instead of `invoke`),
is e.g. part of the reasoning in e.g. `mayHaveSideEffects()`.

Note that this is call-site-specific knowledge,
just because some callsite had an `unreachable`
unwind edge, does not mean that all will.
2023-01-13 00:41:58 +03:00
Roman Lebedev
a5c23d5584
[NFC][SimplifyCFG] Autogenerate checklines in some tests that eliminate unwind edges 2023-01-13 00:41:58 +03:00
Alex Richardson
1b440155c1 Make switch-to-lookup-large-types.ll more reliable
When larger integer types are natively supported simplifycfg will use an
inline constant instead of a global variable for this transform. I noticed
this while trying to automatically infer the datalayout from the target
triple in opt if it is not explicitly specified. Since the x86_64
datalayout includes "n8:16:32:64", this test started failing.

While touching this file also change i128 to i64 in the first test since
this was intended behaviour in the original commit.

Reviewed By: spatel, fhahn

Differential Revision: https://reviews.llvm.org/D141055
2023-01-06 13:35:43 +00:00
Owen Anderson
733740b189 Fix a phase-ordering problem in SimplifyCFG.
Switch simplification could sometimes fail to notice when an
intermediate case removal caused the switch condition to become
constant. This would cause the switch to be simplified into a
conditional branch rather than a direct branch.

Most of the time this didn't matter, except that occasionally
downstream parts of SimplifyCFG expect tautological branches to
already have been eliminated. The missed handling in switch
simplification would cause an assertion failure in the downstream
code.

Triggering the assertion failure is fairly sensitive to the exact
order of various simplifications.

Fixes https://github.com/llvm/llvm-project/issues/59768

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D140831
2023-01-04 16:47:13 -07:00
Nikita Popov
e87aa92066 [SimplifyCFG] Convert some tests to opaque pointers (NFC) 2023-01-04 16:27:49 +01:00
Nikita Popov
9e0f7655f2 [SimplifyCFG] Add test for branch on undef/poison (NFC) 2023-01-03 14:52:48 +01:00
Nikita Popov
f492db7eec [SimplifyCFG] Avoid branch on undef UB in test (NFC) 2023-01-03 12:38:25 +01:00
Roman Lebedev
3a8e009f97
Revert "Reland "[SimplifyCFG] FoldBranchToCommonDest(): deal with mismatched IV's in PHI's in common successor block""
One of these two changes is exposing (or causing) some more miscompiles.
A reproducer is in progress, so reverting until resolved.

This reverts commit 428f36401b1b695fd501ebfdc8773bed8ced8d4e.
2022-12-20 18:36:42 +03:00
Roman Lebedev
4def99e642
[InstCombine] Try to fold not into cmp iff other users of cmp are freely invertible
There is still some such patterns that require collaboration
of folds to handle,that we don't currently do.
2022-12-19 00:24:28 +03:00
Roman Lebedev
428f36401b
Reland "[SimplifyCFG] FoldBranchToCommonDest(): deal with mismatched IV's in PHI's in common successor block"
This reverts commit 37b8f09a4b61bf9bf9d0b9017d790c8b82be2e17,
and returns commit 1bd0b82e508d049efdb07f4f8a342f35818df341.
The miscompile was in InstCombine, and it has been addressed.

This tries to approach the problem noted by @arsenm:
terrible codegen for `__builtin_fpclassify()`:
https://godbolt.org/z/388zqdE37

Just because the PHI in the common successor happens to have different
incoming values for these two blocks, doesn't mean we have to give up.
It's quite easy to deal with this, we just need to produce a select:
https://alive2.llvm.org/ce/z/000srb

Now, the cost model for this transform is rather overly strict,
so this will basically never fire. We tally all (over all preds)
the selects needed to the NumBonusInsts

Differential Revision: https://reviews.llvm.org/D139275
2022-12-17 05:18:54 +03:00
Roman Lebedev
bece10c0fd
[NFC][InstCombine] Add miscompile reproducer from https://reviews.llvm.org/D139275#4001580
SimplifyCFG change is correct and not at fault here.
The actual miscompile appears to be happening in InstCombine.

```
$ /builddirs/llvm-project/build-Clang15/bin/opt -load /repositories/alive2/build-Clang-release/tv/tv.so -load-pass-plugin /repositories/alive2/build-Clang-release/tv/tv.so -passes='tv,instcombine,tv' -o /dev/null /repositories/llvm-project/llvm/test/Transforms/InstCombine/D139275_c4001580.ll

----------------------------------------
define float @D139275_c4001580(float %arg) {
%0:
  %i = fcmp ugt float %arg, 0.000000
  %i1 = fcmp ult float %arg, 1.000000
  %i2 = and i1 %i, %i1
  %i3 = fcmp uge float %arg, 0.100000
  %i4 = xor i1 %i, %i2
  %i5 = select i1 %i4, float 0.100000, float 0.000000
  %i6 = and i1 %i3, %i2
  %i7 = fadd float %arg, -0.100000
  %i8 = select i1 %i6, float %i7, float %i5
  ret float %i8
}
=>
define float @D139275_c4001580(float %arg) {
%0:
  %i = fcmp ugt float %arg, 0.000000
  %i1 = fcmp ult float %arg, 1.000000
  %i2 = and i1 %i, %i1
  %i3 = fcmp uge float %arg, 0.100000
  %i7 = fadd float %arg, -0.100000
  %i5 = select i1 %i3, float %i7, float 0.100000
  %i8 = select i1 %i2, float %i5, float 0.000000
  ret float %i8
}
Transformation doesn't verify! (unsound)
ERROR: Value mismatch

Example:
float %arg = #x3dcbb820 (0.099472284317?)

Source:
i1 %i = #x1 (1)
i1 %i1 = #x1 (1)
i1 %i2 = #x1 (1)
i1 %i3 = #x0 (0)
i1 %i4 = #x0 (0)
float %i5 = #x00000000 (+0.0)
i1 %i6 = #x0 (0)
float %i7 = #xba0a5680 (-0.000527717173?)
float %i8 = #x00000000 (+0.0)

Target:
i1 %i = #x1 (1)
i1 %i1 = #x1 (1)
i1 %i2 = #x1 (1)
i1 %i3 = #x0 (0)
float %i7 = #xba0a5680 (-0.000527717173?)
float %i5 = #x3dcccccd (0.100000001490?)
float %i8 = #x3dcccccd (0.100000001490?)
Source value: #x00000000 (+0.0)
Target value: #x3dcccccd (0.100000001490?)

Pass: (anonymous namespace)::TVPass
Command line: '/builddirs/llvm-project/build-Clang15/bin/opt' '-load' '/repositories/alive2/build-Clang-release/tv/tv.so' '-load-pass-plugin' '/repositories/alive2/build-Clang-release/tv/tv.so' '-passes=tv,instcombine,tv' '-o' '/dev/null' '/repositories/llvm-project/llvm/test/Transforms/InstCombine/D139275_c4001580.ll'

Alive2: Transform doesn't verify!

```
2022-12-16 20:28:39 +03:00
Alexander Kornienko
37b8f09a4b Revert "[SimplifyCFG] FoldBranchToCommonDest(): deal with mismatched IV's in PHI's in common successor block"
This reverts commit 1bd0b82e508d049efdb07f4f8a342f35818df341, since it leads to
miscompiles. See https://reviews.llvm.org/D139275#3993229 and
https://reviews.llvm.org/D139275#4001580.
2022-12-16 17:23:35 +01:00
Nikita Popov
8979ae4276 [SimplifyCFG] Convert tests to opaque pointers (NFC) 2022-12-14 15:14:12 +01:00
Roman Lebedev
1bd0b82e50
[SimplifyCFG] FoldBranchToCommonDest(): deal with mismatched IV's in PHI's in common successor block
This tries to approach the problem noted by @arsenm:
terrible codegen for `__builtin_fpclassify()`:
https://godbolt.org/z/388zqdE37

Just because the PHI in the common successor happens to have different
incoming values for these two blocks, doesn't mean we have to give up.
It's quite easy to deal with this, we just need to produce a select:
https://alive2.llvm.org/ce/z/000srb

Now, the cost model for this transform is rather overly strict,
so this will basically never fire. We tally all (over all preds)
the selects needed to the NumBonusInsts

Differential Revision: https://reviews.llvm.org/D139275
2022-12-12 18:20:03 +03:00
Dmitry Makogon
b134119137 [SimplifyCFG] Prohibit hoisting of llvm.deoptimize calls
This prohibits hoisiting identical llvm.deoptimize calls
from 'then' and 'else' blocks of a conditional branch.
This fixes a crash that happened because we didn't hoist
the return instructions together with the llvm.deoptimize calls,
so the verifier would crash.

Differential Revision: https://reviews.llvm.org/D139437
2022-12-09 17:44:32 +07:00
Dmitry Makogon
a21e0ec724 [Test] Reduce deopt bundle in test with hoisted llvm.deoptimize call 2022-12-09 17:44:31 +07:00
Bjorn Pettersson
3528e63d89 [test] Remove duplicate RUN lines in Transform tests 2022-12-08 11:47:16 +01:00
Roman Lebedev
e8b923f1aa
[NFC] Port all SimplifyCFG tests to -passes= syntax 2022-12-08 02:38:51 +03:00
Roman Lebedev
ea7ad8b365
[NFC][SimplifyCFG] Add more fold-branch-to-common-dest tests 2022-12-07 03:32:42 +03:00
Dmitry Makogon
b70807b340 [Test] Add test exposing crash in SimplifyCFG when hoisting llvm.deoptimize 2022-12-06 23:17:02 +07:00
Roman Lebedev
571abdefd1
[NFC][SimplifyCFG] Add few more fold-branch-to-common-dest tests 2022-12-06 04:39:03 +03:00
Roman Lebedev
54649724df
[NFC][SimplifyCFG] Add one more fold-branch-to-common-dest test 2022-12-06 03:31:21 +03:00
Roman Lebedev
d1d1293569
[NFC] Port all runlines for SimplifyCFG pass tests to -passes syntax 2022-12-05 21:12:20 +03:00
Roman Lebedev
295ba49330
[NFC][SimplifyCFG] Add some tests with PHI's for fold-branch-to-common-dest xform 2022-12-04 20:58:55 +03:00
Roman Lebedev
b79921a4a8
[NFC] Re-autogenerate checklines in a few tests being affected 2022-12-04 20:58:55 +03:00
Matt Arsenault
cb0d2887ab Utils: Fix deleting calls to null in non-0 address spaces 2022-11-23 08:49:44 -05:00
Nikita Popov
304f1d59ca [IR] Switch everything to use memory attribute
This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.

The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.

High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.

Differential Revision: https://reviews.llvm.org/D135780
2022-11-04 10:21:38 +01:00
Nikita Popov
01ec0ff2dc [SimplifyCFG] Allow speculating block containing assume()
SpeculativelyExecuteBB(), which converts a branch + phi structure
into a select, currently bails out if the block contains an assume
(because it is not speculatable).

Adjust the fold to ignore ephemeral values (i.e. assumes and values
only used in assumes) for cost modelling purposes, and drop them
when performing the fold.

Theoretically, we could try to preserve the assume information by
generating a assume(br_cond || assume_cond) style assume, but this
is very unlikely to to be useful (because we don't do anything
useful with assumes of this form) and it would make things
substantially more complicated once we take operand bundle assumes
into account (which don't really support a || operation).
I'd prefer not to do that without good motivation.

Differential Revision: https://reviews.llvm.org/D137339
2022-11-04 09:26:35 +01:00
Nikita Popov
d42cfc4be1 [SimplifyCFG] Add tests for block speculation with assumes (NFC) 2022-11-03 15:46:55 +01:00
Yaxun (Sam) Liu
9d5adc7e49 Revert "reland e5581df60a35 [SimplifyCFG] accumulate bonus insts cost"
This reverts commit bd7949bcd86633bd4203b2ba6f891aea00fce4d1.

Revert this patch since reviwers have different opinions regarding
the approach in post-commit review.

Will open RFC for further discussion.

Differential Revision: https://reviews.llvm.org/D132408
2022-10-25 12:15:39 -04:00
Yaxun (Sam) Liu
bd7949bcd8 reland e5581df60a35 [SimplifyCFG] accumulate bonus insts cost
Fixed compile time increase due to always constructing LocalCostTracker.
Now only construct LocalCostTracker when needed.
2022-10-24 15:43:53 -04:00
chenglin.bi
a43c0974f0 [SimplifyCFG] Add tests for simpilfycfg, switch to lookup table with i2 types; NFC 2022-10-15 02:25:27 +08:00
Arthur Eubanks
e23aee7175 [test] Update some legacy PM tests 2022-09-30 11:31:02 -07:00
Mingming Liu
ac28efa6c1 [SimplifyCFG][TranformUtils]Do not simplify away a trivial basic block if both this block and at least one of its predecessors are loop latches.
- Before this patch, loop metadata (if exists) will override the metadata of each predecessor; if the predecessor block already has loop metadata, the orignal loop metadata won't be preserved and could cause missed loop transformations (see 'test2' in llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll).

To illustrate how inner-loop metadata might be dropped before this patch:

CFG Before

      entry
        |
        v
 ---> while.cond   ------------->  while.end
 |       |
 |       v
 |   while.body
 |       |
 |       v
 |    for.body <---- (md1)
 |       |  |______|
 |       v
 |    while.cond.exit (md2)
 |       |
 |_______|

CFG After

       entry
         |
         v
 ---> while.cond.rewrite  ------------->  while.end
 |       |
 |       v
 |   while.body
 |       |
 |       v
 |    for.body <---- (md2)
 |_______|  |______|

Basically, when 'while.cond.exit' is folded into 'while.cond', 'md2' overrides 'md1' and 'md1' is dropped from the CFG.

Differential Revision: https://reviews.llvm.org/D134152
2022-09-28 10:48:14 -07:00
Mingming Liu
34db7c64df [NFC] Use opaqueptr in llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll
Use opaqueptr for test case
llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll.

- Adjust variable number accordingly since bitcast between different pointer
  types are not necessary.

Differential Revision: https://reviews.llvm.org/D134159
2022-09-19 09:01:11 -07:00
Nikita Popov
dd61726d5b Revert "[SimplifyCFG] accumulate bonus insts cost"
This reverts commit e5581df60a35fffb0c69589777e4e126c849405f.

This causes major compile-time regressions, about 2-3% end-to-end
on CTMark.
2022-09-19 14:46:43 +02:00
Mingming Liu
7392b45162 [NFC][SimplifyCFG]Precommit test case to show inner-loop metadata may not be preserved
- There is an outer while-loop and an inner for-loop in the test case.
  Inner-loop has `llvm.loop.unroll.enable` metadata that is not
  preserved. This happens around [1], when the loop metadata of outer loop
  overrides the inner loop metadata directly, without looking at whether inner-loop
  itself has loop metadata.

 [1] ab755e6562/llvm/lib/Transforms/Utils/Local.cpp (L1146)

Differential Revision: https://reviews.llvm.org/D134014
2022-09-18 22:48:09 -07:00
Yaxun (Sam) Liu
e5581df60a [SimplifyCFG] accumulate bonus insts cost
SimplifyCFG folds

bool foo() {
  if (cond1) return false;
  if (cond2) return false;
  return true;
}

as

bool foo() {
  if (cond1 | cond2) return false
  return true;
}

'cond2' is called 'bonus insts' in branch folding since they introduce overhead
since the original CFG could do early exit but the folded CFG always executes
them. SimplifyCFG calculates the costs of 'bonus insts' of a folding a BB into
its predecessor BB which shares the destination. If it is below bonus-inst-threshold,
SimplifyCFG will fold that BB into its predecessor and cond2 will always be executed.

When SimplifyCFG calculates the cost of 'bonus insts', it only consider 'bonus' insts
in the current BB to be considered for folding. This causes issue for unrolled loops
which share destinations, e.g.

bool foo(int *a) {
  for (int i = 0; i < 32; i++)
    if (a[i] > 0) return false;
  return true;
}

After unrolling, it becomes

bool foo(int *a) {
  if(a[0]>0) return false
  if(a[1]>0) return false;
  //...
  if(a[31]>0) return false;
  return true;
}

SimplifyCFG will merge each BB with its predecessor BB,
and ends up with 32 'bonus insts' which are always executed, which
is much slower than the original CFG.

The root cause is that SimplifyCFG does not consider the
accumulated cost of 'bonus insts' which are folded from
different BB's.

This patch fixes that by introducing a ValueMap to track
costs of 'bonus insts' coming from different BB's into
the same BB, and cuts off if the accumulated cost
exceeds a threshold.

Reviewed by: Artem Belevich, Florian Hahn, Nikita Popov, Matt Arsenault

Differential Revision: https://reviews.llvm.org/D132408
2022-09-18 20:21:14 -04:00