132 Commits

Author SHA1 Message Date
Johannes Doerfert
a81fff8afd Reapply "[Intrinsics] Add nocallback to the default intrinsic attributes"
This reverts commit c5f789050daab25aad6770790987e2b7c0395936 and
reapplies 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 with additional test
changes.
2022-03-25 09:36:50 -05:00
Nikita Popov
f00cd27646 [Verifier] Verify llvm.access.group metadata
According to LangRef, an access scope must have zero operands and
be distinct. The access group may either be a single access scope
or a list of access scopes.

LoopInfo may assert if this is not the case.
2022-03-14 16:16:36 +01:00
Roman Lebedev
c8ba2b67a0
[SimplifyCFG] 'merge compatible invokes': fully support indirect invokes
As long as *all* the invokes in the set are indirect,
we can merge them, but don't merge direct invokes into the set,
even though it would be legal to do.
2022-02-08 21:29:38 +03:00
Roman Lebedev
414b47645d
[SimplifyCFG] 'merge compatible invokes': don't create trivial PHI's with all-identical incoming values 2022-02-08 21:29:38 +03:00
Roman Lebedev
e2aed0b047
[NFC][SimplifyCFG] 'merge compatible invokes': tests for indirect invokes. 2022-02-08 21:29:38 +03:00
Roman Lebedev
42ca7cc889
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ uses
If the original invokes had uses, the uses must have been in PHI's,
but that immediately results in the incoming values being incompatible.
But we'll replace uses of the original invokes with the use of the
merged invoke, so as long as the incoming values become compatible
after that, we can merge.
2022-02-08 17:49:38 +03:00
Roman Lebedev
9986d60224
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ PHIs but no uses
As long as the incoming values for all the invokes in the set
are identical, we can merge the invokes.
2022-02-08 17:49:38 +03:00
Roman Lebedev
8411560fd0
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ no uses, no PHI's
Even if the invokes have normal destination, iff it's the same block,
we can merge them. For now, require that there are no PHI nodes,
and the returned values of invokes aren't used.
2022-02-08 17:49:38 +03:00
Roman Lebedev
1d5a3f70dc
[NFC][SimplifyCFG] 'merge compatible invokes': more tests for various edge-cases 2022-02-08 17:49:38 +03:00
Roman Lebedev
f5353c10af
[NFC][SimplifyCFG] 'merge compatible invokes': tests for non-unreachable normal destination 2022-02-05 02:15:07 +03:00
Roman Lebedev
55cd727c9a
[SimplifyCFG] 'merge compatible invokes': allow PHI nodes in landing pads
... iff the incoming values for the invokes-to-be-merged
are compatible (identical).
2022-02-04 20:26:44 +03:00
Roman Lebedev
332d70cd45
[NFC][SimplifyCFG] 'merge compatible invokes': tests w/ PHI's in landingpad 2022-02-04 20:26:44 +03:00
Roman Lebedev
36df803dfd
[SimplifyCFG] Merge compatible invokes of a landingpad
While nowadays SimplifyCFG knows how to hoist code from then-else blocks,
sink code from unconditional predecessors, and even promote the latter
by tail-merging `ret`/`resume` function terminators, that isn't everything.

While i (& others) have been trying to deal with merging/sinking `unreachable`,
apparently perhaps the more impactful remaining problem is merging the `throw`
calls.

If we start at the `landingpad`, all the predecessors are unwind edges of `invoke`s,
and in some cases some of the `invoke`s are mergeable.
```
/// This is a weird mix of hoisting and sinking. Visually, it goes from:
///          [...]        [...]
///            |            |
///        [invoke0]    [invoke1]
///           / \          / \
///     [cont0] [landingpad] [cont1]
/// to:
///      [...] [...]
///          \ /
///       [invoke]
///          / \
///     [cont] [landingpad]
```

This simplifies the IR/CFG, at the cost of debug info and extra PHI nodes.

Note that we don't require for *all* the `invokes` of the `landingpad`
to be mergeable, they can form more than a single set, we gracefully handle that.

For now, i completely disallowed normal destination, PHI nodes and indirect invokes
but that can be supported.

Out of all the CTMark projects, only 7zip is C++, so there isn't much impact:
https://llvm-compile-time-tracker.com/compare.php?from=ba8eb31bd9542828f6424e15a3014f80f14522c8&to=722fc871c84f14157d45c2159bc9c8c7e2825785&stat=size-total
... but there it currently causes size-total decrease.

Differential Revision: https://reviews.llvm.org/D117805
2022-02-04 17:04:21 +03:00
Roman Lebedev
6afbf8354b
[NFC][SimplifyCFG] 'merge compatible invokes': test with PHI nodes in unreachable normal destinations 2022-02-04 16:52:09 +03:00
Roman Lebedev
ee4ba9f3a1
Revert "[SimplifyCFG] Start redesigning FoldTwoEntryPHINode()."
Unfortunately, it seems we really do need to take the long route;
start from the "merge" block, find (all the) "dispatch" blocks,
and deal with each "dispatch" block separately, instead of simply
starting from each "dispatch" block like it would logically make sense,
otherwise we run into a number of other missing folds around
`switch` formation, missing sinking/hoisting and phase ordering.

This reverts commit 85628ce75b3084dc0f185a320152baf85b59aba7.
This reverts commit c5fff9095342a792bf4b9a077fe3c3a83c4e566c.
This reverts commit 34a98e1046e3aa55e5f26ab20a15e96b4034d25a.
This reverts commit 1e353f092288309d74d380367aa50bbd383780ed.
2022-02-03 12:32:50 +03:00
Roman Lebedev
1e353f0922
[SimplifyCFG] Start redesigning FoldTwoEntryPHINode().
The current `FoldTwoEntryPHINode()` is not quite designed correctly.
It starts from the merge point, and then tries to detect
the 'divergence' point.

Because of that, it is limited to the simple two-predecessor case,
where the PHI completely goes away. but that is rather pessimistic,
and it doesn't make much sense from the costmodel side of things.

For example if there is some other unrelated predecessor of
the merge point,  we could split the merge point so that
the then/else blocks first branch to an empty block
and then to the merge point, and then we'd be able to speculate
the then/else code.

But if we'd instead simply start at the divergence point,
and look for the merge point, then we'll just natively support this case.

There's also the fact that `SpeculativelyExecuteBB()` already does
just that, but only if there is a single block to speculate,
and with a much more restrictive cost model.
But that also means we have code duplication.

Now, sadly, while this is as much NFCI as possible,
there is just no way to cleanly migrate to
the proper implementation. The results *are* going to be different
somewhat because of various phase ordering effects and SimplifyCFG
block iteration strategy.
2022-02-02 17:53:56 +03:00
Roman Lebedev
73cb542930
[NFC][SimplifyCFG] Autogenerate checklines in a few tests being affected by upcoming change 2022-02-02 17:53:56 +03:00
Roman Lebedev
1455eddcf7
[NFC][SimplifyCFG] Add some tests for invoke merging 2022-01-20 20:37:29 +03:00
pvellien
4e1c207726 [SimplifyCFG] Fix assertion failure when reusing table switch comparison
After D116332, some icmps no longer fold with the target-independent
constant folder. The SimplifyCFG code assumed that the comparison
would always fold, which is not guaranteed. Explicitly check that the
result is either true or false.

Differential Revision: https://reviews.llvm.org/D117184
2022-01-18 09:30:54 +01:00
Roman Lebedev
82c8aca934
[SimplifyCFG] Be more aggressive when sinking into block followed by unreachable
I strongly believe we need some variant of this.

The main problem is e.g. that the glibc's assert has 4 parameters,
but the profitability check is only okay with one extra phi node,
so D116692 doesn't even trigger on most of the expected cases.

While that restriction probably makes sense in normal code, if we
are about to run off of a cliff (into an `unreachable`), this
successor block is unlikely so the cost to setup these PHI nodes
should not be on the hotpath, and shouldn't matter performance-wise.

Likewise, we don't sink if there are unconditional predecessors
UNLESS we'd sink at least one non-speculatable instruction,
which is a performance workaround, but if we are about to run into
`unreachable`, it shouldn't matter.

Note that we only allow the case where there are at
most unconditiona branches on the way to the unreachable block.

Differential Revision: https://reviews.llvm.org/D117045
2022-01-13 23:30:31 +03:00
Roman Lebedev
7b7a49a9fb
[NFC][SimplifyCFG] Add some more tests for sinking into 'unreachable' block 2022-01-11 22:35:20 +03:00
Owen Anderson
68079ef0eb Teach SimplifyCFG to fold switches into lookup tables in more cases.
In particular, it couldn't handle cases where lookup table constant
expressions involved bitcasts. This does not seem to come up
frequently in C++, but comes up reasonably often in Rust via
`#[derive(Debug)]`.

Originally reported by pcwalton.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D109565
2021-09-15 22:07:08 +00:00
Roman Lebedev
1901c98dd8
[SimplifyCFG] SwitchToLookupTable(): don't increase ret count
The very next SimplifyCFG pass invocation will tail-merge these two ret's
anyways, there is not much point in creating more work for ourselves.
2021-07-26 23:29:55 +03:00
Roman Lebedev
d7378259aa
[SimplifyCFG] SimplifyCondBranchToTwoReturns(): really only deal with different ret blocks
This function is called when some predecessor of an empty return block
ends with a conditional branch, with both successors being empty ret blocks.

Now, because of the way SimplifyCFG works, it might happen to simplify
one of the blocks in a way that makes a conditional branch
into an unconditional one, since it's destinations are now identical,
but it might not have actually simplified said conditional branch
into an unconditional one yet.

So, we have to check that ourselves first,
especially now that SimplifyCFG aggressively tail-merges
all ret and resume blocks.

Even if it was an unconditional branch already,
`SimplifyCFGOpt::simplifyReturn()` doesn't call `FoldReturnIntoUncondBranch()`
by default.
2021-07-23 00:36:59 +03:00
Roman Lebedev
3e6c383dc6
[SimplifyCFG] Rerun PHI deduplication after common code sinkinkg (PR51092)
`SinkCommonCodeFromPredecessors()` doesn't itself ensure that duplicate PHI nodes aren't created.
I suppose, we could teach it to do that on-the-fly (& account for the already-existing PHI nodes,
& adjust costmodel), the diff will be bigger than this.

The alternative is to schedule a new EarlyCSE pass invocation somewhere later in the pipeline.
Clearly, we don't have any EarlyCSE runs in module optimization passline, so this pattern isn't cleaned up...
That would perhaps better, but it will again have some compile time impact.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D106010
2021-07-15 16:34:34 +03:00
Philip Reames
e75a2dfe20 [tests] Stablize tests for possible change in deref semantics
There's a potential change in dereferenceability attribute semantics in the nearish future.  See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context.

This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics.  Note that for many of these cases, O3 would infer exactly these attributes on the test IR.

This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory.  There's a couple other tests which need more one-off attention, they'll be handled in another change.
2021-07-14 13:05:43 -07:00
Bjorn Pettersson
472462c472 [NewPM] Consistently use 'simplifycfg' rather than 'simplify-cfg'
There was an alias between 'simplifycfg' and 'simplify-cfg' in the
PassRegistry. That was the original reason for this patch, which
effectively removes the alias.

This patch also replaces all occurrances of 'simplify-cfg'
by 'simplifycfg'. Reason for choosing that form for the name is
that it matches the DEBUG_TYPE for the pass, and the legacy PM name
and also how it is spelled out in other passes such as
'loop-simplifycfg', and in other options such as
'simplifycfg-merge-cond-stores'.

I for some reason the name should be changed to 'simplify-cfg' in
the future, then I think such a renaming should be more widely done
and not only impacting the PassRegistry.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D105627
2021-07-09 09:47:03 +02:00
serge-sans-paille
4ab3041acb Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee04c75b1f1332b4fd1ac4e8ef6c3c247.

See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
2021-05-24 19:43:40 +02:00
serge-sans-paille
bda6e5bee0 [NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71406c75a4ea4d5a2fe84289f07ea3a1, no attributes is quivalent to
setting attribute to false.

This is a preliminary commit for https://reviews.llvm.org/D99080
2021-05-24 19:31:04 +02:00
Roman Lebedev
cc63203908
[SimplifyCFG] Common code sinking: fix application of profitability check
The profitability check is: we don't want to create more than a single PHI
per instruction sunk. We need to create the PHI unless we'll sink
all of it's would-be incoming values.

But there is a caveat there.
This profitability check doesn't converge on the first iteration!
If we first decide that we want to sink 10 instructions,
but then determine that 5'th one is unprofitable to sink,
that may result in us not sinking some instructions that
resulted in determining that some other instruction
we've determined to be profitable to sink becoming unprofitable.

So we need to iterate until we converge, as in determine
that all leftover instructions are profitable to sink.

But, the direct approach of just re-iterating seems dumb,
because in the worst case we'd find that the last instruction
is unprofitable, which would result in revisiting instructions
many many times.

Instead, i think we can get away with just two passes - forward and backward.
However then it isn't obvious what is the most performant way to update
InstructionsToSink.
2021-04-29 21:11:40 +03:00
Roman Lebedev
1886aad9d0
[SimplifyCFG] Common code sinking: relax restriction on non-uncond predecessors
While we have a known profitability issue for sinking in presence of
non-unconditional predecessors, there isn't any known issues
for having multiple such non-unconditional predecessors,
so said restriction appears to be artificial. Lift it.
2021-04-29 01:01:01 +03:00
Roman Lebedev
410d03aabf
[NFC][SimplifyCFG] Add test for sinking common code with multuple cond predecessors 2021-04-29 01:01:00 +03:00
Roman Lebedev
a8e273f2ed
[NFC][SimplifyCFG] Add test showing that profitability check for sinking is broken
Essentially, we can't promise that the instruction is sinkable without
introducing PHI's until we know that it is profitable to sink.
2021-04-29 01:01:00 +03:00
Roman Lebedev
d16d820c2e
[SimplifyCFG] Try 2: sink all-indirect indirect calls
Note that we don't want to turn a partially-direct call
into an indirect one, that will break ICP amongst other things.
2021-04-28 19:08:54 +03:00
Roman Lebedev
38dd222b4a
[NFC][SimplifyCFG] Add common code sinking test with direct and indirect callees
This is the pattern ICP produces.
We shouldn't fold this back into an indirect call.
2021-04-28 19:08:54 +03:00
Roman Lebedev
262c679d32
Revert "[SimplifyCFG] Sinking indirect calls - they're already indirect anyways"
Seems to break indirect call promotion, LTO/Resolution/X86/load-sample-prof-icp.ll fails.

This reverts commit e57cf128b30a88c6dd42e8ef40deeedd0d7f116d.
2021-04-28 17:46:59 +03:00
Roman Lebedev
e57cf128b3
[SimplifyCFG] Sinking indirect calls - they're already indirect anyways 2021-04-28 17:36:23 +03:00
Roman Lebedev
677a0dee64
[NFC][SimplifyCFG] Add test for sinking indirect calls 2021-04-28 17:36:23 +03:00
Roman Lebedev
a95a5dc5ab
[NFC][SimplifyCFG] Move sink-common-code.ll into X86
There are post-commit notest for e4c61d5 that suggest
the test is failing on certain bots. It looks like
the code there isn't being moved, which suggests
cost-model involvement, which suggests that we need to
hardcode the target triple.

Hopefully this helps?
2021-04-28 14:10:25 +03:00
Roman Lebedev
e4c61d5f83
[NFC][SimplifyCFG] Autogenerate check lines in many test files
These are potentially being affected by an upcoming patch.
2021-04-27 22:05:42 +03:00
Philip Reames
854de7c4d0 [tests] Refresh a bunch of autogen test to adjust for format changes 2021-03-22 10:41:39 -07:00
Roman Lebedev
0895b836d7
[SimplifyCFG] FoldBranchToCommonDest(): don't deal with unconditional branches
The case where BB ends with an unconditional branch,
and has a single predecessor w/ conditional branch
to BB and a single successor of BB is exactly the pattern
SpeculativelyExecuteBB() transform deals with.
(and in this case they both allow speculating only a single instruction)

Well, or FoldTwoEntryPHINode(), if the final block
has only those two predecessors.

Here, in FoldBranchToCommonDest(), only a weird subset of that
transform is supported, and it's glued on the side in a weird way.
  In particular, it took me a bit to understand that the Cond
isn't actually a branch condition in that case, but just the value
we allow to speculate (otherwise it reads as a miscompile to me).
  Additionally, this only supports for the speculated instruction
to be an ICmp.

So let's just unclutter FoldBranchToCommonDest(), and leave
this transform up to SpeculativelyExecuteBB(). As far as i can tell,
this shouldn't really impact optimization potential, but if it does,
improving SpeculativelyExecuteBB() will be more beneficial anyways.

Notably, this only affects a single test,
but EarlyCSE should have run beforehand in the pipeline,
and then FoldTwoEntryPHINode() would have caught it.

This reverts commit rL158392 / commit d33f4efbfdef6ffccf212ab3e40a7673589085fd.
2021-01-22 17:22:49 +03:00
Arthur Eubanks
6699029b67 [NewPM][opt] Run the "default" AA pipeline by default
We tend to assume that the AA pipeline is by default the default AA
pipeline and it's confusing when it's empty instead.

PR48779

Initially reverted due to BasicAA running analyses in an unspecified
order (multiple function calls as parameters), fixed by fetching
analyses before the call to construct BasicAA.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D95117
2021-01-21 21:08:54 -08:00
Arthur Eubanks
ba9b4ea4ee Revert "[NewPM][opt] Run the "default" AA pipeline by default"
This reverts commit be611431cd1f5c826a55b531db92a63e84323866.

Other/new-pm-lto-defaults.ll failing
2021-01-21 20:16:34 -08:00
Arthur Eubanks
be611431cd [NewPM][opt] Run the "default" AA pipeline by default
We tend to assume that the AA pipeline is by default the default AA
pipeline and it's confusing when it's empty instead.

PR48779

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D95117
2021-01-21 19:46:38 -08:00
Roman Lebedev
c1b825d4b8
[SimplifyCFG] Teach FoldValueComparisonIntoPredecessors() to preserve DomTree, part 1 2021-01-01 03:25:22 +03:00
Roman Lebedev
7f221c9196
[SimplifyCFG] Teach SwitchToLookupTable() to preserve DomTree 2020-12-30 23:58:41 +03:00
Roman Lebedev
d4c0abb4a3
[SimplifyCFG] Teach FoldCondBranchOnPHI() to preserve DomTree 2020-12-30 00:48:11 +03:00
Roman Lebedev
18c407bf4c
[SimplifyCFG] Teach HoistThenElseCodeToIf() to preserve DomTree 2020-12-30 00:48:10 +03:00
Roman Lebedev
b7d00e29b7
[SimplifyCFG] Teach simplifySingleResume() to preserve DomTree 2020-12-20 00:18:34 +03:00