776 Commits

Author SHA1 Message Date
Simon Moll
b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Fangrui Song
557efc9a8b [llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the error has been removed,
cl::ZeroOrMore is unneeded.

Also remove cl::init(false) while touching the lines.
2022-06-03 21:59:05 -07:00
Nikita Popov
1721ff1dfd [GVN] Enable enable-split-backedge-in-load-pre option by default
This option was added in D89854. It prevents GVN from performing
load PRE in a loop, if doing so would require critical edge
splitting on the backedge. From the review:

> I know that GVN Load PRE negatively impacts peeling,
> loop predication, so the passes expecting that latch has
> a conditional branch.

In the PhaseOrdering test in this patch, splitting the backedge
negatively affects vectorization: After critical edge splitting,
the loop gets rotated, effectively peeling off the first loop
iteration. The effect is that the first element is handled
separately, then the bulk of the elements use a vectorized
reduction (but using unaligned, off-by-one memory accesses) and
then a tail of 15 elements is handled separately again.

It's probably worth noting that the loop load PRE from D99926 is
not affected by this change (as it does not need backedge
splitting). This is about normal load PRE that happens to occur
inside a loop.

Differential Revision: https://reviews.llvm.org/D126382
2022-05-30 09:55:58 +02:00
Owen Anderson
939a43461b Revert "Replace the custom linked list in LeaderTableEntry with TinyPtrVector."
This reverts commit 1e9114984490b83d4665f12a11f84c83f50ca8f0.

Pending further discussion.
2022-05-26 09:50:36 -07:00
Owen Anderson
1e91149844 Replace the custom linked list in LeaderTableEntry with TinyPtrVector.
The purpose of the custom linked list was to optimize for the case
of a single-element list. It turns out that TinyPtrVector handles
the same basic scenario even better, reducing the size of
LeaderTableEntry by 33%, and requiring only log2(N) allocations
as the size of the list grows. The only downside is that we have
to store the Value's and BasicBlock's in separate vectors, which
is slightly awkward in a few cases. Fortunately that ends up being
entirely encapsulated inside helper functions.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D125205
2022-05-25 23:52:44 -07:00
Nikita Popov
b9dc565147 [GVN] Encode GEPs in offset representation
When using opaque pointers, convert GEPs into offset representation
of the form P + V1 * Scale1 + V2 * Scale2 + ... + ConstantOffset.
This allows us to recognize equivalent address calculations even if
the GEPs don't use the same source element type.

This fixes an opaque pointer codegen regression seen in rustc.

Differential Revision: https://reviews.llvm.org/D124527
2022-04-28 09:32:05 +02:00
Martin Storsjö
46776f7556 Fix warnings about variables that are set but only used in debug mode
Add void casts to mark the variables used, next to the places where
they are used in assert or `LLVM_DEBUG()` expressions.

Differential Revision: https://reviews.llvm.org/D123117
2022-04-06 10:01:46 +03:00
Nikita Popov
cf18ec445d [GVN] Check load type in select PRE
This is no longer implicitly guaranteed with opaque pointers.
2022-03-14 12:46:54 +01:00
serge-sans-paille
59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Nikita Popov
2a1b1f1b1b [GVN] Store source element type for GEP expressions
To avoid incorrectly merging GEPs with different source types
under opaque pointers.

To avoid increasing the Expression structure size, this reuses the
existing type member. The code does not rely on this to be the
expression result type, it's only used as a disambiguator.
2022-02-11 13:03:30 +01:00
Florian Hahn
1c9f15426f
[GVN] Replace PointerIntPair with separate pointer & kind fields (NFC).
After adding another value kind in 8a12cae862af, Value * pointers do not
have enough available empty bits to store the kind (e.g. on ARM)

To address this, the patch replaces the PointerIntPair with separate
value and kind fields.
2022-02-02 09:44:15 +00:00
Florian Hahn
8a12cae862
[GVN] Support load of pointer-select to value-select conversion.
This patch extends the available-value logic to detect loads
of pointer-selects that can be replaced by a value select.

For example, consider the code below:

  loop:
    %sel.phi = phi i32* [ %start, %ph ], [ %sel, %ph ]
    %l = load %ptr
    %l.sel = load %sel.phi
    %sel = select cond, %ptr, %sel.phi
    ...

  exit:
    %res = load %sel
    use(%res)

The load of the pointer phi can be replaced by a load of the start value
outside the loop and a new phi/select chain based on the loaded values,
as illustrated below

    %l.start = load %start
  loop:
    sel.phi.prom = phi i32 [ %l.start, %ph ], [ %sel.prom, %ph ]
    %l = load %ptr
    %sel.prom = select cond, %l, %sel.phi.prom
    ...
  exit:
    use(%sel.prom)

This is a first step towards alllowing vectorizing loops using common libc++
library functions, like std::min_element (https://clang.godbolt.org/z/6czGzzqbs)

    #include <vector>
    #include <algorithm>

    int foo(const std::vector<int> &V) {
        return *std::min_element(V.begin(), V.end());
    }

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D118143
2022-02-02 09:23:09 +00:00
Philip Reames
6b0ff0969d Extract utility function for checking initial value of allocation [NFC, try 2]
This is a reoccuring pattern, we can consolidate three copies into one.  The main motivation is to reduce usages of isMallocLike.

The original commit (which was quickly reverted) didn't account for the allocation function could be an invoke, test coverage for that case added in this commit.
2022-01-07 08:44:08 -08:00
Philip Reames
c6a0c1585a Revert "Extract utility function for checking initial value of allocation [NFC]"
This reverts commit 9ce30fe86f58df45b2c5daa601802593601c471d.  Appears to be causing a problem on a buildbot, revert while investigating.

https://green.lab.llvm.org/green//job/clang-stage1-RA/26818/consoleFull#-1502953973d489585b-5106-414a-ac11-3ff90657619c
2022-01-06 19:05:51 -08:00
Philip Reames
9ce30fe86f Extract utility function for checking initial value of allocation [NFC]
This is a reoccuring pattern, we can consolidate three copies into one.  The main motivation is to reduce usages of isMallocLike.
2022-01-06 18:02:14 -08:00
Nuno Lopes
84b285d6eb [GVN] Set phi entries of unreachable predecessors to poison instead of undef
This matches NewGVN's behavior.
2021-12-30 14:47:24 +00:00
Kazu Hirata
7505b7045f [llvm] Use GetElementPtrInst::indices (NFC) 2021-11-13 21:43:28 -08:00
Arthur Eubanks
1d8750c3da [NFC] Rename GVN -> GVNPass and SROA -> SROAPass
To be more consistent with other pass struct names.

There are still more passes that don't end with "Pass", but these are the important ones.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D112935
2021-11-09 10:35:58 -08:00
Kazu Hirata
4f0225f6d2 [Transforms] Migrate from getNumArgOperands to arg_size (NFC)
Note that getNumArgOperands is considered a legacy name.  See
llvm/include/llvm/IR/InstrTypes.h for details.
2021-10-01 09:57:40 -07:00
Nikita Popov
0fc624f029 [IR] Return AAMDNodes from Instruction::getMetadata() (NFC)
getMetadata() currently uses a weird API where it populates a
structure passed to it, and optionally merges into it. Instead,
we can return the AAMDNodes and provide a separate merge() API.
This makes usages more compact.

Differential Revision: https://reviews.llvm.org/D109852
2021-09-16 21:06:57 +02:00
Markus Lavin
1ac209ed76 [NPM] Added -print-pipeline-passes print params for a few passes.
Added '-print-pipeline-passes' printing of parameters for those passes
declared with *_WITH_PARAMS macro in PassRegistry.def.

Note that it only prints the parameters declared inside *_WITH_PARAMS as
in a few cases there appear to be additional parameters not parsable.

The following passes are now covered (i.e. all of those with *_WITH_PARAMS in
PassRegistry.def).

LoopExtractorPass - loop-extract
HWAddressSanitizerPass - hwsan
EarlyCSEPass - early-cse
EntryExitInstrumenterPass - ee-instrument
LowerMatrixIntrinsicsPass - lower-matrix-intrinsics
LoopUnrollPass - loop-unroll
AddressSanitizerPass - asan
MemorySanitizerPass - msan
SimplifyCFGPass - simplifycfg
LoopVectorizePass - loop-vectorize
MergedLoadStoreMotionPass - mldst-motion
GVN - gvn
StackLifetimePrinterPass - print<stack-lifetime>
SimpleLoopUnswitchPass - simple-loop-unswitch

Differential Revision: https://reviews.llvm.org/D109310
2021-09-15 08:34:04 +02:00
Kazu Hirata
8e86c0e4f4 [Scalar] Use make_early_inc_range (NFC) 2021-09-12 08:17:18 -07:00
Jingu Kang
b52171629f [GVN] Execute performLoopLoadPRE ahead of PerformLoadPRE
Differential Revision: https://reviews.llvm.org/D108204
2021-08-24 09:50:27 +01:00
Nikita Popov
2b70b68efb [GVN] Don't short-circuit load PRE
4ad41902e8c7481ccf3cdf6e618dfcd1e1fc10fc changed this code to
propagate Changed if scalar GEP PRE is performed. However, as
implemented this would skip the load PRE entirely if GEP indices
were PREd. Make sure load PRE runs even if Changed is already
true.

This likely has no functional effect as load PRE would then
occur on a later GVN iteration.
2021-08-22 21:12:58 +02:00
Simon Pilgrim
56541d1377 GVN.cpp - remove unused <vector> include. NFCI. 2021-06-13 14:06:32 +01:00
Daniil Fukalov
0b34acdab7 [NFC] Fix 'Load' name masking.
Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D103456
2021-06-02 11:09:53 +03:00
Arthur Eubanks
6b9524a05b [NewPM] Don't mark AA analyses as preserved
Currently all AA analyses marked as preserved are stateless, not taking
into account their dependent analyses. So there's no need to mark them
as preserved, they won't be invalidated unless their analyses are.

SCEVAAResults was the one exception to this, it was treated like a
typical analysis result. Make it like the others and don't invalidate
unless SCEV is invalidated.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D102032
2021-05-18 13:49:03 -07:00
Adam Nemet
ab1f6ffa56 [GVN] Improve analysis for missed optimization remark
This change tries to handle multiple dominating users of the pointer operand
by choosing the most immediately dominating one, if possible.  While making
this change I also found that the previous implementation had a missing break
statement, making all loads with an odd number of dominating users emit an
OtherAccess value, so that has also been fixed.

Patch by Henrik G Olsson!

Differential Revision: https://reviews.llvm.org/D79097
2021-05-17 21:51:15 -07:00
dfukalov
fdae3fc8b3 [GVN] Clobber partially aliased loads.
Use offsets stored in `AliasResult` implemented in D98718.

Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95543
2021-05-14 11:17:14 +03:00
Jordan Rupprecht
fec2945998 Revert "[GVN] Clobber partially aliased loads."
This reverts commit 6c570442318e2d3b8b13e95c2f2f588d71491acb.

It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543:

```
$ cat repro.ll
; ModuleID = 'repro.ll'
source_filename = "repro.ll"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.widget = type { i32 }
%struct.baz = type { i32, %struct.snork }
%struct.snork = type { %struct.spam }
%struct.spam = type { i32, i32 }

@global = external local_unnamed_addr global %struct.widget, align 4
@global.1 = external local_unnamed_addr global i8, align 1
@global.2 = external local_unnamed_addr global i32, align 4

define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 {
bb:
  %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1
  %tmp1 = bitcast %struct.snork* %tmp to i64*
  %tmp2 = load i64, i64* %tmp1, align 4
  %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1
  %tmp4 = icmp ugt i64 %tmp2, 4294967295
  br label %bb5

bb5:                                              ; preds = %bb14, %bb
  %tmp6 = load i32, i32* %tmp3, align 4
  %tmp7 = icmp ne i32 %tmp6, 0
  %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false
  %tmp9 = zext i1 %tmp8 to i8
  store i8 %tmp9, i8* @global.1, align 1
  %tmp10 = load i32, i32* @global.2, align 4
  switch i32 %tmp10, label %bb11 [
    i32 1, label %bb12
    i32 2, label %bb12
  ]

bb11:                                             ; preds = %bb5
  br label %bb14

bb12:                                             ; preds = %bb5, %bb5
  %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4
  br label %bb14

bb14:                                             ; preds = %bb12, %bb11
  br label %bb5
}
$ opt -O2 repro.ll -disable-output
opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value *llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst *, unsigned int, llvm::Type *, llvm::Instruction *, const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output
...
```
2021-05-11 16:08:53 -07:00
dfukalov
6c57044231 [GVN] Clobber partially aliased loads.
Use offsets stored in `AliasResult` implemented in D98718.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95543
2021-04-24 14:14:20 +03:00
Max Kazantsev
8fe62b7af1 [GVN] Introduce loop load PRE
This patch allows PRE of the following type of loads:

```
preheader:
  br label %loop

loop:
  br i1 ..., label %merge, label %clobber

clobber:
  call foo() // Clobbers %p
  br label %merge

merge:
  ...
  br i1 ..., label %loop, label %exit

```

Into
```
preheader:
  %x0 = load %p
  br label %loop

loop:
  %x.pre = phi(x0, x2)
  br i1 ..., label %merge, label %clobber

clobber:
  call foo() // Clobbers %p
  %x1 = load %p
  br label %merge

merge:
  x2 = phi(x.pre, x1)
  ...
  br i1 ..., label %loop, label %exit

```

So instead of loading from %p on every iteration, we load only when the actual clobber happens.
The typical pattern which it is trying to address is: hot loop, with all code inlined and
provably having no side effects, and some side-effecting calls on cold path.

The worst overhead from it is, if we always take clobber block, we make 1 more load
overall (in preheader). It only matters if loop has very few iteration. If clobber block is not taken
at least once, the transform is neutral or profitable.

There are several improvements prospect open up:
- We can sometimes be smarter in loop-exiting blocks via split of critical edges;
- If we have block frequency info, we can handle multiple clobbers. The only obstacle now is that
  we don't know if their sum is colder than the header.

Differential Revision: https://reviews.llvm.org/D99926
Reviewed By: reames
2021-04-22 12:50:38 +07:00
Max Kazantsev
baf17e2cc9 [NFC] Move statictic increment out of helper 2021-04-09 16:32:35 +07:00
Max Kazantsev
275f3a2540 [GVN][NFC] Factor out load elimination logic via PRE for reuse 2021-04-09 16:12:25 +07:00
Arthur Eubanks
c5d1ccbcdf [GVN] Properly invalidate ICF cache when we simplify a value
This fixes a "Cached first special instruction is wrong!" assert.

The assert fires because replacing a value with another can cause an
instruction to no longer be "special" to ICF. In this case,
devirtualization happened, turning an indirect call to a
call to a willreturn function which is no longer special.

Reviewed By: nikic, rnk

Differential Revision: https://reviews.llvm.org/D99977
2021-04-08 14:01:57 -07:00
Philip Reames
9ef6aa020b Plumb AssumeInst through operand bundle apis [nfc]
Follow up to a6d2a8d6f5.  This covers all the public interfaces of the bundle related code.  I tried to cleanup the internals where the changes were obvious, but there's definitely more room for improvement.
2021-04-06 12:53:53 -07:00
Arthur Eubanks
4e83e59eb8 [GVN] Add missing ICF update
performScalarPREInsertion() inserts instructions into blocks that we
need to tell ImplicitControlFlowTracking about, otherwise the ICF cache
may be invalid.

Fixes PR49193.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D99909
2021-04-06 10:13:42 -07:00
Philip Reames
21d4839948 Move GCRelocateInst and GCResultInst to IntrinsicInst.h [nfc]
These two are part of the IntrinsicInst class hierarchy and it helps to cut down on some redundant includes.
2021-04-06 08:33:15 -07:00
Philip Reames
52ecd94cfb Remove last remnants of PR49607 migration [NFC]
The key change (4f5e92c) to switch gc.result and gc.relocate to being readnone landed nearly two weeks ago, and we haven't seen any fallout.  Time to remove the code added to make reverting easy.
2021-04-06 07:56:55 -07:00
Max Kazantsev
a1d83776bf [NFC] Undo some erroneous renamings
Some vars renamed by mistake during auto-replacements. Undoing them.
2021-04-01 13:10:10 +07:00
Max Kazantsev
630818a850 [NFC] Disambiguate LI in GVN
Name GVN uses name 'LI' for two different unrelated things:
LoadInst and LoopInfo. This patch relates the variables with
former meaning into 'Load' to disambiguate the code.
2021-04-01 12:40:35 +07:00
KAWASHIMA Takahiro
5fac7c6046 [GVN] Propagate llvm.access.group metadata of loads
Before this change, the `llvm.access.group` metadata was dropped
when moving a load instruction in GVN. This prevents vectorizing
a C/C++ loop with `#pragma clang loop vectorize(assume_safety)`.
This change propagates the metadata as well as other metadata if
it is safe (the move-destination basic block and source basic
block belong to the same loop).

Differential Revision: https://reviews.llvm.org/D93503
2021-04-01 10:00:48 +09:00
Philip Reames
6972e39d47 [gvn] CSE gc.relocates based on meaning, not spelling (try 2)
This was (partially) reverted in cfe8f8e0 because the conversion from readonly to readnone in Intrinsics.td exposed a couple of problems.  This change has been reworked to not need that change (via some explicit checks in client code).  This is being done to address the original optimization issue and simplify the testing of the readonly changes.  I'm working on that piece under 49607.

Original commit message follows:

The last two operands to a gc.relocate represent indices into the associated gc.statepoint's gc bundle list. (Effectively, gc.relocates are projections from the gc.statepoints multiple return values.)

We can use this to recognize when two gc.relocates are equivalent (and can be CSEd), even when the indices are non-equal. This is particular useful when considering a chain of multiple statepoints as it lets us eliminate all duplicate gc.relocates in a single pass.

Differential Revision: https://reviews.llvm.org/D97974
2021-03-16 10:59:31 -07:00
Nikita Popov
3fedaf2a52 [GVN] Don't explicitly materialize undefs from dead blocks
When materializing an available load value, do not explicitly
materialize the undef values from dead blocks. Doing so will
will force creation of a phi with an undef operand, even if there
is a dominating definition. The phi will be folded away on
subsequent GVN iterations, but by then we may have already
poisoned MDA cache slots.

Simply don't register these values in the first place, and let
SSAUpdater do its thing.
2021-03-06 23:46:24 +01:00
Philip Reames
5db2735af9 [gvn] Handle simply phi equivalence cases
GVN basically doesn't handle phi nodes at all. This is for a reason - we can't value number their inputs since the predecessor blocks have probably not been visited yet.

However, it also creates a significant pass ordering problem. As it stands, instcombine and simplifycfg ends up implementing CSE of phi nodes. This means that for any series of CSE opportunities intermixed with phi nodes, we end up having to alternate instcombine/simplifycfg and gvn to make progress.

This patch handles the simplest case by simply preprocessing the phi instructions in a block, and CSEing them if they are syntactically identical. This turns out to be powerful enough to handle many cases in a single invocation of GVN since blocks which use the cse'd phi results are visited after the block containing the phi. If there's a CSE opportunity in one the phi predecessors required to recognize the phi CSE opportunity, that will require a second iteration on the function. (Still within a single run of gvn though.)

Compile time wise, this could go either way. On one hand, we're potentially causing GVN to iterate over the function more. On the other, we're cutting down on iterations between two passes and potentially shrinking the IR aggressively. So, a bit unclear what to expect.

Note that this does still rely on instcombine to canonicalize block order of the phis, but that's a one time transformation independent of the values incoming to the phi.

Differential Revision: https://reviews.llvm.org/D98080
2021-03-06 09:31:12 -08:00
Philip Reames
51b13a7ea0 [gvn] CSE gc.relocates based on meaning, not spelling
The last two operands to a gc.relocate represent indices into the associated gc.statepoint's gc bundle list. (Effectively, gc.relocates are projections from the gc.statepoints multiple return values.)

We can use this to recognize when two gc.relocates are equivalent (and can be CSEd), even when the indices are non-equal. This is particular useful when considering a chain of multiple statepoints as it lets us eliminate all duplicate gc.relocates in a single pass.

Differential Revision: https://reviews.llvm.org/D97974

(Note: Part of the reviewed change was split and landed as f352463a)
2021-03-05 10:16:12 -08:00
Kazu Hirata
5fc9e30985 [Scalar] Use range-based for loops (NFC) 2021-02-25 19:54:38 -08:00
ksyx
4125cabce1 [GVN] Fix a typo in comment
NFC.

Differential Revision: https://reviews.llvm.org/D97200

Reviewed By: fhahn
2021-02-23 10:39:34 +08:00
Kazu Hirata
302313a264 [Transforms] Use range-based for loops (NFC) 2021-02-08 22:33:53 -08:00
Kazu Hirata
fb74e1e78a [Transforms/Scalar] Use range-based for loops (NFC) 2021-02-04 21:18:05 -08:00