652 Commits

Author SHA1 Message Date
Nikita Popov
d18a2dc5c9 [GVN] Name instructions in test (NFC) 2023-01-06 17:28:18 +01:00
Nikita Popov
23abf93138 [GVN] Convert some tests to opaque pointers (NFC) 2022-12-23 10:00:59 +01:00
Guozhi Wei
4c13af22b4 [TEST] Pre-commit test for GVN PRE load
This is a test case for D139582.

In this test case, %v4 and %v5 can be moved to predecessors, %v3 can be changed to a PHI instruction.

Differential Revision: https://reviews.llvm.org/D140234
2022-12-20 18:43:31 +00:00
Joshua Cranmer
e6b02214c6 [IR] Add a target extension type to LLVM.
Target-extension types represent types that need to be preserved through
optimization, but otherwise are not introspectable by target-independent
optimizations. This patch doesn't add any uses of these types by an existing
backend, it only provides basic infrastructure such that these types would work
correctly.

Reviewed By: nikic, barannikov88

Differential Revision: https://reviews.llvm.org/D135202
2022-12-20 11:02:11 -05:00
Nikita Popov
99b95bde43 [GVN] Regenerate test checks (NFC) 2022-12-09 15:30:58 +01:00
Bjorn Pettersson
3528e63d89 [test] Remove duplicate RUN lines in Transform tests 2022-12-08 11:47:16 +01:00
Roman Lebedev
c67f0701bb
[NFC] Port all GVN tests to -passes= syntax 2022-12-08 02:38:43 +03:00
Roman Lebedev
0aeedf581c
[NFC] Port all GVN tests to -passes= syntax 2022-12-07 22:22:08 +03:00
Bjorn Pettersson
0676acb6fd [test] Switch to use -passes syntax in a bunch of test cases
Should cover most of the tests for GVN, GVNHoist, GVNSink, GlobalOpt,
GlobalSplit, InstCombine, Reassociate, SROA and TailCallElim that
had not been updated earlier.
2022-11-29 13:29:02 +01:00
Alex Gatea
7d0648cb6c [GVN] Patch for invalid GVN replacement
If PRE is performed as part of the main GVN pass (to PRE GEP
operands before processing loads), and it is performed across a
backedge, we will end up adding the new instruction to the leader
table of a block that has not yet been processed. When it will be
processed, GVN will incorrectly assume that the value is already
available, even though it is only available at the end of the
block.

Avoid this by not performing PRE across backedges.

Fixes https://github.com/llvm/llvm-project/issues/58418.

Differential Revision: https://reviews.llvm.org/D136095
2022-11-04 14:28:17 +01:00
Nikita Popov
b246ca79fa [GVN] Regenerate test checks (NFC) 2022-10-28 17:29:29 +02:00
Arthur Eubanks
f3a928e233 [opt] Don't translate legacy -analysis flag to require<analysis>
Tests relying on this should explicitly use -passes='require<analysis>,foo'.
2022-10-07 14:54:34 -07:00
Sebastian Peryt
99c9b37d11 [NFC][1/n] Remove -enable-new-pm=0 flags from lit tests
This is the first patch in a series intended for removing flag
-enable-new-pm=0 from lit tests. This is part of a bigger
effort of completely removing legacy code related to legacy
pass manager in favor of currently default new pass manager.

In this patch flag has been removed only from tests where no significant
change has been required because checks has been duplicated for
both PMs.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D134150
2022-09-19 09:57:37 -07:00
Craig Topper
50a699e362 [IR][VP] Remove IntrArgMemOnly from vp.gather/scatter.
IntrArgMemOnly is only valid for intrinsics that use a scalar
pointer argument. These intrinsics use a vector of pointer.

Alias analysis will try to find a scalar pointer argument and
will return incorrect alias results when it doesn't find one.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133898
2022-09-14 15:00:07 -07:00
Craig Topper
6384044df4 [GVN][VP] Add test case for incorrect removal of a vp.gather. NFC
Pre-commit for D133898

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133899
2022-09-14 15:00:07 -07:00
zhongyunde
8a15695be2 [AA] Improve the BasicAA analysis capability
According https://discourse.llvm.org/t/memoryssa-does-the-accessedbetween-support-scalable-vector-pointer/65052,
scalable vector support in BasicAA is currently essentially limited,
and should be improved effectively for a constant offset GEP if the scalable index is zero, eg:
  getelementptr <vscale x 4 x i32>, ptr %p, i64 0, i64 %i

Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D133567
2022-09-12 19:41:17 +08:00
Nikita Popov
4f046bc8e0 [PHITranslateAddr] Require dominance when searching for translated address (PR57025)
This is a fix for PR57025 and an alternative to D131776. The problem
in the phi-translation-to-wrong-context.ll test case is that phi
translation of %gep.j into if2 pick %gep.i as the result. While this
instruction has the correct pointer address, it occurs in a context
where %i != 0. As such, we get a NoAlias result for the store in
if2, even though they do alias for %i == 0 (which is legal in the
original context of the pointer).

PHITranslateValue already has a MustDominate option, which can be
used to restrict PHI translation results to values that dominate the
translated-into block. However, this is more aggressive than what we
need and would significantly regress GVN results. In particular, if
we have a pointer value that does not require any translation, then
it is fine to continue using that value in the predecessor, because
the context is still correct for the original query. We only run into
problems if PHITranslateSubExpr() picks a completely random
instruction in a context that may have preconditions that do not hold.

Fix this by always performing the dominance checks in
PHITranslateSubExpr(), without enabling the more general MustDominate
requirement.

Fixes https://github.com/llvm/llvm-project/issues/57025. This also
fixes the test case for https://github.com/llvm/llvm-project/issues/30999,
but I'm not sure whether that's just the particular test case,
or a general solution to the problem.

Differential Revision: https://reviews.llvm.org/D132935
2022-09-01 16:26:42 +02:00
Bjorn Pettersson
3aab9d2bb7 [GVN] Pre-commit test case showing miscompile in github issue #57025
This commit adds a reproducer for
  https://github.com/llvm/llvm-project/issues/57025
showing a miscompile in GVN.

Not sure how likely this kind of faults would be in a normal pipeline,
considering that the input IR has some dead code in it. On the other
hand, GVN itself sometimes creates dead basic blocks when splitting
critical edges. Anyway, the fault was found when doing fuzzy testing
using random pass pipelines.

Differential Revision: https://reviews.llvm.org/D131775
2022-09-01 14:43:24 +02:00
Nikita Popov
b10e508c19 [GVN] Add another test for phi translation miscompile (NFC) 2022-08-31 09:14:53 +02:00
Nikita Popov
07bfbce988 [GVN] Regenerate test checks (NFC) 2022-08-30 12:06:37 +02:00
Augie Fackler
12c0bf8ba9 tests: add attributes that would normally come from inferattrs
As my goal is to remove at least _some_ functions from the static list
in MemoryBuiltins.cpp, these tests either need to run inferattrs or
statically declare these attributes to keep passing. A couple of tests
had alternate cases which are no longer meaningful, e.g.
`malloc-load-removal.ll`.

Differential Revision: https://reviews.llvm.org/D123087
2022-07-25 17:29:00 -04:00
Peter Waller
f8919d2f7e [NFC][GVN] Put phi-translation of 'add' behind a switch
The code in this `#if 0` block appears to be a net benefit. Put it
behind a switch defaulting to off to support experimentation and as a
request for comment.

The codegen impact of enabling this that I'm currently persuing is that
it allows PRE to take place more frequently, particularly in loops with
second order recurrences.

Preliminary experimental data:

Across LNT on AArch64, 54 benchmarks are sped up by >1%, and 42 are
regressed by >1%, the geomean (exec_time_enabled / exec_time_disabled)
of these 96 "1% or greater significance" benchmarks is 0.991. For the
full set of 770 benchmarks it's 0.998.

There are two benchmarks which experience a >30% speedup, and the worst
slowdown is ~12%, and for every benchmark with a slowdown there is a
benckmark which is sped up by a greater factor.

Differential Revision: https://reviews.llvm.org/D130241
2022-07-25 07:59:47 +00:00
Nikita Popov
2a721374ae [IR] Don't use blockaddresses as callbr arguments
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:

    ; Before:
    %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
    to label %asm.fallthrough [label %foo]
    ; After:
    %res = callbr i8* asm "", "=r,r,!i"(i8* %x)
    to label %asm.fallthrough [label %foo]

The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:

* Allow unrolling/peeling/rotation of callbr, or any other
  clone-based optimizations
  (https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
  (https://github.com/llvm/llvm-project/issues/45248)

This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.

Differential Revision: https://reviews.llvm.org/D129288
2022-07-15 10:18:17 +02:00
Nikita Popov
34a5c2bcf2 [BasicBlockUtils] Allow critical edge splitting with callbr terminators
After D129205, we support SplitBlockPredecessors() for predecessors
with callbr terminators. This means that it is now also safe to
invoke critical edge splitting for an edge coming from a callbr
terminator. Remove checks in various passes that were protecting
against that.

Differential Revision: https://reviews.llvm.org/D129256
2022-07-08 09:20:44 +02:00
Vir Narula
89a99ec900
[GVN] Bug fix to reportMayClobberedLoad remark
Bug fix to avoid assert crashing when generating remarks for GVN crashing.

Intention of assert is correct but ignores edge case of instructions being equivalent.

Reduced input that causes crash when remarks are turned on:
```
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
target triple = "arm64-apple-macosx12.0.0"

define ptr @ReplaceWithTidy(ptr %zz_hold) {
cond.end480.us:
  %0 = load ptr, ptr null, align 8
  store ptr %0, ptr %0, align 8
  store ptr null, ptr %zz_hold, align 8
  %1 = load ptr, ptr %0, align 8
  store ptr %1, ptr null, align 8
  ret ptr null
}
```

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D129235
2022-07-06 17:42:05 -07:00
Nikita Popov
10c531cd5b [SCCP] Simplify CFG in SCCP as well
Currently, we only remove dead blocks and non-feasible edges in
IPSCCP, but not in SCCP. I'm not aware of any strong reason for
that difference, so this patch updates SCCP to perform the CFG
cleanup as well.

Compile-time impact seems to be pretty minimal, in the 0.05%
geomean range on CTMark.

For the test case from https://reviews.llvm.org/D126962#3611579
the result after -sccp now looks like this:

    define void @test(i1 %c) {
    entry:
      br i1 %c, label %unreachable, label %next
    next:
      unreachable
    unreachable:
      call void @bar()
      unreachable
    }

-jump-threading does nothing on this, but -simplifycfg will produce
the optimal result.

Differential Revision: https://reviews.llvm.org/D128796
2022-06-30 09:25:03 +02:00
Florian Hahn
78c6b1488f
[CaptureTracking] Increase limit and use it for all visited uses.
Currently the MaxUsesToExplore limit only applies to the number of users
per value, not the total number of users to explore.

The current limit of 20 pessimizes IR with opaque pointers in some
cases. Without opaque pointers, we have deeper pointer def-use chains in
general due to extra bitcasts and geps for structs with index 0.

With opaque pointers the def-use chain is not as deep but wider, due to
bitcasts & 0-geps missing.

To improve the situation for opaque pointers, this patch does 2 things:

 1. Apply the limit to the total number of uses visited. From the
    wording in the description of the option it seems like this may be
    the original intention. With the current implementation we could
    still end up walking a lot of uses.
 2. Increase the limit to 100. This is quite arbitrary, but enables
    a good number of additional optimizations.

Those adjustments have a noticeable compile-time impact though. In part
that is likely due to additional transformations (and conversely
the current baseline misses optimizations after switching to opaque
pointers).

This recovers some regressions that showed up after enabling opaque
pointers.

Limit=100:

* NewPM-O3: +0.21%
* NewPM-ReleaseThinLTO: +0.87%
* NewPM-ReleaseLTO-g: +0.46%

https://llvm-compile-time-tracker.com/compare.php?from=2e50ecb2ef4e1da1aeab05bcf66380068e680991&to=7e6fbe519d958d09f32f01d5d44a622f551e2031&stat=instructions

Limit=60:

* NewPM-O3: +0.14%
* NewPM-ReleaseThinLTO: +0.41%
* NewPM-ReleaseLTO-g: +0.21%

https://llvm-compile-time-tracker.com/compare.php?from=aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=520563fdc146319aae90d06f88d87f2e9e1247b7&stat=instructions

Limit=40:
* NewPM-O3: +0.11%
* NewPM-ReleaseThinLTO: +0.12%
* NewPM-ReleaseLTO-g: +0.09%

https://llvm-compile-time-tracker.com/compare.php?from=aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=c9182576e9fe3f1c84a71479665aef91a416318c&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126236
2022-06-02 21:43:58 +01:00
Florian Hahn
44c86e5cdc
[GVN] Add test for capture tracking use limit.
Test for capture-tracking-max-uses-to-explore, adjusted in D126236.
2022-06-02 20:15:26 +01:00
Nikita Popov
1721ff1dfd [GVN] Enable enable-split-backedge-in-load-pre option by default
This option was added in D89854. It prevents GVN from performing
load PRE in a loop, if doing so would require critical edge
splitting on the backedge. From the review:

> I know that GVN Load PRE negatively impacts peeling,
> loop predication, so the passes expecting that latch has
> a conditional branch.

In the PhaseOrdering test in this patch, splitting the backedge
negatively affects vectorization: After critical edge splitting,
the loop gets rotated, effectively peeling off the first loop
iteration. The effect is that the first element is handled
separately, then the bulk of the elements use a vectorized
reduction (but using unaligned, off-by-one memory accesses) and
then a tail of 15 elements is handled separately again.

It's probably worth noting that the loop load PRE from D99926 is
not affected by this change (as it does not need backedge
splitting). This is about normal load PRE that happens to occur
inside a loop.

Differential Revision: https://reviews.llvm.org/D126382
2022-05-30 09:55:58 +02:00
Philip Reames
6381d4845b [tests] Add test coverage for issue causing revert f7988d0
As theorized, it does look like opnew is not getting inferred inaccessiblemmemonly.
2022-05-18 08:17:57 -07:00
Philip Reames
f7988d08a8 Revert "[BasicAA] Remove unneeded special case for malloc/calloc"
This reverts commit 9b1e00738c5ddba681e17e5cb7c260d9afc4c3a7.

Nikic reported in commit thread that I had forgotten history here, and that a) we'd tried this before, and b) had to revert due to an unexpected codegen impact.  Current measurements confirm the same issue still exists.
2022-05-18 07:35:27 -07:00
Philip Reames
9b1e00738c [BasicAA] Remove unneeded special case for malloc/calloc
This code pre-exists the generic handling for inaccessiblememonly.  If we remove it and update one test with inaccessiblememonly, nothing else changes.  Note that simply running O1 on that test would annotate malloc with the missing inaccessiblememonly.
2022-05-17 20:45:14 -07:00
Florian Hahn
411b9b8153
[GVN] Add test case for memdep invalidation bug.
Test case for #30999.
2022-05-11 20:46:48 +01:00
Nikita Popov
b9dc565147 [GVN] Encode GEPs in offset representation
When using opaque pointers, convert GEPs into offset representation
of the form P + V1 * Scale1 + V2 * Scale2 + ... + ConstantOffset.
This allows us to recognize equivalent address calculations even if
the GEPs don't use the same source element type.

This fixes an opaque pointer codegen regression seen in rustc.

Differential Revision: https://reviews.llvm.org/D124527
2022-04-28 09:32:05 +02:00
Nikita Popov
4fcbd0eb4a [GVN] Add more tests for opaque pointer GEPs (NFC)
Some of these are equivalent when considering an offset encoding.
2022-04-27 15:41:55 +02:00
Simon Pilgrim
6e078f9804 [GVN][NewGVN] Regenerate no_speculative_loads_with_asan.ll tests
As discussed on D124284 - ensure we actually checking the codegen not just a label + return
2022-04-27 10:45:39 +01:00
Artur Pilipenko
4fbde1ef40 Fix MemorySSAUpdater::insertDef for dead code
Fix for https://github.com/llvm/llvm-project/issues/51257.

Differential Revision: https://reviews.llvm.org/D122601
2022-03-31 16:32:35 -07:00
Nikita Popov
cf18ec445d [GVN] Check load type in select PRE
This is no longer implicitly guaranteed with opaque pointers.
2022-03-14 12:46:54 +01:00
Nikita Popov
e9c0720010 [PHITransAddr] Check GEP source element type
It's not the same GEP if the source element type is different.
2022-02-11 16:22:48 +01:00
Nikita Popov
2a1b1f1b1b [GVN] Store source element type for GEP expressions
To avoid incorrectly merging GEPs with different source types
under opaque pointers.

To avoid increasing the Expression structure size, this reuses the
existing type member. The code does not rely on this to be the
expression result type, it's only used as a disambiguator.
2022-02-11 13:03:30 +01:00
Nikita Popov
46f9e45ef0 [Statepoint] Update gc.statepoint calls in tests with elementtype (NFC)
This updates tests for the LangRef change in D117890.
2022-02-04 14:15:41 +01:00
Florian Hahn
8a12cae862
[GVN] Support load of pointer-select to value-select conversion.
This patch extends the available-value logic to detect loads
of pointer-selects that can be replaced by a value select.

For example, consider the code below:

  loop:
    %sel.phi = phi i32* [ %start, %ph ], [ %sel, %ph ]
    %l = load %ptr
    %l.sel = load %sel.phi
    %sel = select cond, %ptr, %sel.phi
    ...

  exit:
    %res = load %sel
    use(%res)

The load of the pointer phi can be replaced by a load of the start value
outside the loop and a new phi/select chain based on the loaded values,
as illustrated below

    %l.start = load %start
  loop:
    sel.phi.prom = phi i32 [ %l.start, %ph ], [ %sel.prom, %ph ]
    %l = load %ptr
    %sel.prom = select cond, %l, %sel.phi.prom
    ...
  exit:
    use(%sel.prom)

This is a first step towards alllowing vectorizing loops using common libc++
library functions, like std::min_element (https://clang.godbolt.org/z/6czGzzqbs)

    #include <vector>
    #include <algorithm>

    int foo(const std::vector<int> &V) {
        return *std::min_element(V.begin(), V.end());
    }

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D118143
2022-02-02 09:23:09 +00:00
Florian Hahn
b1fb613924
[GVN] Add additional tests after 216d1a729.
Further extend test coverage added in 216d1a729
2022-02-01 21:02:41 +00:00
Florian Hahn
216d1a729c
[GVN] Add tests for D118143 not requiring loops. 2022-02-01 20:24:19 +00:00
Florian Hahn
9dd5fffd30
[GVN] Add tests with redundant load of pointer select.
Additional test cases for D118144.
2022-01-28 20:15:32 +00:00
Florian Hahn
56659c80d0
[GVN] Add additional tests for PRE with pointer selects.
Additional tests for D118143.
2022-01-28 19:11:36 +00:00
Florian Hahn
b0956a9acf
[GVN] Add tests for loop load PRE through select. 2022-01-25 14:49:17 +00:00
Bryce Wilson
dd13744bfb
Revert "[BasicAliasAnalysis] Remove isMallocOrCallocLikeFn"
This reverts commit 1f2cfc4fdc1eefb2c5f562c77a5fe7e916bbf670.
2022-01-14 14:42:53 -08:00
Bryce Wilson
1f2cfc4fdc
[BasicAliasAnalysis] Remove isMallocOrCallocLikeFn
Allocation functions should be marked with onlyAccessesInaccessibleMemory (when that is correct for the given function) which is checked elsewhere so this check is no longer needed.

Differential Revision: https://reviews.llvm.org/D117180
2022-01-14 12:22:01 -08:00
Nick Desaulniers
79ebc3b0dd [llvm][test] rewrite callbr to use i rather than X constraint NFC
In D115311, we're looking to modify clang to emit i constraints rather
than X constraints for callbr's indirect destinations. Prior to doing
so, update all of the existing tests in llvm/ to match.

Reviewed By: void, jyknight

Differential Revision: https://reviews.llvm.org/D115410
2022-01-11 11:31:08 -08:00