70 Commits

Author SHA1 Message Date
Jeffrey Byrnes
7180c23cf6
[SeparateConstOffsetFromGEP] Reland: Reorder trivial GEP chains to separate constants (#81671)
Actually update tests w.r.t
9e5a77f252
and reland https://github.com/llvm/llvm-project/pull/73056
2024-02-13 17:10:23 -08:00
Philip Reames
99c5a66c62 Revert "[SeparateConstOffsetFromGEP] Reorder trivial GEP chains to separate constants (#73056)" and follow ups
"ninja check-llvm" is failing on tip of tree.

This reverts commit ec0aa1646e9953d1a8d0d15dc381d3250c854572.
This reverts commit 1b65742f8c71f576381fe85d5e34579b24f2d874.
2024-02-13 13:29:23 -08:00
Jeffrey Byrnes
ec0aa1646e [SeparateConstOffsetFromGEP] Fix test after 1b65742f8c71f576381fe85d5e34579b24f2d874
Change-Id: I7ced7774c80997d21969ab7886fc30c0c1e1cc81
2024-02-13 11:48:03 -08:00
Jeffrey Byrnes
1b65742f8c
[SeparateConstOffsetFromGEP] Reorder trivial GEP chains to separate constants (#73056)
In this case, a trivial GEP chain has the form:

```
%ptr = getelementptr sameType, %base, constant
%val = getelementptr sameType, %ptr, %variable
```

That is, a one-index GEP consumes another (of the same basis and result
type) one-index GEP, where the inner GEP uses a constant index and the
outer GEP uses a variable index. For chains of this type, it is trivial
to reorder them (by simply swapping the indexes). The result of doing so
is better AddrMode matching for users of the ultimate ptr produced by
GEP chain.

Future patches can extend this to support non-trivial GEP chains (e.g.
those with different basis types and/or multiple indices).
2024-02-13 11:22:49 -08:00
Krzysztof Drewniak
63fe80fb18
[SeperateConstOffsetFromGEP] Handle or disjoint flags (#76997)
This commit extends separate-const-offset-from-gep to look at the
newly-added `disjoint` flag on `or` instructions so as to preserve
additional opportunities for optimization.

The tests were pre-committed in #76972.
2024-01-26 09:56:06 -06:00
Nikita Popov
9e5a77f252 [SeparateConstOffsetFromGEP] Always emit i8 gep
Always emit canonical i8 GEPs, don't try to preserve the original
element type. As this is a backend pass, trying to preserve the
type is not useful.
2024-01-10 11:57:28 +01:00
Nikita Popov
c2654befca [SeparateConstOFfsetFromGEP] Regenerate test checks (NFC) 2024-01-10 11:43:50 +01:00
Krzysztof Drewniak
cd3942059e
[SeperateConstOffsetFromGEP] Pre-commit tests for or disjoint handling (#76972)
1. Adds tests for the existing interpretation of `or` as `add` in
SeperateConstOffsetFromGEP.
2. Pre-commits a test for `or disjoint`.
2024-01-04 14:08:30 -06:00
Alex Richardson
e39f6c1844 [opt] Infer DataLayout from triple if not specified
There are many tests that specify a target triple/CPU flags but no
DataLayout which can lead to IR being generated that has unusual
behaviour. This commit attempts to use the default DataLayout based
on the relevant flags if there is no explicit override on the command
line or in the IR file.

One thing that is not currently possible to differentiate from a missing
datalayout `target datalayout = ""` in the IR file since the current
APIs don't allow detecting this case. If it is considered useful to
support this case (instead of passing "-data-layout=" on the command
line), I can change IR parsers to track whether they have seen such a
directive and change the callback type.

Differential Revision: https://reviews.llvm.org/D141060
2023-10-26 12:07:37 -07:00
Alex Richardson
83c4227ab7 Auto-generate test checks for tests affected by D141060
These files had manual CHECK lines which make the diff from D141060
very difficult to review.
2023-10-04 10:51:35 -07:00
Paul Walker
c7d65e4466 [IR] Enable load/store/alloca for arrays of scalable vectors.
Differential Revision: https://reviews.llvm.org/D158517
2023-09-14 13:49:01 +00:00
Matt Arsenault
f2596b754c SeparateConstOffsetFromGEP: Don't use SCEV
This was only using the SCEV expressions as a map key, which we can do
just as well with the value pointers. This also allows it to handle
vectors.
2023-06-26 13:58:06 -04:00
Matt Arsenault
6882d9adb4 SeparateConstOffsetForGEP: Fill out some missing test coverage
Try to test several untested paths.
 - Test the extension source type check
 - Test the programUndefinedIfPoison check
 - Test the add/sub with commuted operands
 - Test with vectors
 - Test multiple uses
 - Try to break operand map mismatches
 - Add some preparatory tests for zext+nuw support.
2023-06-26 13:58:06 -04:00
Matt Arsenault
e9ab7ff73a SeparateConstOffsetFromGEP: Copy a test to AMDGPU 2023-06-26 13:58:06 -04:00
Matt Arsenault
fe0750b979 SeparateConstOffsetFromGEP: Reorder run lines
Testing codegen in test/Transforms is questionable to begin with, but
it's more reasonable to see failures on the IR half before ISA checks.
2023-06-26 13:58:06 -04:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Tom Stellard
2e3cabe172 [SeparateConstOffsetFromGEP] Fix bug handling negative offsets
Fix bug constants and sub instructions

When finding constants in a chain starting with the RHS operator of
sub instructions, we were negating the constant before zero extending
it, which is incorrect.

Unfortunately, I was unable to find a simple way to implement this
transformation correctly, so for now I just disabled this optimization
for constants that feed into the RHS of a sub.

Resolves #62379

Transformation from alive2.llvm.org:

    define i16 @src(i8 %a, i8 %b, i8 %c) {
    entry:
    %0 = sub nuw nsw i8 %c, %a
    %1 = sub nuw nsw i8 %b, %0
    %2 = zext i8 %1 to i16
    ret i16 %2
    }

    Before/Bad:

    define i16 @tgt(i8 %a, i8 %b, i8 %c) {
    entry:
    %0 = zext i8 %a to i16
    %1 = zext i8 %b to i16
    %c_neg = sub i8 0, %c
    %c_zext = zext i8 %c_neg to i16
    %2 = sub i16 0, %0
    %3 = sub i16 %1, %2
    %4 = add i16 %3, %c_zext
    ret i16 %4
    }

    Correct:

    define i16 @tgt(i8 %a, i8 %b, i8 %c) {
    entry:
    %0 = zext i8 %a to i16
    %1 = zext i8 %b to i16
    %c_zext = zext i8 %c to i16
    %c_neg = sub i16 0, %c_zext
    %2 = sub i16 0, %0
    %3 = sub i16 %1, %2
    %4 = add i16 %3, %c_neg
    ret i16 %4
    }

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D149507
2023-05-04 18:45:49 -07:00
Krzysztof Drewniak
916425b2d1 [llvm] Use pointer index type for more GEP offsets (pre-codegen)
Many uses of getIntPtrType() were using that type to calculate the
neened type for GEP offset arguments. However, some time ago,
DataLayout was extended to support pointers where the size of the
pointer is not equal to the size of the values used to index it.

Much code was already migrated to, for example, use getIndexSizeInBits
instead of getPtrSizeInBits, but some rewrites still used
getIntPtrType() to get the type for GEP offsets.

This commit changes uses of getIntPtrType() to getIndexType() where
they are involved in a GEP-related calculation.

In at least one case (bounds check insertion) this resolves a compiler
crash that the new test added here would previously trigger.

This commit does not impact
- C library-related rewriting (memcpy()), which are operating under
the assumption that intptr_t == size_t. While all the mechanisms for
breaking this assumption now exist, doing so is outside the scope of
this commit.
- Code generation and below. Note that the use of getIntPtrType() in
CodeGenPrepare will be changed in a future commit.
- Usage of getIntPtrType() in any backend

Depends on D143435

Reviewed By: arichardson

Differential Revision: https://reviews.llvm.org/D143437
2023-03-28 16:41:02 +00:00
Liren Peng
06f06644ef [SeparateConstOffsetFromGEP] Fix: b - a matched a - b during reuniteExts
During the SeparateConstOffsetFromGEP pass, a - b and b - a will be
considered equivalent in some instances.

An example- the IR contains:

  BB1:
      %add = add %a, 511
      br label %BB2
  BB2:
      %sub2 = sub %b,  %a
      br label %BB3
  BB3:
      %sub1 = sub %add, %b
      %gep = getelementptr float, ptr %p, %sub1

Step 1 in the SeparateConstOffsetFromGEP pass, after split constant index:

  BB1:
      %add = add %a, 511
      br label %BB2
  BB2:
      %sub2 = sub %b,  %a
      br label %BB3
  BB3:
      %sub.t = sub %a, %b
      %gep.base = getelementptr float, ptr %p, %sub.t
      %gep = getelementptr float, ptr %gep.base, 511

Step 2, after reuniteExts:

  BB1:
      br label %BB2
  BB2:
      %sub2 = sub %b,  %a
      br label %BB3
  BB3:
      %gep.base = getelementptr float, ptr %p, %sub2
      %gep = getelementptr float, ptr %gep.base, 511

Obviously, reuniteExts treated a - b and b - a as equivalent.
This patch fixes that.

Reviewed By: nikic, spatel

Differential Revision: https://reviews.llvm.org/D143542
2023-02-15 02:33:31 +00:00
Liren Peng
a52432f633 [NFC][SeparateConstOffsetFromGEP] Added flag lower-gep
We need such a flag to check whether the transformation is correct if
LowerGEP was enabled.

Reviewed By: nikic, arsenm, spatel

Differential Revision: https://reviews.llvm.org/D143980
2023-02-15 02:04:30 +00:00
Krzysztof Drewniak
2d279c0d95 [llvm] Add tests for upcoming fixes to pointer/index type confusion.
Various parts of the codebase are using getIntPtrType() and its
relatives when getting the type of the offset argument to GEP. Most
such code has been updated to use the pointer index type field from
the data layout, but there is code that still assumes these two types
are the same in certain optimizaiton passes.

This commit adds regression tests to capture the old behavior.

Reviewed By: #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D143435
2023-02-07 16:06:58 +00:00
Paul Walker
1dee7f9571 [SeparateConstOffsetFromGEP] Remove TypeSize error when collecting constant indices.
Differential Revision: https://reviews.llvm.org/D140229
2022-12-19 14:08:13 +00:00
Bjorn Pettersson
3528e63d89 [test] Remove duplicate RUN lines in Transform tests 2022-12-08 11:47:16 +01:00
Roman Lebedev
0dd180a5e3
[NFC] Port all SeparateConstOffsetFromGEP tests to -passes= syntax 2022-12-08 02:38:50 +03:00
Matt Arsenault
4cbab1e5ff SeparateConstOffsetFromGEP: Update tests to use opaque pointers
NVPTX/split-gep.ll needed a check for a bitcast replaced.
2022-11-27 20:53:52 -05:00
Matt Arsenault
d1c0092163 SeparateConstOffsetFromGEP: Fix creating pointless bitcasts
This was directly creating new BitCastInsts, so under opaque pointers,
would end up producing bitcast from ptr to ptr.
2022-11-27 20:53:48 -05:00
Matt Arsenault
e8d4550813 SeparateConstOffsetFromGEP: Add baseline test for opaque pointers
This currently emits a pointless bitcast.
2022-11-27 20:53:43 -05:00
Matt Arsenault
926bba1424 SeparateConstOffsetFromGEP: Switch tests to use -passes 2022-11-27 10:09:58 -05:00
David Green
201b7858f6 [AArch64] Disable aarch64-enable-gep-opt
This option was enabled in D128582, and whilst it seems to be a net
improvement in many cases, at least a couple of issues have been
reported from D135957 and from the CSE added to the backend causing more
instructions in executed blocks. Revert for the time being, until we can
improve the precision.
2022-11-19 21:25:18 +00:00
Shubham Narlawar
f55dbfbd9d [AArch64] Move SeparateConstOffsetFromGEPPass before LSR and enable EnableGEPOpt by default.
GEP's across basic blocks were not getting splitted due to EnableGEPOpt
which was turned off by default. Hence, EarlyCSE missed the opportunity
to eliminate common part of GEP's. This can be achieved by simply
turning GEP pass on.
 - This patch moves SeparateConstOffsetFromGEPPass() just before LSR.
 - It enables EnableGEPOpt by default.

Resolves - https://github.com/llvm/llvm-project/issues/50528

Added an unit test.

Differential Revision: https://reviews.llvm.org/D128582
2022-07-22 15:20:53 +01:00
Elena Lepilkina
4e1090cfe9 [test][RISCV] Precommit test for SeparateConstOffsetFromGEP (NFC)
Precommit test for D127727
2022-06-15 16:03:30 +03:00
Jay Foad
f510045d82 [CodeGen] Remove unneeded regex escaping in FileCheck patterns. NFC.
Take advantage of D117117 to simplify all {{\[}} to [ and {{\]}} to ].

Differential Revision: https://reviews.llvm.org/D117298
2022-02-18 16:10:56 +00:00
Matt Arsenault
88146230e1 SeparateConstOffsetFromGEP: Fix stack overflow in unreachable code
ConstantOffsetExtractor::Find was infinitely recursing on the add
referencing itself.
2021-09-14 19:49:38 -04:00
Arthur Eubanks
5561b48b70 [test] Make global in split-gep-and-gvn.ll not constant
An upcoming change will cause loads from a constant zeroinitializer
global to be constant folded, breaking this test.
2021-04-19 11:03:19 -07:00
Arthur Eubanks
1cbf8e89b5 [NewPM] Port -separate-const-offset-from-gep
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D91095
2020-11-09 17:42:36 -08:00
Eli Friedman
4532a50899 Infer alignment of unmarked loads in IR/bitcode parsing.
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.

The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.

Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.

This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.

Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.

Differential Revision: https://reviews.llvm.org/D78403
2020-05-14 13:03:50 -07:00
Jonathan Roelofs
1148f004fa Fix PR45371: SeparateConstOffsetFromGEP clean up bookkeeping
find() was altering the UserChain, even in cases where it subsequently
discovered that the resulting constant was a 0. This confuses
rebuildWithoutConstOffset() when it attempts to walk the chain later, since it
is expected that the chain itself be a path down the use-def edges of an
expression.
2020-04-01 12:38:15 -06:00
Drew Wock
0bcfafc5e7 [SeparateConstOffsetFromGEP] Fix: sext(a) + sext(b) -> sext(a + b) matches add and sub instructions with one another
During the SeparateConstOffsetFromGEP pass, signed extensions are distributed
to the values that feed into them and then later recombined. The recombination
stage is somewhat problematic- it doesn't differ add and sub instructions
from another when matching the sext(a) +/- sext(b) -> sext(a +/- b) pattern
in some instances.

An example- the IR contains:
%unextendedA
%unextendedB
%subuAuB = unextendedA - unextendedB
%extA = extend A
%extB = extend B
%addeAeB = extA + extB

The problematic optimization will transform that into:

%unextendedA
%unextendedB
%subuAuB = unextendedA - unextendedB
%extA = extend A
%extB = extend B
%addeAeB = extend subuAuB ; Obviously not semantically equivalent to the IR input.

This patch fixes that.

Patch by Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D65967
2020-01-17 12:22:52 -05:00
Fangrui Song
ac14f7b10c [lit] Delete empty lines at the end of lit.local.cfg NFC
llvm-svn: 363538
2019-06-17 09:51:07 +00:00
Eric Christopher
cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher
a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Artem Belevich
c2cd5d5ce0 [Split GEP] handle trunc() in separate-const-offset-from-gep pass.
Let separate-const-offset-from-gep pass handle trunc() when it calculates
constant offset relative to base. The pass itself may insert trunc()
instructions when it canonicalises array indices to pointer-size integers
and needs to handle trunc() in order to evaluate the offset.

Differential Revision: https://reviews.llvm.org/D46732

llvm-svn: 332142
2018-05-11 21:13:19 +00:00
Yaxun Liu
0124b5484c [AMDGPU] Change constant addr space to 4
Differential Revision: https://reviews.llvm.org/D43170

llvm-svn: 325030
2018-02-13 18:00:25 +00:00
Marek Olsak
8f2df9d26c [SeparateConstOffsetFromGEP] Fix up addrspace in the AMDGPU test
llvm-svn: 323913
2018-01-31 20:49:19 +00:00
Marek Olsak
8e7d149a31 [SeparateConstOffsetFromGEP] Preserve metadata when splitting GEPs
Summary:
!amdgpu.uniform needs to be preserved for AMDGPU, otherwise bad things
happen.

Reviewers: arsenm, nhaehnle, jingyue, broune, majnemer, bjarke.roune, dblaikie

Subscribers: wdng, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D42744

llvm-svn: 323907
2018-01-31 20:17:52 +00:00
Matt Arsenault
3dbeefa978 AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444
2017-03-21 21:39:51 +00:00
Justin Lebar
cd564c6b46 [NVPTX] Enable the load-store vectorizer on nvptx.
Reviewers: tra

Subscribers: jholewinski, arsenm, asbirlea

Differential Revision: https://reviews.llvm.org/D22592

llvm-svn: 276196
2016-07-20 22:11:36 +00:00
Jingyue Wu
2b353a9522 [ReassociateGEP] Update tests to allow missing "inbounds" on certain GEPs.
With r275532 fixing miscompilation of GVN, "inbounds" on certain GEPs in these
tests cannot be preserved any more. Left a TODO in the tests for future
reference.

llvm-svn: 275596
2016-07-15 18:47:17 +00:00
David Majnemer
959a6623b5 XFAIL two SeparateConstOffsetFromGEP tests
They appear to have relied on bugs hidden in copyIRFlags/andIRFlags.

This has been filed as PR28564.

llvm-svn: 275533
2016-07-15 05:37:22 +00:00
Philip Reames
146307eb52 [ValueTracking] Remove dead code from an old experiment
This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits.  While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default.  Given this, it's time to abandon the experiment.  

Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments.  Given how easy the patch is to apply, there's no reason to leave the code in tree.  

For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads.  In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach.  

llvm-svn: 262646
2016-03-03 19:44:06 +00:00