254 Commits

Author SHA1 Message Date
Nikita Popov
eecb99c5f6 [Tests] Add disjoint flag to some tests (NFC)
These tests rely on SCEV looking recognizing an "or" with no common
bits as an "add". Add the disjoint flag to relevant or instructions
in preparation for switching SCEV to use the flag instead of the
ValueTracking query. The IR with disjoint flag matches what
InstCombine would produce.
2023-12-05 14:09:36 +01:00
Jeremy Morse
d2d9dc8eb4
[DebugInfo][RemoveDIs] Make debugify pass convert to/from RemoveDIs mode (#73251)
Debugify is extremely useful as a testing and debugging tool, and a good
number of LLVM-IR transform tests use it. We need it to support "new"
non-instruction debug-info to get test coverage, but it's not important
enough to completely convert right now (and it'd be a large
undertaking). Thus: convert to/from dbg.value/DPValue mode on entry and
exit of the pass, which gives us the functionality without any further
work. The cost is compile-time, but again this is only happening during
tests.

Tested by: the large set of debugify tests enabled here. Note the
InstCombine test (cast-mul-select.ll) that hasn't been fully enabled:
this is because there's a debug-info sinking piece of code there that
hasn't been instrumented.
2023-11-29 13:19:50 +00:00
Philip Reames
f8742b8d6a
[SCEV] Teach SCEVExpander to use zext nneg when possible (#70815)
zext nneg was recently added to the IR in #67982. Teaching SCEVExpander
to emit nneg when possible is valuable since SCEV may have proved
non-trivial facts about loop bounds which would otherwise be lost when
materializing the value.
2023-10-31 09:33:07 -07:00
Philip Reames
6485978120 Refresh a couple of auto-gen tests [nfc]
Reducing spurious diff in an upcoming review.
2023-10-31 07:46:01 -07:00
Nikita Popov
97f1db2fdd [LoopIdimo] Use tryZExtValue() instead of getZExtValue()
To avoid an assertion for large BECounts.

I also suspect that this code is missing an overflow check.

Fixes https://github.com/llvm/llvm-project/issues/70008.
2023-10-24 11:05:42 +02:00
Jeremy Morse
1ce1732f82 [DebugInfo] Use getStableDebugLoc to pick IRBuilder DebugLocs
When IRBuilder is given an insertion position and there is debug-info, it
sets the DebugLoc of newly inserted instructions to the DebugLoc of the
insertion position. Unfortunately, that means if you insert in front of a
debug intrinsics, your "real" instructions get potentially-misleading
source locations from the debug intrinsics. Worse, if you compile -gmlt to
get source locations but no variable locations, you'll get different source
locations to a normal -g build, which is silly.

Rectify this with the getStableDebugLoc method, which skips over any debug
intrinsics to find the next "real" instruction. This is the source location
that you would get if you compile with -gmlt, and it remains stable in the
presence of debug intrinsics. The changed tests show a few locations where
this has been happening, for example selecting line-zero locations for
instrumentation on a perfectly valid call site.

Differential Revision: https://reviews.llvm.org/D159485
2023-09-11 19:00:44 +01:00
Nikita Popov
69bd66b3ce [Tests] Remove some and/or constant expressions in tests (NFC)
In preparation for their removal in D158081.
2023-08-21 12:05:32 +02:00
Nikita Popov
174300a283 [LoopIdiom] Regenerate test checks (NFC) 2023-07-21 10:12:05 +02:00
William S. Moses
3eb6fefb97 [LoopIdiom] Preserve alias information for memset_pattern
TBAA/NoAlias/AliasScope and other information is currently preserved
when upgrading to a memcpy/memset. However, this is missing when upgrading to
the macOS memset_pattern function. This adds the same alias information preservation
to memset_pattern

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D152934
2023-06-14 16:14:53 -04:00
luxufan
e9ddb584e8 [LoopIdiom] Freeze BitPos if !isGuaranteedNotToBeUndefOrPoison
Fixes: https://github.com/llvm/llvm-project/issues/62873

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D151690
2023-06-07 14:50:22 +08:00
Nikita Popov
d5c56c5162 [SCEVExpander] Remember phi nodes inserted by LCSSA construction
SCEVExpander keeps track of all instructions it inserted. However,
it currently misses some phi nodes created during LCSSA construction.
Fix this by collecting these into another argument.

This also removes the IRBuilder argument, which was added for
essentially the same purpose, but only handles the root LCSSA nodes,
not those inserted by SSAUpdater.

This was reported as a regression on D149344, but the reduced test
case also reproduces without it.

Differential Revision: https://reviews.llvm.org/D150681
2023-05-25 09:34:19 +02:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
OCHyams
72776850ed Revert "[DebugInfo] Print empty MDTuples wrapped in MetadataAsValue inline"
This reverts commit 1e6fe677f8aa98518e05218affa16e468819f5ed (D140900).

Buildbot: https://lab.llvm.org/buildbot/#/builders/196/builds/29937
2023-04-25 14:37:25 +01:00
OCHyams
1e6fe677f8 [DebugInfo] Print empty MDTuples wrapped in MetadataAsValue inline
This improves the readability of debugging intrinsics. Instead of:

    call void @llvm.dbg.value(metadata !2, ...)
    !2 = !{}

We will see:

    call void @llvm.dbg.value(metadata !{}, ...)
    !2 = !{}

Note that we still get a numbered metadata entry for the node even if it's not
used elsewhere. This is to avoid adding more context to the print functions.

This is already legal IR - LLVM can parse and understand it - so there is no
need to update the parser.

The next patches in this stack will make such empty metadata operands more
common and semantically important.

Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D140900
2023-04-25 14:13:47 +01:00
Craig Topper
8bba57b1f1 [LoopIdiomRecognize] Remove NUW flag from SCEV in getTripCount.
Based on the conversation in D147355.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148170
2023-04-13 11:58:10 -07:00
Tim Northover
150595ab4b LoopIdiom: avoid patterned memset if constant is not relocatable.
The pattern we're using for the memset_pattern* call gets put into a static
global variable initialized, which means it has to be representable with
relocations on the target. Most `ConstantExpr` instances do not satisfy this
constraint, so avoid all of them for now.
2023-01-12 18:53:07 +00:00
Nikita Popov
7a752e8108 [LoopIdiom] Convert tests to opaque pointers (NFC)
The differences here are due to SCEVExpander producing GEPs with
explicit offset calculation, a known difference with opaque pointers.
2023-01-06 11:36:37 +01:00
Nikita Popov
89f1876b61 [LoopIdiom] Name instructions in test (NFC) 2023-01-06 11:07:57 +01:00
Nikita Popov
055fb7795a [Transforms] Convert some tests to opaque pointers (NFC)
These are all tests where conversion worked automatically, and
required no manual fixup.
2023-01-05 12:43:45 +01:00
Roman Lebedev
45fcdaf6b6
[NFC] Port all LoopIdiom tests to -passes= syntax 2022-12-08 02:38:46 +03:00
Roman Lebedev
48c6b2729e
[NFC] Port all LoopIdiom tests to -passes= syntax 2022-12-07 23:15:16 +03:00
Arthur Eubanks
f3a928e233 [opt] Don't translate legacy -analysis flag to require<analysis>
Tests relying on this should explicitly use -passes='require<analysis>,foo'.
2022-10-07 14:54:34 -07:00
Simon Pilgrim
37dc4373aa [LoopIdiom] Add non-LZCNT target test coverage 2022-09-19 18:13:11 +01:00
Simon Pilgrim
6b4d409f69 [CostModel][X86] Add CostKinds handling for CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF instructions
This was achieved with the 'cost-tables vs llvm-mca' script D103695
2022-09-19 17:37:58 +01:00
Simon Pilgrim
95c2c9c5c5 [LoopIdiom][X86] Add non-LZCNT test coverage to 'rshift until zero' idiom tests 2022-09-16 17:23:54 +01:00
Eli Friedman
abdf0da800 [LoopIdiom] Fix bailout for aliasing in memcpy transform.
Commit dd5991cc modified the aliasing checks here to allow transforming
a memcpy where the source and destination point into the same object.
However, the change accidentally made the code skip the alias check for
other operations in the loop.

Instead of completely skipping the alias check, just skip the check for
whether the memcpy aliases itself.

Differential Revision: https://reviews.llvm.org/D126486
2022-05-31 17:24:23 -07:00
Dávid Bolvanský
260679b000 [NFCI] Regenerate LoopIdiomRecognize test checks 2022-04-04 00:21:26 +02:00
Stephen Long
e02f4976ac [LoopIdiom] Merge TBAA of adjacent stores when creating memset
Factor in the TBAA of adjacent stores instead of just the head store
when merging stores into a memset. We were seeing GVN remove a load that
had a TBAA that matched the 2nd store because GVN determined it didn't
match the TBAA of the memset. The memset had the TBAA of only the first
store.

i.e. Loading the field pi_ of shared_count after memset to create an
array of shared_ptr

template<class T>
class shared_ptr {
  T *p;
  shared_count refcount;
};

class shared_count {
  sp_counted_base *pi_;
};

Differential Revision: https://reviews.llvm.org/D122205
2022-03-30 16:54:49 -07:00
Nikita Popov
d9715a7266 [SCEV] Don't try to reuse expressions with offset
SCEVs ExprValueMap currently tracks not only which IR Values
correspond to a given SCEV expression, but additionally stores that
it may be expanded in the form X+Offset. In theory, this allows
reusing existing IR Values in more cases.

In practice, this doesn't seem to be particularly useful (the test
changes are rather underwhelming) and adds a good bit of complexity.
Per https://github.com/llvm/llvm-project/issues/53905, we have an
invalidation issue with these offseted expressions.

Differential Revision: https://reviews.llvm.org/D120311
2022-02-25 09:16:48 +01:00
William S. Moses
8cb9c73609 [LoopIdiom] Keep TBAA when creating memcpy/memmove
When upgrading a loop of load/store to a memcpy, the existing pass does not keep existing aliasing information. This patch allows existing aliasing information to be kept.

Reviewed By: jeroen.dobbelaere

Differential Revision: https://reviews.llvm.org/D108221
2022-01-31 16:28:13 -05:00
Florian Hahn
782c0dd1a1
[IRBuilder] Migrate and-folding to value-based FoldAnd.
Similar to the migration of or-folding to FoldOr, there are a few cases
where the fold in IRBuilder::CreateAnd triggered directly. Those have
been updated.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D117431
2022-01-20 10:22:21 +00:00
Alex Bradbury
33d008b169 [RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be experimental
Agreed policy is that RISC-V extensions that have not yet been ratified
should be marked as experimental, and enabling them requires the use of
the -menable-experimental-extensions flag when using clang alongside the
version number. These extensions have now been ratified, so this is no
longer necessary, and the target feature names can be renamed to no
longer be prefixed with "experimental-".

Differential Revision: https://reviews.llvm.org/D117131
2022-01-12 19:33:44 +00:00
eopXD
bc17d32a5f [LoopIdiom] Let LIR fold memset pointer / stride SCEV regarding loop guards
Expression guraded in loop entry can be folded prior to comparison. This patch
proceeds D107353 and makes LIR able to deal with nested for-loop.

Reviewed By: qianzhen, bmahjour

Differential Revision: https://reviews.llvm.org/D108112
2021-12-13 09:36:58 -08:00
Roman Lebedev
b291597112
Revert rest of IRBuilderBase's short-circuiting folds
Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't solve the motivational problem anyway.

Additionally, some more (polly) tests have escaped being updated.
So, let's just take a step back here.

This reverts commit f3190dedeef9da2109ea57e4cb372f295ff53b88.
This reverts commit 749581d21f2b3f53e4fca4eb8728c942d646893b.
This reverts commit f3df87d57e096143670e0fd396e81d43393a2dd2.
This reverts commit ab1dbcecd6f0969976fafd62af34730436ad5944.
2021-10-28 02:15:14 +03:00
Roman Lebedev
42712698fd
Revert "[IR] IRBuilderBase::CreateAdd(): short-circuit x + 0 --> x"
Clang OpenMP codegen tests are failing.

This reverts commit 288f1f8abe5835180a0021f142043ee261ab3846.
This reverts commit cb90e5356ac1594e95fed8e208d6e0e9b6a87db1.
2021-10-27 22:21:37 +03:00
Roman Lebedev
cb90e5356a
[IR] IRBuilderBase::CreateAdd(): short-circuit x + 0 --> x
There's precedent for that in `CreateOr()`/`CreateAnd()`.

The motivation here is to avoid bloating the run-time check's IR
in `SCEVExpander::generateOverflowCheck()`.

Refs. https://reviews.llvm.org/D109368#3089809
2021-10-27 21:34:38 +03:00
Roman Lebedev
749581d21f
[IR] IRBuilderBase::CreateAnd(): fix short-circuiting for constant on LHS
Refs. https://reviews.llvm.org/D109368#3089809
2021-10-27 18:01:06 +03:00
eopXD
76db6d8080 [NFC][LoopIdiom] Add more test case to runtime-determined memset size
This patch supplements missing test case for D107353.
- Fix wrong descriptions in 64-bit mode test case
- Added testcase under 32-bit mode

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D108507
2021-10-21 00:05:18 -07:00
Clement Courbet
6aaf1e7ea9 [LoopIdiom] Fix store size SCEV type.
We were using the type of the loop back edge count to represent the
store size. This failed for small loop counts (e.g. in the added test,
the loop count was an i2).

Use the index type instead.

Fixes PR52104.

Differential Revision: https://reviews.llvm.org/D111401
2021-10-11 09:39:06 +02:00
Dawid Jurczak
dd5991cc6f [LoopIdiom] Transform loop containing memcpy to memmove
The purpose of patch is to learn Loop Idiom Recognize pass how to recognize simple memmove patterns
in similar way like GCC does: https://godbolt.org/z/dKjGvTGff
It's follow-up of following change: https://reviews.llvm.org/D104464

Differential Revision: https://reviews.llvm.org/D107075
2021-10-08 09:56:01 +02:00
Craig Topper
f2ad8c9dc6 [RISCV] Remove experimental-b extension that includes all Zb* extensions
At this point it looks like a B extension will never exist. Instead
Zba, Zbb, Zbc, and Zbs are individual extensions being ratified
together as a package. Unknown at this time when or if the other
Zb* extensions will be ratified.

This patch removes references to the B extension. I've updated and
split tests accordingly.

This has been split from D110669 to make review a little easier.

Differential Revision: https://reviews.llvm.org/D111338
2021-10-07 20:47:17 -07:00
Philip Reames
f39978b84f [SCEV] Correctly propagate nowrap flags across scopes when folding invariant add through addrec
This fixes a violation of the wrap flag rules introduced in c4048d8f. This is an alternate fix to D106852.

The basic problem being fixed is that we infer a set of flags which is valid at some inner scope S1 (usually by correctly propagating them from IR), and then (incorrectly) extend them to a SCEV in scope S2 where S1 != S2. This is not in general safe per the wrap flags semantics recently defined.

In this patch, I include a simple inference step to handle the case where we can prove that S2 is the preheader of the loop S1, and that entry into S2 implies execution of S1. See the code for a more detailed explanation.

One worry I have with this patch is that I might be over-fitting what shows up in tests - and thus hiding negative impact we'd see in the real world. My best defense is that the rule used here very closely follows the one used to propagate the flags from IR to the inner add to start with, and thus if one is reasonable, so probably is the other. Curious what others think about that piece.

The test diffs are roughly as expected. Mostly analysis only, with two transform changes. Oddly, the result looks better in the loop-idiom test, and I don't understand the PPC output enough to have tell. Nothing terrible looking though. (For context, without the scope inference peephole, the test delta includes a couple of vectorization tests. Again, not super concerning, but slightly more so.)

Differential Revision: https://reviews.llvm.org/D109845
2021-10-03 15:19:33 -07:00
Jon Roelofs
4b19e7dfae [LoopIdiomRecognize][Remarks] Track loop-strided store to/from blocks
Differential revision: https://reviews.llvm.org/D109929
2021-09-16 15:46:26 -07:00
Philip Reames
d4e03bccd4 regen an autogened test which is stale 2021-09-14 18:42:23 -07:00
Dawid Jurczak
bdcf04246c [LoopIdiom] Don't transform loop into memmove when load from body has more than one use
This change fixes issue found by Markus: https://reviews.llvm.org/rG11338e998df1
Before this patch following code was transformed to memmove:

for (int i = 15; i >= 1; i--) {
  p[i] = p[i-1];
  sum += p[i-1];
}

However load from p[i-1] is used not only by store to p[i] but also by sum computation.
Therefore we cannot emit memmove in loop header.

Differential Revision: https://reviews.llvm.org/D107964
2021-08-25 14:22:40 +02:00
Dawid Jurczak
2e8534beb2 [NFC][LoopIdiom] Add reproducer of wrong memmove transformation
That's precommit test for D107964.

Differential Revision: https://reviews.llvm.org/D108537
2021-08-24 12:00:24 +02:00
eopXD
012173680f [LoopIdiom] let the pass deal with runtime memset size
The current LIR does not deal with runtime-determined memset-size. This patch
utilizes SCEV and check if the PointerStrideSCEV and the MemsetSizeSCEV are equal.
Before comparison the pass would try to fold the expression that is already
protected by the loop guard.

Testcase file `memset-runtime.ll`, `memset-runtime-debug.ll` added.

This patch deals with proper loop-idiom. Proceeding patch wants to deal with SCEV-s
that are inequal after folding with the loop guards.

Reviewed By: lebedev.ri, Whitney

Differential Revision: https://reviews.llvm.org/D107353
2021-08-14 19:22:06 +08:00
Dawid Jurczak
11338e998d [LoopIdiom] Transform memmove-like loop into memmove (PR46179)
The purpose of patch is to learn Loop idiom recognition pass how to recognize simple memmove patterns
in similar way like GCC: https://godbolt.org/z/fh95e83od
LoopIdiomRecognize already has machinery for memset and memcpy recognition, patch tries to extend exisiting capabilities with minimal effort.

Differential Revision: https://reviews.llvm.org/D104464
2021-07-22 13:05:43 +02:00
Jon Roelofs
37b6e03c18 [Intrinsics] Make MemCpyInlineInst a MemCpyInst
This opens up more optimization opportunities in passes that already handle MemCpyInst's.

Differential revision: https://reviews.llvm.org/D105247
2021-07-02 10:25:24 -07:00
Eli Friedman
8f3d16905d [ScalarEvolution] Ensure backedge-taken counts are not pointers.
A backedge-taken count doesn't refer to memory; returning a pointer type
is nonsense. So make sure we always return an integer.

The obvious way to do this would be to just convert the operands of the
icmp to integers, but that doesn't quite work out at the moment:
isLoopEntryGuardedByCond currently gets confused by ptrtoint operations.
So we perform the ptrtoint conversion late for lt/gt operations.

The test changes are mostly innocuous. The most interesting changes are
more complex SCEV expressions of the form "(-1 * (ptrtoint i8* %ptr to
i64)) + %ptr)". This is expected: we can't fold this to zero because we
need to preserve the pointer base.

The call to isLoopEntryGuardedByCond in howFarToZero is less precise
because of ptrtoint operations; this shows up in the function
pr46786_c26_char in ptrtoint.ll. Fixing it here would require more
complex refactoring.  It should eventually be fixed by future
improvements to isImpliedCond.

See https://bugs.llvm.org/show_bug.cgi?id=46786 for context.

Differential Revision: https://reviews.llvm.org/D103656
2021-06-21 16:24:16 -07:00