11562 Commits

Author SHA1 Message Date
Nikita Popov
a5c3b5748c [MemCpyOpt] Work around PR54682
As discussed on https://github.com/llvm/llvm-project/issues/54682,
MemorySSA currently has a bug when computing the clobber of calls
that access loop-varying locations. I think a "proper" fix for this
on the MemorySSA side might be non-trivial, but we can easily work
around this in MemCpyOpt:

Currently, MemCpyOpt uses a location-less getClobberingMemoryAccess()
call to find a clobber on either the src or dest location, and then
refines it for the src and dest clobber. This was intended as an
optimization, as the location-less API is cached, while the
location-affected APIs are not.

However, I don't think this really makes a difference in practice,
because I don't think anything will use the cached clobbers on
those calls later anyway. On CTMark, this patch seems to be very
mildly positive actually.

So I think this is a reasonable way to avoid the problem for now,
though MemorySSA should also get a fix.

Differential Revision: https://reviews.llvm.org/D122911
2022-04-04 10:19:51 +02:00
Nikita Popov
c0cc98251a [Float2Int] Make sure dependent ranges are calculated first (PR54669)
The range calculation in walkForwards() assumes that the ranges of
the operands have already been calculated. With the used visit
order, this is not necessarily the case when there are multiple
roots. (There is nothing guaranteeing that instructions are visited
in topological order.)

Fix this by queuing instructions for reprocessing if the operand
ranges haven't been calculated yet.

Fixes https://github.com/llvm/llvm-project/issues/54669.

Differential Revision: https://reviews.llvm.org/D122817
2022-04-04 10:18:39 +02:00
Philip Reames
7c51669c21 [memcpyopt] Restructure store(load src, dest) form of callslotopt for compile time
The search for the clobbering call is fairly expensive if uses are not optimized at construction.  Defer the clobber walk to the point in the implementation we need it; there are a bunch of bailouts before that point.  (e.g. If the source pointer is not an alloca, we can't do callslotopt.)

On a test case which involves a bunch of copies from argument pointers, this switches memcpyopt from > 1/2 second to < 10ms.
2022-04-03 20:16:20 -07:00
Xiang1 Zhang
f830392be7 Correct spelling error in TLS-Load-Hoist 2022-04-04 08:27:54 +08:00
Florian Hahn
5bedc1f093
[ConstraintElimination] Move logic to build worklist to helper (NFC).
This refactor makes it easier to extend the logic to collect information
from blocks in the future, without even further increasing the size of
eliminateConstriants.
2022-04-02 16:55:05 +01:00
Xiang1 Zhang
a56f264958 Refine tls-load-hoista llvm option
Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D122890
2022-04-01 19:03:58 +08:00
Jorge Gorbe Moya
fc7573f29c Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 46774df307159444d65083c2fd82f8574f0ab1d9.
2022-03-31 14:54:41 -07:00
Paul Kirth
46774df307 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-31 17:38:21 +00:00
Nikita Popov
33ac23e7cf [Float2Int] Avoid unnecessary lamdbas (NFC)
Instead of first creating a lambda for calculating the range,
then collecting the ranges for the operands, and then calling the
lambda on those ranges, we can first calculate the operand ranges
and then calculate the result directly in the switch.
2022-03-31 16:13:13 +02:00
Nikita Popov
f66975555f [Float2Int] Extract calcRange() method (NFC)
This avoids the awkward "Abort" flag, because we can simply
early-return instead.
2022-03-31 16:13:13 +02:00
Aditya Kumar
368681f803 [GVNHoist] drop debug location according to the debug info guide
According to the LLVM debug info update guide: https://llvm.org/docs/HowToUpdateDebugInfo.html,
"Hoisting identical instructions which appear in several successor
blocks into a predecessor block. In this case there is no single
merged instruction. The rule for dropping locations applies".

Thanks to Yuanbo Li for reporting this.

Reviewed By: dblaikie

Reviewers: sebpop, tejohnson, dblaikie

Differential Revision: https://reviews.llvm.org/D122730
2022-03-30 20:17:53 -07:00
Stephen Long
e02f4976ac [LoopIdiom] Merge TBAA of adjacent stores when creating memset
Factor in the TBAA of adjacent stores instead of just the head store
when merging stores into a memset. We were seeing GVN remove a load that
had a TBAA that matched the 2nd store because GVN determined it didn't
match the TBAA of the memset. The memset had the TBAA of only the first
store.

i.e. Loading the field pi_ of shared_count after memset to create an
array of shared_ptr

template<class T>
class shared_ptr {
  T *p;
  shared_count refcount;
};

class shared_count {
  sp_counted_base *pi_;
};

Differential Revision: https://reviews.llvm.org/D122205
2022-03-30 16:54:49 -07:00
Chang-Sun Lin Jr
c28ce745cf Value-number GVNHoist loads by result type as well as pointer address.
Avoids merge errors when opaque pointers are loaded into different types.

Reviewed by: jcranmer-intel, hiraditya
Differential Revision: https://reviews.llvm.org/D122521
2022-03-30 11:33:49 -07:00
Florian Hahn
3dbb5eb2cd
[ConstraintElimination] Move ConstraintInfo after ConstraintTy. (NFC)
Code movement to it slightly easier to use ConstraintTy & co in
ConstraintInfo directly, for follow-up patches.
2022-03-29 09:59:03 +01:00
Serguei Katkov
6444a65514 [LSR] Fixup canonicalization formula and its checker.
According to definition of canonical form, it is a canonical
if scale reg does not contain addrec for loop L then none of bases
should contain addrec for this loop.

The critical word here is "contains".

Current checker of canonical form checks not "containing" property
but "is". So it does not check whether it contains but whether it is.

Fix the checker and canonicalizing utility to follow definition.

Without this fix in the test attached the base formula looking as
reg((-1 * {0,+,8}<nuw><nsw><%bb2>)<nsw>) + 1*reg((8 * (%arg /u 8))<nuw>)
is considered as conanocial while base contains an addrec.
And modified formula we want to insert
reg({0,+,8}<nuw><nsw><%bb2>) + 1*reg((-8 * (%arg /u 8)))
is considered as not canonical.

Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D122457
2022-03-29 14:05:04 +07:00
Paul Kirth
90cb325abd Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 2add3fbd976d7b80a3a7fc14ef0deb9b1ca6beee.
2022-03-29 06:20:30 +00:00
Philip Reames
33deaa13b8 [memcpyopt] Common code into performCallSlotOptzn [NFC]
We have the same code repeated in both callers, sink it into callee.

The motivation here isn't just code style, we can also defer the relatively expensive aliasing checks until the cheap structural preconditions have been validated.  (e.g. Don't bother aliasing if src is not an alloca.)  This helps compile time significantly.
2022-03-28 20:10:13 -07:00
Paul Kirth
2add3fbd97 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-28 23:30:04 +00:00
Alina Sbirlea
f7381a795a Revert 29fada4a3d3db309f11f7fa7a0c61cd4021e9947
Seeing a test failure with asan in Halide generated code, reverting
while I investigate.

Differential Revision: https://reviews.llvm.org/D121987
2022-03-28 16:17:41 -07:00
Florian Hahn
8c3281db49
[ConstraintElimination] Use AddOverflow for offset summation.
Fixes an incorrect transformation due to values overflowing
https://alive2.llvm.org/ce/z/uizoea
2022-03-25 18:08:24 +00:00
Djordje Todorovic
9dbc687a5e NFC: [LICM] Update some stale comments
After removing the MaybePromotable, some comments
became stale. This improves them.

Differential Revision: https://reviews.llvm.org/D122319
2022-03-24 14:37:20 +01:00
Nikita Popov
29fada4a3d [EarlyCSE] Don't eagerly optimize MemoryUses
EarlyCSE currently optimizes all MemoryUses upfront. However,
EarlyCSE only actually queries the clobbering memory access for
a subset of uses, namely those where a CSE candidate has already
been identified. Delaying use optimization to the clobber query
improves compile-time in practice.

This change is not NFC because EarlyCSE has a limit on the number
of clobber queries (EarlyCSEMssaOptCap), in which case it falls
back to the defining access. The defining access for uses will now
no longer coincide with the optimized access.

If there are performance regressions from this change, we should
be able to address them by raising this limit.

Differential Revision: https://reviews.llvm.org/D121987
2022-03-23 16:47:35 +01:00
Nikita Popov
afb526b3f4 [LICM] Handle store of pointer to itself (PR54495)
Rather than iterating over users and comparing operands, iterate
over uses and check operand number. Otherwise, we'll end up
promoting a store twice if it has two equal operands.

This can only happen with opaque pointers, as otherwise both
operands differ by a level of indirection, so a bitcast would have
to be involved.

Fixes https://github.com/llvm/llvm-project/issues/54495.
2022-03-22 14:00:07 +01:00
Philip Reames
ee7324b898 Rename mayBeMemoryDependent to mayHaveNonDefUseDependency [nfc] 2022-03-21 10:01:40 -07:00
psamolysov-intel
2ed030ba88 [InferAddressSpaces][NFC] Small code improvements for the InferAddressSpaces pass
There is a bunch of code improvements in the patch: marking as const everything what can be
const and fixing some typos in comments.

Also the patch removes the shadowing parameter TTI from the rewriteWithNewAddressSpaces
method, the TTI parameter is not required because the same field is in the class.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D121671
2022-03-21 11:03:12 -05:00
Kazu Hirata
bce1bf0ee2 [Transform] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC) 2022-03-20 10:41:22 -07:00
Nick Desaulniers
e1bae23f6f [SCCP] do not clean up dead blocks that have their address taken
[SCCP] do not clean up dead blocks that have their address taken

Fixes a crash observed in IPSCCP.

Because the SCCPSolver has already internalized BlockAddresses as
Constants or ConstantExprs, we don't want to try to update their Values
in the ValueLatticeElement. Instead, continue to propagate these
BlockAddress Constants, continue converting BasicBlocks to unreachable,
but don't delete the "dead" BasicBlocks which happen to have their
address taken.  Leave replacing the BlockAddresses to another pass.

Fixes: https://github.com/llvm/llvm-project/issues/54238
Fixes: https://github.com/llvm/llvm-project/issues/54251

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D121744
2022-03-18 11:02:15 -07:00
Florian Hahn
5ab421fb4e
[LICM] Add allowspeculation pass options.
This adds a new option to control AllowSpeculation added in D119965 when
using `-passes=...`.

This allows reproducing #54023 using opt.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D121944
2022-03-18 16:51:57 +00:00
Nikita Popov
ab2284a643 [LowerConstantIntrinsics] Make TLI a required dependency
The way the pass is actually used in the optimization pipeline,
TLI will be available, but this is not the case when running just
-lower-constant-intrinsics in tests, which ends up being quite
confusing.

Require TLI unconditionally, as we usually do.
2022-03-18 14:59:18 +01:00
Nikita Popov
f96428e16d [MemorySSA] Don't optimize uses during construction
This changes MemorySSA to be constructed in unoptimized form.
MemorySSA::ensureOptimizedUses() can be called to optimize all
uses (once). This should be done by passes where having optimized
uses is beneficial, either because we're going to query all uses
anyway, or because we're doing def-use walks.

This should help reduce the compile-time impact of MemorySSA for
some use cases (the reason why I started looking into this is
D117926), which can avoid optimizing all uses upfront, and instead
only optimize those that are actually queried.

Actually, we have an existing use-case for this, which is EarlyCSE.
Disabling eager use optimization there gives a significant
compile-time improvement, because EarlyCSE will generally only query
clobbers for a subset of all uses (this change is not included in
this patch).

Differential Revision: https://reviews.llvm.org/D121381
2022-03-18 09:56:16 +01:00
Florian Hahn
4a699ae9c6
[LoopSimplifyCFG] Check predecessors of exits before marking them dead.
LoopSimplifyCFG may process loops that are not in
loop-simplify/canonical form. For loops not in canonical form, exit
blocks may be reachable from non-loop blocks and we cannot consider them
as dead if they only are not reachable from the loop itself.

Unfortunately the smallest test I could come up with requires running
multiple passes:
    -passes='loop-mssa(loop-instsimplify,loop-simplifycfg,simple-loop-unswitch)'

The reason is that loops are canonicalized at the beginning of loop
pipelines, so a later transform has to break canonical form in a way
that breaks LoopSimplifyCFG's dead-exit analysis.

Alternatively we could try to require all loop passes to maintain
canonical form. That in turn would also require additional verification.

Fixes #54023, #49931.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D121925
2022-03-18 08:54:44 +00:00
Paul Kirth
964398ccb1 Revert "Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics"""
This reverts commit 6cf560d69a222bff4af4e1d092437fd77f0f981c.
2022-03-18 00:21:33 +00:00
Paul Kirth
6cf560d69a Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics""
I mistakenly reverted my commit, so I'm relanding it.

This reverts commit 10866a1df4a82cdc54187330c509a2d46235455d.
2022-03-18 00:04:22 +00:00
Paul Kirth
10866a1df4 Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit e7749d4713a5ec886011ceb0fc821c6723061724.
2022-03-17 23:54:26 +00:00
Paul Kirth
e7749d4713 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Differential Revision: https://reviews.llvm.org/D115907
2022-03-17 23:46:23 +00:00
Florian Hahn
470a975c84
[ConstraintElimination] Add missing dominance check.
When dealing with an unconditional branch, the condition can only added
if BB properly dominates the successor.
2022-03-16 20:01:24 +00:00
Nikita Popov
d7cf7ec05d [SROA] Handle over-large loads during presplitting
When a load extends past the extent of the alloca, SROA will
restrict the slice size to extend to the end of the alloca only.
However, presplitting was asserting that the load size and the
slice size match exactly, which does not hold in this case.
Relax the assertion to only require that the load size is greater
or equal than the slice size.
2022-03-16 15:41:11 +01:00
Florian Hahn
f473d4aa80
[ConstraintElimination] Support BBs with single successor in CanAdd.
If BB has a single successor, conditions can be added safely.
2022-03-16 14:13:52 +00:00
Nikita Popov
cf18ec445d [GVN] Check load type in select PRE
This is no longer implicitly guaranteed with opaque pointers.
2022-03-14 12:46:54 +01:00
Benoit Jacob
9879c555f2 Expose ScalarizerPass options to C++ (not just commandline)
Context: I needed this for https://github.com/google/iree/pull/8474 .
I found that TSan instrumentation expects vector sizes to be <= 16,
and in my project (IREE) we have tests with higher vector sizes.
That left some test functions uninstrumented, resulting in crashes as
instrumented code called into them.

Differential Revision: https://reviews.llvm.org/D121182
2022-03-14 12:00:35 +01:00
serge-sans-paille
ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Benjamin Kramer
dbc32e2aa7 [LoopUnswitch] Use SmallPtrSet instead of std::set. NFCI. 2022-03-11 19:14:34 +01:00
Xiang1 Zhang
c31014322c TLS loads opimization (hoist)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D120000
2022-03-10 09:29:06 +08:00
Congzhe Cao
abc8ca65c3 [LoopInterchange] Detect output dependency of a store instruction with itself
This patch is motivated by pr48057 where an output dependency is not detected
since loop interchange did not check a store instruction with itself.
Fixed that deficiency.

Reviewed By: bmahjour, Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D118102
2022-03-09 15:50:27 -05:00
Florian Hahn
f98125abb2
Revert "[PassManager] Add pretty stack entries before P->run() call."
This reverts commit 128745cc2681c284bc6d0150a319673a6d6e8424.

This increased compile-time unnecessarily. Revert this change and follow
ups 2c7afadb4789 & add0c5856d5f.

http://llvm-compile-time-tracker.com/compare.php?from=338dfcd60f843082bb589b287d890dbd9394eb82&to=128745cc2681c284bc6d0150a319673a6d6e8424&stat=instructions
2022-03-09 18:46:32 +00:00
Florian Hahn
128745cc26
[PassManager] Add pretty stack entries before P->run() call.
This patch adds PrettyStackEntries before running passes. The entries
include the pass name and the IR unit the pass runs on.

The information is used the print additional information when a pass
crashes, including the name and a reference to the IR unit on which it
crashed. This is similar to the behavior of the legacy pass manager.

The improved stack trace now includes:

Stack dump:
0.	Program arguments: bin/opt -loop-vectorize -force-vector-width=4 crash.ll
1.	Running pass 'ModuleToFunctionPassAdaptor' on module 'crash.ll'
2.	Running pass 'LoopVectorizePass' on function '@a'

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D120993
2022-03-09 13:01:09 +00:00
Florian Hahn
e10b0ea371
[ConstraintElimination] Remove over-eager assertion.
After moving the CanAdd check in c60cdb44f7ecb4b02ed and using it for
the assume cases as well, the passed in block may not have  a branch
instruction as terminator. This can trigger the assertion. Given the new
use case, it doesn't add value any longer and can be removed.

Fixes https://github.com/llvm/llvm-project/issues/54281
2022-03-08 22:02:08 +00:00
Florian Hahn
4bbee17ecb
[ConstraintElimination] Use ZExtValue for unsigned decomposition.
When decomposing constraints for unsigned conditions, we can use
negative values by zero-extending them, as long as they are less than
the maximum constraint value.

Fixes https://github.com/llvm/llvm-project/issues/54224
2022-03-07 13:34:01 +00:00
Florian Hahn
c60cdb44f7
[ConstraintElimination] Only add cond from assume to succs if valid.
Add missing CanAdd check before adding a condition from an assume
to the successor blocks. When adding information from assume to
successor blocks we need to perform the same CanAdd as we do for adding
a condition from a branch.

Fixes https://github.com/llvm/llvm-project/issues/54217
2022-03-07 12:01:15 +00:00
Florian Hahn
542c335159
[ConstraintElimination] Remove dead variables when dropping constraints.
This patch extends ConstraintElimination to also remove dead variables
when removing a constraint. When a constraint is removed because it is
out of scope, all new variables added for this constraint can also be
removed.

This keeps the total size of the systems much smaller, because it
reduces the number of variables drastically.

It also fixes a bug where variables where removed incorrectly.

Fixes https://github.com/llvm/llvm-project/issues/54228
2022-03-07 09:04:07 +00:00