625 Commits

Author SHA1 Message Date
Max Kazantsev
f91aaf1b0c Return "[LICM] Support logical AND/OR when hoisting min/max"
Underlying bug (creation of umin for pointers) is now fixed.

Differential Revision: https://reviews.llvm.org/D145771
2023-03-13 14:34:43 +07:00
Max Kazantsev
fc128e126b [LICM] Do not hoist min/max for pointer types
umin and similar intrinsics are not defined for them.
2023-03-13 14:12:21 +07:00
Vitaly Buka
f902ead7cb Revert "[LICM] Support logical AND/OR when hoisting min/max"
Breaks https://lab.llvm.org/buildbot/#/builders/37/builds/20720

This reverts commit 9e83d13c9f77e300ebb7b94a1400de3c2d47b3d5.
2023-03-10 09:52:57 -08:00
Nikita Popov
a7322a2171 [LICM] Delay fetching of preheader (NFC)
Only fetch preheader once we want to actually hoist. It turns out
that calculating the preheader is expensive enough to affect
overall compile-time if you do it for every single instruction.

Addresses the compile-time regression from D143726.
2023-03-10 16:16:48 +01:00
Max Kazantsev
9e83d13c9f [LICM] Support logical AND/OR when hoisting min/max
We can handle logical AND/OR in the same way as arithmetic AND/OR, it only
takes us freezing `RHS2` for which we may introduce a new use which didn't
exist before dynamically.

Differential Revision: https://reviews.llvm.org/D145771
Reviewed By: nikic
2023-03-10 18:07:15 +07:00
Max Kazantsev
6b03ce374e [LICM] Simplify (X < A && X < B) into (X < MIN(A, B)) if MIN(A, B) is loop-invariant
We don't do this transform in InstCombine in general case for arbitrary values, because cost of
AND and 2 ICMP's isn't higher than of MIN and ICMP. However, LICM also has a notion
about the loop structure. This transform becomes profitable if `A` and `B` are loop-invariant and
`X` is not: by doing this, we can compute min outside the loop.

Differential Revision: https://reviews.llvm.org/D143726
Reviewed By: nikic
2023-03-10 17:36:52 +07:00
Max Kazantsev
ff687c47b3 [LICM][NFC] Don't preserve DT and loop analyzes separately
This is already implied by getLoopPassPreservedAnalyses.

Differential Revision: https://reviews.llvm.org/D144860
Reviewed By: nikic, skatkov
2023-02-28 17:02:51 +07:00
William S. Moses
58eac856cc [LICM] Ensure LICM can hoist invariant.group
Invariant.group's are not sufficiently handled by LICM. Specifically,
if a given invariant.group loaded pointer is not overwritten between
the start of a loop, and its use in the load, it can be hoisted.
The invariant.group (on an already invariant pointer operand) ensures
the result is the same. If it is not overwritten between the start
of the loop and the load, it is therefore legal to hoist.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D144053
2023-02-26 12:41:41 -05:00
Liren Peng
529ee9750b [NFC] Use single quotes for single char output during printPipline
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D144365
2023-02-22 02:35:13 +00:00
Nikita Popov
ea6e8d4f59 [LICM] Don't generate crash dialog for missing MSSA
This is a user error, so we should not be asking them to report
an issue.
2023-01-23 11:59:32 +01:00
Guillaume Chatelet
8fd5558b29 [NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize()
This change is one of a series to implement the discussion from
https://reviews.llvm.org/D141134.
2023-01-11 16:49:38 +00:00
Nikita Popov
88419a30a0 [LICM] Allow load-only scalar promotion in the presence of aliasing loads
During scalar promotion, if there are additional potentially-aliasing
loads outside the promoted set, we can still perform a load-only
promotion. As the stores are retained, any potentially-aliasing
loads will still read the correct value.

This increases the number of load promotions in llvm-test-suite by
a factor of two:

                                |  Old |  New
    licm.NumPromotionCandidates | 4448 | 6038
    licm.NumLoadPromoted        |  479 | 1069
    licm.NumLoadStorePromoted   | 1459 | 1459

Unfortunately, this does have some impact on compile-time:
http://llvm-compile-time-tracker.com/compare.php?from=57f7f0d6cf0706a88e1ecb74f3d3e8891cceabfa&to=72b811738148aab399966a0435f13b695da1c1c8&stat=instructions
In part this is because we now have less early bailouts from
promotion, but also due to second order effects (e.g. for one case
I looked at we spend more time in SLP now).

Differential Revision: https://reviews.llvm.org/D133192
2022-12-20 10:02:46 +01:00
Vasileios Porpodas
32b38d248f [NFC] Rename Instruction::insertAt() to Instruction::insertInto(), to be consistent with BasicBlock::insertInto()
Differential Revision: https://reviews.llvm.org/D140085
2022-12-15 12:27:45 -08:00
Vasileios Porpodas
06911ba6ea [NFC] Cleanup: Replaces BB->getInstList().insert() with I->insertAt().
This is part of a series of cleanup patches towards making BasicBlock::getInstList() private.

Differential Revision: https://reviews.llvm.org/D138877
2022-12-12 13:33:05 -08:00
Nikita Popov
ed76074173 [LICM] Remove custom isInstInList() implementation (PR59324)
We already collect all instructions that need to be promoted. The
custom isInstInList() implementation could provide incorrect
results if a new use of the original pointer was introduced as
part of promotion. This probably cannot happen with normal code,
because of the pointer capture, but it can happen with a null
pointer.

Fixes https://github.com/llvm/llvm-project/issues/59324.
2022-12-07 09:51:35 +01:00
Nikita Popov
e95ca5bb05 [AST] Make AliasSetTracker work on BatchAA
D138014 restricted AST to work on immutable IR. This means it is
also safe to use a single BatchAA instance for the entire AST
lifetime, instead of only batching parts of individual queries.

The primary motivation for this is not compile-time, but rather
having a central place to control cross-iteration AA, which will
be used by D137958.

Differential Revision: https://reviews.llvm.org/D137955
2022-12-05 08:12:26 +01:00
Kazu Hirata
7524db4d44 [llvm] Remove unused forward declarations (NFC) 2022-11-20 09:59:36 -08:00
OCHyams
2da67e8053 [Assignment Tracking][18/*] Account for assignment tracking in LICM
The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Merge DIAssignID attachments on stores that are merged and sunk out of
loops. The store may be sunk into multiple exit blocks, and in this case all
the copies of the store get the same DIAssignID.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133313
2022-11-15 12:24:16 +00:00
Patrick Walton
01859da84b [AliasAnalysis] Introduce getModRefInfoMask() as a generalization of pointsToConstantMemory().
The pointsToConstantMemory() method returns true only if the memory pointed to
by the memory location is globally invariant. However, the LLVM memory model
also has the semantic notion of *locally-invariant*: memory that is known to be
invariant for the life of the SSA value representing that pointer. The most
common example of this is a pointer argument that is marked readonly noalias,
which the Rust compiler frequently emits.

It'd be desirable for LLVM to treat locally-invariant memory the same way as
globally-invariant memory when it's safe to do so. This patch implements that,
by introducing the concept of a *ModRefInfo mask*. A ModRefInfo mask is a bound
on the Mod/Ref behavior of an instruction that writes to a memory location,
based on the knowledge that the memory is globally-constant memory (in which
case the mask is NoModRef) or locally-constant memory (in which case the mask
is Ref). ModRefInfo values for an instruction can be combined with the
ModRefInfo mask by simply using the & operator. Where appropriate, this patch
has modified uses of pointsToConstantMemory() to instead examine the mask.

The most notable optimization change I noticed with this patch is that now
redundant loads from readonly noalias pointers can be eliminated across calls,
even when the pointer is captured. Internally, before this patch,
AliasAnalysis was assigning Ref to reads from constant memory; now AA can
assign NoModRef, which is a tighter bound.

Differential Revision: https://reviews.llvm.org/D136659
2022-10-31 13:03:41 -07:00
Nikita Popov
747f27d97d [AA] Rename getModRefBehavior() to getMemoryEffects() (NFC)
Follow up on D135962, renaming the method name to match the new
type name.
2022-10-19 11:03:54 +02:00
Nikita Popov
1a9d9823c5 [AA] Rename uses of FunctionModRefBehavior (NFC)
Followup to D135962 to rename remaining uses of
FunctionModRefBehavior to MemoryEffects. Does not touch API names
yet, but also updates variables names FMRB/MRB to ME, to match the
new type name.
2022-10-19 10:54:47 +02:00
Shubham Narlawar
b920407cf5 [LICM] Disable thread-safety checks in single-thread model
If the single-thread model is used, or the
-licm-force-thread-model-single flag is specified, skip checks
related to thread-safety. This means that store promotion for
conditionally executed stores only requires proof of
dereferenceability and writability, but not of thread-safety. For
example, this enables promotion of stores to (non-constant) globals,
as well as captured allocas.

Fixes https://github.com/llvm/llvm-project/issues/50537.

Differential Revision: https://reviews.llvm.org/D130466
2022-10-10 16:51:16 +02:00
Nikita Popov
d40dcb0b8d [LICM] Collect more scalar promotion stats (NFC)
Collect more statistics for scalar promotion. In particular,
keep track of how many promotion candidates there were, and
whether it is a load or a load/store promotion.
2022-09-30 16:07:52 +02:00
Matt Arsenault
0d1f040749 LICM: Pass through AssumptionCache 2022-09-21 09:16:17 -04:00
Matt Arsenault
ce44357216 Analysis: Add AssumptionCache to isSafeToSpeculativelyExecute
Does not update any of the uses.
2022-09-19 19:25:22 -04:00
Matt Arsenault
0d8ffcc532 Analysis: Add AssumptionCache argument to isDereferenceableAndAlignedPointer
This does not try to pass it through from the end users.
2022-09-19 18:57:33 -04:00
Max Kazantsev
818b1ab84e [SCEV][NFC] Remove unused parameter from forgetLoopDispositions
Let's be honest about it, we don't drop loop dispositions for
particular loops. Remove the parameter that misleadingly makes
it apparent that we do.
2022-09-19 14:06:42 +07:00
Nikita Popov
b1cd393f9e [AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI)
Currently, FunctionModRefBehavior tracks whether the function reads
or writes memory (ModRefInfo) and which locations it can access
(argmem, inaccessiblemem and other). This patch changes it to track
ModRef information per-location instead.

To give two examples of why this is useful:

* D117095 highlights a weakness of ModRef modelling in the presence
  of operand bundles. For a memcpy call with deopt operand bundle,
  we want to say that it can read any memory, but only write argument
  memory. This would allow them to be treated like any other calls.
  However, we currently can't express this and have to say that it
  can read or write any memory.
* D127383 would ideally be modelled as a separate threadid location,
  where threadid Refs outside pre-split coroutines can be ignored
  (like other accesses to constant memory). The current representation
  does not allow modelling this precisely.

The patch as implemented is intended to be NFC, but there are some
obvious opportunities for improvements and simplification. To fully
capitalize on this we would also want to change the way we represent
memory attributes on functions, but that's a larger change, and I
think it makes sense to separate out the FunctionModRefBehavior
refactoring.

Differential Revision: https://reviews.llvm.org/D130896
2022-09-14 16:34:41 +02:00
Nikita Popov
a9f312c7f4 [AST] Use BatchAA in aliasesUnknownInst() (NFCI) 2022-09-09 15:54:48 +02:00
Nikita Popov
4ab77d1677 [LICM] Allow promotion with non-load/store users
If there are non-load/store users of the promoted pointer, we
currently abort promotion. However, having such users isn't really
relevant to the transform. We already separately check that a)
there are no instructions that modref the promoted pointer and
b) that a pointer capture disables store promotion.

In the affected @test_captured_in_loop test case we have a readnone
capture of the promoted pointer, which means that load promotion
can be performed (while store promotion cannot).

Differential Revision: https://reviews.llvm.org/D133485
2022-09-09 13:09:59 +02:00
Nikita Popov
388b684354 [LICM] Separate check for writability and thread-safety (NFCI)
This used a single check to make sure that the object is both
writable and thread-local. Separate them out to make the
deficiencies in the current code more obvious.
2022-09-05 09:43:17 +02:00
Nikita Popov
639d912282 [LICM] Allow load-only scalar promotion in the presence of unwinding
Currently, we bail out of scalar promotion if the loop may unwind
and the memory may be visible on unwind. This is because we can't
insert stores of the promoted value on unwind edges.

However, nowadays scalar promotion also has support for only
promoting loads, while leaving stores in place. This kind of
promotion is safe even in the presence of unwinding.

Differential Revision: https://reviews.llvm.org/D133111
2022-09-02 09:27:13 +02:00
Nikita Popov
f5c178b6a4 [LICM] Remove unnecessary condition (NFC) 2022-09-01 15:42:35 +02:00
Nikita Popov
315aef667e [LICM] Fix thread safety checks for promotion of byval args
This code was relying on a very subtle contract: The expectation
was that for non-allocas, the unwind safety check would already
perform a capture check, so we don't need to perform it later.
This held true when this unwind safety was only handled for allocas
and noalias calls, but became incorrect when byval support was
added.

To avoid this kind of issue, just remove the dependency between the
unwind and thread-safety checks entirely. At worst, this means we
perform a redundant capture check. If this should turn out to be
problematic for compile-time, we can cache that query in a more
explicit way.
2022-09-01 15:33:46 +02:00
Nikita Popov
3f8b1d0f15 [LICM] Add some debug output to scalar promotion (NFC) 2022-09-01 14:46:30 +02:00
Arthur Eubanks
04f3c20989 [NFC][LICM] Stop passing around unused BFI
Uses of this were removed in 1a25d0bfbb6b587caa03bacd121b67086a774598.
2022-08-31 19:15:34 -07:00
Kazu Hirata
6b1bc80188 [Scalar] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-20 21:18:25 -07:00
Simon Pilgrim
fdec50182d [CostModel] Replace getUserCost with getInstructionCost
* Replace getUserCost with getInstructionCost, covering all cost kinds.
* Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.

Original Patch by @samparker (Sam Parker)

Differential Revision: https://reviews.llvm.org/D79483
2022-08-18 11:55:23 +01:00
Simon Pilgrim
e48892ee42 [Transforms] LICM.cpp - pull out repeated getUserCost call
Pulled out of D79483
2022-08-18 10:43:29 +01:00
Kazu Hirata
0e37ef0186 [Transforms] Fix comment typos (NFC) 2022-08-07 23:55:24 -07:00
Nikita Popov
40a4078e14 [BasicBlockUtils] Allow splitting predecessors with callbr terminators
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.

The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.

I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.

Differential Revision: https://reviews.llvm.org/D129205
2022-07-07 09:13:25 +02:00
Nuno Lopes
6ef9a2ad01 [LICM] Use poison to replace unreachable values instead of undef [NFC] 2022-06-26 14:56:35 +01:00
Adrian Tong
4e555a3df4 Fix a misspell. NFC 2022-06-22 21:23:21 +00:00
Florian Hahn
0776c48f9b
Recommit "[LICM] Only create load in ph when promoting load or store doesn't exec."
This reverts the revert commit ad95255b9215a.

The updated version also creates a load when the store may not execute.
In those cases, we still need to introduce a load in a function where
there may not have been one before, so this doesn't completely resolve
issue #51248.

Original message:

    When only a store is sunk, there is no need to create a load in the
    pre-header, as the result of the load will never get used.

    The dead load can can introduce UB, if the function is marked as
    writeonly.

    Reviewed By: nikic

    Differential Revision: https://reviews.llvm.org/D123473
2022-05-29 21:57:14 +01:00
Florian Hahn
ad95255b92
Revert "[LICM] Only create load in pre-header when promoting load."
This reverts commit 4bf3b7dc929c8288e9e5631978ef060d9140b251.

This might be causing another buildbot failure.
2022-04-13 20:24:28 +02:00
Florian Hahn
4bf3b7dc92
Recommit "[LICM] Only create load in pre-header when promoting load."
This reverts the revert commit 1ddc719680c21f3.

This version of the patch sets the initial available value to poison,
which resolves an issue with the SSAUpdater breaking LCSSA form.
2022-04-13 17:20:39 +02:00
Florian Hahn
1ddc719680
Revert "[LICM] Only create load in pre-header when promoting load."
This reverts commit 42229b96bf94ec896d5c62fa643d83ba96e86eea.

This appears to cause crashes on multiple bots.
2022-04-11 17:37:23 +02:00
Florian Hahn
42229b96bf
[LICM] Only create load in pre-header when promoting load.
When only a store is sunk, there is no need to create a load in the
pre-header, as the result of the load will never get used.

The dead load can can introduce UB, if the function is marked as
writeonly.

Fixes #51248.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123473
2022-04-11 16:45:18 +02:00
Nikita Popov
c8c6362560 [LICM] Pass MemorySSAUpdater by referene (NFC)
Make it clearer that this is a required dependency.
2022-04-08 10:08:57 +02:00
Nikita Popov
5cefe7d9f5 [LoopSink] Require MemorySSA
This makes MemorySSA in LoopSink required, and removes the AST-based
implementation, as well as the related support code in LICM.

Differential Revision: https://reviews.llvm.org/D123288
2022-04-08 09:49:44 +02:00