Make the fold use the information present in the condition for deducing constants i.e:
```
%c = icmp eq i8 %x, 10
%s = select i1 %c, i8 3, i8 2
%r = mul i8 %x, %s
```
If we fold the `mul` into the select, on the true side we insert `10` for `%x` in the `mul`.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D146349
Remove C APIs for interacting with PassRegistry and pass
initialization. These are legacy PM concepts, and are no longer
relevant for the new pass manager.
Calls to these initialization functions can simply be dropped.
Differential Revision: https://reviews.llvm.org/D145043
No need to include CallGraphSCCPass.h from the IPO/Inliner.
Also removed the include of LegacyPassManager.h in a couple of files
that do not really depend on that header file.
Differential Revision: https://reviews.llvm.org/D148083
- As the address space cast may not be valid on a specific target,
`addrspacecast` is not handled when an `alloca` is able to be replaced
with the source of memcpy/memmove. This patch addresses that by
querying a target hook on whether that address space cast is valid.
For example, on most GPU targets, the cast from a global pointer to a
generic pointer is valid.
- If that cast is allowedd (by querying `isValidAddrSpaceCast`), the
replacement is enhanced to handle that `addrspacecast` as well.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D147025
Move this handling to a centralized place and extend it to handle
saturating add/sub intrinsics.
I originally wanted to make this fully generic rather than
whitelist based, because this is legal and likely profitable for all
speculatable intrinsics. The caveat is that for vector selects,
the intrinsic can't perform cross-lane operations like a shuffle
or reduction, which we don't really expose as a generic property
right now. So for now I'm just extending the list.
Now that we canonicalize to min/max intrinsics, we no longer need
to guard against this here.
In fact, it seems like the issue from PR46271 was the final push
for introducing the intrinsics in the first place...
Combine binary operator of two phi node if there is at least one
specific constant value in phi0 and phi1's incoming values for each
same incoming block and this specific constant value can be used
to do optimization for specific binary operator.
For example:
```
%phi0 = phi i32 [0, %bb0], [%i, %bb1]
%phi1 = phi i32 [%j, %bb0], [0, %bb1]
%add = add i32 %phi0, %phi1
==>
%add = phi i32 [%j, %bb0], [%i, %bb1]
```
Fixes: https://github.com/llvm/llvm-project/issues/61137
Differential Revision: https://reviews.llvm.org/D145223
This is the follow-up to D144199 and suggestion from D144045.
We make use of loop info explicit via InstCombine pass parameter
rather than semi-arbitrary via caching.
The only InstCombine transform that uses LoopInfo currently is a
GEP fold in visitGEPOfGEP(), so that shows up as a failure in the
dedicated test for the fold as well as several LoopVectorizer tests
that run extra passes.
I don't see any pass manager regression tests that actually check
for pass options, but this is intended to be NFC for the pass
pipeline behavior - we only try to use loop info where it would
have been used before via caching .
Differential Revision: https://reviews.llvm.org/D144274
Legacy passes are only supported for codegen, and I don't believe
it's possible to write backends using the C API, so we should drop
all of those. Reduces the number of places that need to be modified
when removing legacy passes.
Differential Revision: https://reviews.llvm.org/D144970
The reported compile-time regression has been address in
47f9109dff80a1abbe2705ee71dc0882b1d62274.
Additionally, this contains a change to immediately fold zext
with constant operand, even if it's used in a trunc. I'm not sure
if this is relevant for anything, but I noticed it as a behavioral
discrepancy when investigating this issue.
-----
InstCombine currently performs a constant folding attempt as part
of the main InstCombine loop, before visiting the instruction.
However, each visit method will also attempt to simplify the
instruction, which will in turn constant fold it. (Additionally,
we also constant fold instructions before the main InstCombine loop
and use a constant folding IR builder, so this is doubly redundant.)
There is one place where InstCombine visit methods currently don't
call into simplification, and that's casts. To be conservative,
I've added an explicit constant folding call there (though it has
no impact on tests).
This makes for a mild compile-time improvement and in particular
mitigates the compile-time regression from enabling load
simplification in be88b5814d9efce131dbc0c8e288907e2e6c89be.
Differential Revision: https://reviews.llvm.org/D144369
Increase compile time with ubsan ARM from 3 to 14 min single file.
I upload reproducer into D144369.
Also we have random timeouts on internal x86_64 builds.
Both bisected to this one.
This reverts commit 45a0b812fa13ec255cae91f974540a4d805a8d79.
Enabling assignment tracking without this patch, a significant amount of
additional compiler run time comes from the RemoveRedundantDbgInstrs call in
InstCombine. This patch reduces compiler run time by choosing better places to
call RemoveRedundantDbgInstrs.
In non-assignment-tracking builds, RemoveRedundantDbgInstrs is called by
InstCombine if LowerDbgDeclare makes a change (i.e. it is _sometimes_
called). In assignment tracking builds LowerDbgDeclare doesn't do anything. We
still need to clean up redundant intrinsics to avoid a large performance hit
due to the number of instructions, so the current approach is to have
InstCombine _always_ call RemoveRedundantDbgInstrs.
Instrumenting the compiler to run RemoveRedundantDbgInstrs after every pass and
dump the numbers and building CTMark/tramp3d-v4 indicates that SROA and
LoopVectorize give us a bigger bang (number removed) for buck (times pass is
run).
The compile time tracker reports that this patch reduces the number of
instructions retired building CTMark projects by an average of 1.1%.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D144483
With this patch freeze undef/poison will no longer be folded into a constant if it's used as a
vector operand in a shufflevector.
Differential Revision: https://reviews.llvm.org/D143593
InstCombine currently performs a constant folding attempt as part
of the main InstCombine loop, before visiting the instruction.
However, each visit method will also attempt to simplify the
instruction, which will in turn constant fold it. (Additionally,
we also constant fold instructions before the main InstCombine loop
and use a constant folding IR builder, so this is doubly redundant.)
There is one place where InstCombine visit methods currently don't
call into simplification, and that's casts. To be conservative,
I've added an explicit constant folding call there (though it has
no impact on tests).
This makes for a mild compile-time improvement and in particular
mitigates the compile-time regression from enabling load
simplification in be88b5814d9efce131dbc0c8e288907e2e6c89be.
Differential Revision: https://reviews.llvm.org/D144369
This is a cleanup/modernization patch requested in D144045 to make loop
analysis a proper optional parameter to the pass rather than a
semi-arbitrary value inherited via the pass pipeline.
It's a bit more complicated than the recent patch I started copying from
(D143980) because InstCombine already has an option for MaxIterations
(added with D71145).
I debated just deleting that option, but it was used by a pair of existing
tests, so I put it into a struct (code largely copied from SimplifyCFG's
implementation) to make the code more flexible for future options
enhancements.
I didn't alter the pass manager invocations of InstCombine in this patch
because the patch was already getting big, but that will be a small
follow-up as noted with the TODO comment.
Differential Revision: https://reviews.llvm.org/D144199
Computing EH-related information was only relevant for analysis passes so far. Lifting it to IR will allow the IR Verifier to calculate EH funclet coloring and validate funclet operand bundles in a follow-up step.
Reviewed By: rnk, compnerd
Differential Revision: https://reviews.llvm.org/D138122
Remove LLVM flag -experimental-assignment-tracking. Assignment tracking is
still enabled from Clang with the command line -Xclang
-fexperimental-assignment-tracking which tells Clang to ask LLVM to run the
pass declare-to-assign. That pass converts conventional debug intrinsics to
assignment tracking metadata. With this patch it now also sets a module flag
debug-info-assignment-tracking with the value `i1 true` (using the flag conflict
rule `Max` since enabling assignment tracking on IR that contains only
conventional debug intrinsics should cause no issues).
Update the docs and tests too.
Reviewed By: CarlosAlbertoEnciso
Differential Revision: https://reviews.llvm.org/D142027
In canCreateUndefOrPoison(), take not only poison-generating flags,
but also poison-generating metadata into account. The helpers are
written generically, but I believe the only case that can actually
matter is !range on calls -- !nonnull and !align are only valid on
loads, and those can create undef/poison anyway.
Unfortunately, this negatively impacts logical to bitwise and/or
conversion: For ctpop/ctlz/cttz we always attach !range metadata,
which will now block the transform, because it might introduce
poison. It would be possible to recover this regression by supporting
a ConsiderFlagsAndMetadata=false mode in impliesPoison() and clearing
flags/metadata on visited instructions.
Fixes https://github.com/llvm/llvm-project/issues/59888.
Differential Revision: https://reviews.llvm.org/D142115
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
This mirrors a similar shufflevector transformation so the same
effect is obtained for scalable vectors. The transformation is
only performed when it can be proven the number of resulting
reversals is not increased. By bubbling the reversals from operand
to result this should typically be the case and ideally leads to
back-back shuffles that can be elimitated entirely.
Differential Revision: https://reviews.llvm.org/D139342
The important bit here is that we gracefully handle other uses,
iff they can be adapted to inversion.
I'll note, the previous logic was actively bad,
it increased instruction count since it didn't actually ensure
that the inversions happened.
Without this change this function could return nullptr if no further
transformation took place making InstCombine pass incorrectly report no
IR changes preventing analyses invalidation.
Differential Revision: https://reviews.llvm.org/D140004
This code compares getPointerTypeSizeInBits and getTypeSizeInBits.
getPointerTypeSizeInBits contains a call to getScalarType while
getTypeSizeInBits does not. This makes the code incorrect for vectors.
For scalable vectors this caused a warning about a scalable TypeSize
being converted to unsigned.
Switch to DL.getTypeSizeInBits for the pointers too. This should
work since inttoptr/ptrtoint can't change the number of elements.
This was suggested by @nikic in D139911.
Fixes PR59480.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D139962
These are used for special arguments to intrinsics and don't make any
sense to consider for poisonness. Fixes not pushing freeze through
llvm.fptrunc.round.
92106641ae297c24877085e0357e8095aa7b43c9 made
isGuaranteedNotToBeUndefOrPoison return false for metadata arguments,
which doesn't entirely make sense. An alternate patch could switch
that to true, and try to filter out adding some pointless noundefs on
metadata arguments (I tried that, attributor breaks one case with a
llvm.dbg.value in it).
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
An extractelt with a constant index which extracts an element from the
two vector operands of a select can be directly folded into a select.
extractelt (select %x, %vec1, %vec2), %const ->
select %x, %vec1[%const], %vec2[%const]
Note: the implementation currently only works for constant vector operands.
Reviewed By: foad, spatel
Differential Revision: https://reviews.llvm.org/D137934
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
This reduces peak memory overhead by 15% when building CTMark's tramp3d-v4 with
-O2 -g with assignment tracking enabled.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D133321
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
Most of the updates here are just to ensure DIAssignID attachments are
maintained and propagated correctly.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D133307