This allows developing and distributing inlining heuristics
outside of tree. And together with the inline advisor plugins
allows for fine grained control of the inliner.
The PluginInlineOrderAnalysis class serves as the entry point
for dynamic advisors. Plugins must register instances of this
class to provide their own InlineOrder.
I'm checking in this patch on behalf of ibricchi
<ibricchi@student.ethz.ch>.
Differential Revision: https://reviews.llvm.org/D140637
Reapply with a fix for phi handling: For phis, we need to insert
into the incoming block, not above the phi. This is especially
tricky if there are multiple incoming values from the same
predecessor, because these must all use the same value.
-----
LowerTypeTests replaces weak declarations with an icmp+select
constant expressions. As this is not a relocatable expression,
it additionally promotes initializers using it to global ctors.
As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179,
I would like to remove the select constant expression, of which LTT
is now the last user. This is a bit tricky, because we now need to
replace a constant with an instruction, which might require
converting intermediate constant expression users to instructions as
well.
We do this using the convertUsersOfConstantsToInstructions() helper.
However, it needs to be slightly extended to also support expansion
of ConstantAggregates. These are important in this context, because
the promotion of initializers to global ctors will produce stores
of such aggregates.
Differential Revision: https://reviews.llvm.org/D145247
PassManagerBuilder is dead, long live PassBuilder!
bugpoint's -O# are now useless (and probably have been for a while given the number of passes we've removed from PassManagerBuilder). Perhaps they'll be revived if bugpoint ever works with the new PM.
Reviewed By: nikic, MaskRay
Differential Revision: https://reviews.llvm.org/D145835
DFAJumpThreading
JumpThreading
LibCallsShrink
LoopVectorize
SLPVectorizer
DeadStoreElimination
AggressiveDCE
CorrelatedValuePropagation
IndVarSimplify
These are part of the optimization pipeline, of which the legacy version is deprecated and being removed.
This change improves FS discriminators in the following ways:
(1) use call-stack debug information in the the to generate
discriminators: the same (src/line) DILs can now have same
discriminator value if they come from different call-stacks.
This effectively increases the usable discriminator values
for each round of FS discriminator pass.
(2) don't generate the FS discriminator for meta instructions
(i.e. instructions not emitted). This reduces the number
discriminators conflicts (for the case we run out of discriminator
bits for that pass).
(3) use less expensive hashing of xxHash64.
These improvements should bring better performance for FSAFDO
and they should be used by default. But this change creates
incompatible FS discriminators. For the iterative profile users,
they might see a performance drop in the first release with
this change (due to the fact that the profiles have the old
discriminators and the compiler uses the new discriminator).
We have measured that this is not more than 1.5% on several
benchmarks. Note the degradation should be gone in the second
release and one should expect a performance gain over the binary
without this change.
One possible solution to the iterative profile issue would be
separating discriminators for profile-use and the ones emitted to
the binary. This would require a mechanism to allow two sets of
discriminators to be maintained and then phasing out the first
approach. This is too much churn in the compiler and the
performance implications do not seem to be worth the effort.
Instead, we put the changes under an option so iterative profile
users can do a gradual rollout of this change. We will make the
option default value to true in a later patch and eventually
purge this option from the code base.
Differential Revision: https://reviews.llvm.org/D145171
Since https://reviews.llvm.org/D141386 !range violations return
poison instead of causing immediate undefined behavior. As such,
it is fine for IPSCCP to infer !range even if the value might be
poison. (The value cannot be undef as this would promote undef to
poison, but this is already checked separately.)
This basically undoes the late change done to D83952, restoring
it to its original version (which is now valid).
Differential Revision: https://reviews.llvm.org/D144467
The legacy PM is only supported for codegen, and PassManagerBuilder
is exclusively about the middle-end optimization pipeline. Drop it.
Differential Revision: https://reviews.llvm.org/D145387
Remove the null pointer check on Callee since it is guaranteed to pass by the check
at the top of the loop which continues if Callee is null. While this change is somewhat
trivial, for what it's worth this check triggers Coverity warnings because it implies that
Callee might be null at this point even though it is dereferenced in the preceding code.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D145463
If all stores only store the initializer value of a global, consider it
as not stored in the heuristic. GlobalOpt will remove such stores later
on.
Depends on D129857.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144476
LowerTypeTests replaces weak declarations with an icmp+select
constant expressions. As this is not a relocatable expression,
it additionally promotes initializers using it to global ctors.
As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179,
I would like to remove the select constant expression, of which LTT
is now the last user. This is a bit tricky, because we now need to
replace a constant with an instruction, which might require
converting intermediate constant expression users to instructions as
well.
We do this using the convertUsersOfConstantsToInstructions() helper.
However, it needs to be slightly extended to also support expansion
of ConstantAggregates. These are important in this context, because
the promotion of initializers to global ctors will produce stores
of such aggregates.
Differential Revision: https://reviews.llvm.org/D145247
Previous search does not take into account @llvm.dbg.* intrinsics
and debug types information while DebugInfoFinder takes into account
such information.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D145239
Pointer bitcasts no longer occur with opaque pointers -- and in
this case not handling them allows us to drop the code for
promoting constant expressions to instructions as well.
Extend CleanupPointerRootUsers to iterate over a worklist, add users of
constant expressions to the worklist to enable additional cleanups.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144468
Legacy passes are only supported for codegen, and I don't believe
it's possible to write backends using the C API, so we should drop
all of those. Reduces the number of places that need to be modified
when removing legacy passes.
Differential Revision: https://reviews.llvm.org/D144970
When limiting the number of parts we split a global into, ignore
any parts that are either only loaded or only stored, because we
expect these to be optimized away after SRA.
Differential Revision: https://reviews.llvm.org/D129857
This reverts commit a9a1950115d7db95c7439128b14af2cefe8f796d.
The legacy PM uses in Polly have been removed, so recommit the patch.
Original message:
This is part of the optimization pipeline, of which the legacy pass manager version is deprecated.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D144201
Conflicting module flags leads to a proper error for regular LTO but a crash
(report_fatal_error) for ThinLTO. Switch to createStringError to fix the crash
and match regular LTO.
Prior to this patch, WPD was not acting on relative-vtables in C++. This
involves teaching WPD about these things:
- llvm.load.relative which is how relative-vtables are indexed (instead of GEP)
- dso_local_equivalent which is used in the vtable itself when taking the
offset between a virtual function and vtable
- Update llvm/test/ThinLTO/X86/devirt.ll to use opaque pointers and add
equivalent tests for RV
Differential Revision: https://reviews.llvm.org/D134320
Follow on to D144209 to support single implementation devirtualization
for Regular LTO when the vtable holds a function alias.
For now I have prevented other optimizations performed in regular LTO
that need to analyze the contents of the function target when the vtable
holds an alias, as I'm not sure they are always correct to perform in
that case.
Differential Revision: https://reviews.llvm.org/D144270
When replacing return values with undef, we should also drop the
noundef attribute (and other UB implying attributes).
Differential Revision: https://reviews.llvm.org/D144461
[Originally committed as f6ddf7781471b71243fa3c3ae7c93073f95c7dff;
reverted in bbef38352fbade9e014ec97d5991da5dee306da7 due to test
breakage; now relanded with the Arm tests conditioned on
`arm-registered-target`]
The LowerTypeTests pass emits a jump table in the form of an
`inlineasm` IR node containing a string representation of some
assembly. It tests the target triple to see what architecture it
should be generating assembly for. But that's not good enough for
`Triple::thumb`, because the 32-bit PC-relative `b.w` branch
instruction isn't available in all supported architecture versions. In
particular, Armv6-M doesn't support that instruction (although the
similar Armv8-M Baseline does).
Most of this patch is concerned with working out whether the
compilation target is Armv6-M or not, which I'm doing by going through
all the functions in the module, retrieving a TargetTransformInfo for
each one, and querying it via a new method I've added to check its
SubtargetInfo. If any function's TTI indicates that it's targeting an
architecture supporting B.W, then we assume we're also allowed to use
B.W in the jump table.
The Armv6-M compatible jump table format requires a temporary
register, and therefore also has to use the stack in order to restore
that register.
Another consequence of this change is that jump tables on Arm/Thumb
are no longer always the same size. In particular, on an architecture
that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb
tables are different sizes from //each other//. As a consequence,
``getJumpTableEntrySize`` can no longer base its answer on the target
triple's architecture: it has to take into account the decision that
``selectJumpTableArmEncoding`` made, which meant I had to move that
function to an earlier point in the code and store its answer in the
``LowerTypeTestsModule`` class.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D143576
This reverts commit 5356fefc19df3fbf32d180b1b10e6226e8743541.
It looks like Polly still relies on the legacy SCCP pass. Bring it back
until the best way forward is determined.
This is part of the optimization pipeline, of which the legacy pass manager version is deprecated.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D144201
We were not summarizing a function alias in the vtable, leading to
incorrect WPD in some cases, and missing WPD in others.
Specifically, we would end up ignoring function aliases as they aren't
summarized, so we could incorrectly devirtualize if there was a single
other non-alias function in a compatible vtable. And if there was only
one implementation, but it was an alias, we would not be able to
identify and perform the single implementation devirtualization.
Handling the alias summary correctly also required fixing the handling
in mustBeUnreachableFunction, so that it is not incorrectly ignored.
Regular LTO is conservatively correct because it will skip
devirtualizing when any pointer within a vtable is not a function.
However, it needs additional work to be able to take advantage of
function alias within the vtable that is in fact the only
implementation. For that reason, the Regular LTO testing in the second
test case is currently disabled, and will be enabled along with a follow
on enhancement fix for Regular LTO WPD.
Differential Revision: https://reviews.llvm.org/D144209
When we simplify loads we need to adjust types (esp. null-values)
properly to avoid inconsinstencies down the line. Add a cast and an
error message.
Fixes: https://github.com/llvm/llvm-project/issues/60788
This reverts commit f6ddf7781471b71243fa3c3ae7c93073f95c7dff.
Eight buildbots reported that the two test files changed by that
commit had started failing. The buildbots in question all had in
common that they build with a very restricted `LLVM_TARGETS_TO_BUILD`,
such as only X86 or AArch64 or Hexagon. I didn't notice this before
commit because my own build has the full default set of targets, and
in that circumstance, the tests pass.
I assume the problem has something to do with the attempt to query
TargetTransformInfo: if you can't make a valid TTI for the target
triple then you can't ask it what kind of inline assembler you should
be emitting, and so `opt` without the Arm backend can't get the Arm
cases of these tests right.
I don't have time to fix this until next week, so I'll revert the
change for now to keep the buildbots happy.
The LowerTypeTests pass emits a jump table in the form of an
`inlineasm` IR node containing a string representation of some
assembly. It tests the target triple to see what architecture it
should be generating assembly for. But that's not good enough for
`Triple::thumb`, because the 32-bit PC-relative `b.w` branch
instruction isn't available in all supported architecture versions. In
particular, Armv6-M doesn't support that instruction (although the
similar Armv8-M Baseline does).
Most of this patch is concerned with working out whether the
compilation target is Armv6-M or not, which I'm doing by going through
all the functions in the module, retrieving a TargetTransformInfo for
each one, and querying it via a new method I've added to check its
SubtargetInfo. If any function's TTI indicates that it's targeting an
architecture supporting B.W, then we assume we're also allowed to use
B.W in the jump table.
The Armv6-M compatible jump table format requires a temporary
register, and therefore also has to use the stack in order to restore
that register.
Another consequence of this change is that jump tables on Arm/Thumb
are no longer always the same size. In particular, on an architecture
that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb
tables are different sizes from //each other//. As a consequence,
``getJumpTableEntrySize`` can no longer base its answer on the target
triple's architecture: it has to take into account the decision that
``selectJumpTableArmEncoding`` made, which meant I had to move that
function to an earlier point in the code and store its answer in the
``LowerTypeTestsModule`` class.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D143576
This patch adds several missing GlobalList modifier functions, like
removeGlobalVariable(), eraseGlobalVariable() and insertGlobalVariable().
There is no longer need to access the list directly so it also makes
getGlobalList() private.
Differential Revision: https://reviews.llvm.org/D144027