We currently generate code like this on x86 for a jump table with 5 elements,
assuming the call target is in rbx:
lea global_addr(%rip), %rax # initialize temporary rax with base address
mov %rbx, %rcx # initialize another temporary rcx for index (rbx will be used for the call, so it is still live)
sub %rax, %rcx # compute `address - base`
ror $0x3, %rcx # compute `(address - base) ror 3` i.e. index
cmp $0x4, %rcx # check index <= 4
ja .Ltrap
[...]
.Ltrap:
ud1
A more efficient instruction sequence, that only needs one temporary
register and one fewer instruction, is possible by subtracting the
address we are testing from the fixed address instead of vice versa:
lea (global_addr + 4*8)(%rip), %rax # initialize temporary rax with address of last element
sub %rbx, %rax # compute `last element - address`
ror $0x3, %rax # compute `(last element - address) ror 3` i.e. 4 - index
cmp $0x4, %rax # check 4 - index <= 4 (same as above)
ja .Ltrap
[...]
.Ltrap:
ud1
Change LowerTypeTests to generate that sequence. As a consequence, the
order of bits in the bitsets is reversed. Because it doesn't matter how we
do the subtraction on other architectures (to the best of my knowledge),
do so unconditionally.
Reviewers: fmayer, vitalybuka
Reviewed By: fmayer
Pull Request: https://github.com/llvm/llvm-project/pull/142887
Over in 6a45fce, this flag (experimental-debuginfo-iterators) was
switched to do nothing, to flush out anything that depended on the
debug-intrinsics way of doing things. It's been a month and nothing's
super-broken, so we'll start to rip things out.
This commit deletes MergeFunc's debuginfo-iterators test: in d2942a86d7
it's documented that that test is specifically because of differences
between intrinsic/non-intrinsic data structures, and we're deleting the
possibility of that difference.
With the previous approach of emitting one inline asm call for all jump
table entries we would encounter SelectionDAG's limit on the number
of operands per node (65536) when the number of jump table entries
exceeded that number. Fix the problem by switching to one inline asm
per jump table entry so that each DAG node only needs one operand.
Reviewers: fmayer, vitalybuka
Reviewed By: fmayer
Pull Request: https://github.com/llvm/llvm-project/pull/136265
createCast in MergeFunctions did not consider ArrayTypes, which results
in the creation of a bitcast between ArrayTypes in the thunk function,
leading to an assertion failure in the provided test case.
The version of createCast in GlobalMergeFunctions does handle
ArrayTypes, so this common code has been factored out into the
IRBuilder.
These date back to when the non-intrinsic format of variable locations
was still being tested and was behind a compile-time flag, so not all
builds / bots would correctly run them. The solution at the time, to get
at least some test coverage, was to have tests opt-in to non-intrinsic
debug-info if it was built into LLVM.
Nowadays, non-intrinsic format is the default and has been on for more
than a year, there's no need for this flag to exist.
(I've downgraded the flag from "try" to explicitly requesting
non-intrinsic format in some places, so that we can deal with tests that
are explicitly about non-intrinsic format in their own commit).
MergeFunc uses the original function F as Thunk and creates a new function NewF for the original function F. Preserve F's comdat info on NewF instead of the thunk.
This fixes a crash when trying to lower comdat on the thunk during codegen.
Update writeThunkOrAlias to only create an alias or thunk if it is
actually needed. Drop discardable linkone_odr functions if they are not
used before.
PR: https://github.com/llvm/llvm-project/pull/128865
Avoid creating new calls to linkonce_odr/weak_odr functions when
merging 2 functions, as this may introduce an infinite call
cycle.
Consider 2 functions below, both present in 2 modules.
Module X
--
define linkonce_odr void @"A"() {
call void @"foo"()
}
define linkonce_odr void @"B"() {
call void @"foo"()
}
---
Module Y
---
global @"g" = @"B"
define linkonce_odr void @"A"() {
%l = load @"g"
call void %l()
}
define linkonce_odr void @"B"() {
call void @"foo"()
}
---
@"A" and @"B" in both modules are semantically equivalent
Module X after function merging:
---
define linkonce_odr void @"A"() {
call void @"foo"()
}
define linkonce_odr void @"B"() {
call void @"A"()
}
---
Module Y is unchanged.
Then the linker picks @"A" from module Y and @"B" from module X. Now there's an infinite call cycle
PR: https://github.com/llvm/llvm-project/pull/125050
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.
Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
make it easier to use old IR files and somewhat reduce the test churn in
this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
attribute. The representation in the LLVM IR dialect should be updated
separately.
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records.
If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.
For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.
Currently, inrange is specified as follows:
```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```
The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:
```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```
This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:
```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```
The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.
This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.
When MergeFunctions creates new thunk functions, it needs to copy over
the debug info format kind from the original function, otherwise we'll
mix debug info formats and run into assertions. This was exposed by a
downstream change that runs MergeFunctions before inlining, which caused
assertions when inlining attempted to inline thunks created by merging,
and the added test covers both scenarios where merging creates thunks.
The MergeFunctions pass has a "preserve some debug-info" mode that tries
to preserve parameter values. This patch generalises its decision-making
so that it applies to both debug-info stored in intrinsics, and
debug-info stored in DPValue objects. For the most part this involves
using a generic lambda and applying it to each type of object.
(Normally we avoid debug-info affecting the code generated, but this is
hidden behind a command line switch, so won't usually be encountered by
users).
Note that this diff is messy, but that's because I'm hoisting some code
into lambdas. The actual decision making processes here are identical.
Functions using different constant expressions were incorrectly
merged, because a lot of state was missing from the comparison,
including the opcode, the comparison predicate, the GEP element
type, as well as the inbounds, inrange and nowrap poison flags.
Prior to this patch, differing metadata operands to two otherwise
identical instructions was not enough to consider the instructions
different in the eyes of the function comparator. This breaks LLVM
virtual function elimination, among other features.
In this patch, we handle the case where two associated operands are
MDStrings of different value. This patch does not differentiate more
complex metadata operands.
---------
Co-authored-by: Nuri Amari <nuriamari@fb.com>
When MergeFuncs creates a thunk, it does not modify the function in
place, but creates a new one altogether. If type metadata is not
properly forwarded to this new function, LowerTypeTests will be unable
to put this thunk into the dispatch table.
The fix here is to just forward the type metadata to the newly created
functions.
I noticed that when we determine the size of the function to figure out
if its profitable, we include debug instructions which can end up making
larger functions than necessary.
This diff fixes an issue in the MergeFunctions pass where two different
Control Flow Integrity (CFI) metadata checks were incorrectly considered
identical. These merges would lead to runtime violations down the line
as two separate objects contained a single destructor which itself
contained checks for only one of the objects.
Here I update the comparison logic to take into account the metadata at
llvm.type.test checks. Now, only truly identical checks will be
considered for merging, thus preserving the integrity of each check.
Previous discussion: https://reviews.llvm.org/D154119
This is the consolidation of D151644 and D151943 moved from
InstCombine to FunctionAttrs. This is based on discussion in the above
patches as well as D152081 (Attributor). This patch was written in a
way so it can have an immediate impact in currently active passes
(FunctionAttrs), but should be easy to port elsewhere (Attributor or
Inliner) if that makes more sense later on.
Some function attributes imply the attribute for all/some instructions
in the function. These attributes can be safely propagated to
callsites within the function that are missing the attribute. This can
be useful when 1) analyzing individual instructions in a function
and 2) if the original caller is later inlined, as if the attributes are
not propagated, they will be lost.
This patch implements propagation in a new class/file
`InferCallsiteAttrs` which can hypothetically be included elsewhere.
At the moment this patch infers the following:
Function Attributes:
- mustprogress
- nofree
- willreturn
- All memory attributes (readnone, readonly, writeonly, argmem,
etc...)
- The memory attributes are only propagated IFF the set of
pointers available to the callsite is the same as the set
available outside the caller (i.e no local memory arguments
from alloca or local malloc like functions).
Argument Attributes:
- noundef
- nonnull
- nofree
- readnone
- readonly
- writeonly
- nocapture
- nocapture is only propagated IFF the set of pointers
available to the callsite is the same as the set available
outside the caller and its guranteed that between the
callsite and function return, the state of any capture
pointers will not change (so the nocaptured gurantee of the
caller has been met by the instruction preceding the
callsite and will not changed).
Argument are only propagated to callsite arguments that are also function
arguments, but not derived values.
Return Attributes:
- noundef
- nonnull
Return attributes are only propagated if the callsite's return value
is used as the caller's return and execution is guranteed to pass from
callsite to return.
The compile time hit of this for -O3 and -O3+thinLTO is ~[.02, .37]%
regression. Proper LTO, however, has more significant regressions (up
to 3.92%):
https://llvm-compile-time-tracker.com/compare.php?from=94407e1bba9807193afde61c56b6125c0fc0b1d1&to=79feb6e78b818e33ec69abdc58c5f713d691554f&stat=instructions:u
Differential Revision: https://reviews.llvm.org/D152226
Many uses of getIntPtrType() were using that type to calculate the
neened type for GEP offset arguments. However, some time ago,
DataLayout was extended to support pointers where the size of the
pointer is not equal to the size of the values used to index it.
Much code was already migrated to, for example, use getIndexSizeInBits
instead of getPtrSizeInBits, but some rewrites still used
getIntPtrType() to get the type for GEP offsets.
This commit changes uses of getIntPtrType() to getIndexType() where
they are involved in a GEP-related calculation.
In at least one case (bounds check insertion) this resolves a compiler
crash that the new test added here would previously trigger.
This commit does not impact
- C library-related rewriting (memcpy()), which are operating under
the assumption that intptr_t == size_t. While all the mechanisms for
breaking this assumption now exist, doing so is outside the scope of
this commit.
- Code generation and below. Note that the use of getIntPtrType() in
CodeGenPrepare will be changed in a future commit.
- Usage of getIntPtrType() in any backend
Depends on D143435
Reviewed By: arichardson
Differential Revision: https://reviews.llvm.org/D143437
We should not call mdconst::extract, unless we know that the
metadata in question is ConstantAsMetadata.
For now we consider all other metadata as equal. The noalias test
shows that this is not correct, but at least it doesn't crash
anymore.
This reverts commit 8aa9ab336889ae2eb8e4188036faeb151379ab7b.
Reverting due to compile-time regressions as pointed out in
https://reviews.llvm.org/D145051#4166656
E.g.
"In particular tramp3d-v4 with debuginfo regressed by 15%."
We now limit ADCE to only remove debug intrinsics if it does something else
that would invalidate cached analyses anyway.
As we've seen in
https://github.com/llvm/llvm-project/issues/58285
throwing away cached analysis info when only debug instructions are removed
can lead to different code when debug info is present or not present.
Differential Revision: https://reviews.llvm.org/D145051
Some calls to GEPOperator::accumulateConstantOffset(APInt) passed the
pointer bitwidth as the width of the APInt, while the function asserts
that the width of its argument is equal to the index width of the GEP
pointer input. These values are almost always the same, so mixing up
which call to use doesn't usually cause issues. However, when dealing
with data layouts where these values are different, the passes tested
here can crash.
This will be fixed in D143437 .
Differential Revision: https://reviews.llvm.org/D144673