This fixes a crash when `splitAndWriteThinLTOBitcode()` hits a
declaration with type metadata. For example, such declarations can be
generated by the `EliminateAvailableExternally` pass.
Here's a high level summary of the changes in this patch. For more
information on rational, see the RFC.
(https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774).
- Add config parameter to LTO backend, specifying which LTO mode is
desired when using unified LTO.
- Add unified LTO flag to the summary index for efficiency. Unified
LTO modules can be detected without parsing the module.
- Make sure that the ModuleID is generated by incorporating more types
of symbols.
Differential Revision: https://reviews.llvm.org/D123803
This adds a type_checked_load_relative intrinsic whose semantics it is to
load a relative function pointer.
A relative function pointer is a pointer to a 32bit value that when
added to its address yields the address of the function.
Differential Revision: https://reviews.llvm.org/D143204
ifuncs can't take part of the whole-program devirtualization so no need them to be copied to the merged module.
The corresponding resolver function also kept out which caused the crash.
Fixes#60962#57870
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D144982
Currently, FunctionModRefBehavior tracks whether the function reads
or writes memory (ModRefInfo) and which locations it can access
(argmem, inaccessiblemem and other). This patch changes it to track
ModRef information per-location instead.
To give two examples of why this is useful:
* D117095 highlights a weakness of ModRef modelling in the presence
of operand bundles. For a memcpy call with deopt operand bundle,
we want to say that it can read any memory, but only write argument
memory. This would allow them to be treated like any other calls.
However, we currently can't express this and have to say that it
can read or write any memory.
* D127383 would ideally be modelled as a separate threadid location,
where threadid Refs outside pre-split coroutines can be ignored
(like other accesses to constant memory). The current representation
does not allow modelling this precisely.
The patch as implemented is intended to be NFC, but there are some
obvious opportunities for improvements and simplification. To fully
capitalize on this we would also want to change the way we represent
memory attributes on functions, but that's a larger change, and I
think it makes sense to separate out the FunctionModRefBehavior
refactoring.
Differential Revision: https://reviews.llvm.org/D130896
Using the legacy PM for the optimization pipeline is deprecated and in
the process of being removed. This is a small step in that direction.
For an example of migrating to the new PM:
853b57fe80
Turning on opaque pointers has uncovered an issue with WPD where we currently pattern match away `assume(type.test)` in WPD so that a later LTT doesn't resolve the type test to undef and introduce an `assume(false)`. The pattern matching can fail in cases where we transform two `assume(type.test)`s into `assume(phi(type.test.1, type.test.2))`.
Currently we create `assume(type.test)` for all virtual calls that might be devirtualized. This is to support `-Wl,--lto-whole-program-visibility`.
To prevent this, all virtual calls that may not be in the same LTO module instead use a new `llvm.public.type.test` intrinsic in place of the `llvm.type.test`. Then when we know if `-Wl,--lto-whole-program-visibility` is passed or not, we can either replace all `llvm.public.type.test` with `llvm.type.test`, or replace all `llvm.public.type.test` with `true`. This prevents WPD from trying to pattern match away `assume(type.test)` for public virtual calls when failing the pattern matching will result in miscompiles.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D128955
Update FunctionAttrs to use FunctionModRefBehavior instead
MemoryAccessKind.
This allows for adding support for inferring argmemonly and others,
see D121415.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D121460
Inline assembly refererences to static functions with ThinLTO+CFI were
fixed in D104058 by creating aliases for promoted functions. Creating
the aliases unconditionally resulted in an unexpected size increase in
a Chrome helper binary:
https://bugs.chromium.org/p/chromium/issues/detail?id=1261715
This is caused by the compiler being unable to drop unused code now
referenced by the alias in module-level inline assembly. This change
adds a .set_conditional assembly extension, which emits an assignment
only if the target symbol is also emitted, avoiding phantom references
to functions that could have otherwise been dropped.
This is an alternative to the solution proposed in D112761.
Reviewed By: pcc, nickdesaulniers, MaskRay
Differential Revision: https://reviews.llvm.org/D113613
To better reflect the meaning of the now-disambiguated {GlobalValue,
GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction
(D109792), the function is renamed to getAliaseeObject.
Create an internal alias with the original name for static functions
that are renamed in promoteInternals to avoid breaking inline
assembly references to them.
Relands 700d07f8ce6f2879610fd6b6968b05c6f17bb915 with -msvc targets
fixed.
Link: https://github.com/ClangBuiltLinux/linux/issues/1354
Reviewed By: nickdesaulniers, pcc
Differential Revision: https://reviews.llvm.org/D104058
Create an internal alias with the original name for static functions
that are renamed in promoteInternals to avoid breaking inline
assembly references to them. This version uses module inline assembly
to avoid issues with LowerTypeTestsModule.
Relands commmit 8e3b5cb39eef462943ed7556469604ce25c07a1d with arch
specific tests fixed.
Link: https://github.com/ClangBuiltLinux/linux/issues/1354
Reviewed By: nickdesaulniers, pcc
Differential Revision: https://reviews.llvm.org/D104058
Create an internal alias with the original name for static functions
that are renamed in promoteInternals to avoid breaking inline
assembly references to them. This version uses module inline assembly
to avoid issues with LowerTypeTestsModule.
Link: https://github.com/ClangBuiltLinux/linux/issues/1354
Reviewed By: nickdesaulniers, pcc
Differential Revision: https://reviews.llvm.org/D104058
This casues compiler crash: Assertion `materialized_use_empty() && "Uses remain when a value is destroyed!"'
This reverts commit e3d24b45b8f808ec66213e134c4ceda5202fbe31.
Create an internal alias with the original name for static functions
that are renamed in promoteInternals to avoid breaking inline
assembly references to them.
This relands commit 4474958d3a97dede2caa0920f7c4a4dc7aac57d3
with a fix to a use-of-uninitialized-value error that tripped
MemorySanitizer.
Link: https://github.com/ClangBuiltLinux/linux/issues/1354
Reviewed By: nickdesaulniers, pcc
Differential Revision: https://reviews.llvm.org/D104058
The unnamedaddr property of a function is lost when using
`-fwhole-program-vtables` and thinlto which causes size increase under linker's
safe icf mode.
The size increase of chrome on Linux when switching from all icf to safe icf
drops from 5 MB to 3 MB after this change, and from 6 MB to 4 MB on Windows.
There is a repro:
```
# a.h
struct A {
virtual int f();
virtual int g();
};
# a.cpp
#include "a.h"
int A::f() { return 10; }
int A::g() { return 10; }
# main.cpp
#include "a.h"
int g(A* a) {
return a->f();
}
int main(int argv, char** args) {
A a;
return g(&a);
}
$ clang++ -O2 -ffunction-sections -flto=thin -fwhole-program-vtables -fsplit-lto-unit -c main.cpp -o main.o && clang++ -Wl,--icf=safe -fuse-ld=lld -flto=thin main.o -o a.out && llvm-readobj -t a.out | grep -A 1 -e _ZN1A1fEv -e _ZN1A1gEv
Name: _ZN1A1fEv (480)
Value: 0x201830
--
Name: _ZN1A1gEv (490)
Value: 0x201840
```
Differential Revision: https://reviews.llvm.org/D100498
Iterating on `SmallPtrSet<GlobalValue *, 8>` with more than 8 elements
is not deterministic. Use a SmallVector instead because `Used` is guaranteed to contain unique elements.
While here, decrease inline element counts from 8 to 4. The number of
`llvm.used`/`llvm.compiler.used` elements is usually 0 or 1. For full
LTO/hybrid LTO, the number may be large, so we need to be careful.
According to tejohnson's analysis https://reviews.llvm.org/D97128#2582399 , 4 is
good for a large project with WholeProgramDevirt, when available_externally
vtables are placed in the llvm.compiler.used set.
Differential Revision: https://reviews.llvm.org/D97128
Refines the fix in 3c4c205060c9398da705eb71b63ddd8a04999de9 to only
put globals whose defs were cloned into the split regular LTO module
on the cloned llvm*.used globals. This avoids an issue where one of the
attached values was a local that was promoted in the original module
after the module was cloned. We only need to have the values defined in
the new module on those globals.
Fixes PR49251.
Differential Revision: https://reviews.llvm.org/D97013
Adds a lld test for a case that the handling added for dynamically
exported symbols in 1487747e990ce9f8851f3d92c3006a74134d7518 already
fixes. Because isExportDynamic returns true when the symbol is
SharedKind with default visibility, it will treat as dynamically
exported and block devirtualization when the definition of a vtable
comes from a shared library. This is desireable as it is dangerous to
devirtualize in that case, since there could be hidden overrides in the
shared library. Typically that happens when the shared library header
contains available externally definitions, which applications can
override. An example is std::error_category, which is overridden in LLVM
and causing failures after a self build with WPD enabled, because
libstdc++ contains hidden overrides of the virtual base class methods.
The regular LTO case in the new test already worked, but there are
2 fixes in this patch needed for the index-only case and the hybrid
LTO case. For the index-only case, WPD should not simply ignore
available externally vtables. A follow on fix will be made to clang to
emit type metadata for those vtables, which the new test is modeling.
For the hybrid case, we need to ensure when the module is split that any
llvm.*used globals are cloned to the regular LTO split module so
available externally vtable definitions are not prematurely deleted.
Another follow on fix will add the equivalent gold test, which requires
a small fix to the plugin to treat symbols in dynamic libraries the same
way lld already is.
Differential Revision: https://reviews.llvm.org/D96721
ModuleSummaryAnalysis is the only file in libAnalysis that brings a
dependency on the CodeGen layer from libAnalysis, moving it breaks this
dependency.
Differential Revision: https://reviews.llvm.org/D77994
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
The default behavior of Clang's indirect function call checker will replace
the address of each CFI-checked function in the output file's symbol table
with the address of a jump table entry which will pass CFI checks. We refer
to this as making the jump table `canonical`. This property allows code that
was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address
of a function, but it comes with a couple of caveats that are especially
relevant for users of cross-DSO CFI:
- There is a performance and code size overhead associated with each
exported function, because each such function must have an associated
jump table entry, which must be emitted even in the common case where the
function is never address-taken anywhere in the program, and must be used
even for direct calls between DSOs, in addition to the PLT overhead.
- There is no good way to take a CFI-valid address of a function written in
assembly or a language not supported by Clang. The reason is that the code
generator would need to insert a jump table in order to form a CFI-valid
address for assembly functions, but there is no way in general for the
code generator to determine the language of the function. This may be
possible with LTO in the intra-DSO case, but in the cross-DSO case the only
information available is the function declaration. One possible solution
is to add a C wrapper for each assembly function, but these wrappers can
present a significant maintenance burden for heavy users of assembly in
addition to adding runtime overhead.
For these reasons, we provide the option of making the jump table non-canonical
with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump
table is made non-canonical, symbol table entries point directly to the
function body. Any instances of a function's address being taken in C will
be replaced with a jump table address.
This scheme does have its own caveats, however. It does end up breaking
function address equality more aggressively than the default behavior,
especially in cross-DSO mode which normally preserves function address
equality entirely.
Furthermore, it is occasionally necessary for code not compiled with
``-fsanitize=cfi-icall`` to take a function address that is valid
for CFI. For example, this is necessary when a function's address
is taken by assembly code and then called by CFI-checking C code. The
``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make
the jump table entry of a specific function canonical so that the external
code will end up taking a address for the function that will pass CFI checks.
Fixes PR41972.
Differential Revision: https://reviews.llvm.org/D65629
llvm-svn: 368495
Globals that are associated with globals with type metadata need to appear
in the merged module because they will reference the global's section directly.
Differential Revision: https://reviews.llvm.org/D65312
llvm-svn: 367242
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).
Also added are the necessary bitcode records, and the corresponding
assembly support.
The follow-on index-based WPD patch is D55153.
Depends on D53890.
Reviewers: pcc
Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D54815
llvm-svn: 364960
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).
Also added are the necessary bitcode records, and the corresponding
assembly support.
The index-based WPD will be sent as a follow-on.
Depends on D53890.
Reviewers: pcc
Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D54815
llvm-svn: 351453