80 Commits

Author SHA1 Message Date
Mingming Liu
47cc9db797
[WPD]Regard unreachable function as a possible devirtualizable target (#115668)
https://reviews.llvm.org/D115492 skips unreachable functions and
potentially allows more static de-virtualizations. The motivation is to
ignore virtual deleting destructor of abstract class (e.g.,
`Base::~Base()` in https://gcc.godbolt.org/z/dWMsdT9Kz).
* Note WPD already handles most pure virtual functions (like `Base::x()`
in the godbolt example above), which becomes a `__cxa_pure_virtual` in
the vtable slot.

This PR proposes to undo the change, because it turns out there are
other unreachable functions that a general program wants to run and fail
intentionally, with `LOG(FATAL)` or `CHECK` [1] for example. While many
real-world applications are encouraged to check-fail sparingly, they are
allowed to do so on critical errors (e.g., misconfiguration or bug is
detected during server startup).
* Implementation-wise, this PR keeps the one-bit 'unreachable' state in
bitcode and updates WPD analysis.
 
https://gcc.godbolt.org/z/T1aMhczYr is a minimum reproducible example
extracted from unit test. `Base::func` is a one-liner of `LOG(FATAL) <<
"message"`, and lowered to one basic block ending with `unreachable`. A
real-world program is _allowed_ to invoke Base::func to terminate the
program as a way to report errors (in server initialization stage for
example), even if errors on the serving path should be handled more
gracefully.

[1] https://abseil.io/docs/cpp/guides/logging#CHECK and
https://abseil.io/docs/cpp/guides/logging#configuration-and-flags
2024-11-13 11:28:36 -08:00
Mingming Liu
60944177b8
[WPD][ThinLTO]Add cutoff option for WPD (#113383)
This option applies for _import_ WPD (i.e., when `DevirtModule` pass
de-virtualizes according to an imported summary, in ThinLTO backend
pipeline). It's meant for debugging (e.g., bisection).
2024-10-23 23:47:27 -07:00
Leonard Chan
58c5f50f4c Reapply "[llvm] Teach GlobalDCE about dso_local_equivalent"
Also reapply "[llvm] Teach whole program devirtualization about
relative vtables"

This reverts commit 1c604a9780fcfe92a99d539913553f0835b81de3 and
474f5efebed24547e76d022f0c5ffcc9db97ce6f.
2024-04-15 11:40:23 -07:00
Leonard Chan
c0b77e0a4a Revert "Reapply "[llvm] Teach whole program devirtualization about relative vtables""
This reverts commit 09c3bfe9b3eb47a2af0c10531b25f90cfb5fa9f4.
2024-04-15 11:14:50 -07:00
Leonard Chan
09c3bfe9b3 Reapply "[llvm] Teach whole program devirtualization about relative vtables"
This reverts commit 474f5efebed24547e76d022f0c5ffcc9db97ce6f.
2024-04-15 11:01:04 -07:00
Mingming Liu
dda73336ad
[ThinLTO]Record import type in GlobalValueSummary::GVFlags (#87597)
The motivating use case is to support import the function declaration
across modules to construct call graph edges for indirect calls [1]
when importing the function definition costs too much compile time
(e.g., the function is too large has no `noinline` attribute).
1. Currently, when the compiled IR module doesn't have a function
definition but its postlink combined summary contains the function
summary or a global alias summary with this function as aliasee, the
function definition will be imported from source module by IRMover. The
implementation is in FunctionImporter::importFunctions [2]
2. In order for FunctionImporter to import a declaration of a function,
both function summary and alias summary need to carry the def / decl
state. Specifically, all existing summary fields doesn't differ across
import modules, but the def / decl state of is decided by
`<ImportModule, Function>`.

This change encodes the def/decl state in `GlobalValueSummary::GVFlags`.

In the subsequent changes
1. The indexing step `computeImportForModule` [3]
will compute the set of definitions and the set of declarations for each
module, and passing on the information to bitcode writer.
2. Bitcode writer will look up the def/decl state and sets the state
when it writes out the flag value. This is demonstrated in
https://github.com/llvm/llvm-project/pull/87600
3. Function importer will read the def/decl state when reading the
combined summary to figure out two sets of global values, and IRMover
will be updated to import the declaration (aka linkGlobalValuePrototype [4])
into the destination module.

- The next change is https://github.com/llvm/llvm-project/pull/87600

[1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5
[2] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L1608-L1764)
[3] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L856)
[4] 3b337242ee/llvm/lib/Linker/IRMover.cpp (L605)
2024-04-10 19:46:01 -07:00
Arnold Schwaighofer
98eb8abff6 Add a type_checked_load_relative to support relative function pointer tables
This adds a type_checked_load_relative intrinsic whose semantics it is to
load a relative function pointer.

A relative function pointer is a pointer to a 32bit value that when
added to its address yields the address of the function.

Differential Revision: https://reviews.llvm.org/D143204
2023-06-29 08:33:45 -07:00
Arnold Schwaighofer
200a1ccea0 WholeProgramDevirt: Fix call target propagation for ptrauth architectures
We can't have a call with a constant target with a ptrauth bundle. Remove the
ptrauth bundle operand in such a case

rdar://105696396

Differential Revision: https://reviews.llvm.org/D144581
2023-06-29 08:02:58 -07:00
Teresa Johnson
200cc952a2 [LTO][GlobalDCE] Use pass parameter instead of module flag for LTO phase
D63932 added a module flag to indicate that we are executing the regular
LTO post merge pipeline, so that GlobalDCE could perform more aggressive
optimization for Dead Virtual Function Elimination. This caused issues
trying to reuse bitcode that had already been through the LTO pipeline
(see context in D139816).

Instead support this by passing down a parameter flag to the
GlobalDCEPass constructor, which is the more usual way for indicating
this information.

Most test changes are to remove incidental uses of this flag. Of the 2
real uses, llvm/test/LTO/ARM/lto-linking-metadata.ll is now obsolete and
removed in this patch, and the virtual-functions-visibility-post-lto.ll
test is updated to use the regular LTO default pipeline where this
parameter is set to true.

Differential Revision: https://reviews.llvm.org/D153655
2023-06-23 17:05:07 -07:00
Leonard Chan
474f5efebe Revert "[llvm] Teach whole program devirtualization about relative vtables"
This reverts commit db288184765c0b4010060ebea1f6de3ac1f66445.

Reverting since it broke our lto builders reported by fxbug.dev/123807.
2023-03-26 01:53:13 +00:00
Leonard Chan
53a9175951 [llvm] Handle duplicate call bases when applying branch funneling
It's possible to segfault in `DevirtModule::applyICallBranchFunnel` when
attempting to call `getCaller` on a call base that was erased in a prior
iteration. This can occur when attempting to find devirtualizable calls
via `findDevirtualizableCallsForTypeTest` if the vtable passed to
llvm.type.test is a global and not a local. The function works by taking
the first argument of the llvm.type.test call (which is a vtable),
iterating through all uses of it, and adding any relevant all uses that
are calls associated with that intrinsic call to a vector. For most
cases where the vtable is actually a *local*, this wouldn't be an issue.
Take for example:

```
define i32 @fn(ptr %obj) #0 {
  %vtable = load ptr, ptr %obj
  %p = call i1 @llvm.type.test(ptr %vtable, metadata !"typeid2")
  call void @llvm.assume(i1 %p)
  %fptr = load ptr, ptr %vtable
  %result = call i32 %fptr(ptr %obj, i32 1)
  ret i32 %result
}
```

`findDevirtualizableCallsForTypeTest` will check the call base ` %result
= call i32 %fptr(ptr %obj, i32 1)`, find that it is associated with a
virtualizable call from `%vtable`, find all loads for `%vtable`, and add
any instances those load results are called into a vector. Now consider
the case where instead `%vtable` was the global itself rather than a
local:

```
define i32 @fn(ptr %obj) #0 {
  %p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2")
  call void @llvm.assume(i1 %p)
  %fptr = load ptr, ptr @vtable
  %result = call i32 %fptr(ptr %obj, i32 1)
  ret i32 %result
}
```

`findDevirtualizableCallsForTypeTest` should work normally and add one
unique call instance to a vector. However, if there are multiple
instances where this same global is used for llvm.type.test, like with:

```
define i32 @fn(ptr %obj) #0 {
  %p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2")
  call void @llvm.assume(i1 %p)
  %fptr = load ptr, ptr @vtable
  %result = call i32 %fptr(ptr %obj, i32 1)
  ret i32 %result
}

define i32 @fn2(ptr %obj) #0 {
  %p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2")
  call void @llvm.assume(i1 %p)
  %fptr = load ptr, ptr @vtable
  %result = call i32 %fptr(ptr %obj, i32 1)
  ret i32 %result
}
```

Then each call base `%result = call i32 %fptr(ptr %obj, i32 1)` will be
added to the vector twice. This is because for either call base `%result
= call i32 %fptr(ptr %obj, i32 1) `, we determine it is associated with
a virtualizable call from `@vtable`, and then we iterate through all the
uses of `@vtable`, which is used across multiple functions. So when
scanning the first `%result = call i32 %fptr(ptr %obj, i32 1)`, then
both call bases will be added to the vector, but when scanning the
second one, both call bases are added again, resulting in duplicate call
bases in the CSInfo.CallSites vector.

Note this is actually accounted for in every other instance WPD iterates
over CallSites. What everything else does is actually add the call base
to the `OptimizedCalls` set and just check if it's already in the set.
We can't reuse that particular set since it serves a different purpose
marking which calls where devirtualized which `applyICallBranchFunnel`
explicitly says it doesn't. For this fix, we can just account for
duplicates with a map and do the actual replacements afterwards by
iterating over the map.

Differential Revision: https://reviews.llvm.org/D146267
2023-03-23 21:44:59 +00:00
Leonard Chan
db28818476 [llvm] Teach whole program devirtualization about relative vtables
Prior to this patch, WPD was not acting on relative-vtables in C++. This
involves teaching WPD about these things:

- llvm.load.relative which is how relative-vtables are indexed (instead of GEP)
- dso_local_equivalent which is used in the vtable itself when taking the
  offset between a virtual function and vtable
- Update llvm/test/ThinLTO/X86/devirt.ll to use opaque pointers and add
  equivalent tests for RV

Differential Revision: https://reviews.llvm.org/D134320
2023-02-23 22:18:43 +00:00
Leonard Chan
0ecf796592 [llvm] Update WPD tests to use opaque pointers
Differential Revision: https://reviews.llvm.org/D135611
2022-10-11 22:30:24 +00:00
Nikita Popov
6e504d637d [ValueTracking] Handle constant exprs in isKnownNonZero()
Handle constant expressions by falling through to the general
operator-based code. In particular, this adds support for bitcast
and GEP expressions.
2022-10-04 11:58:07 +02:00
Nuno Lopes
53dc0f1078 [NFC] Switch a few uses of undef to poison as placeholders for unreachble code 2022-07-03 14:34:03 +01:00
Teresa Johnson
ced9a795fd [WPD] Add statistics
Add statistics to count overall devirtualized targets as well as the
various types of devirtualizations applied at callsites.

Differential Revision: https://reviews.llvm.org/D123152
2022-04-05 18:48:23 -07:00
minglotus-6
9c49f8d705 [LTO][WPD] Ignore unreachable function by analyzing IR.
In regular LTO, analyze IR and discard unreachable functions when finding virtual call targets.

Differential Revision: https://reviews.llvm.org/D116056
2021-12-21 18:13:03 +00:00
Bjorn Pettersson
3f8027fb67 [test] Update some test cases to use -passes when specifying the pipeline
This updates transform test cases for
  ADCE
  AddDiscriminators
  AggressiveInstCombine
  AlignmentFromAssumptions
  ArgumentPromotion
  BDCE
  CalledValuePropagation
  DCE
  Reg2Mem
  WholeProgramDevirt
to use the -passes syntax when specifying the pipeline.

Given that LLVM_ENABLE_NEW_PASS_MANAGER isn't set to off (which is
a deprecated feature) the updated test cases already used the new
pass manager, but they were using the legacy syntax when specifying
the passes to run. This patch can be seen as a step toward deprecating
that interface.

This patch also removes some redundant RUN lines. Here I am
referring to test cases that had multiple RUN lines verifying both
the legacy "-passname" syntax and the new "-passes=passname" syntax.
Since we switched the default pass manager to "new PM" both RUN lines
have verified the new PM version of the pass (more or less wasting
time running the same test twice), unless LLVM_ENABLE_NEW_PASS_MANAGER
is set to "off". It is assumed that it is enough to run these tests
with the new pass manager now.

Differential Revision: https://reviews.llvm.org/D108472
2021-09-29 21:51:08 +02:00
Nikita Popov
f8aaec19e6 [OpaquePtr] Support forward references in textual IR
Currently, LLParser will create a Function/GlobalVariable forward
reference based on the desired pointer type and then modify it when
it is declared. With opaque pointers, we generally do not know the
correct type to use until we see the declaration.

Solve this by creating the forward reference with a dummy type, and
then performing a RAUW with the correct Function/GlobalVariable when
it is declared. The approach is adopted from
b5b55963f6.

This results in a change to the use list order, which is why we see
test changes on some module passes that are not stable under use list
reordering.

Differential Revision: https://reviews.llvm.org/D104950
2021-06-29 20:10:31 +02:00
Arthur Eubanks
7110510eca [WPD] Don't optimize calls more than once
WPD currently assumes that there is a one to one correspondence between
type test assume sequences and virtual calls. However, with
-fstrict-vtable-pointers this may not be true. This ends up causing
crashes when we try to optimize a virtual call more than once (
applyUniformRetValOpt()/applyUniqueRetValOpt()/applyVirtualConstProp()/applySingleImplDevirt()).

applySingleImplDevirt() actually didn't previous crash because it would
replace the devirtualized call with the same direct call. Adding an
assert that the call is indirect causes the corresponding test to crash
with the rest of the patch.

This makes Chrome successfully build with -fstrict-vtable-pointers + WPD.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D104798
2021-06-24 13:28:09 -07:00
Hans Wennborg
c6e5c4654b Don't use $ as suffix for symbol names in ThinLTOBitcodeWriter and other places
Using $ breaks demangling of the symbols. For example,

$ c++filt _Z3foov\$123
_Z3foov$123

This causes problems for developers who would like to see nice stack traces
etc., but also for automatic crash tracking systems which try to organize
crashes based on the stack traces.

Instead, use the period as suffix separator, since Itanium demanglers normally
ignore such suffixes:

$ c++filt _Z3foov.123
foo() [clone .123]

This is already done in some places; try to do it everywhere.

Differential revision: https://reviews.llvm.org/D97484
2021-03-29 13:03:52 +02:00
Fangrui Song
54fb3ca96e [ThinLTO] Add Visibility bits to GlobalValueSummary::GVFlags
Imported functions and variable get the visibility from the module supplying the
definition.  However, non-imported definitions do not get the visibility from
(ELF) the most constraining visibility among all modules (Mach-O) the visibility
of the prevailing definition.

This patch

* adds visibility bits to GlobalValueSummary::GVFlags
* computes the result visibility and propagates it to all definitions

Protected/hidden can imply dso_local which can enable some optimizations (this
is stronger than GVFlags::DSOLocal because the implied dso_local can be
leveraged for ELF -shared while default visibility dso_local has to be cleared
for ELF -shared).

Note: we don't have summaries for declarations, so for ELF if a declaration has
the most constraining visibility, the result visibility may not be that one.

Differential Revision: https://reviews.llvm.org/D92900
2021-01-27 10:43:51 -08:00
Arthur Eubanks
460dda071e [WholeProgramDevirt][NewPM] Add NPM testing path to match legacy pass
The legacy pass's default constructor sets UseCommandLine = true and
goes down a separate testing route. Match that in the NPM pass.

This fixes all tests in llvm/test/Transforms/WholeProgramDevirt under NPM.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D88588
2020-09-30 17:27:37 -07:00
Teresa Johnson
6014c46c80 Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This restores commit 80d0a137a5aba6998fadb764f1e11cb901aae233, and the
follow on fix in 873c0d0786dcf22f4af39f65df824917f70f2170, with a new
fix for test failures after a 2-stage clang bootstrap, and a more robust
fix for the Chromium build failure that an earlier version partially
fixed. See also discussion on D75201.

Reviewers: evgeny777

Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73242
2020-07-14 12:16:57 -07:00
Bob Haarman
cc5c58889e [WPD] Avoid noalias assumptions in unique return value optimization
Summary:
Changes the type of the @__typeid_.*_unique_member imports we generate
for unique return value optimization from i8 to [0 x i8]. This
prevents assuming that these imports do not alias, such as when
two unique return values occur in the same vtable.

Fixes PR45393.

Reviewers: tejohnson, pcc

Reviewed By: pcc

Subscribers: aganea, hiraditya, rnk, george.burgess.iv, dblaikie, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77421
2020-04-16 14:49:51 -07:00
evgeny
6d2032e259 [WPD] Provide a way to prevent functions from being devirtualized
Differential revision: https://reviews.llvm.org/D75617
2020-03-09 14:05:15 +03:00
Teresa Johnson
80bf137fa1 Revert "Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP""
This reverts commit 80d0a137a5aba6998fadb764f1e11cb901aae233, and the
follow on fix in 873c0d0786dcf22f4af39f65df824917f70f2170. It is
causing test failures after a multi-stage clang bootstrap. See
discussion on D73242 and D75201.
2020-03-02 14:02:13 -08:00
Teresa Johnson
80d0a137a5 Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This restores commit 748bb5a0f1964d20dfb3891b0948ab6c66236c70, along
with a fix for a Chromium test suite build issue (and a new test for
that case).

Differential Revision: https://reviews.llvm.org/D73242
2020-02-11 10:48:05 -08:00
Teresa Johnson
25aa2eef99 Revert "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This reverts commit 748bb5a0f1964d20dfb3891b0948ab6c66236c70.

Due to Chromium CFI+ThinLTO test crashes reported on patch.
2020-02-05 19:27:32 -08:00
Teresa Johnson
748bb5a0f1 [WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP
Summary:
Currently type test assume sequences inserted for devirtualization are
removed during WPD. This patch delays their removal until later in the
optimization pipeline. This is an enabler for upcoming enhancements to
indirect call promotion, for example streamlined promotion guard
sequences that compare against vtable address instead of the target
function, when there are small number of possible vtables (either
determined via WPD or by in-progress type profiling). We need the type
tests to correlate the callsites with the address point offset needed in
the compare sequence, and optionally to associated type summary info
computed during WPD.

This depends on work in D71913 to enable invocation of LowerTypeTests to
drop type test assume sequences, which will now be invoked following ICP
in the ThinLTO post-LTO link pipelines, and also after the existing
export phase LowerTypeTests invocation in regular LTO (which is already
after ICP). We cannot simply move the existing import phase
LowerTypeTests pass later in the ThinLTO post link pipelines, as the
comment in PassBuilder.cpp notes (it must run early because when
performing CFI other passes may disturb the sequences it looks for).

This necessitated adding a new type test resolution "Unknown" that we
can use on the type test assume sequences previously removed by WPD,
that we now want LTT to ignore.

Depends on D71913.

Reviewers: pcc, evgeny777

Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73242
2020-02-05 08:59:48 -08:00
Teresa Johnson
2f63d549f1 Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d37cf9ad88b5021b33ecdbaf2e18911c (D71913), along
with bot fix 19c76989bb505c3117730c47df85fd3800ea2767.

The bot failure should be fixed by D73418, committed as
af954e441a5170a75687699d91d85e0692929d43.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Fangrui Song
daabc9a028 [WholeProgramDevirt][test] Fix test after D73094 2020-01-24 00:46:18 -08:00
Evgeny Leviant
8973fae195 [WPD] Allow load/save bitcoded index when running opt -wholeprogramdevirt
Differential revision: https://reviews.llvm.org/D73094
2020-01-24 00:31:39 -08:00
Teresa Johnson
90e630a95e Revert "[LTO/WPD] Enable aggressive WPD under LTO option"
This reverts commit 59733525d37cf9ad88b5021b33ecdbaf2e18911c.

There is a windows sanitizer bot failure in one of the cfi tests
that I will need some time to figure out:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio
2020-01-23 17:29:24 -08:00
Teresa Johnson
59733525d3 [LTO/WPD] Enable aggressive WPD under LTO option
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.

Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.

Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.

Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.

I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.

Depends on D71907 and D71911.

Reviewers: pcc, evgeny777, steven_wu, espindola

Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71913
2020-01-23 16:09:44 -08:00
Tim Northover
a009a60a91 IR: print value numbers for unnamed function arguments
For consistency with normal instructions and clarity when reading IR,
it's best to print the %0, %1, ... names of function arguments in
definitions.

Also modifies the parser to accept IR in that form for obvious reasons.

llvm-svn: 367755
2019-08-03 14:28:34 +00:00
Peter Collingbourne
ef5cfc2dae WholeProgramDevirt: Teach the pass to respect the global's alignment.
The bytes inserted before an overaligned global need to be padded according
to the alignment set on the original global in order for the initializer
to meet the global's alignment requirements. The previous implementation
that padded to the pointer width happened to be correct for vtables on most
platforms but may do the wrong thing if the vtable has a larger alignment.

This issue is visible with a prototype implementation of HWASAN for globals,
which will overalign all globals including vtables to 16 bytes.

There is also no padding requirement for the bytes inserted after the global
because they are never read from nor are they significant for alignment
purposes, so stop inserting padding there.

Differential Revision: https://reviews.llvm.org/D65031

llvm-svn: 366725
2019-07-22 18:50:45 +00:00
Teresa Johnson
37b80122bd [ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible
Summary:
We hit undefined references building with ThinLTO when one source file
contained explicit instantiations of a template method (weak_odr) but
there were also implicit instantiations in another file (linkonce_odr),
and the latter was the prevailing copy. In this case the symbol was
marked hidden when the prevailing linkonce_odr copy was promoted to
weak_odr. It led to unsats when the resulting shared library was linked
with other code that contained a reference (expecting to be resolved due
to the explicit instantiation).

Add a CanAutoHide flag to the GV summary to allow the thin link to
identify when all copies are eligible for auto-hiding (because they were
all originally linkonce_odr global unnamed addr), and only do the
auto-hide in that case.

Most of the changes here are due to plumbing the new flag through the
bitcode and llvm assembly, and resulting test changes. I augmented the
existing auto-hide test to check for this situation.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59709

llvm-svn: 360466
2019-05-10 20:08:24 +00:00
Eric Christopher
cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher
a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Teresa Johnson
7fb39dfa7c [ThinLTO] Efficiency fix for writing type id records in per-module indexes
Summary:
In D49565/r337503, the type id record writing was fixed so that only
referenced type ids were emitted into each per-module index for ThinLTO
distributed builds. However, this still left an efficiency issue: each
per-module index checked all type ids for membership in the referenced
set, yielding O(M*N) performance (M indexes and N type ids).

Change the TypeIdMap in the summary to be indexed by GUID, to facilitate
correlating with type identifier GUIDs referenced in the function
summary TypeIdInfo structures. This allowed simplifying other
places where a map from type id GUID to type id map entry was previously
being used to aid this correlation.

Also fix AsmWriter code to handle the rare case of type id GUID
collision.

For a large internal application, this reduced the thin link time by
almost 15%.

Reviewers: pcc, vitalybuka

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51330

llvm-svn: 343021
2018-09-25 20:14:40 +00:00
Eugene Leviant
2b70d616f0 [WholeProgramDevirt] Don't process declarations when building type id map
Differential revision: https://reviews.llvm.org/D52175

llvm-svn: 342836
2018-09-23 13:27:47 +00:00
Vitaly Buka
66f53d71f7 Runtime flag to control branch funnel threshold
Reviewers: pcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45193

llvm-svn: 329459
2018-04-06 21:32:36 +00:00
Vitaly Buka
4296ea72ff Don't inline @llvm.icall.branch.funnel
Summary: @llvm.icall.branch.funnel is musttail with variable number of
arguments. After inlining current backend can't separate call targets from call
arguments.

Reviewers: pcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45116

llvm-svn: 329235
2018-04-04 21:46:27 +00:00
Peter Collingbourne
2974856ad4 Use branch funnels for virtual calls when retpoline mitigation is enabled.
The retpoline mitigation for variant 2 of CVE-2017-5715 inhibits the
branch predictor, and as a result it can lead to a measurable loss of
performance. We can reduce the performance impact of retpolined virtual
calls by replacing them with a special construct known as a branch
funnel, which is an instruction sequence that implements virtual calls
to a set of known targets using a binary tree of direct branches. This
allows the processor to speculately execute valid implementations of the
virtual function without allowing for speculative execution of of calls
to arbitrary addresses.

This patch extends the whole-program devirtualization pass to replace
certain virtual calls with calls to branch funnels, which are
represented using a new llvm.icall.jumptable intrinsic. It also extends
the LowerTypeTests pass to recognize the new intrinsic, generate code
for the branch funnels (x86_64 only for now) and lay out virtual tables
as required for each branch funnel.

The implementation supports full LTO as well as ThinLTO, and extends the
ThinLTO summary format used for whole-program devirtualization to
support branch funnels.

For more details see RFC:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120672.html

Differential Revision: https://reviews.llvm.org/D42453

llvm-svn: 327163
2018-03-09 19:11:44 +00:00
Rafael Espindola
9fbc040599 Make GlobalValues with non-default visibilility dso_local.
This is similar to r322317, but for visibility. It is not as neat
because we have to special case extern_weak.

The idea is the same as the previous change, make the transition to
explicit dso_local easier for the frontends. With this they only have
to add dso_local to symbols where we need some external information to
decide if it is dso_local (like it being part of an ELF executable).

llvm-svn: 322806
2018-01-18 02:08:23 +00:00
Rafael Espindola
e4b0231c63 Make internal/private GVs implicitly dso_local.
While updating clang tests for having clang set dso_local I noticed
that:

- There are *a lot* of tests to update.
- Many of the updates are redundant.

They are redundant because a GV is "obviously dso_local". This patch
starts formalizing that a bit by requiring that internal and private
GVs be dso_local too. Since they all are, we don't have to print
dso_local to the textual representation, making it a bit more compact
and easier to read.

llvm-svn: 322317
2018-01-11 22:15:05 +00:00
Sean Fertile
4595a915f6 [LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local.
Now that we have a way to mark GlobalValues as local we can use the symbol
resolutions that the linker plugin provides as part of lto/thinlto link
step to refine the compilers view on what symbols will end up being local.

Originally commited as r317374, but reverted in r317395 to update some missed
tests.

Differential Revision: https://reviews.llvm.org/D35702

llvm-svn: 317408
2017-11-04 17:04:39 +00:00
Sean Fertile
39770ca0a1 Revert "[LTO][ThinLTO] Use the linker resolutions to mark global values ..."
Changes more tests then expected on one of the build bots.
reverting to investigate.

This reverts https://llvm.org/svn/llvm-project/llvm/trunk@317374

llvm-svn: 317395
2017-11-04 01:54:20 +00:00
Sean Fertile
36528c2a9b [LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local.
Now that we have a way to mark GlobalValues as local we can use the symbol
resolutions that the linker plugin provides as part of lto/thinlto link
step to refine the compilers view on what symbols will end up being local.

Differential Revision: https://reviews.llvm.org/D35702

llvm-svn: 317374
2017-11-03 21:45:55 +00:00