llvm-project

Author	SHA1	Message	Date
Mircea Trofin	f2d827c444	[profcheck] Require `unknown` metadata have an origin parameter (#157594 ) Rather than passes using `!prof = !{!”unknown”}`for cases where don’t have enough information to emit profile values, this patch captures the pass (or some other information) that can help diagnostics - i.e. `!{!”unknown”, !”some-pass-name”}`. For example, suppose we emitted a `select` with the unknown metadata, and, later, end up needing to lower that to a conditional branch. If we observe (via sample profiling, for example) that the branch is biased and would have benefitted from a valid profile, the extra information can help speed up debugging. We can also (in a subsequent pass) generate optimization remarks about such lowered selects, with a similar aim - identify patterns lowering to `select` that may be worth some extra investment in extracting a more precise profile.	2025-09-10 15:34:35 -07:00
Mircea Trofin	c55708c65c	[WPD] set the branch funnel function entry count (#155657 ) We can compute the entry count of branch funnel functions, and potentially avoid them being deemed cold (also, keeping profile information coherent is always good for performance) Issue #147390	2025-09-08 16:41:33 -07:00
Tianle Liu	6001a8bb94	[WholeProgramDevirt] Add check for AvailableExternal and give up icall.branch.funnel (#143468 ) When a customer class inherits from a libc++ class, and is built with "-flto -fwhole-program-vtables -static-libstdc++ \ -Wl,-plugin-opt=-whole-program-visibility", the libc++ class's vtable is available_externally, meanwhile the customer class vtable is private. And both of them are !vcall_visibility == Linkage Unit. In this case, icall.branch.funnel might be generated. But the icall.branch.funnel would cause crash in LowerTypeTests because available_externally Global_Object's GlobalTypeMember would not be saved and finally leads to a NULL GlobalTypeMember which causes a crash. Even saving the available_externally GO's GlobalTypeMember so that it is not NULL to avoid the crash in LowerTypeTests, it still will crash in SelectionDAGBuilder or Verifier, because operands linkage type consistency check of icall.branch.funnel can not pass. So any one of available externally vtable would stop to generate icall.branch.funnel. This patch fixes FullLTO mode and split-LTO-unit ThinLTO mode.	2025-06-20 08:01:32 +08:00
PiJoules	f32f048719	[llvm] Use ABI instead of preferred alignment for const prop checks (#142500 ) We'd hit an assertion checking proper alignment for an i8 when building chromium because we used the prefered alignment (which is 4 bytes) instead of the ABI alignment (which is 1 byte). The ABI alignment should be used because that's the actual alignment needed to load a constant from the vtable. This also updates the two `virtual-const-prop-small-alignment-*` to explicitly give ABI alignments for i64s.	2025-06-04 10:57:59 -07:00
PiJoules	2661e995ce	[llvm] Ensure propagated constants in the vtable are aligned (#136630 ) It's possible for virtual constant propagation in whole program devirtualization to create unaligned loads. We originally saw this with 4-byte aligned relative vtables where we could store 8-byte values before/after the vtable. But since the vtable is 4-byte aligned and we unconditionally do an 8-byte load, we can't guarantee that the stored constant will always be aligned to 8 bytes. We can also see this with normal vtables whenever a 1-byte char is stored in the vtable because the offset calculation for the GEP doesn't take into account the original vtable alignment. This patch introduces two changes to virtual constant propagation: 1. Do not propagate constants whose preferred alignment is larger than the vtable alignment. This is required because if the constants are stored in the vtable, we can only guarantee the constant will be stored at an address at most aligned to the vtable's alignment. 2. Round up the offset used in the GEP before the load to ensure it's at an address suitably aligned such that we can load from it. This patch updates tests to reflect this alignment change and adds some cases for relative vtables.	2025-05-15 11:52:25 -07:00
PiJoules	649b7994fb	[NFC][WPD] Add constant propagation tests checking relative vtables (#138989 ) This is a patch with precommitted tests to make https://github.com/llvm/llvm-project/pull/136630 easier to review. The `virtual-const-prop-small-alignment-*` tests check the output when the loaded int alignment is less than the vtable alignment. This also changes some constants to make it easier to differentiate between propagated values in vtables.	2025-05-13 11:50:11 -07:00
PiJoules	0f8c075f7c	[llvm] Match llvm.type.checked.load.relative semantics to llvm.load.r… (#129583 ) …elative The semantics of `llvm.type.checked.load.relative` seem to be a little different from that of `llvm.load.relative`. It looks like the semantics for `llvm.type.checked.load.relative` is `ptr + offset + (ptr + offset)` whereas the semantics for `llvm.load.relative` is `ptr + (ptr + offset)`. That is, the offset for the former is added to the offset address whereas the later has the offset added to the original pointer. It really feels like the checked intrinsic was meant to match the semantics of the non-checked intrinsic, but I think for all cases the checked intrinsic is used (swift being the only use I know of), the calculation just happens to be the same because swift always uses an offset of zero. Likewise, all llvm tests for this intrinsic happen to use an offset of zero. Relative vtables in clang happens to be the first time where we're using this intrinsic and using it with non-zero values. This updates the semantics of the checked intrinsic to match the non-checked one. Effectively this shouldn't change any codegen by any users of this since all current users seem to use a zero offset. This PR also updates some tests with non-zero offsets.	2025-03-13 14:50:41 -07:00
Mingming Liu	47cc9db797	[WPD]Regard unreachable function as a possible devirtualizable target (#115668 ) https://reviews.llvm.org/D115492 skips unreachable functions and potentially allows more static de-virtualizations. The motivation is to ignore virtual deleting destructor of abstract class (e.g., `Base::~Base()` in https://gcc.godbolt.org/z/dWMsdT9Kz). * Note WPD already handles most pure virtual functions (like `Base::x()` in the godbolt example above), which becomes a `__cxa_pure_virtual` in the vtable slot. This PR proposes to undo the change, because it turns out there are other unreachable functions that a general program wants to run and fail intentionally, with `LOG(FATAL)` or `CHECK` [1] for example. While many real-world applications are encouraged to check-fail sparingly, they are allowed to do so on critical errors (e.g., misconfiguration or bug is detected during server startup). * Implementation-wise, this PR keeps the one-bit 'unreachable' state in bitcode and updates WPD analysis. https://gcc.godbolt.org/z/T1aMhczYr is a minimum reproducible example extracted from unit test. `Base::func` is a one-liner of `LOG(FATAL) << "message"`, and lowered to one basic block ending with `unreachable`. A real-world program is _allowed_ to invoke Base::func to terminate the program as a way to report errors (in server initialization stage for example), even if errors on the serving path should be handled more gracefully. [1] https://abseil.io/docs/cpp/guides/logging#CHECK and https://abseil.io/docs/cpp/guides/logging#configuration-and-flags	2024-11-13 11:28:36 -08:00
Mingming Liu	60944177b8	[WPD][ThinLTO]Add cutoff option for WPD (#113383 ) This option applies for _import_ WPD (i.e., when `DevirtModule` pass de-virtualizes according to an imported summary, in ThinLTO backend pipeline). It's meant for debugging (e.g., bisection).	2024-10-23 23:47:27 -07:00
Leonard Chan	58c5f50f4c	Reapply "[llvm] Teach GlobalDCE about dso_local_equivalent" Also reapply "[llvm] Teach whole program devirtualization about relative vtables" This reverts commit 1c604a9780fcfe92a99d539913553f0835b81de3 and 474f5efebed24547e76d022f0c5ffcc9db97ce6f.	2024-04-15 11:40:23 -07:00
Leonard Chan	c0b77e0a4a	Revert "Reapply "[llvm] Teach whole program devirtualization about relative vtables"" This reverts commit 09c3bfe9b3eb47a2af0c10531b25f90cfb5fa9f4.	2024-04-15 11:14:50 -07:00
Leonard Chan	09c3bfe9b3	Reapply "[llvm] Teach whole program devirtualization about relative vtables" This reverts commit 474f5efebed24547e76d022f0c5ffcc9db97ce6f.	2024-04-15 11:01:04 -07:00
Mingming Liu	dda73336ad	[ThinLTO]Record import type in GlobalValueSummary::GVFlags (#87597 ) The motivating use case is to support import the function declaration across modules to construct call graph edges for indirect calls [1] when importing the function definition costs too much compile time (e.g., the function is too large has no `noinline` attribute). 1. Currently, when the compiled IR module doesn't have a function definition but its postlink combined summary contains the function summary or a global alias summary with this function as aliasee, the function definition will be imported from source module by IRMover. The implementation is in FunctionImporter::importFunctions [2] 2. In order for FunctionImporter to import a declaration of a function, both function summary and alias summary need to carry the def / decl state. Specifically, all existing summary fields doesn't differ across import modules, but the def / decl state of is decided by `<ImportModule, Function>`. This change encodes the def/decl state in `GlobalValueSummary::GVFlags`. In the subsequent changes 1. The indexing step `computeImportForModule` [3] will compute the set of definitions and the set of declarations for each module, and passing on the information to bitcode writer. 2. Bitcode writer will look up the def/decl state and sets the state when it writes out the flag value. This is demonstrated in https://github.com/llvm/llvm-project/pull/87600 3. Function importer will read the def/decl state when reading the combined summary to figure out two sets of global values, and IRMover will be updated to import the declaration (aka linkGlobalValuePrototype [4]) into the destination module. - The next change is https://github.com/llvm/llvm-project/pull/87600 [1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5 [2] `3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L1608-L1764)` [3] `3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L856)` [4] `3b337242ee/llvm/lib/Linker/IRMover.cpp (L605)`	2024-04-10 19:46:01 -07:00
Arnold Schwaighofer	98eb8abff6	Add a type_checked_load_relative to support relative function pointer tables This adds a type_checked_load_relative intrinsic whose semantics it is to load a relative function pointer. A relative function pointer is a pointer to a 32bit value that when added to its address yields the address of the function. Differential Revision: https://reviews.llvm.org/D143204	2023-06-29 08:33:45 -07:00
Arnold Schwaighofer	200a1ccea0	WholeProgramDevirt: Fix call target propagation for ptrauth architectures We can't have a call with a constant target with a ptrauth bundle. Remove the ptrauth bundle operand in such a case rdar://105696396 Differential Revision: https://reviews.llvm.org/D144581	2023-06-29 08:02:58 -07:00
Teresa Johnson	200cc952a2	[LTO][GlobalDCE] Use pass parameter instead of module flag for LTO phase D63932 added a module flag to indicate that we are executing the regular LTO post merge pipeline, so that GlobalDCE could perform more aggressive optimization for Dead Virtual Function Elimination. This caused issues trying to reuse bitcode that had already been through the LTO pipeline (see context in D139816). Instead support this by passing down a parameter flag to the GlobalDCEPass constructor, which is the more usual way for indicating this information. Most test changes are to remove incidental uses of this flag. Of the 2 real uses, llvm/test/LTO/ARM/lto-linking-metadata.ll is now obsolete and removed in this patch, and the virtual-functions-visibility-post-lto.ll test is updated to use the regular LTO default pipeline where this parameter is set to true. Differential Revision: https://reviews.llvm.org/D153655	2023-06-23 17:05:07 -07:00
Leonard Chan	474f5efebe	Revert "[llvm] Teach whole program devirtualization about relative vtables" This reverts commit db288184765c0b4010060ebea1f6de3ac1f66445. Reverting since it broke our lto builders reported by fxbug.dev/123807.	2023-03-26 01:53:13 +00:00
Leonard Chan	53a9175951	[llvm] Handle duplicate call bases when applying branch funneling It's possible to segfault in `DevirtModule::applyICallBranchFunnel` when attempting to call `getCaller` on a call base that was erased in a prior iteration. This can occur when attempting to find devirtualizable calls via `findDevirtualizableCallsForTypeTest` if the vtable passed to llvm.type.test is a global and not a local. The function works by taking the first argument of the llvm.type.test call (which is a vtable), iterating through all uses of it, and adding any relevant all uses that are calls associated with that intrinsic call to a vector. For most cases where the vtable is actually a local, this wouldn't be an issue. Take for example: ``` define i32 @fn(ptr %obj) #0 { %vtable = load ptr, ptr %obj %p = call i1 @llvm.type.test(ptr %vtable, metadata !"typeid2") call void @llvm.assume(i1 %p) %fptr = load ptr, ptr %vtable %result = call i32 %fptr(ptr %obj, i32 1) ret i32 %result } ``` `findDevirtualizableCallsForTypeTest` will check the call base ` %result = call i32 %fptr(ptr %obj, i32 1)`, find that it is associated with a virtualizable call from `%vtable`, find all loads for `%vtable`, and add any instances those load results are called into a vector. Now consider the case where instead `%vtable` was the global itself rather than a local: ``` define i32 @fn(ptr %obj) #0 { %p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2") call void @llvm.assume(i1 %p) %fptr = load ptr, ptr @vtable %result = call i32 %fptr(ptr %obj, i32 1) ret i32 %result } ``` `findDevirtualizableCallsForTypeTest` should work normally and add one unique call instance to a vector. However, if there are multiple instances where this same global is used for llvm.type.test, like with: ``` define i32 @fn(ptr %obj) #0 { %p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2") call void @llvm.assume(i1 %p) %fptr = load ptr, ptr @vtable %result = call i32 %fptr(ptr %obj, i32 1) ret i32 %result } define i32 @fn2(ptr %obj) #0 { %p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2") call void @llvm.assume(i1 %p) %fptr = load ptr, ptr @vtable %result = call i32 %fptr(ptr %obj, i32 1) ret i32 %result } ``` Then each call base `%result = call i32 %fptr(ptr %obj, i32 1)` will be added to the vector twice. This is because for either call base `%result = call i32 %fptr(ptr %obj, i32 1) `, we determine it is associated with a virtualizable call from `@vtable`, and then we iterate through all the uses of `@vtable`, which is used across multiple functions. So when scanning the first `%result = call i32 %fptr(ptr %obj, i32 1)`, then both call bases will be added to the vector, but when scanning the second one, both call bases are added again, resulting in duplicate call bases in the CSInfo.CallSites vector. Note this is actually accounted for in every other instance WPD iterates over CallSites. What everything else does is actually add the call base to the `OptimizedCalls` set and just check if it's already in the set. We can't reuse that particular set since it serves a different purpose marking which calls where devirtualized which `applyICallBranchFunnel` explicitly says it doesn't. For this fix, we can just account for duplicates with a map and do the actual replacements afterwards by iterating over the map. Differential Revision: https://reviews.llvm.org/D146267	2023-03-23 21:44:59 +00:00
Leonard Chan	db28818476	[llvm] Teach whole program devirtualization about relative vtables Prior to this patch, WPD was not acting on relative-vtables in C++. This involves teaching WPD about these things: - llvm.load.relative which is how relative-vtables are indexed (instead of GEP) - dso_local_equivalent which is used in the vtable itself when taking the offset between a virtual function and vtable - Update llvm/test/ThinLTO/X86/devirt.ll to use opaque pointers and add equivalent tests for RV Differential Revision: https://reviews.llvm.org/D134320	2023-02-23 22:18:43 +00:00
Leonard Chan	0ecf796592	[llvm] Update WPD tests to use opaque pointers Differential Revision: https://reviews.llvm.org/D135611	2022-10-11 22:30:24 +00:00
Nikita Popov	6e504d637d	[ValueTracking] Handle constant exprs in isKnownNonZero() Handle constant expressions by falling through to the general operator-based code. In particular, this adds support for bitcast and GEP expressions.	2022-10-04 11:58:07 +02:00
Nuno Lopes	53dc0f1078	[NFC] Switch a few uses of undef to poison as placeholders for unreachble code	2022-07-03 14:34:03 +01:00
Teresa Johnson	ced9a795fd	[WPD] Add statistics Add statistics to count overall devirtualized targets as well as the various types of devirtualizations applied at callsites. Differential Revision: https://reviews.llvm.org/D123152	2022-04-05 18:48:23 -07:00
minglotus-6	9c49f8d705	[LTO][WPD] Ignore unreachable function by analyzing IR. In regular LTO, analyze IR and discard unreachable functions when finding virtual call targets. Differential Revision: https://reviews.llvm.org/D116056	2021-12-21 18:13:03 +00:00
Bjorn Pettersson	3f8027fb67	[test] Update some test cases to use -passes when specifying the pipeline This updates transform test cases for ADCE AddDiscriminators AggressiveInstCombine AlignmentFromAssumptions ArgumentPromotion BDCE CalledValuePropagation DCE Reg2Mem WholeProgramDevirt to use the -passes syntax when specifying the pipeline. Given that LLVM_ENABLE_NEW_PASS_MANAGER isn't set to off (which is a deprecated feature) the updated test cases already used the new pass manager, but they were using the legacy syntax when specifying the passes to run. This patch can be seen as a step toward deprecating that interface. This patch also removes some redundant RUN lines. Here I am referring to test cases that had multiple RUN lines verifying both the legacy "-passname" syntax and the new "-passes=passname" syntax. Since we switched the default pass manager to "new PM" both RUN lines have verified the new PM version of the pass (more or less wasting time running the same test twice), unless LLVM_ENABLE_NEW_PASS_MANAGER is set to "off". It is assumed that it is enough to run these tests with the new pass manager now. Differential Revision: https://reviews.llvm.org/D108472	2021-09-29 21:51:08 +02:00
Nikita Popov	f8aaec19e6	[OpaquePtr] Support forward references in textual IR Currently, LLParser will create a Function/GlobalVariable forward reference based on the desired pointer type and then modify it when it is declared. With opaque pointers, we generally do not know the correct type to use until we see the declaration. Solve this by creating the forward reference with a dummy type, and then performing a RAUW with the correct Function/GlobalVariable when it is declared. The approach is adopted from `b5b55963f6`. This results in a change to the use list order, which is why we see test changes on some module passes that are not stable under use list reordering. Differential Revision: https://reviews.llvm.org/D104950	2021-06-29 20:10:31 +02:00
Arthur Eubanks	7110510eca	[WPD] Don't optimize calls more than once WPD currently assumes that there is a one to one correspondence between type test assume sequences and virtual calls. However, with -fstrict-vtable-pointers this may not be true. This ends up causing crashes when we try to optimize a virtual call more than once ( applyUniformRetValOpt()/applyUniqueRetValOpt()/applyVirtualConstProp()/applySingleImplDevirt()). applySingleImplDevirt() actually didn't previous crash because it would replace the devirtualized call with the same direct call. Adding an assert that the call is indirect causes the corresponding test to crash with the rest of the patch. This makes Chrome successfully build with -fstrict-vtable-pointers + WPD. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D104798	2021-06-24 13:28:09 -07:00
Hans Wennborg	c6e5c4654b	Don't use $ as suffix for symbol names in ThinLTOBitcodeWriter and other places Using $ breaks demangling of the symbols. For example, $ c++filt _Z3foov\$123 _Z3foov$123 This causes problems for developers who would like to see nice stack traces etc., but also for automatic crash tracking systems which try to organize crashes based on the stack traces. Instead, use the period as suffix separator, since Itanium demanglers normally ignore such suffixes: $ c++filt _Z3foov.123 foo() [clone .123] This is already done in some places; try to do it everywhere. Differential revision: https://reviews.llvm.org/D97484	2021-03-29 13:03:52 +02:00
Fangrui Song	54fb3ca96e	[ThinLTO] Add Visibility bits to GlobalValueSummary::GVFlags Imported functions and variable get the visibility from the module supplying the definition. However, non-imported definitions do not get the visibility from (ELF) the most constraining visibility among all modules (Mach-O) the visibility of the prevailing definition. This patch * adds visibility bits to GlobalValueSummary::GVFlags * computes the result visibility and propagates it to all definitions Protected/hidden can imply dso_local which can enable some optimizations (this is stronger than GVFlags::DSOLocal because the implied dso_local can be leveraged for ELF -shared while default visibility dso_local has to be cleared for ELF -shared). Note: we don't have summaries for declarations, so for ELF if a declaration has the most constraining visibility, the result visibility may not be that one. Differential Revision: https://reviews.llvm.org/D92900	2021-01-27 10:43:51 -08:00
Arthur Eubanks	460dda071e	[WholeProgramDevirt][NewPM] Add NPM testing path to match legacy pass The legacy pass's default constructor sets UseCommandLine = true and goes down a separate testing route. Match that in the NPM pass. This fixes all tests in llvm/test/Transforms/WholeProgramDevirt under NPM. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D88588	2020-09-30 17:27:37 -07:00
Teresa Johnson	6014c46c80	Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP" This restores commit 80d0a137a5aba6998fadb764f1e11cb901aae233, and the follow on fix in 873c0d0786dcf22f4af39f65df824917f70f2170, with a new fix for test failures after a 2-stage clang bootstrap, and a more robust fix for the Chromium build failure that an earlier version partially fixed. See also discussion on D75201. Reviewers: evgeny777 Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73242	2020-07-14 12:16:57 -07:00
Bob Haarman	cc5c58889e	[WPD] Avoid noalias assumptions in unique return value optimization Summary: Changes the type of the @__typeid_.*_unique_member imports we generate for unique return value optimization from i8 to [0 x i8]. This prevents assuming that these imports do not alias, such as when two unique return values occur in the same vtable. Fixes PR45393. Reviewers: tejohnson, pcc Reviewed By: pcc Subscribers: aganea, hiraditya, rnk, george.burgess.iv, dblaikie, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77421	2020-04-16 14:49:51 -07:00
evgeny	6d2032e259	[WPD] Provide a way to prevent functions from being devirtualized Differential revision: https://reviews.llvm.org/D75617	2020-03-09 14:05:15 +03:00
Teresa Johnson	80bf137fa1	Revert "Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"" This reverts commit 80d0a137a5aba6998fadb764f1e11cb901aae233, and the follow on fix in 873c0d0786dcf22f4af39f65df824917f70f2170. It is causing test failures after a multi-stage clang bootstrap. See discussion on D73242 and D75201.	2020-03-02 14:02:13 -08:00
Teresa Johnson	80d0a137a5	Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP" This restores commit 748bb5a0f1964d20dfb3891b0948ab6c66236c70, along with a fix for a Chromium test suite build issue (and a new test for that case). Differential Revision: https://reviews.llvm.org/D73242	2020-02-11 10:48:05 -08:00
Teresa Johnson	25aa2eef99	Revert "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP" This reverts commit 748bb5a0f1964d20dfb3891b0948ab6c66236c70. Due to Chromium CFI+ThinLTO test crashes reported on patch.	2020-02-05 19:27:32 -08:00
Teresa Johnson	748bb5a0f1	[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP Summary: Currently type test assume sequences inserted for devirtualization are removed during WPD. This patch delays their removal until later in the optimization pipeline. This is an enabler for upcoming enhancements to indirect call promotion, for example streamlined promotion guard sequences that compare against vtable address instead of the target function, when there are small number of possible vtables (either determined via WPD or by in-progress type profiling). We need the type tests to correlate the callsites with the address point offset needed in the compare sequence, and optionally to associated type summary info computed during WPD. This depends on work in D71913 to enable invocation of LowerTypeTests to drop type test assume sequences, which will now be invoked following ICP in the ThinLTO post-LTO link pipelines, and also after the existing export phase LowerTypeTests invocation in regular LTO (which is already after ICP). We cannot simply move the existing import phase LowerTypeTests pass later in the ThinLTO post link pipelines, as the comment in PassBuilder.cpp notes (it must run early because when performing CFI other passes may disturb the sequences it looks for). This necessitated adding a new type test resolution "Unknown" that we can use on the type test assume sequences previously removed by WPD, that we now want LTT to ignore. Depends on D71913. Reviewers: pcc, evgeny777 Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73242	2020-02-05 08:59:48 -08:00
Teresa Johnson	2f63d549f1	Restore "[LTO/WPD] Enable aggressive WPD under LTO option" This restores 59733525d37cf9ad88b5021b33ecdbaf2e18911c (D71913), along with bot fix 19c76989bb505c3117730c47df85fd3800ea2767. The bot failure should be fixed by D73418, committed as af954e441a5170a75687699d91d85e0692929d43. I also added a fix for non-x86 bot failures by requiring x86 in new test lld/test/ELF/lto/devirt_vcall_vis_public.ll.	2020-01-27 07:55:05 -08:00
Fangrui Song	daabc9a028	[WholeProgramDevirt][test] Fix test after D73094	2020-01-24 00:46:18 -08:00
Evgeny Leviant	8973fae195	[WPD] Allow load/save bitcoded index when running opt -wholeprogramdevirt Differential revision: https://reviews.llvm.org/D73094	2020-01-24 00:31:39 -08:00
Teresa Johnson	90e630a95e	Revert "[LTO/WPD] Enable aggressive WPD under LTO option" This reverts commit 59733525d37cf9ad88b5021b33ecdbaf2e18911c. There is a windows sanitizer bot failure in one of the cfi tests that I will need some time to figure out: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio	2020-01-23 17:29:24 -08:00
Teresa Johnson	59733525d3	[LTO/WPD] Enable aggressive WPD under LTO option Summary: Third part in series to support Safe Whole Program Devirtualization Enablement, see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html This patch adds type test metadata under -fwhole-program-vtables, even for classes without hidden visibility. It then changes WPD to skip devirtualization for a virtual function call when any of the compatible vtables has public vcall visibility. Additionally, internal LLVM options as well as lld and gold-plugin options are added which enable upgrading all public vcall visibility to linkage unit (hidden) visibility during LTO. This enables the more aggressive WPD to kick in based on LTO time knowledge of the visibility guarantees. Support was added to all flavors of LTO WPD (regular, hybrid and index-only), and to both the new and old LTO APIs. Unfortunately it was not simple to split the first and second parts of this part of the change (the unconditional emission of type tests and the upgrading of the vcall visiblity) as I needed a way to upgrade the public visibility on legacy WPD llvm assembly tests that don't include linkage unit vcall visibility specifiers, to avoid a lot of test churn. I also added a mechanism to LowerTypeTests that allows dropping type test assume sequences we now aggressively insert when we invoke distributed ThinLTO backends with null indexes, which is used in testing mode, and which doesn't invoke the normal ThinLTO backend pipeline. Depends on D71907 and D71911. Reviewers: pcc, evgeny777, steven_wu, espindola Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71913	2020-01-23 16:09:44 -08:00
Tim Northover	a009a60a91	IR: print value numbers for unnamed function arguments For consistency with normal instructions and clarity when reading IR, it's best to print the %0, %1, ... names of function arguments in definitions. Also modifies the parser to accept IR in that form for obvious reasons. llvm-svn: 367755	2019-08-03 14:28:34 +00:00
Peter Collingbourne	ef5cfc2dae	WholeProgramDevirt: Teach the pass to respect the global's alignment. The bytes inserted before an overaligned global need to be padded according to the alignment set on the original global in order for the initializer to meet the global's alignment requirements. The previous implementation that padded to the pointer width happened to be correct for vtables on most platforms but may do the wrong thing if the vtable has a larger alignment. This issue is visible with a prototype implementation of HWASAN for globals, which will overalign all globals including vtables to 16 bytes. There is also no padding requirement for the bytes inserted after the global because they are never read from nor are they significant for alignment purposes, so stop inserting padding there. Differential Revision: https://reviews.llvm.org/D65031 llvm-svn: 366725	2019-07-22 18:50:45 +00:00
Teresa Johnson	37b80122bd	[ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible Summary: We hit undefined references building with ThinLTO when one source file contained explicit instantiations of a template method (weak_odr) but there were also implicit instantiations in another file (linkonce_odr), and the latter was the prevailing copy. In this case the symbol was marked hidden when the prevailing linkonce_odr copy was promoted to weak_odr. It led to unsats when the resulting shared library was linked with other code that contained a reference (expecting to be resolved due to the explicit instantiation). Add a CanAutoHide flag to the GV summary to allow the thin link to identify when all copies are eligible for auto-hiding (because they were all originally linkonce_odr global unnamed addr), and only do the auto-hide in that case. Most of the changes here are due to plumbing the new flag through the bitcode and llvm assembly, and resulting test changes. I augmented the existing auto-hide test to check for this situation. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi Tags: #llvm Differential Revision: https://reviews.llvm.org/D59709 llvm-svn: 360466	2019-05-10 20:08:24 +00:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00
Teresa Johnson	7fb39dfa7c	[ThinLTO] Efficiency fix for writing type id records in per-module indexes Summary: In D49565/r337503, the type id record writing was fixed so that only referenced type ids were emitted into each per-module index for ThinLTO distributed builds. However, this still left an efficiency issue: each per-module index checked all type ids for membership in the referenced set, yielding O(M*N) performance (M indexes and N type ids). Change the TypeIdMap in the summary to be indexed by GUID, to facilitate correlating with type identifier GUIDs referenced in the function summary TypeIdInfo structures. This allowed simplifying other places where a map from type id GUID to type id map entry was previously being used to aid this correlation. Also fix AsmWriter code to handle the rare case of type id GUID collision. For a large internal application, this reduced the thin link time by almost 15%. Reviewers: pcc, vitalybuka Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51330 llvm-svn: 343021	2018-09-25 20:14:40 +00:00
Eugene Leviant	2b70d616f0	[WholeProgramDevirt] Don't process declarations when building type id map Differential revision: https://reviews.llvm.org/D52175 llvm-svn: 342836	2018-09-23 13:27:47 +00:00
Vitaly Buka	66f53d71f7	Runtime flag to control branch funnel threshold Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45193 llvm-svn: 329459	2018-04-06 21:32:36 +00:00

1 2

87 Commits