llvm-project

Author	SHA1	Message	Date
Mingming Liu	60944177b8	[WPD][ThinLTO]Add cutoff option for WPD (#113383 ) This option applies for _import_ WPD (i.e., when `DevirtModule` pass de-virtualizes according to an imported summary, in ThinLTO backend pipeline). It's meant for debugging (e.g., bisection).	2024-10-23 23:47:27 -07:00
Rahul Joshi	6924fc0326	[LLVM] Add `Intrinsic::getDeclarationIfExists` (#112428 ) Add `Intrinsic::getDeclarationIfExists` to lookup an existing declaration of an intrinsic in a `Module`.	2024-10-16 07:21:10 -07:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Jay Foad	e03f427196	[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133 ) It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.	2024-09-19 16:16:38 +01:00
Youngsuk Kim	2051736f7b	[llvm][Transforms] Avoid 'raw_string_ostream::str' (NFC) Since `raw_string_ostream` doesn't own the string buffer, it is desirable (in terms of memory safety) for users to directly reference the string buffer rather than use `raw_string_ostream::str()`. Work towards TODO comment to remove `raw_string_ostream::str()`.	2024-06-30 09:03:29 -05:00
Nikita Popov	9df71d7673	[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919 ) Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.	2024-06-28 08:36:49 +02:00
Nikita Popov	fba84ecc15	[WPD] Directly create geteleementptr inbounds (NFCI) We know that this GEP is inbounds, so make it explicit. NFCI because constant expression construction already infers this.	2024-05-29 15:43:11 +02:00
Vitaly Buka	c60aa430dc	[NFCI][sanitizers][metadata] Exctract create{Unlikely,Likely}BranchWeights (#89464 ) We have a lot of repeated code with random constants. Particular values are not important, the one just needs to be bigger then another. UR_NONTAKEN_WEIGHT is selected as it's the most common one.	2024-04-19 17:03:23 -07:00
Jeremy Morse	2fe81edef6	[NFC][RemoveDIs] Insert instruction using iterators in Transforms/ As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. Noteworthy changes: * FindInsertedValue now takes an optional iterator rather than an instruction pointer, as we need to always insert with iterators, * I've added a few iterator-taking versions of some value-tracking and DomTree methods -- they just unwrap the iterator. These are purely convenience methods to avoid extra syntax in some passes. * A few calls to getNextNode become std::next instead (to keep in the theme of using iterators for positions), * SeparateConstOffsetFromGEP has it's insertion-position field changed. Noteworthy because it's not a purely localised spelling change. All this should be NFC.	2024-03-05 15:12:22 +00:00
Mingming Liu	8ea858b967	[CallPromotionUtil] See through function alias when devirtualizing a virtual call on an alloca. (#80736 ) - Extract utility function from `DevirtModule::tryFindVirtualCallTargets` [1], which sees through an alias to a function. Call this utility function in the WPD callsite. - For type profiling work, this helper function will be used by indirect-call-promotion pass to find the function pointer at a specified vtable offset (an example in [2]) [1] `b99163fe8f/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp (L1069-L1082)` [2] `77a0ef12de/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp (L347)`	2024-02-06 09:22:34 -08:00
Nikita Popov	6c2fbc3a68	[IRBuilder] Add CreatePtrAdd() method (NFC) (#77582 ) This abstracts over the common pattern of creating a gep with i8 element type.	2024-01-12 14:21:21 +01:00
Teresa Johnson	88fbc4d3df	[ThinLTO] Add tail call flag to call edges in summary (#74043 ) This adds support for a HasTailCall flag on function call edges in the ThinLTO summary. It is intended for use in aiding discovery of missing frames from tail calls in profiled call stacks for MemProf of profiled binaries that did not disable tail call elimination. A follow on change will add the use of this new flag during MemProf context disambiguation. The new flag is encoded in the bitcode along with either the hotness flag from the profile, or the relative block frequency under the -write-relbf-to-summary flag when there is no profile data. Because we now will always have some additional call edge information, I have removed the non-profile function summary record format, and we simply encode the tail call flag along with a hotness type of none when there is no profile information or relative block frequency. The change of record format and name caused most of the test case changes. I have added explicit testing of generation of the new tail call flag into the bitcode and IR assembly format as part of the changes to llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also added round trip testing through assembly and bitcode to llvm/test/Assembler/thinlto-summary.ll.	2023-12-06 08:41:44 -08:00
Simon Pilgrim	3ca4fe80d4	[Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC. startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)	2023-11-06 16:50:18 +00:00
Nikita Popov	c4c0ac10f1	[IPO] Remove unnecessary bitcasts (NFC)	2023-11-06 16:49:45 +01:00
Kazu Hirata	9c5a5a421d	[llvm] Stop including llvm/ADT/iterator_range.h (NFC) Identified with misc-include-cleaner.	2023-10-22 15:41:18 -07:00
modimo	272bd6f9cc	[WPD][LLD] Add option to validate RTTI is enabled on all native types and prevent devirtualization on types with native RTTI Discussion about this approach: https://discourse.llvm.org/t/rfc-safer-whole-program-class-hierarchy-analysis/65144/18 When enabling WPD in an environment where native binaries are present, types we want to optimize can be derived from inside these native files and devirtualizing them can lead to correctness issues. RTTI can be used as a way to determine all such types in native files and exclude them from WPD providing a safe checked way to enable WPD. The approach is: 1. In the linker, identify if RTTI is available for all native types. If not, under `--lto-validate-all-vtables-have-type-infos` `--lto-whole-program-visibility` is automatically disabled. This is done by examining all .symtab symbols in object files and .dynsym symbols in DSOs for vtable (_ZTV) and typeinfo (_ZTI) symbols and ensuring there's always a match for every vtable symbol. 2. During thinlink, if `--lto-validate-all-vtables-have-type-infos` is set and RTTI is available for all native types, identify all typename (_ZTS) symbols via their corresponding typeinfo (_ZTI) symbols that are used natively or outside of our summary and exclude them from WPD. Testing: ninja check-all large Meta service that uses boost, glog and libstdc++.so runs successfully with WPD via --lto-whole-program-visibility. Previously, native types in boost caused incorrect devirtualization that led to crashes. Reviewed By: MaskRay, tejohnson Differential Revision: https://reviews.llvm.org/D155659	2023-09-18 15:51:49 -07:00
Fangrui Song	b4d4146db3	[WholeProgramDevirt] Use llvm:: qualifier to implement declared functions. NFC	2023-09-17 19:31:42 -07:00
Kazu Hirata	6da470d7f8	[llvm] Use range-based for loops (NFC)	2023-09-02 09:32:45 -07:00
Bjorn Pettersson	4ce7c4a92a	[llvm] Drop some typed pointer handling/bitcasts Differential Revision: https://reviews.llvm.org/D157016	2023-08-03 22:54:33 +02:00
Bjorn Pettersson	fd05c34b18	Stop using legacy helpers indicating typed pointer types. NFC Since we no longer support typed LLVM IR pointer types, the code can be simplified into for example using PointerType::get directly instead of using Type::getInt8PtrTy and Type::getInt32PtrTy etc. Differential Revision: https://reviews.llvm.org/D156733	2023-08-02 12:08:37 +02:00
Arnold Schwaighofer	98eb8abff6	Add a type_checked_load_relative to support relative function pointer tables This adds a type_checked_load_relative intrinsic whose semantics it is to load a relative function pointer. A relative function pointer is a pointer to a 32bit value that when added to its address yields the address of the function. Differential Revision: https://reviews.llvm.org/D143204	2023-06-29 08:33:45 -07:00
Arnold Schwaighofer	200a1ccea0	WholeProgramDevirt: Fix call target propagation for ptrauth architectures We can't have a call with a constant target with a ptrauth bundle. Remove the ptrauth bundle operand in such a case rdar://105696396 Differential Revision: https://reviews.llvm.org/D144581	2023-06-29 08:02:58 -07:00
Youngsuk Kim	243f0566dc	[llvm] Replace uses of Type::getPointerTo (NFC) Partial progress towards removing in-tree uses of `Type::getPointerTo`, before we can deprecate the API. If the API is used solely to support an unnecessary bitcast, get rid of the bitcast as well. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D153933	2023-06-28 09:21:34 -04:00
Bjorn Pettersson	a20f7efbc5	Remove several no longer needed includes. NFCI Mostly removing includes of InitializePasses.h and Pass.h in passes that no longer has support for the legacy PM.	2023-04-17 13:54:19 +02:00
Leonard Chan	474f5efebe	Revert "[llvm] Teach whole program devirtualization about relative vtables" This reverts commit db288184765c0b4010060ebea1f6de3ac1f66445. Reverting since it broke our lto builders reported by fxbug.dev/123807.	2023-03-26 01:53:13 +00:00
Leonard Chan	53a9175951	[llvm] Handle duplicate call bases when applying branch funneling It's possible to segfault in `DevirtModule::applyICallBranchFunnel` when attempting to call `getCaller` on a call base that was erased in a prior iteration. This can occur when attempting to find devirtualizable calls via `findDevirtualizableCallsForTypeTest` if the vtable passed to llvm.type.test is a global and not a local. The function works by taking the first argument of the llvm.type.test call (which is a vtable), iterating through all uses of it, and adding any relevant all uses that are calls associated with that intrinsic call to a vector. For most cases where the vtable is actually a local, this wouldn't be an issue. Take for example: ``` define i32 @fn(ptr %obj) #0 { %vtable = load ptr, ptr %obj %p = call i1 @llvm.type.test(ptr %vtable, metadata !"typeid2") call void @llvm.assume(i1 %p) %fptr = load ptr, ptr %vtable %result = call i32 %fptr(ptr %obj, i32 1) ret i32 %result } ``` `findDevirtualizableCallsForTypeTest` will check the call base ` %result = call i32 %fptr(ptr %obj, i32 1)`, find that it is associated with a virtualizable call from `%vtable`, find all loads for `%vtable`, and add any instances those load results are called into a vector. Now consider the case where instead `%vtable` was the global itself rather than a local: ``` define i32 @fn(ptr %obj) #0 { %p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2") call void @llvm.assume(i1 %p) %fptr = load ptr, ptr @vtable %result = call i32 %fptr(ptr %obj, i32 1) ret i32 %result } ``` `findDevirtualizableCallsForTypeTest` should work normally and add one unique call instance to a vector. However, if there are multiple instances where this same global is used for llvm.type.test, like with: ``` define i32 @fn(ptr %obj) #0 { %p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2") call void @llvm.assume(i1 %p) %fptr = load ptr, ptr @vtable %result = call i32 %fptr(ptr %obj, i32 1) ret i32 %result } define i32 @fn2(ptr %obj) #0 { %p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2") call void @llvm.assume(i1 %p) %fptr = load ptr, ptr @vtable %result = call i32 %fptr(ptr %obj, i32 1) ret i32 %result } ``` Then each call base `%result = call i32 %fptr(ptr %obj, i32 1)` will be added to the vector twice. This is because for either call base `%result = call i32 %fptr(ptr %obj, i32 1) `, we determine it is associated with a virtualizable call from `@vtable`, and then we iterate through all the uses of `@vtable`, which is used across multiple functions. So when scanning the first `%result = call i32 %fptr(ptr %obj, i32 1)`, then both call bases will be added to the vector, but when scanning the second one, both call bases are added again, resulting in duplicate call bases in the CSInfo.CallSites vector. Note this is actually accounted for in every other instance WPD iterates over CallSites. What everything else does is actually add the call base to the `OptimizedCalls` set and just check if it's already in the set. We can't reuse that particular set since it serves a different purpose marking which calls where devirtualized which `applyICallBranchFunnel` explicitly says it doesn't. For this fix, we can just account for duplicates with a map and do the actual replacements afterwards by iterating over the map. Differential Revision: https://reviews.llvm.org/D146267	2023-03-23 21:44:59 +00:00
Arthur Eubanks	6a3fdcdd38	[WPD] Fix PreservedAnalyses value after runForTesting()	2023-03-15 11:43:24 -07:00
Kazu Hirata	c8f9555c4d	[Transforms] Use *{Set,Map}::contains (NFC)	2023-03-14 00:24:30 -07:00
Leonard Chan	db28818476	[llvm] Teach whole program devirtualization about relative vtables Prior to this patch, WPD was not acting on relative-vtables in C++. This involves teaching WPD about these things: - llvm.load.relative which is how relative-vtables are indexed (instead of GEP) - dso_local_equivalent which is used in the vtable itself when taking the offset between a virtual function and vtable - Update llvm/test/ThinLTO/X86/devirt.ll to use opaque pointers and add equivalent tests for RV Differential Revision: https://reviews.llvm.org/D134320	2023-02-23 22:18:43 +00:00
Teresa Johnson	c1b3e88844	[LTO/WPD] Allow devirtualization to function alias in vtable Follow on to D144209 to support single implementation devirtualization for Regular LTO when the vtable holds a function alias. For now I have prevented other optimizations performed in regular LTO that need to analyze the contents of the function target when the vtable holds an alias, as I'm not sure they are always correct to perform in that case. Differential Revision: https://reviews.llvm.org/D144270	2023-02-23 14:04:05 -08:00
Teresa Johnson	8045ba8948	[ThinLTO/WPD] Handle function alias in vtable correctly We were not summarizing a function alias in the vtable, leading to incorrect WPD in some cases, and missing WPD in others. Specifically, we would end up ignoring function aliases as they aren't summarized, so we could incorrectly devirtualize if there was a single other non-alias function in a compatible vtable. And if there was only one implementation, but it was an alias, we would not be able to identify and perform the single implementation devirtualization. Handling the alias summary correctly also required fixing the handling in mustBeUnreachableFunction, so that it is not incorrectly ignored. Regular LTO is conservatively correct because it will skip devirtualizing when any pointer within a vtable is not a function. However, it needs additional work to be able to take advantage of function alias within the vtable that is in fact the only implementation. For that reason, the Regular LTO testing in the second test case is currently disabled, and will be enabled along with a follow on enhancement fix for Regular LTO WPD. Differential Revision: https://reviews.llvm.org/D144209	2023-02-16 18:20:12 -08:00
Vasileios Porpodas	823186b14d	Recommit: [NFC][IR] Make Module::getGlobalList() private This reverts commit cb5f239363a3c94db5425c105fcd45e77d2a16a9.	2023-02-14 15:12:51 -08:00
Vasileios Porpodas	cb5f239363	Revert "[NFC][IR] Make Module::getGlobalList() private" This reverts commit ed3e3ee9e30dfbffd2170a770a49b36a7f444916.	2023-02-14 14:29:42 -08:00
Vasileios Porpodas	ed3e3ee9e3	[NFC][IR] Make Module::getGlobalList() private This patch adds several missing GlobalList modifier functions, like removeGlobalVariable(), eraseGlobalVariable() and insertGlobalVariable(). There is no longer need to access the list directly so it also makes getGlobalList() private. Differential Revision: https://reviews.llvm.org/D144027	2023-02-14 14:25:10 -08:00
Archibald Elliott	62c7f035b4	[NFC][TargetParser] Remove llvm/ADT/Triple.h I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.	2023-02-07 12:39:46 +00:00
Kazu Hirata	55e2cd1609	Use llvm::count{lr}_{zero,one} (NFC)	2023-01-28 12:41:20 -08:00
Kazu Hirata	83d56fb17a	Drop the ZeroBehavior parameter from countLeadingZeros and the like (NFC) This patch drops the ZeroBehavior parameter from bit counting functions like countLeadingZeros. ZeroBehavior specifies the behavior when the input to count{Leading,Trailing}Zeros is zero and when the input to count{Leading,Trailing}Ones is all ones. ZeroBehavior was first introduced on May 24, 2013 in commit eb91eac9fb866ab1243366d2e238b9961895612d. While that patch did not state the intention, I would guess ZeroBehavior was for performance reasons. The x86 machines around that time required a conditional branch to implement countLeadingZero<uint32_t> that returns the 32 on zero: test edi, edi je .LBB0_2 bsr eax, edi xor eax, 31 .LBB1_2: mov eax, 32 That is, we can remove the conditional branch if we don't care about the behavior on zero. IIUC, Intel's Haswell architecture, launched on June 4, 2013, introduced several bit manipulation instructions, including lzcnt and tzcnt, which eliminated the need for the conditional branch. I think it's time to retire ZeroBehavior as its utility is very limited. If you care about compilation speed, you should build LLVM with an appropriate -march= to take advantage of lzcnt and tzcnt. Even if not, modern host compilers should be able to optimize away quite a few conditional branches because the input is often known to be nonzero from dominating conditional branches. Differential Revision: https://reviews.llvm.org/D141798	2023-01-18 19:58:44 -08:00
Vasileios Porpodas	adfb23c607	[NFC] Cleanup: Remove Function::getBasicBlockList() when not required. This is part of a series of patches that aim at making Function::getBasicBlockList() private. Differential Revision: https://reviews.llvm.org/D139910	2022-12-13 11:46:29 -08:00
Kazu Hirata	343de6856e	[Transforms] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 21:11:37 -08:00
Teresa Johnson	c2cf93c1a9	[WPD/LTT] Lower type test feeding assumes via phi correctly This fixes https://github.com/llvm/llvm-project/issues/57616. Type test lowering in ThinLTO modules relies on having type id summaries set up for the referenced types, which provide the type test resolution. If there is no summary, the type tests are lowered to false. At the very least, a default type id summary gives the type tests a resolution of Unknown, which is handled correctly (ignored by the first invocation of LTT, and lowered to true by the second). WPD sets up the type id summaries (with a default type test resolution) as it is processing the type tests, but only does this for the patterns handled by WPD, which is a type test directly feeding an assume. In the case of type tests feeding an assume via a phi, the type id summary was not being set up, leading to the type tests being lowered to false incorrectly. Fix this by adding the default type id summary entries for all type ids used on globals during index-only WPD. This is not an issue for hybrid (split-lto-unit) LTO, as in that case the type test resolution is determined and set up during LTT, since the type definitions are in the regular LTO split module, and exported via the summary to the ThinLTO split module. Differential Revision: https://reviews.llvm.org/D134012	2022-09-16 13:50:01 -07:00
Nikita Popov	b1cd393f9e	[AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI) Currently, FunctionModRefBehavior tracks whether the function reads or writes memory (ModRefInfo) and which locations it can access (argmem, inaccessiblemem and other). This patch changes it to track ModRef information per-location instead. To give two examples of why this is useful: * D117095 highlights a weakness of ModRef modelling in the presence of operand bundles. For a memcpy call with deopt operand bundle, we want to say that it can read any memory, but only write argument memory. This would allow them to be treated like any other calls. However, we currently can't express this and have to say that it can read or write any memory. * D127383 would ideally be modelled as a separate threadid location, where threadid Refs outside pre-split coroutines can be ignored (like other accesses to constant memory). The current representation does not allow modelling this precisely. The patch as implemented is intended to be NFC, but there are some obvious opportunities for improvements and simplification. To fully capitalize on this we would also want to change the way we represent memory attributes on functions, but that's a larger change, and I think it makes sense to separate out the FunctionModRefBehavior refactoring. Differential Revision: https://reviews.llvm.org/D130896	2022-09-14 16:34:41 +02:00
Kazu Hirata	56ea4f9bd3	[Transforms] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-08-27 21:21:02 -07:00
Kazu Hirata	50724716cd	[Transforms] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-08-14 12:51:58 -07:00
Arthur Eubanks	2eade1dba4	[WPD] Use new llvm.public.type.test intrinsic for potentially publicly visible classes Turning on opaque pointers has uncovered an issue with WPD where we currently pattern match away `assume(type.test)` in WPD so that a later LTT doesn't resolve the type test to undef and introduce an `assume(false)`. The pattern matching can fail in cases where we transform two `assume(type.test)`s into `assume(phi(type.test.1, type.test.2))`. Currently we create `assume(type.test)` for all virtual calls that might be devirtualized. This is to support `-Wl,--lto-whole-program-visibility`. To prevent this, all virtual calls that may not be in the same LTO module instead use a new `llvm.public.type.test` intrinsic in place of the `llvm.type.test`. Then when we know if `-Wl,--lto-whole-program-visibility` is passed or not, we can either replace all `llvm.public.type.test` with `llvm.type.test`, or replace all `llvm.public.type.test` with `true`. This prevents WPD from trying to pattern match away `assume(type.test)` for public virtual calls when failing the pattern matching will result in miscompiles. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D128955	2022-07-26 08:01:08 -07:00
Fangrui Song	0e3447bf8a	[LegacyPM] Remove WholeProgramDevirt Unused after LTO removal from legacy optimization passline.	2022-07-17 23:14:53 -07:00
Nuno Lopes	53dc0f1078	[NFC] Switch a few uses of undef to poison as placeholders for unreachble code	2022-07-03 14:34:03 +01:00
Fangrui Song	d86a206f06	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 00:31:44 -07:00
Fangrui Song	d0d1c416cb	Remove unneeded cl::ZeroOrMore for cl::list options	2022-06-04 23:51:13 -07:00
Fangrui Song	36c7d79dc4	Remove unneeded cl::ZeroOrMore for cl::opt options Similar to 557efc9a8b68628c2c944678c6471dac30ed9e8e. This commit handles options where cl::ZeroOrMore is more than one line below cl::opt.	2022-06-04 00:10:42 -07:00
Fangrui Song	557efc9a8b	[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded. Also remove cl::init(false) while touching the lines.	2022-06-03 21:59:05 -07:00

1 2 3 4 5

205 Commits