llvm-project

Author	SHA1	Message	Date
Fangrui Song	a24418375a	[CodeLayout] cache-directed sort: limit max chain size (#69039 ) When linking an executable with a slightly larger executable, ld.lld --call-graph-profile-sort=cdsort can be very slow (see #68638). ``` 4.6% 20.7Mi .text.hot 3.5% 15.9Mi .text 3.4% 15.2Mi .text.unknown ``` Add cl option `cdsort-max-chain-size`, which is similar to `ext-tsp-max-chain-size`, and set it to 128, to improve performance. In `ld.lld @response.txt --threads=4 --call-graph-profile-sort=cdsort --time-trace" builds, the "Total Sort sections" time is measured as follows: * -mllvm -cdsort-max-chain-size=64: 1.321813 * -mllvm -cdsort-max-chain-size=128: 2.030425 * -mllvm -cdsort-max-chain-size=256: 2.927684 * -mllvm -cdsort-max-chain-size=512: 5.493106 * unlimited: 9 minutes The rest part takes 6.8s.	2023-10-22 16:50:03 -07:00
Dominik Adamski	eee8dd9088	[CodeExtractor] Allow to use 0 addr space for aggregate arg (#66998 ) The user of CodeExtractor should be able to specify that the aggregate argument should be passed as a pointer in zero address space. CodeExtractor is used to generate outlined functions required by OpenMP runtime. The arguments of the outlined functions for OpenMP GPU code are in 0 address space. 0 address space does not need to be the default address space for GPU device. That's why there is a need to allow the user of CodeExtractor to specify, that the allocated aggregate parameter is passed as pointer in zero address space.	2023-10-18 20:12:31 +02:00
Matthias Braun	5181156b37	Use BlockFrequency type in more places (NFC) (#68266 ) The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it more consistently in various APIs and disable implicit conversion to make usage more consistent and explicit. - Use `BlockFrequency Freq` parameter for `setBlockFreq`, `getProfileCountFromFreq` and `setBlockFreqAndScale` functions. - Return `BlockFrequency` in `getEntryFreq()` functions. - While on it change some `const BlockFrequency& Freq` parameters to plain `BlockFreqency Freq`. - Mark `BlockFrequency(uint64_t)` constructor as explicit. - Add missing `BlockFrequency::operator!=`. - Remove `uint64_t BlockFreqency::getMaxFrequency()`. - Add `BlockFrequency BlockFrequency::max()` function.	2023-10-05 11:40:17 -07:00
Kazu Hirata	3b34c117db	[llvm] Remove unused using decls (NFC) Identified with misc-unused-using-decls.	2023-10-03 23:21:50 -07:00
Hans Wennborg	eee1f7cef8	Revert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" This caused asserts: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2331: virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed. See comment on the code review for reproducer. > RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544 > > Similar to imported declarations, the patch tracks function-local types in > DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with > the aforementioned metadata change and provided a support of function-local > types scoped within a lexical block. > > The patch assumes that DICompileUnit's 'enums field' no longer tracks local > types and DwarfDebug would assert if any locally-scoped types get placed there. > > Reviewed By: jmmartinez > > Differential Revision: https://reviews.llvm.org/D144006 This reverts commit f8aab289b5549086062588fba627b0e4d3a5ab15.	2023-09-29 14:23:31 +02:00
Fangrui Song	e705b37a77	[CodeLayout] Add unittest for cache-directed sort The function reordering algorithm added by https://reviews.llvm.org/D152834 and used by BOLT (https://reviews.llvm.org/D153039) is untested. Add some tests at the appropriate layer. Depends on D159526 Differential Revision: https://reviews.llvm.org/D159527	2023-09-27 10:52:12 -07:00
Vladislav Dzhidzhoev	f8aab289b5	[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7) RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544 Similar to imported declarations, the patch tracks function-local types in DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with the aforementioned metadata change and provided a support of function-local types scoped within a lexical block. The patch assumes that DICompileUnit's 'enums field' no longer tracks local types and DwarfDebug would assert if any locally-scoped types get placed there. Reviewed By: jmmartinez Differential Revision: https://reviews.llvm.org/D144006	2023-09-26 23:07:29 +04:00
Florian Hahn	4b0df112da	[VPlan] Fix invalid IR in unit test input, run verifier. Some tests were passing invalid IR to the VPlan construction logic. Fix the invalid IR and run the verifier on the input to avoid issues in the future.	2023-09-22 21:12:09 +01:00
Nikita Popov	2d8d622c73	[SCEV] Require that addrec operands dominate the loop SCEVExpander currently has special handling for the case where the start or the step of an addrec do not dominate the loop header, which is not used by any lit test. Initially I thought that this is entirely dead code, because addrec operands are required to be loop invariant. However, SCEV currently allows creating an addrec with operands that are loop invariant but defined after the loop. This doesn't seem like a useful case to allow, and we don't appear to be using this outside a single easy to adjust unit test.	2023-09-22 09:02:54 +02:00
Bjorn Pettersson	4d5906e0bf	[llvm][unittests] Remove unneeded header includes	2023-09-12 18:47:44 +02:00
Mel Chen	26aed5b9a8	[VPlan][LoopUtils] Remove unused parameter TTI This patch removes the member TTI from VPReductionRecipe, as the generation of reduction operations no longer requires TTI. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D158148	2023-09-04 05:30:37 -07:00
Mel Chen	463e7cb892	[LV][VPlan] Refactor VPReductionRecipe to use reference for member RdxDesc This commit refactors the implementation of VPReductionRecipe to use reference instead of pointer for member RdxDesc. Because the member RdxDesc in VPReductionRecipe should not be a nullptr, using a reference will provide clearer semantics. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D158058	2023-08-16 19:37:49 -07:00
Bjorn Pettersson	e53b28c833	[llvm] Drop some bitcasts and references related to typed pointers Differential Revision: https://reviews.llvm.org/D157551	2023-08-10 15:07:07 +02:00
Alexandros Lamprineas	d1b376fd7b	[FuncSpec] Rework the discardment logic for unprofitable specializations. Currently we make an arbitrary comparison between codesize and latency in order to decide whether to keep a specialization or not. Sometimes the latency savings are biased in favor of loops because of imprecise block frequencies, therefore this metric contains a lot of noise. This patch tries to address the problem as follows: * Reject specializations whose codesize savings are less than X% of the original function size. * Reject specializations whose latency savings are less than Y% of the original function size. * Reject specializations whose inlining bonus is less than Z% of the original function size. I am not saying this is super precise, but at least X, Y and Z are configurable, allowing us to tweak the cost model. Moreover, it lets us prioritize codesize over latency, which is a less noisy metric. I am also increasing the minimum size a function should have to be considered a candidate for specialization. Initially the cost of a function was calculated as CodeMetrics::NumInsts * InlineConstants::getInstrCost() which later in D150464 was altered into CodeMetrics::NumInsts since the metric is supposed to model TargetTransformInfo::TCK_CodeSize. However, we omitted adjusting MinFunctionSize in that commit. Differential Revision: https://reviews.llvm.org/D157123	2023-08-09 10:28:46 +01:00
Florian Hahn	93c5bae00e	[VPlan] Use printOperands for VPInstruction. Use the printOperands for printing VPInstruction's operands to be more in line with other recipes and ensure consistent printing after D15719. Also removes some stray spaces in print output.	2023-08-08 11:31:21 +01:00
Alexandros Lamprineas	c2d19002ae	[FuncSpec] Estimate dead blocks more accurately. Currently we only consider basic blocks with a unique predecessor when estimating the size of dead code. However, we could expand to this to consider blocks with a back-edge, or blocks preceded by dead blocks. Differential Revision: https://reviews.llvm.org/D156903	2023-08-07 11:04:23 +01:00
Alexandros Lamprineas	5bfefff1c4	Reland [FuncSpec] Split the specialization bonus into CodeSize and Latency. Currently we use a combined metric TargetTransformInfo::TCK_SizeAndLatency when estimating the specialization bonus. This is suboptimal, and in some cases erroneous. For example we shouldn't be weighting the codesize decrease attributed to constant propagation by the block frequency of the dead code. Instead only the latency savings should be weighted by block frequency. The total codesize savings from all the specialization arguments should be deducted from the specialization cost. Differential Revision: https://reviews.llvm.org/D155103	2023-08-02 12:41:13 +01:00
Bjorn Pettersson	fd05c34b18	Stop using legacy helpers indicating typed pointer types. NFC Since we no longer support typed LLVM IR pointer types, the code can be simplified into for example using PointerType::get directly instead of using Type::getInt8PtrTy and Type::getInt32PtrTy etc. Differential Revision: https://reviews.llvm.org/D156733	2023-08-02 12:08:37 +02:00
Alexandros Lamprineas	893d3a61c0	Reland [FuncSpec] Add Phi nodes to the InstCostVisitor. This patch allows constant folding of PHIs when estimating the user bonus. Phi nodes are a special case since some of their inputs may remain unresolved until all the specialization arguments have been processed by the InstCostVisitor. Therefore, we keep a list of dead basic blocks and then lazily visit the Phi nodes once the user bonus has been computed for all the specialization arguments. Differential Revision: https://reviews.llvm.org/D154852	2023-07-31 08:25:48 +01:00
Douglas Yung	32683b231e	Revert "[FuncSpec] Add Phi nodes to the InstCostVisitor." This reverts commit 96ff464dd3aac255adc52787a1e28487a9cd4c35. The test in this change was failing on many buildbots: https://lab.llvm.org/buildbot/#/builders/164/builds/41292 https://lab.llvm.org/buildbot/#/builders/258/builds/4491 https://lab.llvm.org/buildbot/#/builders/192/builds/3566 https://lab.llvm.org/buildbot/#/builders/123/builds/20411 https://lab.llvm.org/buildbot/#/builders/58/builds/42553 https://lab.llvm.org/buildbot/#/builders/247/builds/7037 https://lab.llvm.org/buildbot/#/builders/139/builds/46259 https://lab.llvm.org/buildbot/#/builders/216/builds/24650 https://lab.llvm.org/buildbot/#/builders/234/builds/12571 https://lab.llvm.org/buildbot/#/builders/232/builds/12574 https://lab.llvm.org/buildbot/#/builders/235/builds/975	2023-07-27 13:47:52 -07:00
Alexandros Lamprineas	96ff464dd3	[FuncSpec] Add Phi nodes to the InstCostVisitor. This patch allows constant folding of PHIs when estimating the user bonus. Phi nodes are a special case since some of their inputs may remain unresolved until all the specialization arguments have been processed by the InstCostVisitor. Therefore, we keep a list of dead basic blocks and then lazily visit the Phi nodes once the user bonus has been computed for all the specialization arguments. In addition to the last revision this one fixes the bug reported on Phabricator. Differential Revision: https://reviews.llvm.org/D154852	2023-07-27 19:24:11 +01:00
Alexandros Lamprineas	2e00eba232	[FuncSpec][NFC] Remove SSA copy intrinsics in the unittests. Those are added by the SCCP Solver before invoking the Specializer. They need to be removed otherwise the destructor of PredicateInfo complains. Differential Revision: https://reviews.llvm.org/D156365	2023-07-27 08:37:33 +01:00
Alexandros Lamprineas	c52ab9ea2f	Revert "[FuncSpec] Add Phi nodes to the InstCostVisitor." Reverting due to the crash reported in D154852. Also reverting the subsequent commit as collateral damage: "[FuncSpec] Split the specialization bonus into CodeSize and Latency."	2023-07-26 12:33:41 +01:00
Alexandros Lamprineas	20c8f58c11	[FuncSpec] Split the specialization bonus into CodeSize and Latency. Currently we use a combined metric TargetTransformInfo::TCK_SizeAndLatency when estimating the specialization bonus. This is suboptimal, and in some cases erroneous. For example we shouldn't be weighting the codesize decrease attributed to constant propagation by the block frequency of the dead code. Instead only the latency savings should be weighted by block frequency. The total codesize savings from all the specialization arguments should be deducted from the specialization cost. Differential Revision: https://reviews.llvm.org/D155103	2023-07-26 12:03:46 +01:00
Alexandros Lamprineas	03f1d09fe4	[FuncSpec] Add Phi nodes to the InstCostVisitor. This patch allows constant folding of PHIs when estimating the user bonus. Phi nodes are a special case since some of their inputs may remain unresolved until all the specialization arguments have been processed by the InstCostVisitor. Therefore, we keep a list of dead basic blocks and then lazily visit the Phi nodes once the user bonus has been computed for all the specialization arguments. Differential Revision: https://reviews.llvm.org/D154852	2023-07-25 11:00:20 +01:00
Alexandros Lamprineas	1d0476cb4d	[FuncSpec] Prefer DataLayout-aware constant folding of GEPs. As shown in D154820, the DataLayout-independent constant folding interface is not good enough for handling GEPs. Instead we should be using the DataLayout-aware constant folding interface. Since there isn't a method to specifically handle GEPs we can use the one which folds generic instruction operands. Differential Revision: https://reviews.llvm.org/D154821	2023-07-11 13:24:26 +01:00
Alexandros Lamprineas	cae00b2a9b	[FuncSpec][NFC] Improve the unittest coverage for constant folding of GEPs. The InstCostVisitor is currently using the DataLayout-independent constant folding interface. This is a workaround since we can't directly call ConstantExpr::getGetElementPtr due to deprecation. This patch shows that the constant folding interface we are using is not good enough. Differential Revision: https://reviews.llvm.org/D154820	2023-07-11 13:24:12 +01:00
Johannes Doerfert	e9fc399db3	[Attributor][NFCI] Use pointers to pass around AAs This will make it easier to create less trivial AAs in the future as we can simply return `nullptr` rather than an AA with in invalid state.	2023-06-23 17:21:20 -07:00
Alexandros Lamprineas	5400257ded	[FuncSpec] Add Freeze and CallBase to the InstCostVisitor. Allows constant folding of such instructions when estimating user bonus. Differential Revision: https://reviews.llvm.org/D153036	2023-06-19 10:53:08 +01:00
Alexandros Lamprineas	f11d8c88dd	[FuncSpec][NFC] Improve the unittest coverage. The specialization bonus is zero in some unittests because the basic blocks containing the users of the constant arguments are executed less frequently than the entry block. Sinking them into loops solves that. Differential Revision: https://reviews.llvm.org/D153230	2023-06-19 09:43:26 +01:00
Arthur Eubanks	3e39cfe5b4	Revert "Revert "InstSimplify: Require instruction be parented"" This reverts commit 0c03f48480f69b854f86d31235425b5cb71ac921. Going to fix forward size regression instead due to more dependent patches needing to be reverted otherwise.	2023-06-16 13:53:31 -07:00
Arthur Eubanks	0c03f48480	Revert "InstSimplify: Require instruction be parented" This reverts commit 1536e299e63d7788f38117b0212ca50eb76d7a3b. Causes large binary size regressions, see comments on https://reviews.llvm.org/rG1536e299e63d7788f38117b0212ca50eb76d7a3b.	2023-06-16 11:24:29 -07:00
Alan Zhao	d6b4f6786b	Revert "Revert "InstSimplify: Require instruction be parented"" This reverts commit 00264eac4d0938ae8a0826da38e4777be269124c. Reason: caused a bunch of bots to break	2023-06-16 10:58:54 -07:00
Alan Zhao	00264eac4d	Revert "InstSimplify: Require instruction be parented" This reverts commit 1536e299e63d7788f38117b0212ca50eb76d7a3b. Reason: causes a regression in the inliner (see https://crbug.com/1454531 and https://reviews.llvm.org/rG1536e299e63d7788f38117b0212ca50eb76d7a3b#1217141)	2023-06-16 10:36:49 -07:00
Alexandros Lamprineas	4d13896d8a	Reland "[FuncSpec] Improve the accuracy of the cost model" Instead of blindly traversing the use-def chain of constant arguments, compute known constants along the way. Stop as soon as a user cannot be replaced by a constant. Keep it light-weight by handling some basic instruction types. Differential Revision: https://reviews.llvm.org/D150464	2023-06-08 17:44:48 +01:00
Jim Lin	d4c5b45293	[NFC] Remove unneeded semicolon after function definition	2023-06-07 09:29:49 +08:00
Matt Arsenault	1536e299e6	InstSimplify: Require instruction be parented Unlike every other analysis and transform, simplifyInstruction permitted operating on instructions which are not inserted into a function. This created an edge case no other code needs to really worry about, and limited transforms in cases that can make use of the context function. Only the inliner and a handful of other utilities were making use of this, so just fix up these edge cases. Results in some IR ordering differences since cloned blocks are inserted eagerly now. Plus some additional simplifications trigger (e.g. some add 0s now folded out that previously didn't).	2023-06-02 18:14:28 -04:00
Nikita Popov	96a14f388b	Revert "[FuncSpec] Replace LoopInfo with BlockFrequencyInfo" As reported on https://reviews.llvm.org/D150375#4367861 and following, this change causes PDT invalidation issues. Revert it and dependent commits. This reverts commit 0524534d5220da5ecb2cd424a46520184d2be366. This reverts commit ced90d1ff64a89a13479a37a3b17a411a3259f9f. This reverts commit 9f992cc9350a7f7072a6dbf018ea07142ea7a7ed. This reverts commit 1b1232047e83b69561fd64b9547cb0a0d374473a.	2023-05-30 14:49:03 +02:00
Alexandros Lamprineas	0524534d52	[FuncSpec] Enable specialization of literal constants. To do so we have to tweak the cost model such that specialization does not trigger excessively. Differential Revision: https://reviews.llvm.org/D150649	2023-05-25 09:55:46 +01:00
Alexandros Lamprineas	ced90d1ff6	[FuncSpec] Improve the accuracy of the cost model. Instead of blindly traversing the use-def chain of constant arguments, compute known constants along the way. Stop as soon as a user cannot be replaced by a constant. Keep it light-weight by handling some basic instruction types. Differential Revision: https://reviews.llvm.org/D150464	2023-05-24 11:40:12 +01:00
Matt Arsenault	7d7c82ad84	Convert unit test to opaque pointers	2023-05-24 08:49:03 +01:00
Florian Hahn	701f7230cd	[VPlan] Use VPRecipeWithIRFlags for VPReplicateRecipe, retire poison map Update VPReplicateRecipe to use VPRecipeWithIRFlags for IR flag handling. Retire separate MayGeneratePoisonRecipes map. Depends on D149082. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150027	2023-05-15 11:49:20 +01:00
Florian Hahn	c2bef381fa	[VPlan] Remove setEntry to avoid leaks when replacing entry. Update the HCFG builder to directly connect the created CFG to the existing Plan's entry. This allows removing `setEntry`, which can cause leaks when the existing entry is replaced. Should fix https://lab.llvm.org/buildbot/#/builders/5/builds/33455/steps/13/logs/stdio	2023-05-04 19:12:02 +01:00
Florian Hahn	b85a402dd8	[VPlan] Introduce new entry block to VPlan for early SCEV expansion. This patch adds a new preheader block the VPlan to place SCEV expansions expansions like the trip count. This preheader block is disconnected at the moment, as the bypass blocks of the skeleton are not yet modeled in VPlan. The preheader block is executed before skeleton creation, so the SCEV expansion results can be used during skeleton creation. At the moment, the trip count expression and induction steps are expanded in the new preheader. The remainder of SCEV expansions will be moved gradually in the future. D147965 will update skeleton creation to use the steps expanded in the pre-header to fix #58811. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147964	2023-05-04 14:00:13 +01:00
Florian Hahn	6303fa369c	[VPlan] Remove DeadInsts arg from VPInstructionsToVPRecipes (NFC) The argument isn't used. VPlan-based dead recipe removal can be used instead.	2023-05-01 15:03:29 +01:00
Florian Hahn	2c9d21a2a3	[VPlan] Turn Plan entry node into VPBasicBlock (NFCI). The entry to the plan is the preheader of the vector loop and guaranteed to be a VPBasicBlock. Make sure this is the case by adjusting the type. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149005	2023-04-28 12:29:06 +01:00
OCHyams	3feea34d77	[DebugInfo] Do not delete debug intrinsics with empty metadata operands A ValueAsMetadata may be replaced with nullptr for several reasons including deleting (certain) values and value remapping a use-before-def. In the case of a MetadataAsValue user, handleChangedOperand intercepts and replaces the metadata with an empty tuple (!{}). At the moment, an empty metadata operand in a debug intrinsics signals that it can be deleted. Given that we end up with empty metadata operands in circumstances where the Value has been "lost" the current behaviour can lead to incorrect variable locations. Instead, we should treat empty metadata as meaning "there is no location for the variable" (the same as we currently treat undef operands). This patch removes the deletion logic from wouldInstructionBeTriviallyDead. Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D140901	2023-04-26 09:58:31 +01:00
Bing1 Yu	c2f29f24c2	[ValueMapper] allow mapping ConstantTargetNone to its layout type zeroinitializer is allowed for spirv TargetExtType. This PR allows ValueMapper to map TargetExtType ConstantTargetNone to '0' constant of its layout type. Reviewed By: jcranmer-intel Differential Revision: https://reviews.llvm.org/D148774	2023-04-25 15:48:28 +08:00
OCHyams	ea60ffc6d1	[NFC] Return unique dbg intrinsics from findDbgValues and findDbgUsers The out-param vector from findDbgValues and findDbgUsers should not include duplicates, which is possible if the debug intrinsic uses the value multiple times. This filter is already in place for multiple uses in a `DIArgLists`; extend it to cover dbg.assigns too because a Value may be used in both the address and value components. Additionally, refactor the duplicated functionality between findDbgValues and FindDbgUsers into a new function findDbgIntrinsics. Reviewed By: jmorse, StephenTozer Differential Revision: https://reviews.llvm.org/D148788	2023-04-20 14:18:46 +01:00
Florian Hahn	ff0ec4f42e	Recommit "[VPlan] Unify Value2VPValue and VPExternalDefs maps (NFCI)." This reverts the revert commit 8c2276f89887d0a27298a1bbbd2181fa54bbb509. The updated patch re-orders the getDefiningRecipe check in getVPalue to avoid a use-after-free. Original commit message: Before this patch, a VPlan contained 2 mappings for Values -> VPValue: 1) Value2VPValue and 2) VPExternalDefs. This duplication is unnecessary and there are already cases where external defs are added to Value2VPValue. This patch replaces all uses of VPExternalDefs with Value2VPValue. It clarifies the naming of getOrAddVPValue (to getOrAddExternalVPValue) and addVPValue (to addExternalVPValue). At the moment, this is NFC, but will enable additional simplifications in D147783. Depends on D147891. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147892	2023-04-18 10:29:31 +01:00

1 2 3 4 5 ...

679 Commits