llvm-project

Author	SHA1	Message	Date
Rahul Joshi	99e4b3927c	[LLVM] Cleanup pass initialization for Analysis passes (#135858 ) - Do not call pass initialization from pass constructors. - Instead, pass initialization should happen in the `initializeAnalysis` function. - https://github.com/llvm/llvm-project/issues/111767	2025-04-21 12:36:34 -07:00
Mingming Liu	0d19efa9d5	[NFC]In codegen pipeline, turn static-data-splitter pass on/off with its own option (#134752 ) Per discussion in https://github.com/llvm/llvm-project/pull/129781#discussion_r2017489088, we'd like to refactor out the requirement of MFS.	2025-04-07 22:22:08 -07:00
Mingming Liu	c8a70f4c6e	[CodeGen][StaticDataPartitioning]Place local-linkage global variables in hot or unlikely prefixed sections based on profile information (#125756 ) In this PR, static-data-splitter pass finds out the local-linkage global variables in {`.rodata`, `.data.rel.ro`, `bss`, `.data`} sections by analyzing machine instruction operands, and aggregates their accesses from code across functions. A follow-up item is to analyze global variable initializers and count for access from data. * This limitation is demonstrated by `bss2` and `data3` in `llvm/test/CodeGen/X86/global-variable-partition.ll`. Some stats of static-data-splitter with this patch: section\|bss\|rodata\|data :-----:\|:-----:\|:-----:\|:-----: hot-prefixed section coverage\|99.75%\|97.71%\|91.30% unlikely-prefixed section size percentage\|67.94%\|39.37%\|63.10% 1. The coverage is defined as `#perf-sample-in-hot-prefixed <data> section / #perf-sample in <data.> section` for each <data> section. The perf command samples `MEM_INST_RETIRED.ALL_LOADS:u:pinned:precise=2` events at a high frequency (`perf -c 2251`) for 30 seconds. The profiled binary is built as non-PIE so `data.rel.ro` coverage data is not available. 2. The unlikely-prefixed `<data>` section size percentage is defined as `unlikely <data> section size / the sum size of <data>.* sections` for each `<data>` section	2025-03-28 16:31:46 -07:00
Frederik Harwath	6962cf1700	Rename ExpandLargeFpConvertPass to ExpandFpPass (#131128 ) This is meant as a preparation for PR #130988 "[AMDGPU] Implement IR expansion for frem instruction" which implements the expansion of another instruction in this pass. The more general name seems more appropriate given this change and quite reasonable even without it.	2025-03-14 13:11:45 +01:00
Akshat Oke	77f44a9642	[CodeGen][NewPM] Port MachineSink to NPM (#115434 ) Targets can set the EnableSinkAndFold option in CGPassBuilderOptions for the NPM pipeline in buildCodeGenPipeline(... &Opts, ...)	2025-03-03 15:49:37 +05:30
David Green	66e0498daf	[GlobalISel] Do not run verifier after ResetMachineFunctionPass (#124799 ) After we fall back from GlobalISel to SDAG, the verifier gets called, which calls getReservedRegs which uses SIMachineFunctionInfo::usesAGPRs which caches the result of UsesAGPRs. Because we have just fallen-back the function is empty and it incorrectly gets cached to false. This patch makes sure we don't try to run the verifier whilst the function is empty.	2025-01-29 12:48:11 +00:00
Mingming Liu	3feb724496	[AsmPrinter][ELF] Support profile-guided section prefix for jump tables' (read-only) data sections (#122215 ) https://github.com/llvm/llvm-project/pull/122183 adds a codegen pass to infer machine jump table entry's hotness from the MBB hotness. This is a follow-up PR to produce `.hot` and or `.unlikely` section prefix for jump table's (read-only) data sections in the relocatable `.o` files. When this patch is enabled, linker will see {`.rodata`, `.rodata.hot`, `.rodata.unlikely`} in input sections. It can map `.rodata.hot` and `.rodata` in the input sections to `.rodata.hot` in the executable, and map `.rodata.unlikely` into `.rodata` with a pending extension to `--keep-text-section-prefix` like `059e7cbb66`, or with a linker script. 1. To partition hot and jump tables, the AsmPrinter pass slices a function's jump table indices into two groups, one for hot and the other for cold jump tables. It then emits hot jump tables into a `.hot`-prefixed data section and cold ones into a `.unlikely`-prefixed data section, retaining the relative order of `LJT<N>` labels within each group. 2. [ELF only] To have data sections with _dynamic_ names (e.g., `.rodata.hot[.func]`), we implement `TargetLoweringObjectFile::getSectionForJumpTable` method that accepts a `MachineJumpTableEntry` parameter, and update `selectELFSectionForGlobal` to generate `.hot` or `.unlikely` based on MJTE's hotness. - The dynamic JT section name doesn't depend on `-ffunction-section=true` or `-funique-section-names=true`, even though it leverages the similar underlying mechanism to have a MCSection with on-demand name as `-ffunction-section` does. 3. The new code path is off by default. - Typically, `TargetOptions` conveys clang or LLVM tools' options to code generation passes. To follow the pattern, add option `EnableStaticDataPartitioning` bit in `TargetOptions` and make it readable through `TargetMachine`. - To enable the new code path in tools like `llc`, `partition-static-data-sections` option is introduced in `CodeGen/CommandFlags.h/cpp`. - A subsequent patch ([draft](`8f36a13743`)) will add a clang option to enable the new code path. --------- Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>	2025-01-28 22:49:28 -08:00
David Green	5a81a559d6	[GISel] Explicitly disable BF16 tablegen patterns. (#124113 ) We currently have an issue where bf16 patters can be used to match fp16 types, as GISel does not know about the difference between the two. This patch explicitly disables them to make sure that they are never used. The opposite can also happen too, where fp16 patterns are used for operators that should be bf16. So this also changes any operations with bf16 types to now cause a fallback to SDAG. The pass setup for GISel has been slightly adjusted to make sure that a verify pass does not get added between AMD-SDAG and SIFixSGPRCopiesPass, which otherwise can cause verifier issues when falling back.	2025-01-27 22:21:12 +00:00
Mingming Liu	de209fa11b	[CodeGen] Introduce Static Data Splitter pass (#122183 ) https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744 proposes to partition static data sections. This patch introduces a codegen pass. This patch produces jump table hotness in the in-memory states (machine jump table info and entries). Target-lowering and asm-printer consume the states and produce `.hot` section suffix. The follow up PR https://github.com/llvm/llvm-project/pull/122215 implements such changes. --------- Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>	2025-01-22 21:06:46 -08:00
Fangrui Song	9efa7d7af3	Remove -print-lsr-output in favor of --stop-after=loop-reduce Pull Request: https://github.com/llvm/llvm-project/pull/121305	2024-12-29 18:58:30 -08:00
Rahman Lavaee	68f7b075c0	[BasicBlockSections] Allow mixing of -basic-block-sections with MFS. (#117076 ) This PR allows mixing `-basic-block-sections` with `-enable-machine-function-splitter`. The strategy is to let `-basic-block-sections` take precedence over functions with profiles.	2024-11-22 22:23:29 -08:00
Akshat Oke	3f9d02aae8	[CodeGen][NewPM] Port PeepholeOptimizer to NPM (#116326 ) With this, all machine SSA optimization passes are available in the new codegen pipeline.	2024-11-18 11:02:01 +05:30
Matin Raayai	bb3f5e1fed	Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234 ) Following discussions in #110443, and the following earlier discussions in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html, https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine` interface classes. More specifically: 1. Makes `TargetMachine` the only class implemented under `TargetMachine.h` in the `Target` library. 2. `TargetMachine` contains target-specific interface functions that relate to IR/CodeGen/MC constructs, whereas before (at least on paper) it was supposed to have only IR/MC constructs. Any Target that doesn't want to use the independent code generator simply does not implement them, and returns either `false` or `nullptr`. 3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming aims to make the purpose of `LLVMTargetMachine` clearer. Its interface was moved under the CodeGen library, to further emphasis its usage in Targets that use CodeGen directly. 4. Makes `TargetMachine` the only interface used across LLVM and its projects. With these changes, `CodeGenCommonTMImpl` is simply a set of shared function implementations of `TargetMachine`, and CodeGen users don't need to static cast to `LLVMTargetMachine` every time they need a CodeGen-specific feature of the `TargetMachine`. 5. More importantly, does not change any requirements regarding library linking. cc @arsenm @aeubanks	2024-11-14 13:30:05 -08:00
Kyungwoo Lee	d23c5c2d65	[CGData] Global Merge Functions (#112671 ) This implements a global function merging pass. Unlike traditional function merging passes that use IR comparators, this pass employs a structurally stable hash to identify similar functions while ignoring certain constant operands. These ignored constants are tracked and encoded into a stable function summary. When merging, instead of explicitly folding similar functions and their call sites, we form a merging instance by supplying different parameters via thunks. The actual size reduction occurs when identically created merging instances are folded by the linker. Currently, this pass is wired to a pre-codegen pass, enabled by the `-enable-global-merge-func` flag. In a local merging mode, the analysis and merging steps occur sequentially within a module: - `analyze`: Collects stable function hashes and tracks locations of ignored constant operands. - `finalize`: Identifies merge candidates with matching hashes and computes the set of parameters that point to different constants. - `merge`: Uses the stable function map to optimistically create a merged function. We can enable a global merging mode similar to the global function outliner (https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753/), which will perform the above steps separately. - `-codegen-data-generate`: During the first round of code generation, we analyze local merging instances and publish their summaries. - Offline using `llvm-cgdata` or at link-time, we can finalize all these merging summaries that are combined to determine parameters. - `-codegen-data-use`: During the second round of code generation, we optimistically create merging instances within each module, and finally, the linker folds identically created merging instances. Depends on #112664 This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.	2024-11-13 17:34:07 -08:00
abhishek-kaushik22	d2aff182d3	Revert "TLS loads opimization (hoist)" (#114740 ) This reverts commit c31014322c0b5ae596da129cbb844fb2198b4ef4. Based on the discussions in #112772, this pass is not needed after the introduction of `llvm.threadlocal.address` intrinsic. Fixes https://github.com/llvm/llvm-project/issues/112771.	2024-11-07 10:10:28 +01:00
Akshat Oke	44d0e9522a	[CodeGen][NewPM] Port TailDuplicate pass to NPM (#113293 )	2024-10-30 11:48:40 +05:30
Akshat Oke	c4c60c0db9	[CodeGen][NewPM] Port OptimizePHIs to NPM (#113433 )	2024-10-23 16:55:21 +05:30
Christudasan Devadasan	488d3924dd	[CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (#108508 )	2024-10-16 13:22:57 +05:30
Akshat Oke	cd6c2b80be	[NewPM][CodeGen] Port StackColoring to NPM (#111812 )	2024-10-14 19:23:34 +05:30
Christudasan Devadasan	6c143a86cd	[CodeGen][NewPM] Port MachineCSE pass to new pass manager. (#106605 )	2024-09-04 18:54:07 +05:30
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Philip Reames	27a62ec72a	[LSR] Split the -lsr-term-fold transformation into it's own pass (#104234 ) This transformation doesn't actually use any of the internal state of LSR and recomputes all information from SCEV. Splitting it out makes it easier to test. Note that long term I would like to write a version of this transform which is integrated with LSR's solver, but if that happens, we'll just delete the extra pass. Integration wise, I switched from using TTI to using a pass configuration variable. This seems slightly more idiomatic, and means we don't run the extra logic on any target other than RISCV.	2024-08-17 18:34:23 -07:00
Jay Foad	b4edfc1920	[LTO] Run ObjCARCContractPass according to the callgraph (#103034 ) This matches other IR codegen passes and avoids a Dominator Tree Construction in AMDGPU O2/O3 builds.	2024-08-13 12:57:13 +01:00
Peter Rong	74e4694b8c	[LTO] enable `ObjCARCContractPass` only on optimized build (#101114 ) \#92331 tried to make `ObjCARCContractPass` by default, but it caused a regression on O0 builds and was reverted. This patch trys to bring that back by: 1. reverts the [revert](`1579e9ca9c`). 2. `createObjCARCContractPass` only on optimized builds. Tests are updated to refelect the changes. Specifically, all `O0` tests should not include `ObjCARCContractPass` Signed-off-by: Peter Rong <PeterRong@meta.com>	2024-08-09 13:04:25 -07:00
Alexis Engelke	fa92d51f9e	[VP] Merge ExpandVP pass into PreISelIntrinsicLowering (#101652 ) Similar to #97727; avoid an extra pass over the entire IR by performing the lowering as part of the pre-isel-intrinsic-lowering pass.	2024-08-06 09:27:59 +02:00
Alexis Engelke	b5fc083dc3	[CodeGen] Merge lowerConstantIntrinsics into pre-isel lowering (#97727 ) Currently, the LowerConstantIntrinsics pass does an RPO traversal of every function... only to find that many functions don't have constant intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is already a pre-isel intrinsic lowering pass, which iterates over intrinsic declarations and lowers all users. Call lowerConstantIntrinsics from this pass to avoid the extra iteration over the entire IR and the RPO traversal.	2024-08-01 17:44:32 +02:00
Egor Pasko	cab81dd038	[EntryExitInstrumenter] Move passes out of clang into LLVM default pipelines (#92171 ) Move EntryExitInstrumenter(PostInlining=true) to as late as possible and EntryExitInstrumenter(PostInlining=false) to an early pre-inlining stage (but skip for ThinLTO post-link). This should fix the issues reported in https://github.com/rust-lang/rust/issues/92109 and https://github.com/llvm/llvm-project/issues/52853. These are caused by https://reviews.llvm.org/D97608.	2024-05-31 12:48:45 -07:00
Nikita Popov	1579e9ca9c	Revert "Run ObjCContractPass in Default Codegen Pipeline (#92331 )" This reverts commit 8cc8e5d6c6ac9bfc888f3449f7e424678deae8c2. This reverts commit dae55c89835347a353619f506ee5c8f8a2c136a7. Causes major compile-time regressions for unoptimized builds.	2024-05-24 08:14:26 +02:00
Nuri Amari	8cc8e5d6c6	Run ObjCContractPass in Default Codegen Pipeline (#92331 ) Prior to this patch, when using -fthinlto-index= the ObjCARCContractPass isn't run prior to CodeGen, and instruction selection fails on IR containing arc intrinsics. This patch is motivated by that usecase. The pass was previously added in various places codegen is performed. This patch adds the pass to the default codegen pipepline, makes sure it bails immediately if no arc intrinsics are found, and removes the adhoc scheduling of the pass. Co-authored-by: Nuri Amari <nuriamari@fb.com>	2024-05-23 10:04:55 -07:00
Paul Walker	bd6eb54886	[LLVM][CodeGen] Teach SelectionDAG how to expand FREM to a vector math call. (#83859 ) This removes, at least when a vector library is available, a failure case for scalable vectors. Doing so means we can confidently cost vector FREM instructions without making an assumption that later passes will transform the IR before it gets to the code generator. NOTE: Whilst only FREM has been implemented the same mechanism can be used for the other libm related ISD nodes.	2024-03-08 12:09:05 +00:00
Jack Styles	28233408a2	[CodeGen] [ARM] Make RISC-V Init Undef Pass Target Independent and add support for the ARM Architecture. (#77770 ) When using Greedy Register Allocation, there are times where early-clobber values are ignored, and assigned the same register. This is illeagal behaviour for these intructions. To get around this, using Pseudo instructions for early-clobber registers gives them a definition and allows Greedy to assign them to a different register. This then meets the ARM Architecture Reference Manual and matches the defined behaviour. This patch takes the existing RISC-V patch and makes it target independent, then adds support for the ARM Architecture. Doing this will ensure early-clobber restraints are followed when using the ARM Architecture. Making the pass target independent will also open up possibility that support other architectures can be added in the future.	2024-02-26 12:12:31 +00:00
Heejin Ahn	473ef10b0f	[WebAssembly] Demote PHIs in catchswitch BB only (#81570 ) `DemoteCatchSwitchPHIOnly` option in `WinEHPrepare` pass was added in `99d60e0dab`, because Wasm EH uses `WinEHPrepare`, but it doesn't need to demote all PHIs. PHIs in `catchswitch` BBs have to be removed (= demoted) because `catchswitch`s are removed in ISel and `catchswitch` BBs are removed as well, so they can't have other instructions. But because Wasm EH doesn't use funclets, so PHIs in `catchpad` or `cleanuppad` BBs don't need to be demoted. That was the reason `DemoteCatchSwitchPHIOnly` option was added, in order not to demote more instructions unnecessarily. The problem is it should have been set to `true` for Wasm EH. (Its default value is `false` for WinEH) And I mistakenly set it to `false` and wasn't aware about this for more than 5 years. This was not the end of the world; it just means we've been demoting more instructions than we should, possibly huting code size. In practice I think it would've had hardly any effect in real performance given that the occurrence of PHIs in `catchpad` or `cleanuppad` BBs are not very frequent and many people run other optimizers like Binaryen anyway.	2024-02-13 13:43:21 -08:00
Rahman Lavaee	acec6419e8	[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used together by decoupling the handling of the two features. (#74128 ) Today `-split-machine-functions` and `-fbasic-block-sections={all,list}` cannot be combined with `-basic-block-sections=labels` (the labels option will be ignored). The inconsistency comes from the way basic block address map -- the underlying mechanism for basic block labels -- encodes basic block addresses (https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). Specifically, basic block offsets are computed relative to the function begin symbol. This relies on functions being contiguous which is not the case for MFS and basic block section binaries. This means Propeller cannot use binary profiles collected from these binaries, which limits the applicability of Propeller for iterative optimization. To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section binaries, we propose modifying the encoding of this section as follows. First let us review the current encoding which emits the address of each function and its number of basic blocks, followed by basic block entries for each basic block. \| \| \| \|--\|--\| \| Address of the function \| Function Address \| \| Number of basic blocks in this function \| NumBlocks \| \| BB entry 1 \| BB entry 2 \| ... \| BB entry #NumBlocks To make this work for basic block sections, we treat each basic block section similar to a function, except that basic block sections of the same function must be encapsulated in the same structure so we can map all of them to their single function. We modify the encoding to first emit the number of basic block sections (BB ranges) in the function. Then we emit the address map of each basic block section section as before: the base address of the section, its number of blocks, and BB entries for its basic block. The first section in the BB address map is always the function entry section. \| \| \| \|--\|--\| \| Number of sections for this function \| NumBBRanges \| \| Section 1 begin address \| BaseAddress[1] \| \| Number of basic blocks in section 1 \| NumBlocks[1] \| \| BB entries for Section 1 \|..................\| \| Section #NumBBRanges begin address \| BaseAddress[NumBBRanges] \| \| Number of basic blocks in section #NumBBRanges \| NumBlocks[NumBBRanges] \| \| BB entries for Section #NumBBRanges The encoding of basic block entries remains as before with the minor change that each basic block offset is now computed relative to the begin symbol of its containing BB section. This patch adds a new boolean codegen option `-basic-block-address-map`. Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD flag `--lto-basic-block-address-map` are introduced. Analogously, we add a new TargetOption field `BBAddrMap`. This means BB address maps are either generated for all functions in the compiling unit, or for none (depending on `TargetOptions::BBAddrMap`). This patch keeps the functionality of the old `-fbasic-block-sections=labels` option but does not remove it. A subsequent patch will remove the obsolete option. We refactor the `BasicBlockSections` pass by separating the BB address map and BB sections handing to their own functions (named `handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers basic blocks and places them in their assigned sections. `handleBBAddrMap` is invoked after `handleBBSections` (if requested) and only renumbers the blocks. - New tests added: - Two tests basic-block-address-map-with-basic-block-sections.ll and basic-block-address-map-with-mfs.ll to exercise the combination of `-basic-block-address-map` with `-basic-block-sections=list` and '-split-machine-functions`. - A driver sanity test for the `-fbasic-block-address-map` option (basic-block-address-map.c). - An LLD test for testing the `--lto-basic-block-address-map` option. This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`. - Renamed and modified the two existing codegen tests for basic block address map (`basic-block-sections-labels-functions-sections.ll` and `basic-block-sections-labels.ll`) - Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of `SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2 will happen in a separate PR in a few months.	2024-02-01 17:50:46 -08:00
paperchalice	7e50f006f7	[NewPM][CodeGen][llc] Add NPM support (#70922 ) Add new pass manager support to `llc`. Users can use `--passes=pass1,pass2...` to run mir passes, and use `--enable-new-pm` to run default codegen pipeline. This patch is taken from [D83612](https://reviews.llvm.org/D83612), the original author is @yuanfang-chen. --------- Co-authored-by: Yuanfang Chen <455423+yuanfang-chen@users.noreply.github.com>	2024-01-24 09:27:25 +08:00
Yi Kong	3ea92ea2f9	Fix MFS warning format WithColor::warning() does not append new line automatically.	2024-01-23 17:01:23 +09:00
paperchalice	ab0d8fc4a6	Reland "[CodeGen] Support start/stop in CodeGenPassBuilder (#70912 )" (#78570 ) Unfortunately the legacy pass system can't recognize `no-op-module` and `no-op-function` so it causes test failure in `CodeGenTests`. Add a workaround in function `PassInfo *getPassInfo(StringRef PassName)`, `TargetPassConfig.cpp`.	2024-01-20 08:38:22 +08:00
paperchalice	a48c1bda74	Revert "[CodeGen] Support start/stop in CodeGenPassBuilder" (#78567 ) Reverts llvm/llvm-project#70912. This breaks some bazel tests.	2024-01-18 20:09:53 +08:00
paperchalice	baaf0c968e	[CodeGen] Support start/stop in CodeGenPassBuilder (#70912 ) Add `-start/stop-before/after` support for CodeGenPassBuilder. Part of #69879.	2024-01-18 14:54:56 +08:00
Nick Anderson	f1ec0d12bb	Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#77182 ) Port CodeGenPrepare to new pass manager and dependency BasicBlockSectionsProfileReader Fixes: #75380 Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>	2024-01-09 13:32:59 +07:00
Simon Pilgrim	7648371c25	Revert 4d7c5ad58467502fcbc433591edff40d8a4d697d "[NewPM] Update CodeGenPreparePass reference in CodeGenPassBuilder (#77054 )" Revert e0c554ad87d18dcbfcb9b6485d0da800ae1338d1 "Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380)" Revert #75380 and #77054 as they were breaking EXPENSIVE_CHECKS buildbots: https://lab.llvm.org/buildbot/#/builders/104	2024-01-05 12:28:10 +00:00
Nick Anderson	e0c554ad87	Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380 ) Port CodeGenPrepare to new pass manager and dependency BasicBlockSectionsProfileReader Fixes: #64560 Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>	2024-01-05 13:47:56 +07:00
Yusra Syeda	0768253c20	[SystemZ][z/OS] Add exception handling for XPLINK (#74638 ) Adds emitting the exception table and the EH registers for XPLINK. --------- Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>	2023-12-19 13:58:33 -05:00
paperchalice	d63f54f91f	[CodeGen][NewPM] Add necessary codegen options (#70904 ) These options are used by `TargetPassConfig` to build CodeGen pass pipeline, add them to `CGPassBuilderOption` so `CodeGenPassBuilder` can use them. Currently not all options are added, but it is enough to build a prototype of `CodeGenPassBuilder`. Part of #69879.	2023-12-15 17:03:28 +08:00
paperchalice	60eca674b1	[CodeGen] Port `ExpandMemCmp` to new pass manager (#74050 )	2023-12-13 16:18:24 +08:00
paperchalice	4d8bf6ea7f	[CodeGen][GC] Remove `GCInfoPrinter` (#75033 ) This pass is broken and looks like no one uses it for the last 15+ years. ```c++ bool Printer::runOnFunction(Function &F) { if (F.hasGC()) return false; GCFunctionInfo *FD = &getAnalysis<GCModuleInfo>().getFunctionInfo(F); ``` ```c++ GCFunctionInfo &GCModuleInfo::getFunctionInfo(const Function &F) { assert(!F.isDeclaration() && "Can only get GCFunctionInfo for a definition!"); assert(F.hasGC()); // Equivalent to `assert(false);` when called by `Printer::runOnFunction` ``` See also #74972.	2023-12-12 09:22:01 +08:00
Rahman Lavaee	f70e39ec17	[BasicBlockSections] Apply path cloning with -basic-block-sections. (#68860 ) `28b9126879` introduced the path cloning format in the basic-block-sections profile. This PR validates and applies path clonings. A path cloning is valid if all of these conditions hold: 1. All bb ids in the path are mapped to existing blocks. 2. Each two consecutive bb ids in the path have a successor relationship in the CFG. 3. The path does not include a block with indirect branches, except possibly as the last block. Applying a path cloning involves cloning all blocks in the path (except the first one) and setting up their branches. Once all clonings are applied, the cluster information is used to guide block layout in the modified function.	2023-10-27 21:49:39 -07:00
Jon Roelofs	83e6d2edfc	Revert "[ARM] Always lower direct calls as direct when the outliner is enabled (#66434 )" This reverts commit 003bcad9a8b21e15e3786a52b1dafa844075ab84. ARM folks say it regresses some of their benchmarks: https://github.com/llvm/llvm-project/pull/66434#issuecomment-1722424162	2023-09-18 09:45:46 -07:00
Jon Roelofs	003bcad9a8	[ARM] Always lower direct calls as direct when the outliner is enabled (#66434 ) The indirect lowering hinders the outliner's ability to see that sequences are in fact common, since the sequence similarity is rendered opaque by the register callee. The size savings from making them indirect seems to be dwarfed by the outliner's savings from de-duplication. rdar://115178034 rdar://115459865	2023-09-15 10:04:56 -07:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Rahman Lavaee	e280e406c2	Add a pass to garbage-collect empty basic blocks after code generation. Propeller and pseudo-probes map profiles back to Machine IR via basic block addresses that are stored in metadata sections. Empty basic blocks (basic blocks without real code) obfuscate the profile mapping because their addresses collide with their next basic blocks. For instance, the fallthrough block of an empty block should always be adjacent to it. Otherwise, a completely unnecessary jump would be added. This patch adds a MachineFunction pass named `GCEmptyBasicBlocks` which attempts to garbage-collect the empty blocks before the `BasicBlockSections` and pass. This pass removes each empty basic block after redirecting its incoming edges to its fall-through block. The garbage-collection is not complete. We keep the empty block in 4 cases: 1. The empty block is an exception handling pad. 2. The empty block has its address taken. 3. The empty block is the last block of the function and it has predecessors. 4. The empty block is the only block of the function. The first three cases are extremely rare in normal code (no cases for the clang binary). Removing the blocks under the first two cases requires modifying exception handling structures and operands of non-terminator instructions -- which is doable but not worth the additional complexity in the pass. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D107534	2023-08-22 22:42:19 +00:00

1 2 3 4 5 ...

312 Commits