llvm-project

Author	SHA1	Message	Date
Gabriel Baraldi	5e0a06b34d	Move ExpandMemCmp and MergeIcmp to the middle end (#77370 ) Moving these into the middle-end pipeline will allow for additional optimization of the expansion result, such as CSE of redundant loads (c.f. https://godbolt.org/z/bEna4Md9r). For now, we conservatively place the passes at the end of the middle-end pipeline, so we mostly don't benefit from additional optimizations yet. The pipeline position will be moved in a future change. This builds on work done by legrosbuffle in https://reviews.llvm.org/D60318. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 09:57:00 +02:00
Marina Taylor	55322f2d43	[ObjCARC] Run ObjCARCContract before PreISelIntrinsicLowering (#184149 ) 74e4694 moved ObjCARCContract from running before the codegen pipeline into addISelPrepare(), which runs after PreISelIntrinsicLowering. This broke ObjCARCContract's retainRV-to-claimRV optimization because ObjCARCContract identifies ARC calls via intrinsics, not their lowered counterparts. This patch restores the pre-74e4694 ordering by moving ObjCARCContract to addISelPasses. The IntrinsicInst.cpp change looks extraneous but is required here: ObjCARCContract may now rewrite the bundle operand from retainRV to claimRV. When PreISelIntrinsicLowering then encounters this new intrinsic use, lowerObjCCall asserts mayLowerToFunctionCall. Assisted-by: claude rdar://137997453	2026-03-27 15:37:47 +00:00
Bill Wendling	9a0d65cdfd	[NFC][CodeGen] Rename CallBrPrepare pass to InlineAsmPrepare (#181547 ) This is an NFC change to make room for a more generalized "prepare" pass for inline assembly beyond CallBrInsts. In particular, changing how we generate code for inline assembly with "rm" constraints.	2026-02-17 15:37:35 -08:00
Nikita Popov	c4721872af	Revert "[Clang][inlineasm] Add special support for "rm" output constraints (#92040 )" This change landed without approval. This reverts commit 45e666a8531c1148bdb170b9a120f99e1500c427. This reverts commit a636dd4c37f12594275de2fe180ca35bc04d76ea.	2026-02-14 15:59:04 +01:00
Bill Wendling	45e666a853	[Clang][inlineasm] Add special support for "rm" output constraints (#92040 ) Clang isn't able to support multiple constraints on inputs and outputs, like "rm". Instead, it picks the "safest" one to use, i.e. the memory constraint for "rm". This leads to obviously horrible code: asm __volatile__ ("pushf\n\t" "popq %0" : "=rm" (x)); is compiled to: pushf popq -8(%rsp) movq -8(%rsp), %rax It gets worse when inlined into other functions, because it may introduce a stack where none is needed. With this change, Clang now generates IR for the more optimistic choice ("r"). All but the fast register allocator are able to fold registers if it turns out that register pressure is too high. This leaves the fast register allocator. The fast register allocator, as the name suggests, is built for execution speed, not code quality. Thus, we add special processing to convert the "optimistic" IR into the "conservative" choice (again at the IR level), which we know it can handle. We focus on "rm" for the initial commit, but that can be expanded in the future for other constraints where Clang generates ++ungood code (like "g"). Fixes: https://github.com/llvm/llvm-project/issues/20571	2026-02-14 05:02:24 -08:00
Aiden Grossman	9eb797ee35	[CodeGen] Remove -gc-empty-basic-blocks alias flag (#178716 ) This was deprecated earlier and switched to an alias so we would have some time to update things through our internal compiler release process. That has completed, so we can now remove the flag upstream.	2026-01-29 18:45:36 +00:00
Matt Arsenault	539412914a	GlobalISel: Use LibcallLoweringInfo analysis in legalizer (#170328 ) This is mostly boilerplate to move various freestanding utility functions into LegalizerHelper. LibcallLoweringInfo is currently optional, mostly because threading it through assorted other uses of LegalizerHelper is more difficult. I had a lot of trouble getting this to work in the legacy pass manager with setRequiresCodeGenSCCOrder, and am not happy with the result. A sub-pass manager is introduced and this is invalidated, so we're re-computing this unnecessarily.	2026-01-16 14:42:10 +01:00
Rahul Joshi	f4d4caad94	[LLVM][CodeGen] Rename `gc-empty-basic-blocks` to `enable-gc-empty-basic-blocks` (#176018 ) Rename the `gc-empty-basic-blocks` command line option to `enable-gc-empty-basic-blocks` in preparation of adding calls to initializing the pass in `initializeCodeGen` and also make the flag more consistent with other existing flags to enable or disable passes. Keep `gc-empty-basic-blocks` as an alias to allow all users to migrate to the new option.	2026-01-15 08:00:07 -08:00
Teja Alaghari	7cc013af9c	[CodeGen][NPM] Add support for -print-regusage in New Pass Manager (#169761 ) Support `-print-regusage` flag in NPM for printing register usage information	2026-01-14 11:00:02 +05:30
Rahman Lavaee	ba6c5f8e4e	Insert symbols for prefetch targets read from basic blocks section profile. (#168439 ) This is the first PR to enable the prefetch optimization via Propeller based on our [RFC](https://discourse.llvm.org/t/rfc-code-prefetch-insertion/88668/22). It enables emitting special symbols prefixed with `__llvm_prefetch_target` to point to the prefetch targets as specified via directives in the profile. A prefetch target is uniquely identified by its function name, basic block ID, and the subblock index (used when the target is after a call instruction). A new pass is added which sets a field in basic blocks which have prefetch targets. The next PR will add the prefetch insertion logic into the same pass.	2026-01-05 15:36:59 -08:00
paperchalice	f59e2b20ea	[CodeGen] Port gc-empty-basic-blocks to new pass manager (#137585 ) Co-authored-by: Aiden Grossman <aidengrossman@google.com>	2025-12-27 16:29:06 -08:00
Frederik Harwath	5c05824d2b	[CodeGen] Rename expand-fp to expand-ir-insts (#172681 ) The pass now contains a non-fp expansion and should be used for any similar expansions regardless of the types involved. Hence a generic name seems apt. Rename the source files, pass, and adjust the pass description. Move all tests for the expansions that have previously been merged into the pass to a single directory.	2025-12-18 11:15:04 +00:00
Frederik Harwath	71760f324f	[CodeGen] Merge ExpandLargeDivRem into ExpandFp (#172680 ) Both passes expand instructions at the IR level. They use the same kind of instruction visitation logic and contain significant code duplication e.g. for scalarization.	2025-12-18 09:22:47 +01:00
wdx727	5911754a30	Adding Matching and Inference Functionality to Propeller-PR4: Implement matching and inference and create clusters (#167622 ) This PR re-submits the previously reverted PR(https://github.com/llvm/llvm-project/pull/165868) and fixes the return type mismatch error. Co-authored-by: lifengxiang1025 <lifengxiang@kuaishou.com> Co-authored-by: zcfh <wuminghui03@kuaishou.com>	2025-12-05 14:59:27 +08:00
Kazu Hirata	98d49d51c0	[CodeGen] Remove a redundant declaration (NFC) (#168285 ) EnableFSDiscriminator is declared in DebugInfoMetadata.h. Identified with readability-redundant-declaration.	2025-11-16 14:06:18 -08:00
spupyrev	1eaff1924d	Revert "Adding Matching and Inference Functionality to Propeller-PR4: Implement matching and inference and create clusters" (#167559 ) Reverts llvm/llvm-project#165868 due to buildbot failures Co-authored-by: spupyrev <spupyrev@users.noreply.github.com>	2025-11-11 18:42:58 +00:00
wdx727	1a88d04089	Adding Matching and Inference Functionality to Propeller-PR4: Implement matching and inference and create clusters (#165868 ) Adding Matching and Inference Functionality to Propeller. For detailed information, please refer to the following RFC: https://discourse.llvm.org/t/rfc-adding-matching-and-inference-functionality-to-propeller/86238. This is the fourth PR, which is used to implement matching and inference and create the clusters. The associated PRs are: PR1: https://github.com/llvm/llvm-project/pull/160706 PR2: https://github.com/llvm/llvm-project/pull/162963 PR3: https://github.com/llvm/llvm-project/pull/164223 co-authors: lifengxiang1025 [lifengxiang@kuaishou.com](mailto:lifengxiang@kuaishou.com); zcfh [wuminghui03@kuaishou.com](mailto:wuminghui03@kuaishou.com) Co-authored-by: lifengxiang1025 <lifengxiang@kuaishou.com> Co-authored-by: zcfh <wuminghui03@kuaishou.com>	2025-11-11 09:17:24 -08:00
wdx727	d8d80b659a	Adding Matching and Inference Functionality to Propeller-PR2 (#162963 ) Adding Matching and Inference Functionality to Propeller. For detailed information, please refer to the following RFC: https://discourse.llvm.org/t/rfc-adding-matching-and-inference-functionality-to-propeller/86238. This is the second PR, which includes the calculation of basic block hashes and their emission to the ELF file. It is associated with the previous PR at https://github.com/llvm/llvm-project/pull/160706. co-authors: lifengxiang1025 [lifengxiang@kuaishou.com](mailto:lifengxiang@kuaishou.com); zcfh [wuminghui03@kuaishou.com](mailto:wuminghui03@kuaishou.com) Co-authored-by: lifengxiang1025 <lifengxiang@kuaishou.com> Co-authored-by: zcfh <wuminghui03@kuaishou.com> Co-authored-by: Rahman Lavaee <rahmanl@google.com>	2025-10-23 09:38:12 -07:00
Ellis Hoag	ada9da7164	[MachineOutliner] Add profile guided outlining (#154437 )	2025-09-09 10:06:46 -07:00
Frederik Harwath	47793f9a73	[AMDGPU] Implement IR expansion for frem instruction (#130988 ) This patch implements a correctly rounded expansion of the frem instruction in LLVM IR. This is useful for target architectures for which such an expansion is too involved to be implement in ISel Lowering. The expansion is based on the code from the AMD device libs and has been tested successfully against the OpenCL conformance tests on amdgpu. The expansion is implemented in the preexisting "expand-fp" pass. It replaces the expansion of "frem" in ISel for the amdgpu target; it is enabled for targets which do not directly support "frem" and for which no matching "fmod" LibCall is available. --------- Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>	2025-09-03 16:27:15 +02:00
sivadeilra	0a3c5c42a1	Add support for Windows Secure Hot-Patching (redo) (#145565 ) (This is a re-do of #138972, which had a minor warning in `Clang.cpp`.) This PR adds some of the support needed for Windows hot-patching. Windows implements a form of hot-patching. This allows patches to be applied to Windows apps, drivers, and the kernel, without rebooting or restarting any of these components. Hot-patching is a complex technology and requires coordination between the OS, compilers, linkers, and additional tools. This PR adds support to Clang and LLVM for part of the hot-patching process. It enables LLVM to generate the required code changes and to generate CodeView symbols which identify hot-patched functions. The PR provides new command-line arguments to Clang which allow developers to identify the list of functions that need to be hot-patched. This PR also allows LLVM to directly receive the list of functions to be modified, so that language front-ends which have not yet been modified (such as Rust) can still make use of hot-patching. This PR: * Adds a `MarkedForWindowsHotPatching` LLVM function attribute. This attribute indicates that a function should be _hot-patched_. This generates a new CodeView symbol, `S_HOTPATCHFUNC`, which identifies any function that has been hot-patched. This attribute also causes accesses to global variables to be indirected through a `_ref_` global variable. This allows hot-patched functions to access the correct version of a global variable; the hot-patched code needs to access the variable in the _original_ image, not the patch image. Adds a `AllowDirectAccessInHotPatchFunction` LLVM attribute. This attribute may be placed on global variable declarations. It indicates that the variable may be safely accessed without the `_ref_` indirection. Adds two Clang command-line parameters: `-fms-hotpatch-functions-file` and `-fms-hotpatch-functions-list`. The `-file` flag may point to a text file, which contains a list of functions to be hot-patched (one function name per line). The `-list` flag simply directly identifies functions to be patched, using a comma-separated list. These two command-line parameters may also be combined; the final set of functions to be hot-patched is the union of the two sets. * Adds similar LLVM command-line parameters: `--ms-hotpatch-functions-file` and `--ms-hotpatch-functions-list`. * Adds integration tests for both LLVM and Clang. * Adds support for dumping the new `S_HOTPATCHFUNC` CodeView symbol. Although the flags are redundant between Clang and LLVM, this allows additional languages (such as Rust) to take advantage of hot-patching support before they have been modified to generate the required attributes. Credit to @dpaoliello, who wrote the original form of this patch.	2025-06-24 14:56:55 -07:00
Qinkun Bao	4b4782bc86	Revert "Add support for Windows Secure Hot-Patching" (#145553 ) Reverts llvm/llvm-project#138972	2025-06-24 13:11:52 -04:00
sivadeilra	26d318e4a9	Add support for Windows Secure Hot-Patching (#138972 ) This PR adds some of the support needed for Windows hot-patching. Windows implements a form of hot-patching. This allows patches to be applied to Windows apps, drivers, and the kernel, without rebooting or restarting any of these components. Hot-patching is a complex technology and requires coordination between the OS, compilers, linkers, and additional tools. This PR adds support to Clang and LLVM for part of the hot-patching process. It enables LLVM to generate the required code changes and to generate CodeView symbols which identify hot-patched functions. The PR provides new command-line arguments to Clang which allow developers to identify the list of functions that need to be hot-patched. This PR also allows LLVM to directly receive the list of functions to be modified, so that language front-ends which have not yet been modified (such as Rust) can still make use of hot-patching. This PR: * Adds a `MarkedForWindowsHotPatching` LLVM function attribute. This attribute indicates that a function should be _hot-patched_. This generates a new CodeView symbol, `S_HOTPATCHFUNC`, which identifies any function that has been hot-patched. This attribute also causes accesses to global variables to be indirected through a `_ref_` global variable. This allows hot-patched functions to access the correct version of a global variable; the hot-patched code needs to access the variable in the _original_ image, not the patch image. Adds a `AllowDirectAccessInHotPatchFunction` LLVM attribute. This attribute may be placed on global variable declarations. It indicates that the variable may be safely accessed without the `_ref_` indirection. Adds two Clang command-line parameters: `-fms-hotpatch-functions-file` and `-fms-hotpatch-functions-list`. The `-file` flag may point to a text file, which contains a list of functions to be hot-patched (one function name per line). The `-list` flag simply directly identifies functions to be patched, using a comma-separated list. These two command-line parameters may also be combined; the final set of functions to be hot-patched is the union of the two sets. * Adds similar LLVM command-line parameters: `--ms-hotpatch-functions-file` and `--ms-hotpatch-functions-list`. * Adds integration tests for both LLVM and Clang. * Adds support for dumping the new `S_HOTPATCHFUNC` CodeView symbol. Although the flags are redundant between Clang and LLVM, this allows additional languages (such as Rust) to take advantage of hot-patching support before they have been modified to generate the required attributes. Credit to @dpaoliello, who wrote the original form of this patch.	2025-06-24 09:22:38 -07:00
Matt Arsenault	36b710a7e5	CodeGen: Convert some assorted errors to use reportFatalUsageError (#142031 ) The test coverage is lacking for many of these errors.	2025-05-30 08:06:53 +02:00
Rahul Joshi	99e4b3927c	[LLVM] Cleanup pass initialization for Analysis passes (#135858 ) - Do not call pass initialization from pass constructors. - Instead, pass initialization should happen in the `initializeAnalysis` function. - https://github.com/llvm/llvm-project/issues/111767	2025-04-21 12:36:34 -07:00
Mingming Liu	0d19efa9d5	[NFC]In codegen pipeline, turn static-data-splitter pass on/off with its own option (#134752 ) Per discussion in https://github.com/llvm/llvm-project/pull/129781#discussion_r2017489088, we'd like to refactor out the requirement of MFS.	2025-04-07 22:22:08 -07:00
Mingming Liu	c8a70f4c6e	[CodeGen][StaticDataPartitioning]Place local-linkage global variables in hot or unlikely prefixed sections based on profile information (#125756 ) In this PR, static-data-splitter pass finds out the local-linkage global variables in {`.rodata`, `.data.rel.ro`, `bss`, `.data`} sections by analyzing machine instruction operands, and aggregates their accesses from code across functions. A follow-up item is to analyze global variable initializers and count for access from data. * This limitation is demonstrated by `bss2` and `data3` in `llvm/test/CodeGen/X86/global-variable-partition.ll`. Some stats of static-data-splitter with this patch: section\|bss\|rodata\|data :-----:\|:-----:\|:-----:\|:-----: hot-prefixed section coverage\|99.75%\|97.71%\|91.30% unlikely-prefixed section size percentage\|67.94%\|39.37%\|63.10% 1. The coverage is defined as `#perf-sample-in-hot-prefixed <data> section / #perf-sample in <data.> section` for each <data> section. The perf command samples `MEM_INST_RETIRED.ALL_LOADS:u:pinned:precise=2` events at a high frequency (`perf -c 2251`) for 30 seconds. The profiled binary is built as non-PIE so `data.rel.ro` coverage data is not available. 2. The unlikely-prefixed `<data>` section size percentage is defined as `unlikely <data> section size / the sum size of <data>.* sections` for each `<data>` section	2025-03-28 16:31:46 -07:00
Frederik Harwath	6962cf1700	Rename ExpandLargeFpConvertPass to ExpandFpPass (#131128 ) This is meant as a preparation for PR #130988 "[AMDGPU] Implement IR expansion for frem instruction" which implements the expansion of another instruction in this pass. The more general name seems more appropriate given this change and quite reasonable even without it.	2025-03-14 13:11:45 +01:00
Akshat Oke	77f44a9642	[CodeGen][NewPM] Port MachineSink to NPM (#115434 ) Targets can set the EnableSinkAndFold option in CGPassBuilderOptions for the NPM pipeline in buildCodeGenPipeline(... &Opts, ...)	2025-03-03 15:49:37 +05:30
David Green	66e0498daf	[GlobalISel] Do not run verifier after ResetMachineFunctionPass (#124799 ) After we fall back from GlobalISel to SDAG, the verifier gets called, which calls getReservedRegs which uses SIMachineFunctionInfo::usesAGPRs which caches the result of UsesAGPRs. Because we have just fallen-back the function is empty and it incorrectly gets cached to false. This patch makes sure we don't try to run the verifier whilst the function is empty.	2025-01-29 12:48:11 +00:00
Mingming Liu	3feb724496	[AsmPrinter][ELF] Support profile-guided section prefix for jump tables' (read-only) data sections (#122215 ) https://github.com/llvm/llvm-project/pull/122183 adds a codegen pass to infer machine jump table entry's hotness from the MBB hotness. This is a follow-up PR to produce `.hot` and or `.unlikely` section prefix for jump table's (read-only) data sections in the relocatable `.o` files. When this patch is enabled, linker will see {`.rodata`, `.rodata.hot`, `.rodata.unlikely`} in input sections. It can map `.rodata.hot` and `.rodata` in the input sections to `.rodata.hot` in the executable, and map `.rodata.unlikely` into `.rodata` with a pending extension to `--keep-text-section-prefix` like `059e7cbb66`, or with a linker script. 1. To partition hot and jump tables, the AsmPrinter pass slices a function's jump table indices into two groups, one for hot and the other for cold jump tables. It then emits hot jump tables into a `.hot`-prefixed data section and cold ones into a `.unlikely`-prefixed data section, retaining the relative order of `LJT<N>` labels within each group. 2. [ELF only] To have data sections with _dynamic_ names (e.g., `.rodata.hot[.func]`), we implement `TargetLoweringObjectFile::getSectionForJumpTable` method that accepts a `MachineJumpTableEntry` parameter, and update `selectELFSectionForGlobal` to generate `.hot` or `.unlikely` based on MJTE's hotness. - The dynamic JT section name doesn't depend on `-ffunction-section=true` or `-funique-section-names=true`, even though it leverages the similar underlying mechanism to have a MCSection with on-demand name as `-ffunction-section` does. 3. The new code path is off by default. - Typically, `TargetOptions` conveys clang or LLVM tools' options to code generation passes. To follow the pattern, add option `EnableStaticDataPartitioning` bit in `TargetOptions` and make it readable through `TargetMachine`. - To enable the new code path in tools like `llc`, `partition-static-data-sections` option is introduced in `CodeGen/CommandFlags.h/cpp`. - A subsequent patch ([draft](`8f36a13743`)) will add a clang option to enable the new code path. --------- Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>	2025-01-28 22:49:28 -08:00
David Green	5a81a559d6	[GISel] Explicitly disable BF16 tablegen patterns. (#124113 ) We currently have an issue where bf16 patters can be used to match fp16 types, as GISel does not know about the difference between the two. This patch explicitly disables them to make sure that they are never used. The opposite can also happen too, where fp16 patterns are used for operators that should be bf16. So this also changes any operations with bf16 types to now cause a fallback to SDAG. The pass setup for GISel has been slightly adjusted to make sure that a verify pass does not get added between AMD-SDAG and SIFixSGPRCopiesPass, which otherwise can cause verifier issues when falling back.	2025-01-27 22:21:12 +00:00
Mingming Liu	de209fa11b	[CodeGen] Introduce Static Data Splitter pass (#122183 ) https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744 proposes to partition static data sections. This patch introduces a codegen pass. This patch produces jump table hotness in the in-memory states (machine jump table info and entries). Target-lowering and asm-printer consume the states and produce `.hot` section suffix. The follow up PR https://github.com/llvm/llvm-project/pull/122215 implements such changes. --------- Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>	2025-01-22 21:06:46 -08:00
Fangrui Song	9efa7d7af3	Remove -print-lsr-output in favor of --stop-after=loop-reduce Pull Request: https://github.com/llvm/llvm-project/pull/121305	2024-12-29 18:58:30 -08:00
Rahman Lavaee	68f7b075c0	[BasicBlockSections] Allow mixing of -basic-block-sections with MFS. (#117076 ) This PR allows mixing `-basic-block-sections` with `-enable-machine-function-splitter`. The strategy is to let `-basic-block-sections` take precedence over functions with profiles.	2024-11-22 22:23:29 -08:00
Akshat Oke	3f9d02aae8	[CodeGen][NewPM] Port PeepholeOptimizer to NPM (#116326 ) With this, all machine SSA optimization passes are available in the new codegen pipeline.	2024-11-18 11:02:01 +05:30
Matin Raayai	bb3f5e1fed	Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234 ) Following discussions in #110443, and the following earlier discussions in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html, https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine` interface classes. More specifically: 1. Makes `TargetMachine` the only class implemented under `TargetMachine.h` in the `Target` library. 2. `TargetMachine` contains target-specific interface functions that relate to IR/CodeGen/MC constructs, whereas before (at least on paper) it was supposed to have only IR/MC constructs. Any Target that doesn't want to use the independent code generator simply does not implement them, and returns either `false` or `nullptr`. 3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming aims to make the purpose of `LLVMTargetMachine` clearer. Its interface was moved under the CodeGen library, to further emphasis its usage in Targets that use CodeGen directly. 4. Makes `TargetMachine` the only interface used across LLVM and its projects. With these changes, `CodeGenCommonTMImpl` is simply a set of shared function implementations of `TargetMachine`, and CodeGen users don't need to static cast to `LLVMTargetMachine` every time they need a CodeGen-specific feature of the `TargetMachine`. 5. More importantly, does not change any requirements regarding library linking. cc @arsenm @aeubanks	2024-11-14 13:30:05 -08:00
Kyungwoo Lee	d23c5c2d65	[CGData] Global Merge Functions (#112671 ) This implements a global function merging pass. Unlike traditional function merging passes that use IR comparators, this pass employs a structurally stable hash to identify similar functions while ignoring certain constant operands. These ignored constants are tracked and encoded into a stable function summary. When merging, instead of explicitly folding similar functions and their call sites, we form a merging instance by supplying different parameters via thunks. The actual size reduction occurs when identically created merging instances are folded by the linker. Currently, this pass is wired to a pre-codegen pass, enabled by the `-enable-global-merge-func` flag. In a local merging mode, the analysis and merging steps occur sequentially within a module: - `analyze`: Collects stable function hashes and tracks locations of ignored constant operands. - `finalize`: Identifies merge candidates with matching hashes and computes the set of parameters that point to different constants. - `merge`: Uses the stable function map to optimistically create a merged function. We can enable a global merging mode similar to the global function outliner (https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753/), which will perform the above steps separately. - `-codegen-data-generate`: During the first round of code generation, we analyze local merging instances and publish their summaries. - Offline using `llvm-cgdata` or at link-time, we can finalize all these merging summaries that are combined to determine parameters. - `-codegen-data-use`: During the second round of code generation, we optimistically create merging instances within each module, and finally, the linker folds identically created merging instances. Depends on #112664 This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.	2024-11-13 17:34:07 -08:00
abhishek-kaushik22	d2aff182d3	Revert "TLS loads opimization (hoist)" (#114740 ) This reverts commit c31014322c0b5ae596da129cbb844fb2198b4ef4. Based on the discussions in #112772, this pass is not needed after the introduction of `llvm.threadlocal.address` intrinsic. Fixes https://github.com/llvm/llvm-project/issues/112771.	2024-11-07 10:10:28 +01:00
Akshat Oke	44d0e9522a	[CodeGen][NewPM] Port TailDuplicate pass to NPM (#113293 )	2024-10-30 11:48:40 +05:30
Akshat Oke	c4c60c0db9	[CodeGen][NewPM] Port OptimizePHIs to NPM (#113433 )	2024-10-23 16:55:21 +05:30
Christudasan Devadasan	488d3924dd	[CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (#108508 )	2024-10-16 13:22:57 +05:30
Akshat Oke	cd6c2b80be	[NewPM][CodeGen] Port StackColoring to NPM (#111812 )	2024-10-14 19:23:34 +05:30
Christudasan Devadasan	6c143a86cd	[CodeGen][NewPM] Port MachineCSE pass to new pass manager. (#106605 )	2024-09-04 18:54:07 +05:30
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Philip Reames	27a62ec72a	[LSR] Split the -lsr-term-fold transformation into it's own pass (#104234 ) This transformation doesn't actually use any of the internal state of LSR and recomputes all information from SCEV. Splitting it out makes it easier to test. Note that long term I would like to write a version of this transform which is integrated with LSR's solver, but if that happens, we'll just delete the extra pass. Integration wise, I switched from using TTI to using a pass configuration variable. This seems slightly more idiomatic, and means we don't run the extra logic on any target other than RISCV.	2024-08-17 18:34:23 -07:00
Jay Foad	b4edfc1920	[LTO] Run ObjCARCContractPass according to the callgraph (#103034 ) This matches other IR codegen passes and avoids a Dominator Tree Construction in AMDGPU O2/O3 builds.	2024-08-13 12:57:13 +01:00
Peter Rong	74e4694b8c	[LTO] enable `ObjCARCContractPass` only on optimized build (#101114 ) \#92331 tried to make `ObjCARCContractPass` by default, but it caused a regression on O0 builds and was reverted. This patch trys to bring that back by: 1. reverts the [revert](`1579e9ca9c`). 2. `createObjCARCContractPass` only on optimized builds. Tests are updated to refelect the changes. Specifically, all `O0` tests should not include `ObjCARCContractPass` Signed-off-by: Peter Rong <PeterRong@meta.com>	2024-08-09 13:04:25 -07:00
Alexis Engelke	fa92d51f9e	[VP] Merge ExpandVP pass into PreISelIntrinsicLowering (#101652 ) Similar to #97727; avoid an extra pass over the entire IR by performing the lowering as part of the pre-isel-intrinsic-lowering pass.	2024-08-06 09:27:59 +02:00
Alexis Engelke	b5fc083dc3	[CodeGen] Merge lowerConstantIntrinsics into pre-isel lowering (#97727 ) Currently, the LowerConstantIntrinsics pass does an RPO traversal of every function... only to find that many functions don't have constant intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is already a pre-isel intrinsic lowering pass, which iterates over intrinsic declarations and lowers all users. Call lowerConstantIntrinsics from this pass to avoid the extra iteration over the entire IR and the RPO traversal.	2024-08-01 17:44:32 +02:00

1 2 3 4 5 ...

336 Commits