270 Commits

Author SHA1 Message Date
Gabriel Baraldi
5e0a06b34d
Move ExpandMemCmp and MergeIcmp to the middle end (#77370)
Moving these into the middle-end pipeline will allow for additional
optimization of the expansion result, such as CSE of redundant loads
(c.f. https://godbolt.org/z/bEna4Md9r). For now, we conservatively place
the passes at the end of the middle-end pipeline, so we mostly don't
benefit from additional optimizations yet. The pipeline position will be
moved in a future change.

This builds on work done by legrosbuffle in
https://reviews.llvm.org/D60318.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 09:57:00 +02:00
Gergo Stomfai
e8a03bb043
[CodGen] Port UnpackMachineBundles to new pass manager (#184918) 2026-03-17 09:01:37 -07:00
Bill Wendling
9a0d65cdfd
[NFC][CodeGen] Rename CallBrPrepare pass to InlineAsmPrepare (#181547)
This is an NFC change to make room for a more generalized "prepare" pass
for inline assembly beyond CallBrInsts. In particular, changing how we
generate code for inline assembly with "rm" constraints.
2026-02-17 15:37:35 -08:00
Nikita Popov
c4721872af Revert "[Clang][inlineasm] Add special support for "rm" output constraints (#92040)"
This change landed without approval.

This reverts commit 45e666a8531c1148bdb170b9a120f99e1500c427.
This reverts commit a636dd4c37f12594275de2fe180ca35bc04d76ea.
2026-02-14 15:59:04 +01:00
Bill Wendling
45e666a853
[Clang][inlineasm] Add special support for "rm" output constraints (#92040)
Clang isn't able to support multiple constraints on inputs and outputs,
like "rm". Instead, it picks the "safest" one to use, i.e. the memory
constraint for "rm". This leads to obviously horrible code:

  asm __volatile__ ("pushf\n\t"
                    "popq %0"
                    : "=rm" (x));

is compiled to:

        pushf
	popq -8(%rsp)
	movq	-8(%rsp), %rax

It gets worse when inlined into other functions, because it may
introduce
a stack where none is needed.

With this change, Clang now generates IR for the more optimistic choice
("r"). All but the fast register allocator are able to fold registers if
it turns out that register pressure is too high.

This leaves the fast register allocator. The fast register allocator, as
the name suggests, is built for execution speed, not code quality. Thus,
we add special processing to convert the "optimistic" IR into the
"conservative" choice (again at the IR level), which we know it can
handle.

We focus on "rm" for the initial commit, but that can be expanded in the
future for other constraints where Clang generates ++ungood code (like
"g").

Fixes: https://github.com/llvm/llvm-project/issues/20571
2026-02-14 05:02:24 -08:00
Rahul Joshi
b12e3122c8
[NFC][Core][CodeGen] Remove pass initialization from pass constructors (#180153) 2026-02-06 09:05:47 -08:00
Anshul Nigham
85545d4c84
[NewPM] Port MachineDominanceFrontierAnalysis (#177709) 2026-02-01 22:02:45 -08:00
Rahul Joshi
78e22cbcb3
[LLVM] Remove pass initialization from pass constructor (#178729)
Remove pass initialization from pass constructor for
GCEmptyBasicBlocksLegacy pass.
2026-01-29 13:38:17 -08:00
Rahul Joshi
26f962465e
[LLVM][CodeGen] Remove pass initialization calls from pass constructors (#173061)
- Remove pass initialization calls from pass constructors.
- For some passes, add the initialization to `initializeCodeGen` or
`initializeGlobalISel`.
- Remove redundant initializations from llc and X86 target for some
passes.
2026-01-21 08:44:51 -08:00
Frederik Harwath
5c05824d2b
[CodeGen] Rename expand-fp to expand-ir-insts (#172681)
The pass now contains a non-fp expansion and should
be used for any similar expansions regardless of the
types involved. Hence a generic name seems apt.

Rename the source files, pass, and adjust the pass
description. Move all tests for the expansions
that have previously been merged into the pass
to a single directory.
2025-12-18 11:15:04 +00:00
Frederik Harwath
71760f324f
[CodeGen] Merge ExpandLargeDivRem into ExpandFp (#172680)
Both passes expand instructions at the IR level.
They use the same kind of instruction visitation
logic and contain significant code duplication e.g.
for scalarization.
2025-12-18 09:22:47 +01:00
Matt Arsenault
04c81a9973
CodeGen: Add LibcallLoweringInfo analysis pass (#168622)
The libcall lowering decisions should be program dependent,
depending on the current module's RuntimeLibcallInfo. We need
another related analysis derived from that plus the current
function's subtarget to provide concrete lowering decisions.

This takes on a somewhat unusual form. It's a Module analysis,
with a lookup keyed on the subtarget. This is a separate module
analysis from RuntimeLibraryAnalysis to avoid that depending on
codegen. It's not a function pass to avoid depending on any
particular function, to avoid repeated subtarget map lookups in
most of the use passes, and to avoid any recomputation in the
common case of one subtarget (and keeps it reusable across
repeated compilations).

This also switches ExpandFp and PreISelIntrinsicLowering as
a sample function and module pass. Note this is not yet wired
up to SelectionDAG, which is still using the LibcallLoweringInfo
constructed inside of TargetLowering.
2025-12-03 22:00:12 +01:00
S. VenkataKeerthy
3c77b49797
[MIR2Vec] Add embedder for machine instructions (#162161)
Implement MIR2Vec embedder for generating vector representations of Machine IR instructions, basic blocks, and functions. This patch introduces changes necessary to *embed* machine opcodes. Machine operands would be handled incrementally in the upcoming patches.
2025-10-21 10:14:27 -07:00
S. VenkataKeerthy
879f8616ef
[IR2Vec] Initial infrastructure for MIR2Vec (#161463)
This PR introduces the initial infrastructure and vocabulary necessary for generating embeddings for MIR (discussed briefly in the earlier IR2Vec RFC - https://discourse.llvm.org/t/rfc-enhancing-mlgo-inlining-with-ir2vec-embeddings).  The MIR2Vec embeddings are useful in driving target specific optimizations that work on MIR like register allocation.

(Tracking issue - #141817)
2025-10-07 13:45:20 -07:00
Mikhail Gudim
562146499c
[CodeGen][NewPM] Port ReachingDefAnalysis to new pass manager. (#159572)
In this commit:
  (1) Added new pass manager support for `ReachingDefAnalysis`.
  (2) Added printer pass.
  (3) Make old pass manager use `ReachingDefInfoWrapperPass`
2025-09-19 09:38:34 -04:00
Jay Foad
d449d3dc13
[CodeGen] Remove FinalizeMachineBundles pass (#149806)
Replace its only use in the AMDGPU R600 backend with a call to
finalizeBundles.
2025-07-23 11:36:49 +01:00
Vikram Hegde
4aa85cc313
[CodeGen][NPM] Port ProcessImplicitDefs to NPM (#148110)
same as https://github.com/llvm/llvm-project/pull/138829

Co-authored-by : Oke, Akshat
<[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>
2025-07-16 13:23:27 +05:30
Vikram Hegde
fcd4a2fe7a
[CodeGen][NewPM] Port "PostRAMachineSink" pass to NPM (#129690) 2025-07-10 13:10:46 +05:30
Akshat Oke
b33d95fb8a
[CodeGen][NPM] Port InitUndef to NPM (#138495) 2025-07-09 15:31:31 +05:30
Akshat Oke
e91cbd4f29
[CodeGen][NPM] Port VirtRegRewriter to NPM (#130564) 2025-04-30 14:10:46 +05:30
Vikram Hegde
53a8b89003
[CodeGen][NewPM] Port "ShrinkWrap" pass to NPM (#129880) 2025-04-30 13:11:17 +05:30
Vikram Hegde
86d8e8d9a6
[CodeGen][NewPM] Port "PrologEpilogInserter" to NPM (#130550) 2025-04-29 13:13:45 +05:30
Akshat Oke
31ddaef8d1
[CodeGen][NPM] Port UnreachableMachineBlockElim to NPM (#136127) 2025-04-18 15:06:30 +05:30
Akshat Oke
a09fd9c653
[CodeGen][NPM] Port MachineBlockPlacementStats to NPM (#129853) 2025-04-17 15:24:46 +05:30
Akshat Oke
a388395b86
[CodeGen][NPM] Port StackFrameLayoutAnalysisPass to NPM (#130070) 2025-04-15 12:37:19 +05:30
Akshat Oke
f133eae70c
[CodeGen][NPM] Port MachineSanitizerBinaryMetadata to NPM (#130069)
Didn't find a test for this (but there are tests for the `Function`
version of this pass)
2025-04-14 20:52:26 +05:30
Akshat Oke
e29f986838
[CodeGen][NPM] Port RemoveLoadsIntoFakeUses to NPM (#130068) 2025-04-14 12:58:03 +05:30
Akshat Oke
b283ff7eb1
[CodeGen][NPM] Port BranchRelaxation to NPM (#130067)
This completes the PreEmitPasses.
2025-04-14 10:19:42 +05:30
Akshat Oke
2f6b06b264
[CodeGen][NPM] Port PostRAHazardRecognizer to NPM (#130066) 2025-04-09 16:36:22 +05:30
Akshat Oke
4a68702455
[CodeGen][NPM] Port XRayInstrumentation to NPM (#129865) 2025-04-01 15:38:49 +05:30
Mingming Liu
c8a70f4c6e
[CodeGen][StaticDataPartitioning]Place local-linkage global variables in hot or unlikely prefixed sections based on profile information (#125756)
In this PR, static-data-splitter pass finds out the local-linkage global
variables in {`.rodata`, `.data.rel.ro`, `bss`, `.data`} sections by
analyzing machine instruction operands, and aggregates their accesses
from code across functions.

A follow-up item is to analyze global variable initializers and count
for access from data.
* This limitation is demonstrated by `bss2` and `data3` in
`llvm/test/CodeGen/X86/global-variable-partition.ll`.

Some stats of static-data-splitter with this patch:

**section**|**bss**|**rodata**|**data**
:-----:|:-----:|:-----:|:-----:
hot-prefixed section coverage|99.75%|97.71%|91.30%
unlikely-prefixed section size percentage|67.94%|39.37%|63.10%

1. The coverage is defined as `#perf-sample-in-hot-prefixed <data>
section / #perf-sample in <data.*> section` for each <data> section.
* The perf command samples
`MEM_INST_RETIRED.ALL_LOADS:u:pinned:precise=2` events at a high
frequency (`perf -c 2251`) for 30 seconds. The profiled binary is built
as non-PIE so `data.rel.ro` coverage data is not available.
2. The unlikely-prefixed `<data>` section size percentage is defined as
`unlikely <data> section size / the sum size of <data>.* sections` for
each `<data>` section
2025-03-28 16:31:46 -07:00
Akshat Oke
174110bf3c
[CodeGen][NPM] Port LiveDebugValues to NPM (#131563) 2025-03-24 11:34:45 +05:30
Akshat Oke
687c9d359e
[CodeGen][NPM] Port FEntryInserter to NPM (#129857) 2025-03-17 10:35:53 +05:30
Frederik Harwath
6962cf1700
Rename ExpandLargeFpConvertPass to ExpandFpPass (#131128)
This is meant as a preparation for PR #130988 "[AMDGPU] Implement IR
expansion for frem instruction" which implements the expansion of
another instruction in this pass. The more general name seems more
appropriate given this change and quite reasonable even without it.
2025-03-14 13:11:45 +01:00
Akshat Oke
87916f8c32
[CodeGen][NPM] Port MachineBlockPlacement to NPM (#129828) 2025-03-14 10:31:53 +05:30
Akshat Oke
5952972c91
[CodeGen][NPM] Port BranchFolder to NPM (#128858)
EnableTailMerge is false by default and is handled by the pass builder.
Passes are independent of target pipeline options.

This completes the generic `MachineLateOptimization` passes for the NPM
pipeline.
2025-03-13 13:41:28 +05:30
Akshat Oke
9f617161aa
[CodeGen][NPM] Port PatchableFunction to NPM (#129866) 2025-03-12 15:11:11 +05:30
Akshat Oke
57a90883ca
[CodeGen][NPM] Port DetectDeadLanes to NPM (#130567) 2025-03-12 11:22:02 +05:30
Vikram Hegde
e0eb4edad6
[CodeGen][NewPM] Port "FixupStatepointCallerSaved" pass to NPM (#129541) 2025-03-04 15:47:43 +05:30
Akshat Oke
af4ec59f8d
[CodeGen][NPM] Port ExpandPostRAPseudos to NPM (#129509) 2025-03-04 11:49:09 +05:30
Vikram Hegde
6abe148bac
[CodeGen][NewPM] Port "RemoveRedundantDebugValues" to NPM (#129005) 2025-03-03 19:57:50 +07:00
Akshat Oke
77f44a9642
[CodeGen][NewPM] Port MachineSink to NPM (#115434)
Targets can set the EnableSinkAndFold option in CGPassBuilderOptions for
the NPM pipeline in buildCodeGenPipeline(... &Opts, ...)
2025-03-03 15:49:37 +05:30
Akshat Oke
69c8312c0a
[CodeGen][NewPM] Port MachineCycleInfo to NPM (#114745) 2025-03-03 11:26:17 +05:30
Akshat Oke
fe13cb985c
[CodeGen][NewPM] Port RegAllocGreedy to NPM (#119540)
Leaving out NPM command line support for the next patch.
2025-02-26 12:11:22 +05:30
Akshat Oke
229dcf9d34
[CodeGen][NPM] Port MachineLateInstrsCleanup to NPM (#128160)
There are no standalone tests for this pass for backends implementing
the NPM yet.
2025-02-24 14:31:37 +05:30
Akshat Oke
7b60e03d73
Reland "CodeGen][NewPM] Port MachineScheduler to NPM. (#125703)" (#126684)
`RegisterClassInfo` was supposed to be kept alive between pass runs,
which wasn't being done leading to recomputations increasing the compile
time.

Now the Impl class is a member of the legacy and new passes so that it
is not reconstructed on every pass run.

---------

Co-authored-by: Christudasan Devadasan <christudasan.devadasan@amd.com>
2025-02-12 18:54:39 +05:30
Akshat Oke
564b9b7f4d
Revert "CodeGen][NewPM] Port MachineScheduler to NPM. (#125703)" (#126268)
This reverts commit 5aa4979c47255770cac7b557f3e4a980d0131d69 while I
investigate what's causing the compile-time regression.
2025-02-08 15:36:48 +05:30
Christudasan Devadasan
d86e379fd2
[CodeGen][NewPM] Port StackSlotColoring to NPM. (#125876) 2025-02-05 23:18:16 +05:30
Akshat Oke
f77f777f35
[CodeGen][NewPM] Port RenameIndependentSubregs to NPM (#125192) 2025-02-05 17:54:57 +05:30
Christudasan Devadasan
44f638f88e
CodeGen][NewPM] Port PostRAScheduler to NPM. (#125798) 2025-02-05 12:45:59 +05:30