llvm-project

Author	SHA1	Message	Date
Diana Picus	24405f070f	[AMDGPU] Add intrinsic exposing s_alloc_vgpr (#163951 ) Make it possible to use `s_alloc_vgpr` at the IR level. This is a huge footgun and use for anything other than compiler internal purposes is heavily discouraged. The calling code must make sure that it does not allocate fewer VGPRs than necessary - the intrinsic is NOT a request to the backend to limit the number of VGPRs it uses (in essence it's not so different from what we do with the dynamic VGPR flags of the `amdgcn.cs.chain` intrinsic, it just makes it possible to use this functionality in other scenarios).	2026-02-10 09:28:31 +01:00
Pierre van Houtryve	b79ba02479	[AMDGPU][GFX12.5] Reimplement monitor load as an atomic operation (#177343 ) Load monitor operations make more sense as atomic operations, as non-atomic operations cannot be used for inter-thread communication w/o additional synchronization. The previous built-in made it work because one could just override the CPol bits, but that bypasses the memory model and forces the user to learn about ISA bits encoding. Making load monitor an atomic operation has a couple of advantages. First, the memory model foundation for it is stronger. We just lean on the existing rules for atomic operations. Second, the CPol bits are abstracted away from the user, which avoids leaking ISA details into the API. This patch also adds supporting memory model and intrinsics documentation to AMDGPUUsage. Solves SWDEV-516398.	2026-02-09 09:57:27 +01:00
paperchalice	c53acf0443	[SelectionDAGBuilder] Remove NoNaNsFPMath uses (#169904 ) Replaced by checking fast-math flags or value tracking results.	2026-02-09 09:48:07 +08:00
Nicolai Hähnle	3e1e86ef1f	[AMDGPU] Return two MMOs for load-to-lds and store-from-lds intrinsics (#175845 ) Accurately represent both the load and the store part of those intrinsics. The test changes seem to be mostly fairly insignificant changes caused by subtly different scheduler behavior.	2026-02-04 12:29:49 -08:00
Diana Picus	9022f47ca4	[AMDGPU] Implement llvm.sponentry (#176357 ) In some of our use cases, the GPU runtime stores some data at the top of the stack. It figures out where it's safe to store it by using the PAL metadata generated by the backend, which includes the total stack size. However, the metadata does not include the space reserved at the bottom of the stack for the trap handler when CWSR is enabled in dynamic VGPR mode. This space is reserved dynamically based on whether or not the code is running on the compute queue. Therefore, the runtime needs a way to take that into account. Add support for `llvm.sponentry`, which should return the base of the stack, skipping over any reserved areas. This allows us to keep this computation in one place rather than duplicate it between the backend and the runtime. The implementation for functions that set up their own stack uses a pseudo that is expanded to the same code sequence as that used in the prolog to set up the stack in the first place. In callable functions, we generate a fixed stack object and use that instead, similar to the Arm/AArch64 approach. This wastes some stack space but that's not a problem for now because we're not planning to use this in callable functions yet.	2026-02-03 15:02:07 +01:00
Nicolai Hähnle	6f0b873f1c	[CodeGen] Refactor targets to override the new getTgtMemIntrinsic overload (NFC) (#175844 ) This is a fairly mechanical change. Instead of returning true/false, we either keep the Infos vector empty or push one entry.	2026-02-02 17:40:02 -08:00
Aaditya	4ded7e0733	[AMDGPU] Add wave reduce intrinsics for double types - 2 (#170812 ) Supported Ops: `add`, `sub`	2026-01-30 18:13:25 +05:30
Aaditya	4238693e09	[AMDGPU] Add wave reduce intrinsics for double types - 1 (#170811 ) Supported Ops: `min`, `max`	2026-01-30 10:12:44 +01:00
Carl Ritson	447f1e43bb	[AMDGPU] Implement llvm.fptosi.sat and llvm.fptoui.sat (#174726 ) Certain graphics APIs explicitly want the semantics of saturated conversions, particularly w.r.t. edge cases like NaN. The underlying hardware instructions (v_cvt_*) provide the expected behaviour so llvm.fptosi.sat and llvm.fptoui.sat can be implemented directly. Limitations: - conversion to i64 is not handled (default expansion is used) - v_cvt_u16_f16 and v_cvt_i16_f16 are not utilized (future work) - scalar float is untested/unoptimized (future work)	2026-01-30 17:07:40 +09:00
Kewen Meng	120b482375	Revert "[AMDGPU] Replace AMDGPUISD::FFBH_I32 with ISD::CTLS" (#178837 ) Revert to unblock buildbot: https://lab.llvm.org/buildbot/#/builders/206/builds/12769	2026-01-29 21:19:15 -08:00
Dmitry Sidorov	65925b0405	[AMDGPU] Replace AMDGPUISD::FFBH_I32 with ISD::CTLS (#178420 ) Per CDNA4 ISA: V_FFBH_I32 Count the number of leading bits that are the same as the sign bit of a vector input and store the result into a vector register. Store -1 if all input bits are the same. which matches CTLS semantics. Addresses: https://github.com/llvm/llvm-project/issues/177635	2026-01-30 01:36:28 +01:00
Carl Ritson	12c13e0009	[AMDGPU][GFX1250] Implement offset handling in s.buffer.load (#178389 ) Divergent path of s.buffer.load must handle 32b offset extension behaviour on GFX1250. Tests in llvm.amdgcn.s.buffer.load.ll are rewritten to avoid using export instructions not available on GFX1250.	2026-01-29 18:00:48 +09:00
macurtis-amd	5d018e93fe	AMDGPU: Perform zero/any extend combine into permute (#177370 ) Increases opportunities to generate permutes. Motivated sub-optimal code generation of a CK kernel.	2026-01-28 10:47:22 -06:00
Mariusz Sikora	3c0f5045e1	[AMDGPU] Add FeatureGFX13 and SMEM encoding for gfx13 (#177567 ) For now list of features is based on gfx12 and gfx1250 --------- Co-authored-by: Jay Foad <jay.foad@amd.com>	2026-01-26 14:16:36 +01:00
Shilei Tian	786a20710d	[NFCI][AMDGPU] Use `GET_SUBTARGETINFO_MACRO` in `GCNSubtarget.h` and `R600Subtarget.h` (#177402 ) We can finally get rid of the manually defined boolean variables, like other targets. Even though most of them are now defined by macros, we still need to add the entries.	2026-01-25 09:38:42 -05:00
Matt Arsenault	98b55bcdec	AMDGPU: Move f16 legality configuration to SITargetLowering (#177629 ) f16 is never legal for R600 so this should not be in the common base class.	2026-01-23 18:36:26 +00:00
Sam Elliott	7184229fea	[NFC][MI] Tidy Up RegState enum use (2/2) (#177090 ) This Change makes `RegState` into an enum class, with bitwise operators. It also: - Updates declarations of flag variables/arguments/returns from `unsigned` to `RegState`. - Updates empty RegState initializers from 0 to `{}`. If this is causing problems in downstream code: - Adopt the `RegState getXXXRegState(bool)` functions instead of using a ternary operator such as `bool ? RegState::XXX : 0`. - Adopt the `bool hasRegState(RegState, RegState)` function instead of using a bitwise check of the flags.	2026-01-23 00:19:03 -08:00
Matt Arsenault	3c40eadfca	AMDGPU: Avoid introducing illegal fminnum_ieee/fmaxnum_ieee (#177418 ) Avoid introducing fminnum_ieee/fmaxnum_ieee on f16 if f16 is not legal. This avoids regressing minimum/maximum cases in a future commit.	2026-01-22 21:48:51 +01:00
Jameson Nash	d10b2b566a	[NFCI] replace getValueType with new getGlobalSize query (#177186 ) Returns uint64_t to simplify callers. The goal is eventually replace getValueType with this query, which should return the known minimum reference-able size, as provided (instead of a Type) during create. Additionally the common isSized query would be replaced with an isExactKnownSize query to test if that size is an exact definition.	2026-01-22 13:55:53 -05:00
Matt Arsenault	056e5a32c8	AMDGPU: Change ABI of 16-bit scalar values for gfx6/gfx7 (#175795 ) Keep bf16/f16 values encoded as the low half of a 32-bit register, instead of promoting to float. This avoids unwanted FP effects from the fpext/fptrunc which should not be implied by just passing an argument. This also fixes ABI divergence between SelectionDAG and GlobalISel. I've wanted to make this change for ages, and failed the last few times. The main complication was the hack to return shader integer types in SGPRs, which now needs to inspect the underlying IR type.	2026-01-22 18:34:06 +00:00
Shilei Tian	4b1cfc5d7c	[NFCI][AMDGPU] Final touch before moving to `GET_SUBTARGETINFO_MACRO` (#177401 )	2026-01-22 17:33:17 +00:00
Matt Arsenault	a97f5ec95f	AMDGPU: Change ABI of 16-bit element vectors on gfx6/7 (#175781 ) Fix ABI on old subtargets so match new subtargets, packing 16-bit element subvectors into 32-bit registers. Previously this would be scalarized and promoted to i32/float. Note this only changes the vector cases. Scalar i16/half are still promoted to i32/float for now. I've unsuccessfully tried to make that switch in the past, so leave that for later. This will help with removal of softPromoteHalfType.	2026-01-22 17:24:29 +01:00
Shilei Tian	02d34a76f7	[NFCI][AMDGPU] Remove more redundant code from `GCNSubtarget.h` (#177297 ) We are getting pretty close to use `GET_SUBTARGETINFO_MACRO` in the header with this cleanup.	2026-01-22 09:07:15 -05:00
Shilei Tian	1843a7fe9f	[NFCI][AMDGPU] Use X-macro to reduce boilerplate in `GCNSubtarget.h` (#176844 ) `GCNSubtarget.h` contained a large amount of repetitive code following the pattern `bool HasXXX = false;` for member declarations and `bool hasXXX() const { return HasXXX; }` for getters. This boilerplate made the file unnecessarily long and harder to maintain. This patch introduces an X-macro pattern `GCN_SUBTARGET_HAS_FEATURE` that consolidates 135 simple subtarget features into a single list. The macro is expanded twice: once in the protected section to generate member variable declarations, and once in the public section to generate the corresponding getter methods. This reduces the file by approximately 600 lines while preserving the exact same API and functionality. Features with complex getter logic or inconsistent naming conventions are left as manual implementations for future improvement. Ideally, these could be generated by TableGen using `GET_SUBTARGETINFO_MACRO`, similar to the X86 backend. However, `AMDGPU.td` has several issues that prevent direct adoption: duplicate field names (e.g., `DumpCode` is set by both `FeatureDumpCode` and `FeatureDumpCodeLower`), and inconsistent naming conventions where many features don't have the `Has` prefix (e.g., `FlatAddressSpace`, `GFX10Insts`, `FP64`). Fixing these issues would require renaming fields in `AMDGPU.td` and updating all references, which is left for future work.	2026-01-21 15:29:09 -05:00
Matt Arsenault	9bd0db7ad5	AMDGPU: Handle FP in integer in argument lowering (#175835 ) This avoids an assertion when softPromoteHalfType is enabled.	2026-01-20 20:20:52 +00:00
Brox Chen	dd83ead9a5	[AMDGPU][True16] extractEltcheap check 16bit in true16 mode (#171762 )	2026-01-20 09:45:05 -05:00
Frederik Harwath	5fec9fb3cf	[AMDGPU] Enable ISD::{FSIN,FCOS} custom lowering to work on v2f16 (#176382 ) Currently ISD::FSIN and ISD::FCOS of type MVT::v2f16 are legalized by first expanding and then using a custom lowering on the resulting f16 instructions. This ordering prevents using packed math variants of the instructions introduced by the legalization (e.g. the multiplication) and makes it difficult to deal with the resulting IR in peephole optimizations (e.g. si-peephole-sdwa). Change the legalization action for ISD::FSIN and ISD::FCOS of type MTF::v2f16 to Custom and change the custom trig lowering to deal with vectors.	2026-01-20 07:35:54 +01:00
Hongyu Chen	007f1af30e	[AMDGPU] Use APInt in performSetCCCombine (#176564 ) Fixes #176559.	2026-01-20 09:14:25 +08:00
Akshay Deodhar	3860147a7f	[NFC][TargetLowering] Make shouldExpandAtomicRMWInIR and shouldExpandAtomicCmpXchgInIR take a const Instruction pointer (#176073 ) Splits out change from https://github.com/llvm/llvm-project/pull/176015 Changes shouldExpandAtomicRMWInIR to take a constant argument: This is to allow some other TargetLowering constant-argument functions to call it. This change touches several backends. An alternative solution exists, but to me, this seems the "right" way.	2026-01-15 14:22:57 -08:00
Frederik Harwath	4e00719777	[AMDGPU] Remove unnecessary AddPromotedToType use from SIIselLowering (NFC) (#175994 )	2026-01-14 19:38:25 +01:00
Matt Arsenault	2e0e4f6cb3	AMDGPU: Directly use v2bf16 as register type for bf16 vectors. (#175761 ) Previously we were casting v2bf16 to i32, unlike the f16 case. Simplify this by using the natural vector type. This is probably a leftover from before v2bf16 was treated as legal. This is preparation for fixing a miscompile in globalisel.	2026-01-13 17:48:38 +01:00
Shilei Tian	5a63367b15	Reapply "[AMDGPU] Rework the clamp support for WMMA instructions" (#174674 ) (#174697 ) This reverts commit 0b2f3cfb72a76fa90f3ec2a234caabe0d0712590.	2026-01-07 06:12:19 +00:00
dyung	0b2f3cfb72	Revert "[AMDGPU] Rework the clamp support for WMMA instructions" (#174674 ) Reverts llvm/llvm-project#174310 This change is causing 2 cross-project-test failures on https://lab.llvm.org/buildbot/#/builders/174/builds/29695	2026-01-07 01:18:23 +00:00
Shilei Tian	ccca3b8c67	[AMDGPU] Rework the clamp support for WMMA instructions (#174310 ) Fixes #166989.	2026-01-06 15:46:40 -05:00
saxlungs	c262893f4b	Reland "[AMDGPU] Add new llvm.amdgcn.wave.shuffle intrinsic (#167372 )" (#174614 ) This change adds a new intrinsic for AMDGPU that implements a wave shuffle, allowing arbitrary swizzling between lanes using an index. In the initial version of this commit, there was an issue in one of the tests added that returned a signal, causing testing to fail when combined with another recent change to 'not'. For context on the initial commit see #167372 --------- Signed-off-by: Domenic Nutile <domenic.nutile@gmail.com> Co-authored-by: Jay Foad <jay.foad@gmail.com>	2026-01-06 15:02:08 -05:00
Joe Nash	4bca00d56b	Revert "[AMDGPU] Add new llvm.amdgcn.wave.shuffle intrinsic" (#174501 ) Reverts llvm/llvm-project#167372	2026-01-05 17:52:28 -05:00
saxlungs	b9fbc19017	[AMDGPU] Add new llvm.amdgcn.wave.shuffle intrinsic (#167372 ) This intrinsic will be useful for implementing the OpGroupNonUniformShuffle operation in the SPIR-V reference --------- Signed-off-by: Domenic Nutile <domenic.nutile@gmail.com> Co-authored-by: Jay Foad <jay.foad@gmail.com>	2026-01-05 17:15:58 -05:00
Matt Arsenault	9ad39dd116	AMDGPU: Avoid crashing on statepoint-like pseudoinstructions (#170657 ) At the moment the MIR tests are somewhat redundant. The waitcnt one is needed to ensure we actually have a load, given we are currently just emitting an error on ExternalSymbol. The asm printer one is more redundant for the moment, since it's stressed by the IR test. However I am planning to change the error path for the IR test, so it will soon not be redundant.	2025-12-29 19:08:08 +01:00
Islam Imad	7ceecfad40	[CodeGen] Fix EVT::changeVectorElementType assertion on simple-to-extended fallback (#173413 ) Fixes #171608	2025-12-28 18:51:18 +00:00
Jay Foad	35c2dbd481	[AMDGPU] Remove trivially true predicates from GCNSubtarget. NFC. (#172830 )	2025-12-18 11:05:34 +00:00
Matt Arsenault	68aea8e202	AMDGPU: Avoid introducing unnecessary fabs in fast fdiv lowering (#172553 ) If the sign bit of the denominator is known 0, do not emit the fabs. Also, extend this to handle min/max with fabs inputs. I originally tried to do this as the general combine on fabs, but it proved to be too much trouble at this time. This is mostly complexity introduced by expanding the various min/maxes into canonicalizes, and then not being able to assume the sign bit of canonicalize (fabs x) without nnan. This defends against future code size regressions in the atan2 and atan2pi library functions.	2025-12-17 00:22:12 +01:00
Juan Manuel Martinez Caamaño	c13bf9eb26	Reapply "[AMDGPU][SDAG] Add missing cases for SI_INDIRECT_SRC/DST (#170323 ) (#171838 ) A buildbot failed for the original patch. https://github.com/llvm/llvm-project/pull/171835 addresses the issue raised by the buildbot. After the fix is merged, the original patch is reapplied without any change.	2025-12-15 09:05:00 +01:00
Matt Arsenault	2af693bbec	AMDGPU: Fix selection failure on bf16 inverse sqrt (#172044 ) On !hasBF16TransInsts targets, an illegal rsq would form and fail to select.	2025-12-12 18:10:08 +01:00
Juan Manuel Martinez Caamaño	c02978867e	Revert "[AMDGPU][SDAG] Add missing cases for SI_INDIRECT_SRC/DST (#170323 ) (#171787 ) ``` Step 7 (test-check-all) failure: Test just built components: check-all completed (failure) ****************** TEST 'LLVM :: CodeGen/AMDGPU/insert_vector_dynelt.ll' FAILED ****************** Exit Code: 1 Command Output (stdout): -- # RUN: at line 2 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn -mcpu=fiji < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll \| /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -enable-var-scope -check-prefixes=GCN /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll # executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn -mcpu=fiji # executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -enable-var-scope -check-prefixes=GCN /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll # RUN: at line 3 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -O0 -mtriple=amdgcn -mcpu=fiji < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll \| /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck --check-prefixes=GCN-O0 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll # executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -O0 -mtriple=amdgcn -mcpu=fiji # .---command stderr------------ # \| # \| # After Instruction Selection # \| # Machine code for function insert_dyn_i32_6: IsSSA, TracksLiveness # \| Function Live Ins: $sgpr16 in %8, $sgpr17 in %9, $sgpr18 in %10, $sgpr19 in %11, $sgpr20 in %12, $sgpr21 in %13, $vgpr0 in %14, $vgpr1 in %15 # \| # \| bb.0 (%ir-block.0): # \| successors: %bb.1(0x80000000); %bb.1(100.00%) # \| liveins: $sgpr16, $sgpr17, $sgpr18, $sgpr19, $sgpr20, $sgpr21, $vgpr0, $vgpr1 # \| %15:vgpr_32 = COPY $vgpr1 # \| %14:vgpr_32 = COPY $vgpr0 # \| %13:sgpr_32 = COPY $sgpr21 # \| %12:sgpr_32 = COPY $sgpr20 # \| %11:sgpr_32 = COPY $sgpr19 # \| %10:sgpr_32 = COPY $sgpr18 # \| %9:sgpr_32 = COPY $sgpr17 # \| %8:sgpr_32 = COPY $sgpr16 # \| %17:sgpr_192 = REG_SEQUENCE %8:sgpr_32, %subreg.sub0, %9:sgpr_32, %subreg.sub1, %10:sgpr_32, %subreg.sub2, %11:sgpr_32, %subreg.sub3, %12:sgpr_32, %subreg.sub4, %13:sgpr_32, %subreg.sub5 # \| %16:sgpr_192 = COPY %17:sgpr_192 # \| %19:vreg_192 = COPY %17:sgpr_192 # \| %28:sreg_64_xexec = IMPLICIT_DEF # \| %27:sreg_64_xexec = S_MOV_B64 $exec # \| # \| bb.1: # \| ; predecessors: %bb.1, %bb.0 # \| successors: %bb.1(0x40000000), %bb.3(0x40000000); %bb.1(50.00%), %bb.3(50.00%) # \| # \| %26:vreg_192 = PHI %19:vreg_192, %bb.0, %18:vreg_192, %bb.1 # \| %29:sreg_64 = PHI %28:sreg_64_xexec, %bb.0, %30:sreg_64, %bb.1 # \| %31:sreg_32_xm0 = V_READFIRSTLANE_B32 %14:vgpr_32, implicit $exec # \| %32:sreg_64 = V_CMP_EQ_U32_e64 %31:sreg_32_xm0, %14:vgpr_32, implicit $exec # \| %30:sreg_64 = S_AND_SAVEEXEC_B64 killed %32:sreg_64, implicit-def $exec, implicit-def $scc, implicit $exec # \| $m0 = COPY killed %31:sreg_32_xm0 # \| %18:vreg_192 = V_INDIRECT_REG_WRITE_MOVREL_B32_V8 %26:vreg_192(tied-def 0), %15:vgpr_32, 3, implicit $m0, implicit $exec # \| $exec = S_XOR_B64_term $exec, %30:sreg_64, implicit-def $scc # \| S_CBRANCH_EXECNZ %bb.1, implicit $exec # \| # \| bb.3: ``` This reverts commit 15df9e701f1f1194a25e6123612cc735ad392ae4.	2025-12-11 10:08:20 +00:00
Juan Manuel Martinez Caamaño	15df9e701f	[AMDGPU][SDAG] Add missing cases for SI_INDIRECT_SRC/DST (#170323 ) Before this patch, `insertelement/extractelement` with dynamic indices would fail to select with `-O0` for vector 32-bit element types with sizes 3, 5, 6 and 7, which did not map to a `SI_INDIRECT_SRC/DST` pattern. Other "weird" sizes bigger than 8 (like 13) are properly handled already. To solve this issue we add the missing patterns for the problematic sizes. Solves SWDEV-568862	2025-12-11 09:17:43 +01:00
Jay Foad	6ae0b9f586	[AMDGPU] Implement codegen for GFX11+ V_CVT_PK_[IU]16_F32 (#168719 )	2025-12-10 22:26:59 +00:00
Mirko Brkušanin	5759a3a779	[AMDGPU] Add s_wakeup_barrier instruction for gfx1250 (#170501 )	2025-12-10 09:45:13 +01:00
anjenner	27651133e2	AMDGPU: Drop and upgrade llvm.amdgcn.atomic.csub/cond.sub to atomicrmw (#105553 ) These both perform conditional subtraction, returning the minuend and zero respectively, if the difference is negative.	2025-12-09 23:13:33 +00:00
Shilei Tian	3ccd67295b	[AMDGPU] Fix a crash when a bool variable is used in inline asm (#171004 ) Fixes SWDEV-570184.	2025-12-08 14:44:21 -05:00
Dark Steve	cc19f420b9	[AMDGPU][NPM] Port AMDGPUArgumentUsageInfo to NPM (#170886 ) Port AMDGPUArgumentUsageInfo analysis to the NPM to fix suboptimal code generation when NPM is enabled by default. Previously, DAG.getPass() returns nullptr when using NPM, causing the argument usage info to be unavailable during ISel. This resulted in fallback to FixedABIFunctionInfo which assumes all implicit arguments are needed, generating unnecessary register setup code for entry functions. Fixes LLVM::CodeGen/AMDGPU/cc-entry.ll Changes: - Split AMDGPUArgumentUsageInfo into a data class and NPM analysis wrapper - Update SIISelLowering to use DAG.getMFAM() for NPM path - Add RequireAnalysisPass in addPreISel() to ensure analysis availability This follows the same pattern used for PhysicalRegisterUsageInfo.	2025-12-08 20:38:00 +05:30

1 2 3 4 5 ...

1744 Commits