llvm-project

Author	SHA1	Message	Date
Yeongu Choe	df461c164c	[CIR][CodeGen] Implement __builtin_fpclassify (#187977 ) I implemented CIR version of __builtin_fpclassify function.	2026-04-06 14:41:55 -07:00
Amir Ayupov	a8cf1a0352	[BOLT] Allow empty buildid in pre-aggregated profile addresses (#190675 ) Allow `parseString()` to return an empty `StringRef` when the delimiter appears at position 0. This enables parsing pre-aggregated profile addresses with an omitted buildid but preserved colon (`:addr` format), where the empty buildid corresponds to the main binary. Previously, `parseString()` rejected zero-length fields by treating `StringEnd == 0` the same as `StringRef::npos` (delimiter not found). These are distinct situations: `npos` means no delimiter exists, while `0` means the field before the delimiter is empty. The fix removes the `StringEnd == 0` sub-condition so only the missing-delimiter case errors. The existing test for buildid-prefixed addresses is extended to also verify that `:addr` input produces identical output to the plain-address and non-empty-buildid variants. Test Plan: Added empty-buildid input file and extended `pre-aggregated-perf-buildid.test` to run perf2bolt with `:addr` format and diff the fdata output against the existing buildid-prefixed result.	2026-04-06 14:41:21 -07:00
Steven Wu	79e669f000	[CAS] Revert an unintentional change in #190634 (#190686 ) Revert an unintentional change in #190634 that did an unintentional implicit signed to unsigned cast.	2026-04-06 21:37:15 +00:00
Ehsan Amiri	8a11fe97a2	[DA] Require `nsw` for AddRecs involved in GCD test (#186892 ) Similar to other tests, we are adding code that the AddRecs used in GCD test are `nsw`. In this case, all recursively identified `AddRec`s are also checked. Note that there is already a similar check in `getConstantCoefficient` for expressions processed in that function.	2026-04-06 17:33:16 -04:00
Sergei Barannikov	62ce560f68	[lldb] Remove some unreachable code (NFC) (#190529 ) `isRISCV()` check always returns false because we only get here if `min_op_byte_size` and `max_op_byte_size` are equal, which is not true for RISC-V. Also, replase `if (!got_op)` check with an `else`. The check is equivalent to `if (min_op_byte_size != max_op_byte_size)`, and the `if` above checks for the opposite condition.	2026-04-07 00:32:17 +03:00
Shilei Tian	ef715849d7	[NFC][AMDGPU] Add some debug prints to SIMemoryLegalizer (#190658 )	2026-04-06 17:17:33 -04:00
Jared Hoberock	7087ece044	[MLIR][ExecutionEngine] Tolerate CUDA_ERROR_DEINITIALIZED in mgpuModuleUnload (#190563 ) `mgpuModuleUnload` may be called from a global destructor (registered by `SelectObjectAttr`'s `appendToGlobalDtors`) after the CUDA primary context has already been destroyed during program shutdown. In this case, `cuModuleUnload` returns `CUDA_ERROR_DEINITIALIZED`, which is benign since the module's resources are already freed with the context. ## Reproduction Any program that uses `gpu.launch_func` and is AOT-compiled (via `mlir-translate --mlir-to-llvmir \| llc \| cc -lmlir_cuda_runtime`) will print `'cuModuleUnload(module)' failed with '<unknown>'` on exit. This is because `SelectObjectAttr` registers the module unload as a global destructor, which runs after the CUDA primary context is released. This script reproduces the error message from `mgpuModuleUnload` on my system: ``` #!/bin/bash set -e LLVM_BUILD=${LLVM_BUILD:-$HOME/dev/git/llvm-project-22/build} cat > /tmp/repro.mlir << 'MLIR' func.func @main() { %c1 = arith.constant 1 : index gpu.launch blocks(%bx, %by, %bz) in (%gx = %c1, %gy = %c1, %gz = %c1) threads(%tx, %ty, %tz) in (%bsx = %c1, %bsy = %c1, %bsz = %c1) { gpu.terminator } return } MLIR $LLVM_BUILD/bin/mlir-opt /tmp/repro.mlir \ -gpu-lower-to-nvvm-pipeline="cubin-format=fatbin" \ \| $LLVM_BUILD/bin/mlir-translate --mlir-to-llvmir -o /tmp/repro.ll $LLVM_BUILD/bin/llc -relocation-model=pic -filetype=obj /tmp/repro.ll -o /tmp/repro.o cc /tmp/repro.o \ -L$LLVM_BUILD/lib -Wl,-rpath,$LLVM_BUILD/lib \ -lmlir_cuda_runtime -lmlir_runner_utils -o /tmp/repro echo "Running:" /tmp/repro 2>&1 echo "Exit code: $?" ``` ## Context This matches how other projects handle the same shutdown ordering issue: - Clang CUDA (D48613) switched module cleanup from `__attribute__((destructor))` to `atexit()` - GCC libgomp checks context validity before `cuModuleUnload` - Apache TVM silently ignores `CUDA_ERROR_DEINITIALIZED` on module unload Fixes #170833	2026-04-06 21:11:58 +00:00
Joe Nash	af95b0a615	[AMDGPU] Remove implicit super-reg defs on mov64 pseudos (#190379 ) The mov64 pseudo is split into two 32 bit movs, but those 32 bit movs had the full 64-bit register still implicitly defined. VOPD formation is affected, so we can emit more of them.	2026-04-06 21:11:06 +00:00
Jianhui Li	9bddf47198	[MLIR][XeGPU] Extend Wg-to-Sg Distribution of Multi-Reduction Op for round-robin layout (#189988 ) This PR enhance the multi-reduction op pattern of wg-to-sg distribution pass: 1. allows each sg have multiple distribution of sg_data tiles. 2. expand the slm buffer size. 3. construct the layout based on the partial reduced vector and use layout.computeDistributedCoords() to compute coordinates. the layout is constructed so that the store is cooperative, and load overlapps with neighbour threads. 4. perform save and load.	2026-04-06 14:07:50 -07:00
Anshul Nigham	97d50c1490	[NewPM] Adds a port for AArch64PreLegalizerCombiner (#190567 ) Standard porting (note that TargetPassConfig dependency was [removed earlier](`e27e7e4339`)). --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2026-04-06 14:01:37 -07:00
Andrew	ee51de9836	[llvm-cov] add ability to show non executed test vectors for mc/dc coverage (#187517 ) - Added `-show-mcdc-non-executed-vectors` option - Non-executed test vectors now are tracked - When the opt is present it's get written to UI	2026-04-06 15:59:14 -05:00
Zile Xiong	d917027334	[llvm-cov] Guard against empty CountedRegions in findMainViewFileID (#189270 ) When processing coverage generated from branch coverage mode, some functions can reach findMainViewFileID with an empty CountedRegions list. In that case the current logic still proceeds to infer the main view file, even though there is no regular counted region available to do so. Return std::nullopt early when CountedRegions is empty. This was observed when reproducing issue #189169 with: cargo llvm-cov --lib --branch The issue appears related to branch-only coverage information being recorded separately in CountedBranchRegions, while findMainViewFileID currently only consults CountedRegions. This patch is a defensive fix for the empty-region case; further investigation may still be needed to determine whether branch regions should participate in main view file selection. Co-authored-by: Zile Xiong <xiongzile99@gmail.com>	2026-04-06 15:58:31 -05:00
Chinmay Deshpande	9033e872fd	[AMDGPU][GISel] RegBankLegalize rules for update_dpp (#190662 )	2026-04-06 13:52:10 -07:00
Alexis Engelke	89665812f5	[Analysis][NFC] Use block numbers in BlockFrequencyInfo (#190669 ) Block pointers are only stored while constructing the analysis, so the value handle to catch erased blocks is no longer needed when using stable block numbers.	2026-04-06 20:47:34 +00:00
Valentin Clement (バレンタインクレメン)	92b595b9b4	[flang][cuda] Take associate into account for host array diagnostic (#190673 )	2026-04-06 20:43:52 +00:00
Congzhe	fbe6d79465	[LoopFusion] Fix out-of-date LoopInfo being used during fusion (#189452 ) This is fix for [187902](https://github.com/llvm/llvm-project/issues/187902), where `LoopInfo` is not in a valid state at the beginning of `ScalarEvolution::createSCEVIter`. The reason for the bug is that, `mergeLatch()` is called at a place where control flow and dominator trees have been updated but `LoopInfo` has not completed the update yet. `mergeLatch()` calls into `ScalarEvolution` that uses `LoopInfo`, where out-of-date `LoopInfo` would result in crash or unpredictable results. This patch moves `mergeLatch()` to the place where `LoopInfo` has completed its update and hence is in a valid state.	2026-04-06 16:35:28 -04:00
Steven Wu	1a0ca1019d	[CAS] Harden validate() against on-disk corruption (#190634 ) Fixes found by fuzzer: OnDiskTrieRawHashMap: - Bounds-check data slot offsets in TrieVerifier::visitSlot() before calling getRecord(), preventing asData() assertion on out-of-bounds trie entries. - Validate subtrie headers (NumBits, bounds) before constructing SubtrieHandle, preventing SEGV in getSlots() from corrupt NumBits. - Validate arena bump pointer alignment, catching misaligned BumpPtr that would crash store() with an alignment assertion. - Fix comma operator bug in getOrCreateRoot() where the compare_exchange_strong result was discarded, causing asSubtrie() assertion when RootTrieOffset was corrupted to zero. OnDiskGraphDB: - Reject invalid (zero) ref offsets in validate callback, preventing asData() assertion when corrupt data pool refs are resolved via recoverFromFileOffset(). - Validate DataRecordHandle layout flags before calling getTotalSize(), preventing llvm_unreachable on corrupt NumRefsFlags/DataSizeFlags. - Validate data pool bump pointer alignment, catching misaligned BumpPtr that would crash store() in DataRecordHandle::constructImpl(). - Check data record refs offset alignment before calling getRefs(), preventing PointerUnion assertion from misaligned refs pointer. MappedFileRegionArena: - Convert assertions in initializeHeader() to errors so corrupted arena headers return an error on CAS open instead of crashing. Assisted-By: Claude	2026-04-06 13:33:22 -07:00
Arthur Eubanks	70d3dcaa64	Revert "[Inliner] Put inline history into IR as !inline_history metadata" (#190666 ) Reverts llvm/llvm-project#190092 Crashes reported in https://github.com/llvm/llvm-project/pull/190092#issuecomment-4194546908	2026-04-06 20:31:54 +00:00
Steven Wu	40d3949162	[CAS] Add llvm-cas-fuzzer for ObjectStore::validate() (#190635 ) Add a fuzzer that creates an on-disk CAS database, stores objects, then corrupts the on-disk data files using fuzzer-provided bytes and calls validate(). The goal is that validate() should either succeed or return an error, never crash. The fuzzer supports 6 corruption modes: byte-level mutations, file truncation, appending garbage, zeroing ranges, standalone file corruption, and combined mutations with continued CAS operations. Assisted-By: Claude	2026-04-06 13:31:51 -07:00
Jonas Devlieghere	950f1de70b	[lldb] Fix UUID thombstone Key (#190551 ) This changes `DenseMapInfo<UUID>::getTombstoneKey()` to return a 1-byte `{0xFF}` sentinel instead of the empty, default constructed UUID(). Returning the same key for the empty and tombstone value apparently violates the `DenseMap` invariant.	2026-04-06 13:25:34 -07:00
Brian Cain	2aa4100fa7	[compiler-rt] Add hexagon to libFuzzer supported architectures (#190297 ) LibFuzzer builds successfully for Hexagon Linux.	2026-04-06 14:49:43 -05:00
Chinmay Deshpande	40d5a7d69e	[AMDGPU][UniformityAnalysis] Mark set_inactive and set_inactive_chain_arg as SourceOfDivergence (#190640 ) `set_inactive` produces a result that varies per-lane based on the EXEC mask, even when both inputs are uniform.	2026-04-06 12:40:22 -07:00
Aadarsh Keshri	326593b4b4	[Support][Modules] Removed prepareForGetLock and its usages. Ensured parent directory exists when creating lock file. (#189888 ) Following #187372	2026-04-06 12:37:32 -07:00
Lucas Ramirez	5e1162eebc	[CodeGen] Move rollback capabilities outside of the rematerializer (#184341 ) The rematerializer implements support for rolling back rematerializations by modifying MIs that should normally be deleted in an attempt to make them "transparent" to other analyses. This involves: 1. setting their opcode to DBG_VALUE and 2. setting their read register operands to the sentinel register. This approach has several drawbacks. 1. It forces the rematerializer to support tracking these "dead MIs" (even if support is optional, these data-structures have to exist). 2. It is not actually clear whether this mechanism will interact well with all other analyses. This is an issue since the intent of the rematerializer is to be usable in as many contexts as possible. 3. In practice, it has shown itself to be relatively error-prone. This commit removes rollback support from the rematerializer and moves those capabilities to a rematerializer listener than can be instantiated on-demand and implements the same functionality on top of standard rematerializer operations. The rematerializer now actually deletes MIs that are no longer useful after rematerializations, and has support for re-creating them on-demand without requiring additional tracking on its part.	2026-04-06 19:23:19 +00:00
Nerixyz	a2c9146da1	[lldb][NativePDB] Handle `S_DEFRANGE_REGISTER_REL_INDIR` (#190336 ) Since #189401, LLVM and Clang generate `S_DEFRANGE_REGISTER_REL_INDIR` for indirect locations. This adds support in LLDB. The offset added after dereferencing is signed here - unlike in `S_REGREL32_INDIR` (at least that's the assumption). So I updated `MakeRegisterBasedIndirectLocationExpressionInternal` to handle the signedness. This is the reason the MSVC test was changed here. I didn't find a test case where LLVM emits the record with the `VFRAME` register. Other than that, the clang test is similar to the MSVC one except that the locations are slightly different.	2026-04-06 21:21:47 +02:00
Daniel Thornburgh	fecf609998	Reland "[LTO][LLD] Prevent invalid LTO libfunc transforms (#164916 )" (#190642 ) This reverts commit 1ec7e86b3a779df2a0af3f37e58c8f5b3a398d7f after issue #190072 was fixed.	2026-04-06 19:20:45 +00:00
Henry Jiang	412d6941e3	[VFS] Guard against null key/value nodes when parsing YAML overlay (#190506 ) When a VFS overlay YAML file contains malformed content such as tabs, the YAML parser can produce KeyValueNode entries where `getKey` returns nullptr. The VFS overlay parser then passes the nullptr to `parseScalarString`, which then calls dyn_cast. Switch to `dyn_cast_if_present` for the above callsites and a few more.	2026-04-06 12:10:26 -07:00
Keith Smiley	04e2be73a6	[bazel] Fix TestingSupport layering_check (#190630 ) I'm not sure if this header is public API upstream but we are using it that way anyways.	2026-04-06 12:03:45 -07:00
Brian Cain	ab43cb8520	[Hexagon] Pass -pie to linker when PIE is the toolchain default (#189723 ) The Hexagon driver only checked for an explicit -pie flag when constructing the link command, ignoring the toolchain's PIE default. For linux-musl targets, isPIEDefault() returns true (via the Linux toolchain base class), so the compiler generates PIC/PIE code (-pic-level 2 -pic-is-pie) but the linker never received -pie. This mismatch caused LTO failures: without -pie the linker sets Reloc::Static for the LTO backend, which generates GP-relative (small-data) references that lld cannot resolve. Use hasFlag() to respect the toolchain default, and guard the -pie emission against -shared and -r (relocatable) modes.	2026-04-06 13:58:43 -05:00
Stanislav Mekhanoshin	de0a81091b	[AMDGPU] Update vop3-literal.s to use fake16 on gfx1250. NFC (#190243 ) 16-bit instructions there are in fake16 mode and shall also be compatible with older targets. The purpose of the test is to check literals, so fake16 or real16 is not important.	2026-04-06 11:50:15 -07:00
Alexis Engelke	a105f27f61	[Scheduler][NFC] Don't use set to track visited nodes (#190480 ) The visited set can grow rather large and we can use an unused field in SDNode to store the same information without the use of a hash set. This improves compile times: stage2-O3 -0.14%.	2026-04-06 18:37:26 +00:00
Kirill Stoimenov	cdbb1f5014	Revert "[InstCombine] Fix #163110 : Support peeling off matching shifts from icmp operands via canEvaluateShifted" (#190638 ) Reverts llvm/llvm-project#165975 Breaks Sanitizer bots: https://lab.llvm.org/buildbot/#/builders/52/builds/16329	2026-04-06 11:30:36 -07:00
vporpo	8d442bc5b5	[SandboxVec][LoadStoreVec] Add support for constants (#189769 ) Up until now the pass would only vectorize load-store pairs. This patch implements vectorization of constant-store pairs.	2026-04-06 11:25:20 -07:00
neonetizen	e11a31f4c7	[CIR][AArch64] Lower FP16 vduph lane intrinsics (#186955 ) From #185382 Lower `vduph_lane_f16` and `vduph_laneq_f16` to `cir::VecExtractOp` Tests moved from `v8.2a-neon-instrinsics-generic.c` to a new CIR-enabled test file. I tried following from notes made in #185852 (BF16)	2026-04-06 19:12:34 +01:00
SiliconA-Z	5c13d2f099	[ARM] Enable creation of ARMISD::CMN nodes (#163223 ) Map ARMISD::CMN to tCMN instead of armcmpz. Rename the cmn instructions to match this new reality. Please note that I do not have merge permissions.	2026-04-06 20:05:14 +02:00
Craig Topper	38034d42bd	[RISCV] Use EVT instead of MVT in compressShuffleOfShuffles. (#190636 ) For the test case I just grabbed a test that exercised this code path and made the VT non-simple. Fixes #190605.	2026-04-06 11:03:38 -07:00
Chinmay Deshpande	12e957fd7f	[AMDGPU][GISel] RegBankLegalize rules for amdgcn_inverse_ballot (#190629 )	2026-04-06 10:30:35 -07:00
Tomer Shafir	37801e9e99	[MCA] Enhance debug prints of processor resources (#190132 ) Previously, `computeProcResourceMasks()` would print resource masks on debug mode from multiple call sites, creating noise in the debug output. This patch aims to fix this and also print more info about the resources. It splits to 2 types of debug prints for resources: 1. No simulation - mask only 2. Simulation - mask + other info For 2, it shares printing on a single place in `ResourceManager` constructor, that should cover all the other simulation cases indirectly: 1. `llvm/lib/MCA/HardwareUnits/ResourceManager` - covered 2. `llvm/lib/MCA/InstrBuilder.c` - should be covered indirectly - only used by `llvm-mca` before simulation that constructs a `ResourceManager` 3. `llvm/tools/llvm-mca/Views/SummaryView.cpp` - after simulation that constructs a `ResourceManager` 4. `llvm/tools/llvm-mca/Views/BottleneckAnalysis.cpp` - after simulation that constructs a `ResourceManager` It also adds `BufferSize` to the output, which should be useful to debug scheduling model + MCA integration. For 1, it inlines mask-only printing into 2 other callers: 1. `llvm/include/llvm/MCA/Stages/InstructionTables.h` 2. `llvm/tools/llvm-exegesis/lib/SchedClassResolution.cpp` as they only use the masks there. I think this is a reasonable duplication across distinguishably different users/tools. Now every pair of callers, even across groups (1 and 2), effectively print in a mutually exclusive way. The patch adds debug tests for the 3 new callers, in the corresponding root test directories, to drive further location of logically target-independent tests that just require some target at the root. I think this convention is more discoverable, and is pretty widely used in the project.	2026-04-06 20:27:18 +03:00
Arthur Eubanks	72d4ce9889	[Inliner] Put inline history into IR as !inline_history metadata (#190092 ) So that it's preserved across all inline invocations rather than just one inliner pass run. This prevents cases where devirtualization in the simplification pipeline uncovers inlining opportunities that should be discarded due to inline history, but we dropped the inline history between inliner pass runs, causing code size to blow up, sometimes exponentially. For compile time reasons, we want to limit this to only call sites that have the potential to inline through SCCs, potentially with the help of devirtualization. This means that the callee is in a non-trivial (Ref)SCC, or the call site was previously an indirect call, which can potentially be devirtualized to call any function. The CGSCCUpdater::InlinedInternalEdges logic still seems to be relevant even with this change, as monster_scc.ll blows up if I remove that code. http://llvm-compile-time-tracker.com/compare.php?from=e830d88e8ae5f44a97cc76136a0a4e83aa9157c0&to=ed535e732fc41b79ab8efda2417886cbd0812f7f&stat=instructions:u Fixes #186926.	2026-04-06 10:24:41 -07:00
vangthao95	eb065bf028	AMDGPU/GlobalISel: RegBankLegalize rules for G_EXTRACT_VECTOR_ELT (#189144 )	2026-04-06 10:22:11 -07:00
Andrzej Warzyński	38c53b3eb9	[clang][cir][nfc] Fix comments, add missing EOF (#190623 )	2026-04-06 18:06:57 +01:00
Craig Topper	b44d2c977c	[RISCV] Use a vector MemVT when converting store+extractelt into a vector store. (#190107 ) This is needed so that `allowsMemoryAccessForAlignment` checks for unaligned vector memory support instead of unaligned scalar memory support when called from `RISCVTargetLowering::expandUnalignedVPStore` While there remove incorrect setting of the truncating store flag on the vector instruction. And restrict the transform to simple stores since we don't have tests for volatile or atomic. Fixes #189037	2026-04-06 09:58:04 -07:00
Craig Topper	0d14772a91	[RISCV][P-ext] Add isel patterns for for macc.h00/macc.w00. (#190444 ) The RV32 macc.h00 instructions take the lower half words from rs1 and rs2, compute the full word product by extending the inputs, and add to rd. The RV64 macc.w00 is similar but operates on words and produces a double word result. I've restricted this to case where the multiply has a single use. We don't have a general macc that multiplies the full xlen bits of rs1 and rs2, so I'm allowing the input to be sext_inreg/and or have sufficient sign/zero bits according to ComputeNumSignBits/computeKnownBits. We should also add mul.h00/mul.w00 patterns, but those we should restrict to at least one input being sext_inreg/and and prefer regular mul when there are no sext_inreg/and.	2026-04-06 09:57:29 -07:00
Wooseok Lee	0bef4c7aab	[AMDGPU] Add v2i32 and/or patterns for VOP3 AND_OR and OR3 operations (#188375 ) Add ThreeOp_v2i32_Pats pattern class to support v2i32 vector operations for AND_OR_B32 and OR3_B32 instructions. The new patterns check the v2i32 and-or or or-or instruction sequence, extract individual 32-bit elements from v2i32 operands, and applies the and_or or or3 vop3 operations.	2026-04-06 16:54:21 +00:00
Domenic Nutile	5b33f85a08	[AMDGPU] Change isSingleLaneExecution to account for WWM enabling lanes even if there's only one workitem (#188316 ) This issue was discovered during some downstream work around Vulkan CTS tests, specifically `dEQP-VK.subgroups.arithmetic.compute.subgroupadd_float` --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2026-04-06 12:51:46 -04:00
forking-google-bazel-bot[bot]	e7ac60c56b	[Bazel] Fixes ce1a9fd (#190577 ) This fixes ce1a9fd76640929fe340c5c5d1bb493ea09ca9bc. Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>	2026-04-06 09:40:22 -07:00
Valentin Clement (バレンタインクレメン)	baa1e5008b	[flang][cuda] Do not consider kernel result as host variable (#190626 )	2026-04-06 16:39:38 +00:00
adams381	9265f9284c	[mlir][ABI] Add writable, dead_on_unwind, dead_on_return, nofpclass param attrs to LLVM dialect (#188374 ) The MLIR LLVM dialect is missing support for several parameter attributes that exist in LLVM IR: `writable`, `dead_on_unwind`, `dead_on_return`, and `nofpclass`. This adds them to the kind-to-name mapping in `AttrKindDetail.h` and the corresponding name accessors in `LLVMDialect.td`. The existing generic conversion infrastructure in `ModuleTranslation` and `ModuleImport` picks them up automatically — `writable` and `dead_on_unwind` round-trip as `UnitAttr`, while `dead_on_return` and `nofpclass` round-trip as `IntegerAttr`. CIR needs these to match classic codegen's ABI output (sret gets `writable dead_on_unwind`, indirect args get `dead_on_return`, fast-math FP args get `nofpclass`).	2026-04-06 11:26:11 -05:00
Henrich Lauko	348295ac05	[CIR] Use data size in emitAggregateCopy for overlapping copies (#186702 ) Add skip_tail_padding property to cir.copy to handle potentially-overlapping subobject copies directly, instead of falling back to cir.libc.memcpy. When set, the lowering uses the record's data size (excluding tail padding) for the memcpy length. This keeps typed semantics and promotability of cir.copy. Also fix CXXABILowering to preserve op properties when recreating operations, and expose RecordType::computeStructDataSize() for computing data size of padded record types.	2026-04-06 18:24:10 +02:00
Eric Feng	930ef7736e	[mlir][amdgpu] Add optional write mask to amdgpu.global_load_async_to_lds (#190498 )	2026-04-06 09:21:32 -07:00

1 2 3 4 5 ...

575626 Commits