llvm-project

Author	SHA1	Message	Date
Sirui Mu	91569fa030	[CIR][NFC] Use Op::create to create CIR operations in CIRGenBuilder (#154540 )	2025-08-21 09:46:45 +08:00
Aiden Grossman	c811f522f6	[ProfCheck] Add list of xfail tests (#154655 ) This patch contains a list of tests that are currently failing in the LLVM_ENABLE_PROFCHECK=ON build. This enables passing them to lit through the LIT_XFAIL env variable. This is necessary for getting a buildbot spun up to catch regressions while work is being done to fix the existing issues. We need to keep this in the LLVM tree so that tests can be removed from the list at the same time the passes causing issues are fixed. Issue #147390	2025-08-21 01:28:05 +00:00
Matt Arsenault	e414585545	AMDGPU: Add baseline test for mfma rewrite with phi (#153021 )	2025-08-21 10:25:05 +09:00
Matt Arsenault	bcf41e03c7	AMDGPU: Add baseline test for vgpr mfma with copied-from AGPR (#153020 )	2025-08-21 10:24:27 +09:00
Matt Arsenault	eefad7438c	AMDGPU: Handle rewriting VGPR MFMA to AGPR with subregister copies (#153019 ) This should address the case where the result isn't fully used, resulting in partial copy bundles from the MFMA result.	2025-08-21 01:17:03 +00:00
Jim Lin	fd28257195	[DAGCombiner] Fold umax/umin operations with vscale operands (#154461 ) If umax/umin operations with vscale operands, that can be constant folded.	2025-08-21 09:15:40 +08:00
PiJoules	3c8652e737	[compiler-rt][Fuchsia] Change GetMaxUserVirtualAddress to invoke syscall (#153309 ) LSan was recently refactored to call GetMaxUserVirtualAddress for diagnostic purposes. This leads to failures for some of our downstream tests which only run with lsan. This occurs because GetMaxUserVirtualAddress depends on setting up shadow via a call to __sanitizer_shadow_bounds, but shadow bounds aren't set for standalone lsan because it doesn't use shadow. This updates the function to invoke the same syscall used by __sanitizer_shadow_bounds calls for getting the memory limit. Ideally this function would only be called once since we only need to get the bounds once. More context in https://fxbug.dev/437346226.	2025-08-20 18:06:19 -07:00
Craig Topper	8cb6bfe05a	[RISCV] Reduce ManualCodeGen for RVV intrinsics with rounding mode. NFC Operate directly on the existing Ops vector instead of copying to a new vector. This is similar to what the autogenerated codegen does for other intrinsics.	2025-08-20 17:53:46 -07:00
Matt Arsenault	744cd8a9c6	AMDGPU: Add some baseline test for mfma rewrite with subregister copies (#153018 ) Currently only cases rooted at a full copy of an MFMA result are handled. Prepare to relax that by testing more intricate subregister usage. Currently only full copies are handled, add some tests to help work towards handling subregisters.	2025-08-21 00:39:39 +00:00
Matt Arsenault	156f3fce54	AMDGPU: Handle rewriting VGPR MFMAs with immediate src2 (#153016 )	2025-08-21 09:09:24 +09:00
Matt Arsenault	3a0fa12752	DAG: Handle half spanning extract_subvector in type legalization (#154101 ) Previously it would just assert if the extract needed elements from both halves. Extract the individual elements from both halves and create a new vector, as the simplest implementation. This could try to do better and create a partial extract or shuffle (or maybe that's best left for the combiner to figure out later). Fixes secondary issue noticed as part of #153808	2025-08-21 00:05:12 +00:00
Elvis Wang	d611a9ca15	[LV][VPlan] Reduce register usage of VPEVLBasedIVPHIRecipe. (#154482 ) `VPEVLBasedIVPHIRecipe` will lower to VPInstruction scalar phi and generate scalar phi. This recipe will only occupy a scalar register just like other phi recipes. This patch fix the register usage for `VPEVLBasedIVPHIRecipe` from vector to scalar which is close to generated vector IR. https://godbolt.org/z/6Mzd6W6ha shows that no register spills when choosing `<vscale x 16>`. Note that this test is basically copied from AArch64.	2025-08-21 07:39:01 +08:00
Shih-Po Hung	cf0e86118d	[VPlan] Handle canonical VPWidenIntOrFpInduction in branch-condition simplification (#153539 ) SimplifyBranchConditionForVFAndUF only recognized canonical IVs and a few PHI recipes in the loop header. With more IV-step optimizations, the canonical widen-canonical-iv can be replaced by a canonical VPWidenIntOrFpInduction, which the pass did not handle, causing regressions (missed simplifications). This patch replaces canonical VPWidenIntOrFpInduction with a StepVector in the vector preheader since the vector loop region only executes once.	2025-08-21 07:34:54 +08:00
Kazu Hirata	9aae8ef329	[Scalar] Use SmallPtrSet directly instead of SmallSet (NFC) (#154473 ) I'm trying to remove the redirection in SmallSet.h: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; to make it clear that we are using SmallPtrSet. There are only handful places that rely on this redirection. This patch replaces SmallSet to SmallPtrSet where the element type is a pointer.	2025-08-20 16:30:39 -07:00
Kazu Hirata	7be06dbd43	[lldb] Use SmallPtrSet directly instead of SmallSet (NFC) (#154472 ) I'm trying to remove the redirection in SmallSet.h: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; to make it clear that we are using SmallPtrSet. There are only handful places that rely on this redirection. This patch replaces SmallSet to SmallPtrSet where the element type is a pointer.	2025-08-20 16:30:31 -07:00
Kazu Hirata	8a5b6b302e	[flang] Use SmallPtrSet directly instead of SmallSet (NFC) (#154471 ) I'm trying to remove the redirection in SmallSet.h: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; to make it clear that we are using SmallPtrSet. There are only handful places that rely on this redirection. This patch replaces SmallSet to SmallPtrSet where the element type is a pointer.	2025-08-20 16:30:24 -07:00
Min-Yih Hsu	db0eceaa8b	[AMDGPU] Fix uncaught changes made by AMDGPUPreloadKernelArgumentsPass (#154645 ) #153975 added a new test, `test/CodeGen/AMDGPU/disable-preload-kernargs.ll`, that triggers an assertion under `LLVM_ENABLE_EXPENSIVE_CHECKS` complaining about not invalidating analyses even when the Pass made changes. It was caused by the fact that the Pass only invalidates the analyses when number of explicit arguments is greater than zero, while it is possible that some functions will be removed even when there isn't any explicit argument, hence the missed invalidation.	2025-08-20 16:23:23 -07:00
Matt Arsenault	ff5f396dac	AMDGPU: Handle rewriting non-tied MFMA to AGPR form (#153015 ) If src2 and dst aren't the same register, to fold a copy to AGPR into the instruction we also need to reassign src2 to an available AGPR. All the other uses of src2 also need to be compatible with the AGPR replacement in order to avoid inserting other copies somewhere else. Perform this transform, after verifying all other uses are compatible with AGPR, and have an available AGPR available at all points (which effectively means rewriting a full chain of mfmas and load/store at once).	2025-08-21 08:16:56 +09:00
Renaud Kauffmann	3856bb6bbf	[flang] [acc] Adding allocation to the recipe of scalar allocatables (#154643 ) Currently the privatization recipe of a scalar allocatable is as follow: ``` acc.private.recipe @privatization_ref_box_heap_i32 : !fir.ref<!fir.box<!fir.heap<i32>>> init { ^bb0(%arg0: !fir.ref<!fir.box<!fir.heap<i32>>>): %0 = fir.alloca !fir.box<!fir.heap<i32>> %1:2 = hlfir.declare %0 {uniq_name = "acc.private.init"} : (!fir.ref<!fir.box<!fir.heap<i32>>>) -> (!fir.ref<!fir.box<!fir.heap<i32>>>, !fir.ref<!fir.box<!fir.heap<i32>>>) acc.yield %1#0 : !fir.ref<!fir.box<!fir.heap<i32>>> } ``` This change adds the allocation for the scalar.	2025-08-20 16:04:57 -07:00
Mehdi Amini	62b29d9f76	[MLIR] Adopt LDBG() debug macro in BytecodeWriter.cpp (NFC) (#154642 )	2025-08-20 22:45:39 +00:00
Mehdi Amini	908eebcb93	[MLIR] Adopt LDBG() macro in PDL ByteCodeExecutor (NFC) (#154641 )	2025-08-20 22:40:52 +00:00
Ely Ronnen	8b64cd8be2	[lldb-dap] Add module symbol table viewer to VS Code extension #140626 (#153836 ) - VS Code extension: - Add module symbol table viewer using [Tabulator](https://tabulator.info/) for sorting and formatting rows. - Add context menu action to the modules tree. - lldb-dap - Add `DAPGetModuleSymbolsRequest` to get symbols from a module. Fixes #140626 [Screencast From 2025-08-15 19-12-33.webm](https://github.com/user-attachments/assets/75e2f229-ac82-487c-812e-3ea33a575b70)	2025-08-21 00:31:48 +02:00
Joseph Huber	27fc9671f9	Revert "[libc] Enable wide-read memory operations by default on Linux (#154602 )" This reverts commit c80d1483c6d787edf62ff9e86b1e97af5eb5abf9.	2025-08-20 17:27:13 -05:00
Craig Topper	2cb7c46bf0	[RISCV] Add missing 'OrP' to comment in RISCVInstrInfoZb.td. NFC	2025-08-20 15:27:03 -07:00
Joseph Huber	c80d1483c6	[libc] Enable wide-read memory operations by default on Linux (#154602 ) Summary: This patch changes the linux build to use the wide reads on the memory operations by default. These memory functions will now potentially read outside of the bounds explicitly allowed by the current function. While technically undefined behavior in the standard, plenty of C library implementations do this. it will not cause a segmentation fault on linux as long as you do not cross a page boundary, and because we are only reading memory it should not have atomic effects.	2025-08-20 17:17:12 -05:00
Craig Topper	ac8f0bb070	[RISCV] Reduce ManualCodeGen for segment load/store intrinsics. NFC Operate directly on the existing Ops vector instead of copying to a new vector. This is similar to what the autogenerated codegen does for other intrinsics. This reduced the clang binary size by ~96kb on my local Release+Asserts build.	2025-08-20 15:02:24 -07:00
Sergei Barannikov	46343ca374	[TableGen][DecoderEmitter] Add DecoderMethod to InstructionEncoding (NFC) (#154477 ) We used to abuse Operands list to store instruction encoding's DecoderMethod there. Let's store it in the InstructionEncoding class instead, where it belongs.	2025-08-20 21:59:59 +00:00
Mehdi Amini	dbbd3f0d07	[MLIR] Adopt LDBG() macro in Affine/Analysis/Utils.cpp (NFC) (#154626 )	2025-08-20 21:56:03 +00:00
Alan Zhao	904b4f5a27	[clang][timers][modules] Fix a timer being started when it's running (#154231 ) `ASTReader::FinishedDeserializing()` calls `adjustDeductedFunctionResultType(...)` [0], which in turn calls `FunctionDecl::getMostRecentDecl()`[1]. In modules builds, `getMostRecentDecl()` may reach out to the `ASTReader` and start deserializing again. Starting deserialization starts `ReadTimer`; however, `FinishedDeserializing()` doesn't call `stopTimer()` until after it's call to `adjustDeductedFunctionResultType(...)` [2]. As a result, we hit an assert checking that we don't start an already started timer [3]. To fix this, we simply don't start the timer if it's already running. Unfortunately I don't have a test case for this yet as modules builds are notoriously difficult to reduce. [0]: `4d2288d318/clang/lib/Serialization/ASTReader.cpp (L11053)` [1]: `4d2288d318/clang/lib/AST/ASTContext.cpp (L3804)` [2]: `4d2288d318/clang/lib/Serialization/ASTReader.cpp (L11065-L11066)` [3]: `4d2288d318/llvm/lib/Support/Timer.cpp (L150)`	2025-08-20 21:53:43 +00:00
Kazu Hirata	55551da200	[lldb] Add missing case statements for SubstBuiltinTemplatePack (#154606 ) This patch fixes: lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4148:11: error: enumeration value 'SubstBuiltinTemplatePack' not handled in switch [-Werror,-Wswitch] 4148 \| switch (qual_type->getTypeClass()) { \| ^~~~~~~~~~~~~~~~~~~~~~~~~ lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4852:11: error: enumeration value 'SubstBuiltinTemplatePack' not handled in switch [-Werror,-Wswitch] 4852 \| switch (qual_type->getTypeClass()) { \| ^~~~~~~~~~~~~~~~~~~~~~~~~ lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:5153:11: error: enumeration value 'SubstBuiltinTemplatePack' not handled in switch [-Werror,-Wswitch] 5153 \| switch (qual_type->getTypeClass()) { \| ^~~~~~~~~~~~~~~~~~~~~~~~~	2025-08-20 14:53:28 -07:00
Gang Chen	575fad2892	[AMDGPU] Upstream the Support for array of named barriers (#154604 )	2025-08-20 14:53:03 -07:00
Mehdi Amini	d20a74e631	[MLIR] Adopt LDBG() macro in BasicPtxBuilderInterface.cpp (NFC) (#154625 )	2025-08-20 21:51:17 +00:00
Mehdi Amini	4be19e27b5	[MLIR] Adopt LDBG() debug macros in Affine LoopAnalysis.cpp (NFC) (#154621 )	2025-08-20 21:45:42 +00:00
Steven Wu	deab049b5c	[CAS] Add ActionCache to LLVMCAS Library (#114097 ) ActionCache is used to store a mapping from CASID to CASID. The current implementation of the ActionCache can only be used to associate the key/value from the same hash context. ActionCache has two operations: `put` to store the key/value and `get` to lookup the key/value mapping. ActionCache uses the same TrieRawHashMap data structure to store the mapping, where is CASID of the key is the hash to index the map. While CASIDs for key/value are often associcate with actual CAS ObjectStore, it doesn't provide the guarantee of the existence of such object in any ObjectStore.	2025-08-20 14:42:44 -07:00
Ramkumar Ramachandra	0db57ab586	[VPlan] Improve code using onlyScalarValuesUsed (NFC) (#154564 )	2025-08-20 22:38:00 +01:00
Julian Lettner	484d0408f9	[lldb] Fix source line annotations for libsanitizers traces (#154247 ) When providing allocation and deallocation traces, the ASan compiler-rt runtime already provides call addresses (`TracePCType::Calls`). On Darwin, system sanitizers (libsanitizers) provides return address. It also discards a few non-user frames at the top of the stack, because these internal libmalloc/libsanitizers stack frames do not provide any value when diagnosing memory errors. Introduce and add handling for `TracePCType::ReturnsNoZerothFrame` to cover this case and enable libsanitizers traces line-level testing. rdar://157596927 --- Commit 1 is a mechanical refactoring to introduce and adopt `TracePCType` enum to replace `pcs_are_call_addresses` bool. It preserve the current behavior: ``` pcs_are_call_addresses: false -> TracePCType::Returns (default) true -> TracePCType::Calls ``` Best reviewed commit by commit.	2025-08-20 14:33:27 -07:00
Florian Hahn	7d33743324	[LV] Add tests for narrowing interleave groups with scalable vectors.	2025-08-20 22:31:24 +01:00
Andy Kaylor	59b33242af	[CIR][NFC] Fix warning in MemOrder lowering (#154609 ) This fixes a warning about having a default case in a fully covered enum switch statement.	2025-08-20 14:30:22 -07:00
Mehdi Amini	6445a75c98	[MLIR] Update MLIRContext to use the LDBG() style debug macro (NFC) (#154619 )	2025-08-20 21:30:11 +00:00
Mehdi Amini	ffbc8da8b5	[MLIR] Migrate LICM utils to the LDBG() macro style logging (NFC) (#154615 )	2025-08-20 21:29:50 +00:00
Mehdi Amini	780750bbf9	[MLIR] Adopt LDBG() debug macro in ConvertToLLVMPass (NFC) (#154616 )	2025-08-20 21:29:35 +00:00
YongKang Zhu	5c4f506cca	[BOLT] Validate extra entry point by querying data marker symbols (#154611 ) Look up marker symbols and decide whether candidate is really extra entry point in `adjustFunctionBoundaries()`.	2025-08-20 14:18:56 -07:00
Mehdi Amini	5683baea6d	[MLIR] Adopt LDBG() debug macro in bufferization (NFC) (#154614 )	2025-08-20 21:14:02 +00:00
Isaac Nudelman	c6fa115b2d	[clang][analyzer] Relax assertion for non-default address spaces in the cstring checker (#153498 ) Prevent an assertion failure in the cstring checker when library functions like memcpy are defined with non-default address spaces. Adds a test for this case.	2025-08-20 16:07:54 -05:00
David Majnemer	0a7eabcc56	Reapply "[APFloat] Fix getExactInverse for DoubleAPFloat" The previous implementation of getExactInverse used the following check to identify powers of two: // Check that the number is a power of two by making sure that only the // integer bit is set in the significand. if (significandLSB() != semantics->precision - 1) return false; This condition verifies that the only set bit in the significand is the integer bit, which is correct for normal numbers. However, this logic is not correct for subnormal values. APFloat represents subnormal numbers by shifting the significand right while holding the exponent at its minimum value. For a power of two in the subnormal range, its single set bit will therefore be at a position lower than precision - 1. The original check would consequently fail, causing the function to determine that these numbers do not have an exact multiplicative inverse. The new logic calculated this correctly but it seems that test/CodeGen/Thumb2/mve-vcvt-fixed-to-float.ll expected the old behavior. Seeing as how getExactInverse does not have tests or documentation, we conservatively maintain (and document) this behavior. This reverts commit 47e62e846beb267aad50eb9195dfd855e160483e.	2025-08-20 14:02:36 -07:00
Rolf Morel	cbfa265e98	[MLIR][LLVMIR][DLTI] Add `LLVM::TargetAttrInterface` and `#llvm.target` attr (#145899 ) Adds the `#llvm.target<triple = $TRIPLE, chip = $CHIP, features = $FEATURES>` attribute and along with a `-llvm-target-to-data-layout` pass to derive a MLIR data layout from the LLVM data layout string (using the existing `DataLayoutImporter`). The attribute implements the relevant DLTI-interfaces, to expose the `triple`, `chip` (AKA `cpu`) and `features` on `#llvm.target` and the full `DataLayoutSpecInterface`. The pass combines the generated `#dlti.dl_spec` with an existing `dl_spec` in case one is already present, e.g. a `dl_spec` which is there to specify size of the `index` type. Adds a `TargetAttrInterface` which can be implemented by all attributes representing LLVM targets. Similar to the Draft PR https://github.com/llvm/llvm-project/pull/78073. RFC on which this PR is based: https://discourse.llvm.org/t/mandatory-data-layout-in-the-llvm-dialect/85875	2025-08-20 22:00:30 +01:00
Shafik Yaghmour	2a66ce5edb	[Clang][NFC] Clarify some SourceManager related code (#153527 ) Static analysis flagged the columns - 1 code, it was correct but the assumption was not obvious. I document the assumption w/ assertions. While digging through related code I found getColumnNumber that looks wrong at first inspection and adding parentheses makes it clearer.	2025-08-20 13:57:37 -07:00
Philip Reames	e6b4a21849	[IR] Add utilities for manipulating length of MemIntrinsic [nfc] (#153856 ) Goal is simply to reduce direct usage of getLength and setLength so that if we end up moving memset.pattern (whose length is in elements) there are fewer places to audit.	2025-08-20 13:50:11 -07:00
Valentin Clement (バレンタインクレメン)	a4e8ec9de9	[flang][cuda][NFC] Add getDataAttr helper (#154586 )	2025-08-20 13:46:29 -07:00
Dan Salvato	45e2c50256	[M68k] Fix reverse BTST condition causing opposite failure/success logic (#153086 ) Given the test case: ```llvm define fastcc i16 @testbtst(i16 %a) nounwind { entry: switch i16 %a, label %no [ i16 11, label %yes i16 10, label %yes i16 9, label %yes i16 4, label %yes i16 3, label %yes i16 2, label %yes ] yes: ret i16 1 no: ret i16 0 } ``` We currently get this result: ```asm testbtst: ; @testbtst ; %bb.0: ; %entry move.l %d0, %d1 and.l #65535, %d1 sub.l #11, %d1 bhi .LBB0_3 ; %bb.1: ; %entry and.l #65535, %d0 move.l #3612, %d1 btst %d0, %d1 bne .LBB0_3 ; <------- Erroneous condition ; %bb.2: ; %yes moveq #1, %d0 rts .LBB0_3: ; %no moveq #0, %d0 rts ``` The cause of this is a line that explicitly reverses the `btst` condition code. But on M68k, `btst` sets condition codes the same as `and` with a bitmask, meaning `EQ` indicates failure (bit is zero) and not success, so the condition does not need to be reversed. In my testing, I've only been able to get switch statements to lower to `btst`, so I wasn't able to explicitly test other options for lowering. But (if possible to trigger) I believe they have the same logical error. For example, in `LowerAndToBTST()`, a comment specifies that it's lowering a case where the `and` result is compared against zero, which means the corresponding `btst` condition should also not be reversed. This patch simply flips the ternary expression in `getBitTestCondition()` to match the ISD condition code with the same M68k code, instead of the opposite.	2025-08-20 13:45:01 -07:00

1 2 3 4 5 ...

549390 Commits