llvm-project

Author	SHA1	Message	Date
Amr Hesham	5581e34bd9	[CIR] Implement MemberExpr support for ComplexType (#154027 ) This change adds support for the MemberExpr ComplexType Issue: https://github.com/llvm/llvm-project/issues/141365	2025-08-19 10:32:22 +02:00
David Sherwood	13d8ba7dea	[LV][TTI] Calculate cost of extracting last index in a scalable vector (#144086 ) There are a couple of places in the loop vectoriser where we want to calculate the cost of extracting the last lane in a vector. However, we wrongly assume that asking for the cost of extracting lane (VF.getKnownMinValue() - 1) is an accurate representation of the cost of extracting the last lane. For SVE at least, this is non-trivial as it requires the use of whilelo and lastb instructions. To solve this problem I have added a new getReverseVectorInstrCost interface where the index is used in reverse from the end of the vector. Suppose a vector has a given ElementCount EC, the extracted/inserted lane would be EC - 1 - Index. For scalable vectors this index is unknown at compile time. I've added a AArch64 hook that better represents the cost, and also a RISCV hook that maintains compatibility with the behaviour prior to this PR. I've also taken the liberty of adding support in vplan for calculating the cost of VPInstruction::ExtractLastElement.	2025-08-19 09:31:37 +01:00
William Huynh	0c622d72fc	[libc] Add _Returns_twice to C++ code (#153602 ) Fixes issue with `<csetjmp>` which requires `_Returns_twice` but in C++ mode	2025-08-19 09:28:23 +01:00
Timm Baeder	da05208bfb	[clang][bytecode] Create temporary before discarding CXXConstructExpr (#154280 ) Fixes #154110	2025-08-19 10:27:26 +02:00
Nikita Popov	5753ee2434	[LICM] Avoid assertion failure on stale MemoryDef It can happen that the call is originally created as a MemoryDef, and then later transforms show it is actually read-only and could be a MemoryUse -- however, this is not guaranteed to be reflected in MSSA.	2025-08-19 10:25:45 +02:00
Luke Lau	cabf6433c6	[VPlan] EVL transform VPVectorEndPointerRecipe alongisde load/store recipes. NFC (#152542 ) This is the first step in untangling the variable step transform and header mask optimizations as described in #152541. Currently we replace all VF users globally in the plan, including VPVectorEndPointerRecipe. However this leaves reversed loads and stores in an incorrect state until they are adjusted in optimizeMaskToEVL. This moves the VPVectorEndPointerRecipe transform so that it is updated in lockstep with the actual load/store recipe. One thought that crossed my mind was that VPInterleaveRecipe could also use VPVectorEndPointerRecipe, in which case we would have also been computing the wrong address because we don't transform it to an EVL recipe which accounts for the reversed address.	2025-08-19 08:16:48 +00:00
Morris Hafner	b5e5794534	[CIR] Implement Statement Expressions (#153677 ) Depends on #153625 This patch adds support for statement expressions. It also changes emitCompoundStmt and emitCompoundStmtWithoutScope to accept an Address that the optional result is written to. This allows the creation of the alloca ahead of the creation of the scope which saves us from hoisting the alloca to its parent scope.	2025-08-19 10:11:15 +02:00
Simon Pilgrim	9adc4f9720	[X86] Enable MMX unpcklo/unpckhi intrinsics in constexpr (#154149 ) Matches behaviour in SSE/AVX/AVX512 intrinsics - was missed in #153028	2025-08-19 09:08:39 +01:00
Timm Baeder	ddcd3fdcfd	[clang][bytecode][NFC] use both-note in literals test (#154277 ) And harmonize the RUN lines.	2025-08-19 10:02:23 +02:00
Nikita Popov	e6d9542b77	[X86][Inline] Check correct function for target feature check (#152515 ) The check for ABI differences for inlined calls involves the caller, the callee and the nested callee. Before inlining, the ABI is determined by the target features of the callee. After inlining it is determined by the caller. The features of the nested callee should never actually matter.	2025-08-19 09:44:00 +02:00
Nikita Popov	86ac834df5	[RISCV] Use OrigTy from InputArg/OutputArg (NFCI) (#154095 ) The InputArg/OutputArg now contains the OrigTy, so directly use that instead of trying to recover it. CC_RISCV is now nearly a normal CC assignment function. However, it still differs by having an IsRet flag.	2025-08-19 09:28:24 +02:00
Nikita Popov	9d37e80d3c	[SystemZ] Remove custom CCState pre-analysis (#154091 ) The calling convention lowering now has access to OrigTy, so use that to detect short vectors.	2025-08-19 09:28:09 +02:00
Nikita Popov	a4f85515c2	[Hexagon] Remove custom vararg tracking (NFCI) (#154089 ) This information is now directly available, use the generic CCIfArgVarArg.	2025-08-19 09:27:11 +02:00
Nikita Popov	b2fae5b3c7	[Mips] Remove custom "original type" handling (#154082 ) Replace Mips custom logic for retaining information about original types in calling convention lowering by directly querying the OrigTy that is now available. There is one change in behavior here: If the return type is a struct containing fp128 plus additional members, the result is now different, as we no longer special case to a single fp128 member. I believe this is fine, because this is a fake ABI anyway: Such cases should actually use sret, and as such are a frontend responsibility, and Clang will indeed emit these as sret, not as a return value struct. So this only impacts manually written IR tests.	2025-08-19 09:26:38 +02:00
Morris Hafner	b44e47a68f	[CIR] Upstream __builtin_va_start and __builtin_va_end (#153819 ) Part of #153286	2025-08-19 09:16:11 +02:00
Timm Baeder	eb7a1d91b2	[clang][bytecode] Support pmul X86 builtins (#154275 )	2025-08-19 09:10:50 +02:00
Sergei Barannikov	cded128009	[TableGen][DecoderEmitter] Extract encoding parsing into a method (NFC) (#154271 ) Call it from the constructor so that we can make `run` method `const`. Turn a couple of related functions into methods as well.	2025-08-19 06:35:59 +00:00
Sudharsan Veeravalli	c00b04a7e0	[RISCV] Generate QC_INSB/QC_INSBI instructions from OR of AND Imm (#154023 ) Generate QC_INSB/QC_INSBI from `or (and X, MaskImm), OrImm` iff the value being inserted only sets known zero bits. This is based on a similar DAG to DAG transform done in `AArch64`.	2025-08-19 11:14:14 +05:30
Pierre van Houtryve	6f7c77fe90	[AMDGPU] Check noalias.addrspace in mayAccessScratchThroughFlat (#151319 ) PR #149247 made the MD accessible by the backend so we can now leverage it in the memory model. The first use case here is detecting if a flat op can access scratch memory. Benefits both the MemoryLegalizer and InsertWaitCnt.	2025-08-19 07:42:59 +02:00
Benjamin Maxwell	bb3066d42b	[LAA] Move scalable vector check into `getStrideFromAddRec()` (#154013 ) This moves the check closer to the `.getFixedValue()` call and fixes #153797 (which is a regression from #126971).	2025-08-19 06:40:07 +01:00
Kazu Hirata	18123cc91d	[llvm] Proofread Legalizer.rst (#154266 )	2025-08-18 22:40:00 -07:00
Kazu Hirata	5fdc7478a6	[AArch64] Replace SmallSet with SmallPtrSet (NFC) (#154264 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 30 instances that rely on this "redirection". Since the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types. I'm planning to remove the redirection eventually.	2025-08-18 22:39:53 -07:00
Kazu Hirata	4831d92005	[lld] Replace SmallSet with SmallPtrSet (NFC) (#154263 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 30 instances that rely on this "redirection". Since the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types. I'm planning to remove the redirection eventually.	2025-08-18 22:39:45 -07:00
Kazu Hirata	d82617d2e8	[ADT] Refactor SmallPtrSetImplBase::swap (NFC) (#154261 ) SmallPtrSetImplBase::swap needs to deal with four cases depending on whether LHS is small and whether RHS is small. Now, the code to swap small LHS and large RHS is symmetric with the code to swap large LHS and small RHS. This patch rearranges code so that we first take care of the case where both LHS and RHS are small. Then we compute references SmallSide and LargeSide and actually swap the two instances. This refactoing saves about 11 lines of code. Note that SmallDenseMap::swap also uses a similar trick.	2025-08-18 22:39:38 -07:00
Sergei Barannikov	6c3a0ab51a	[TableGen][DecoderEmitter] Shorten a few variable names (NFC) These "Numbered"-prefixed names were rather confusing than helpful.	2025-08-19 08:05:02 +03:00
YAMAMOTO Takashi	9247be8ca0	[lld][WebAssembly] Do not relocate ABSOLUTE symbols (#153763 ) Fixes https://github.com/llvm/llvm-project/issues/153759	2025-08-18 21:56:48 -07:00
Timm Baeder	25c137e43b	[clang][bytecode] Save a per-block dynamic allocation ID (#154094 ) This fixes an old todo item about wrong allocation counting and some diagnostic differences.	2025-08-19 06:52:21 +02:00
Sergei Barannikov	f84ce1e1d0	[TableGen][DecoderEmitter] Extract a couple of loop invariants (NFC)	2025-08-19 07:47:15 +03:00
Craig Topper	da19383ae7	[RISCV] Fold (X & -4096) == 0 -> (X >> 12) == 0 (#154233 ) This is a more general form of the recently added isel pattern (seteq (i64 (and GPR:$rs1, 0x8000000000000000)), 0) -> (XORI (i64 (SRLI GPR:$rs1, 63)), 1) We can use a shift right for any AND mask that is a negated power of 2. But for every other constant we need to use seqz instead of xori. I don't think there is a benefit to xori over seqz as neither are compressible. We already do this transform from target independent code when the setcc constant is a non-zero subset of the AND mask that is not a legal icmp immediate. I don't believe any of these patterns comparing MSBs to 0 are canonical according to InstCombine. The canonical form is (X < 4096). I'm curious if these appear during SelectionDAG and if so, how. My goal here was just to remove the special case isel patterns.	2025-08-18 21:24:35 -07:00
Craig Topper	2817873082	[RISCV] Fold (sext_inreg (setcc), i1) -> (sub 0, (setcc). (#154206 ) This helps the 3 vendor extensions that make sext_inreg i1 legal. I'm delaying this until after LegalizeDAG since we normally have sext_inreg i1 up until LegalizeDAG turns it into and+neg. I also delayed the recently added (sext_inreg (xor (setcc), -1), i1) combine. Though the xor isn't likely to appear before LegalizeDAG anyway.	2025-08-18 21:24:03 -07:00
Anutosh Bhat	00ffd8b8aa	[lldb] Fix error : unknown error while starting lldb's C/C++ repl (#153560 ) Fixes https://github.com/llvm/llvm-project/issues/153157 The proposed solution has been discussed here (https://github.com/llvm/llvm-project/issues/153157#issue-3313379242) This is what we would be seeing now ``` base) anutosh491@Anutoshs-MacBook-Air bin % ./lldb /Users/anutosh491/work/xeus-cpp/a.out (lldb) target create "/Users/anutosh491/work/xeus-cpp/a.out" Current executable set to '/Users/anutosh491/work/xeus-cpp/a.out' (arm64). (lldb) b main Breakpoint 1: where = a.out`main, address = 0x0000000100003f90 (lldb) r Process 71227 launched: '/Users/anutosh491/work/xeus-cpp/a.out' (arm64) Process 71227 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 frame #0: 0x0000000100003f90 a.out`main a.out`main: -> 0x100003f90 <+0>: sub sp, sp, #0x10 0x100003f94 <+4>: str wzr, [sp, #0xc] 0x100003f98 <+8>: str w0, [sp, #0x8] 0x100003f9c <+12>: str x1, [sp] (lldb) expression --repl -l c -- 1> 1 + 1 (int) $0 = 2 2> 2 + 2 (int) $1 = 4 ``` ``` base) anutosh491@Anutoshs-MacBook-Air bin % ./lldb /Users/anutosh491/work/xeus-cpp/a.out (lldb) target create "/Users/anutosh491/work/xeus-cpp/a.out" Current executable set to '/Users/anutosh491/work/xeus-cpp/a.out' (arm64). (lldb) b main Breakpoint 1: where = a.out`main, address = 0x0000000100003f90 (lldb) r Process 71355 launched: '/Users/anutosh491/work/xeus-cpp/a.out' (arm64) Process 71355 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 frame #0: 0x0000000100003f90 a.out`main a.out`main: -> 0x100003f90 <+0>: sub sp, sp, #0x10 0x100003f94 <+4>: str wzr, [sp, #0xc] 0x100003f98 <+8>: str w0, [sp, #0x8] 0x100003f9c <+12>: str x1, [sp] (lldb) expression --repl -l c -- 3 + 3 Warning: trailing input is ignored in --repl mode 1> 1 + 1 (int) $0 = 2 ```	2025-08-19 09:51:20 +05:30
Luke Lau	144736b07e	[VPlan] Don't fold live ins with both scalar and vector operands (#154067 ) If we end up with a extract_element VPInstruction where both operands are live-ins, we will try to fold the live-ins even though the first operand is a vector whilst the live-in is scalar. This fixes it by just returning the vector live-in instead of calling the folder, and removes the handling for insertelement where we aren't able to do the fold. From some quick testing we previously never hit this fold anyway, and were probably just missing test coverage. Fixes #154045	2025-08-19 04:10:53 +00:00
Md Asghar Ahmad Shahid	c24c23d9ab	[NFC][mlir][vector] Handle potential static cast assertion. (#152957 ) In FoldArithToVectorOuterProduct pattern, static cast to vector type causes assertion when a scalar type was encountered. It seems the author meant to have a dyn_cast instead. This NFC patch handles it by using dyn_cast.	2025-08-19 09:27:20 +05:30
Henrik G. Olsson	e1ff432eb6	Reland "[Utils] Add new --update-tests flag to llvm-lit" (#153821 ) This reverts commit `e495231238` to reland the --update-tests feature, originally landed in https://github.com/llvm/llvm-project/pull/108425.	2025-08-18 20:24:27 -07:00
Sergei Barannikov	c8c2218c00	[TableGen][DecoderEmitter] Synthesize decoder table name in emitTable (#154255 ) Previously, HW mode name was appended to decoder namespace name when enumerating encodings, and then emitTable appended the bit width to it to form the final table name. Let's do this all in one place. A nice side effect is that this allows us to avoid having to deal with std::string. The changes in the tests are caused by the different order of tables.	2025-08-19 06:19:54 +03:00
Shilei Tian	1f6e13a161	Revert "[AMDGPU][Attributor] Infer inreg attribute in `AMDGPUAttributor` (#146720 )" This reverts commit 84ab301554f8b8b16b94263a57b091b07e9204f2 because it breaks several AMDGPU test bots.	2025-08-18 22:59:52 -04:00
Sudharsan Veeravalli	8495018a85	[RISCV] Use sd_match in trySignedBitfieldInsertInMask (#154152 ) This keeps everything in APInt and makes it easier to understand and maintain.	2025-08-19 08:22:06 +05:30
slavek-kucera	5b55899781	[clangd] Clangd running with `--experimental-modules-support` crashes when the compilation database is unavailable (#153802 ) fixes llvm/llvm-project#132413	2025-08-19 10:19:13 +08:00
Shilei Tian	84ab301554	[AMDGPU][Attributor] Infer inreg attribute in `AMDGPUAttributor` (#146720 ) This patch introduces `AAAMDGPUUniformArgument` that can infer `inreg` function argument attribute. The idea is, for a function argument, if the corresponding call site arguments are always uniform, we can mark it as `inreg` thus pass it via SGPR. In addition, this AA is also able to propagate the inreg attribute if feasible.	2025-08-18 22:01:47 -04:00
Iris Shi	cc68e45343	[CIR] Implement codegen for inline assembly without input and output operands (#153546 ) - Part of #153267 https://github.com/llvm/clangir/blob/main/clang/lib/CIR/CodeGen/CIRAsm.cpp	2025-08-18 18:54:24 -07:00
ZhaoQi	be3fd6ae25	[LoongArch] Use section-relaxable check instead of relax feature from STI (#153792 ) In some cases, such as using `lto` or `llc`, relax feature is not available from this `SubtargetInfo` (`LoongArchAsmBackend` is instantiated too early), causing loss of relocations. This commit modifiy the condition to check whether the section which contains the two symbols is relaxable. If not relaxable, no need to record relocations.	2025-08-19 09:48:51 +08:00
Ye Tian	db843e5d09	[DAG] Add ISD::FP_TO_SINT_SAT/FP_TO_UINT_SAT handling to SelectionDAG::canCreateUndefOrPoison (#154244 ) Related to https://github.com/llvm/llvm-project/issues/153366	2025-08-19 10:45:11 +09:00
Jianjian Guan	1eb5b18a04	[mlir][emitc] Support dense as init value for ShapedType (#144826 )	2025-08-19 09:41:15 +08:00
Matt Arsenault	19ebfa6d0b	RuntimeLibcalls: Move exception call config to tablegen (#151948 ) Also starts pruning out these calls if the exception model is forced to none. I worked backwards from the logic in addPassesToHandleExceptions and the pass content. There appears to be some tolerance for mixing and matching exception modes inside of a single module. As far as I can tell _Unwind_CallPersonality is only relevant for wasm, so just add it there. As usual, the arm64ec case makes things difficult and is missing test coverage. The set of calls in list form is necessary to use foreach for the duplication, but in every other context a dag is more convenient. You cannot use foreach over a dag, and I haven't found a way to flatten a dag into a list. This removes the last manual setLibcallImpl call in generic code.	2025-08-19 10:35:59 +09:00
Matt Arsenault	fe67267d19	MSP430: Move __mspabi_mpyll calling conv config to tablegen (#153988 ) There are several libcall choices for MUL_I64 which depend on the subtarget, but this is the base case. The manual custom ISelLowering is still overriding the decision until we have a way to control lowering choices, but we can still get the calling convention set for now.	2025-08-19 10:25:10 +09:00
Helena Kotas	eb3d88423d	[HLSL] Global resource arrays element access (#152454 ) Adds support for accessing individual resources from fixed-size global resource arrays. Design proposal: https://github.com/llvm/wg-hlsl/blob/main/proposals/0028-resource-arrays.md Enables indexing into globally scoped, fixed-size resource arrays to retrieve individual resources. The initialization logic is primarily handled during codegen. When a global resource array is indexed, the codegen translates the `ArraySubscriptExpr` AST node into a constructor call for the corresponding resource record type and binding. To support this behavior, Sema needs to ensure that: - The constructor for the specific resource type is instantiated. - An implicit binding attribute is added to resource arrays that lack explicit bindings (#152452). Closes #145424	2025-08-18 18:20:46 -07:00
Mariusz Borsa	0c512f7897	[Sanitizers][Test] narrower constraint for XFAIL (#154245 ) It's a followup to https://github.com/llvm/llvm-project/pull/154189 , which broke test on android bot. Making sure XFAIL only happen on darwin bots rdar://158543555 Co-authored-by: Mariusz Borsa <m_borsa@apple.com>	2025-08-18 18:17:25 -07:00
ZhaoQi	3acb7093c2	[LoongArch][NFC] Add tests for fixing missed addsub relocs when enabling relax (#154108 )	2025-08-19 09:15:23 +08:00
Lang Hames	ee7a6a45bd	[ORC-RT] Initial check-in for a new, top-level ORC runtime project. (#113499 ) Includes CMake files and placeholder header, library, test tool, regression test and unit test. The aim for this project is to create a replacement for the existing ORC Runtime that currently resides in `llvm-project/compiler-rt/lib/orc`. The new project will provide a superset of the original features, and the old runtime will be removed once the new runtime is sufficiently developed. See discussion at https://discourse.llvm.org/t/rfc-move-orc-executor-support-into-top-level-project/81049	2025-08-19 10:56:18 +10:00
Mel Chen	1dac302ce7	[LV] Explicitly disallow interleaved access requiring gap mask for scalable VFs. nfc (#154122 ) Currently, VPInterleaveRecipe::execute does not support generating LLVM IR for interleaved accesses that require a gap mask for scalable VFs. It would be better to detect and prevent such groups from being vectorized as interleaved accesses in LoopVectorizationCostModel::interleavedAccessCanBeWidened, rather than relying on the TTI function getInterleavedMemoryOpCost to return an invalid cost.	2025-08-19 08:42:39 +08:00

1 2 3 4 5 ...

549094 Commits