llvm-project

Author	SHA1	Message	Date
Teja Alaghari	b4b5bfaf40	[CodeGen][NPM] Update MPDT similar to MDT after unreachable BB elimination (#172421 ) After unreachable machine basic blocks are removed, MPDT should also be updated with the latest block numbers alongside MDT.	2025-12-19 11:09:49 +05:30
Teja Alaghari	4e89e710d9	[CodeGenPrepare][NPM] Remove incorrect LoopAnalysis preservation in CodeGenPrepare (#172418 ) CodeGenPrepare modifies and restructures loops & control flow. So, it shouldn't preserve LoopAnalysis. The test `llvm/test/CodeGen/AMDGPU/cf-loop-on-constant.ll` shows CodeGenPrepare modifying loop structure, hence we cannot preserve LoopAnalysis.	2025-12-19 11:08:31 +05:30
KRM7	c9aea6248a	[RegisterCoalescer] Don't commute two-address instructions which only define a subregister (#169031 ) Currently, the register coalescer may try to commute an instruction like: ``` %0.sub_lo32:gpr64 = AND %0.sub_lo32:gpr64(tied-def 0), %1.sub_lo32:gpr64 USE %0:gpr64 ``` resulting in: ``` %1.sub_lo32:gpr64 = AND %1.sub_lo32:gpr64(tied-def 0), %0.sub_lo32:gpr64 USE %1:gpr64 ``` However, this is not correct if the instruction doesn't define the entire register, as the value of the upper 32-bits of the register used in `USE` will not be the same.	2025-12-18 23:24:44 +01:00
Gaëtan Bossu	ef58e6f6af	[SDAG] Widen TRUNCATE to intermediate type to avoid ISel failure (#172473 ) SelectionDAG offered no way to widen TRUNCATE for pathological types like <vscale x 1 x ...> as they do not allow scalarisation. One way to go further to is widen to an intermediate type which will allow to promote the element type in a later run of legalisation.	2025-12-18 17:19:34 +00:00
guan jian	4e675a0c45	[SelectionDAG] Lowering usub.sat(a, 1) to a - (a != 0) (#170076 ) I recently observed that LLVM generates the following code: ``` addi a1, a0, -1 sltu a0, a0, a1 addi a0, a0, -1 and a0, a0, a1 ret ``` This could be optimized using the snez instruction instead.	2025-12-18 14:31:53 +00:00
Frederik Harwath	5c05824d2b	[CodeGen] Rename expand-fp to expand-ir-insts (#172681 ) The pass now contains a non-fp expansion and should be used for any similar expansions regardless of the types involved. Hence a generic name seems apt. Rename the source files, pass, and adjust the pass description. Move all tests for the expansions that have previously been merged into the pass to a single directory.	2025-12-18 11:15:04 +00:00
Frederik Harwath	71760f324f	[CodeGen] Merge ExpandLargeDivRem into ExpandFp (#172680 ) Both passes expand instructions at the IR level. They use the same kind of instruction visitation logic and contain significant code duplication e.g. for scalarization.	2025-12-18 09:22:47 +01:00
Kevin Per	0036c67445	[RISCV]: Implemented softening of `FCANONICALIZE` (#169234 ) The `ISD::FCANONICALIZE` is mapped to `llvm.minnum(x, x)`. Closes https://github.com/llvm/llvm-project/issues/169216	2025-12-17 16:38:18 -08:00
Rahman Lavaee	53005fd435	Use the Propeller CFG profile in the PGO analysis map if it is available. (#163252 ) This PR implements the emitting of the post-link CFG information in PGO analysis map, as explained in the [RFC](https://discourse.llvm.org/t/rfc-extending-the-pgo-analysis-map-with-propeller-cfg-frequencies/88617). This is enabled by a flag `pgo-analysis-map-emit-bb-sections-cfg`. This PR bumps the SHT_LLVM_BB_ADDR_MAP version to 5. Also includes some refactoring changes related to storing the CFG in the Basic block sections profile reader.	2025-12-17 14:19:18 -08:00
Valeriy Savchenko	e7892d702f	[DAGCombiner] Fix assertion failure in vector division lowering (#172321 )	2025-12-17 22:09:54 +00:00
Folkert de Vries	a587ccd87d	fix `llvm.fma.f16` double rounding issue when there is no native support (#171904 ) fixes https://github.com/llvm/llvm-project/issues/98389 As the issue describes, promoting `llvm.fma.f16` to `llvm.fma.f32` does not work, because there is not enough precision to handle the repeated rounding. `f64` does have sufficient space. So this PR explicitly promotes the 16-bit fma to a 64-bit fma. I could not find examples of a libcall being used for fma, but that's something that could be looked in separately to work around code size issues.	2025-12-17 22:03:01 +01:00
Pan Tao	b6bfa85686	[aarch64] Mix the frame pointer with the stack cookie when protecting the stack (#161114 ) This strengthens the guard and matches MSVC. Fixes #156573 .	2025-12-17 12:52:28 -08:00
natanelh-mobileye	fa78d6a5f1	[SDAG] Shrink (abd? (?ext x) (?ext y)) (#171865 ) Alive2 test: https://alive2.llvm.org/ce/z/maryYU Lit test before change: https://godbolt.org/z/nEKWdPbMv Fixes #171640	2025-12-17 16:30:52 +00:00
Nikita Popov	edb45d8ae4	[SDAG] Allow implicit trunc in BUILD_VECTOR legalization BUILD_VECTOR may have operands larger than the result element type, in which case it is specified to truncate. As such, allow implicit truncation.	2025-12-17 15:22:00 +01:00
Nathan Corbyn	b7a20c1cc4	[GlobalISel] Don't permit G_MIN/G_MAX of pointer vectors (#168872 ) - Use `LLT::changeElementType()` instead of `LLT::changeElementSize()` in `LegalizerHelper::lowerMinMax()` to avoid a crash in the case that the destination type is a pointer vector; - Reject `G_MIN`/`G_MAX` of pointers and pointer vectors in `MachineVerifier`; - Don't combine `G_SELECT`+`G_ICMP` pairs into `G_MIN`/`G_MAX` generic instructions when the operands are pointers / pointer vectors. Fixes #166556	2025-12-17 09:03:41 +00:00
Craig Topper	816c9d64a7	[TargetLowering] Use getNegative. NFC (#172526 ) This also fixes the type for the SUB to be ShVT instead of VT. I guess we only test this when ShVT == VT.	2025-12-16 16:45:18 -08:00
Matt Arsenault	68aea8e202	AMDGPU: Avoid introducing unnecessary fabs in fast fdiv lowering (#172553 ) If the sign bit of the denominator is known 0, do not emit the fabs. Also, extend this to handle min/max with fabs inputs. I originally tried to do this as the general combine on fabs, but it proved to be too much trouble at this time. This is mostly complexity introduced by expanding the various min/maxes into canonicalizes, and then not being able to assume the sign bit of canonicalize (fabs x) without nnan. This defends against future code size regressions in the atan2 and atan2pi library functions.	2025-12-17 00:22:12 +01:00
Matt Arsenault	eb1876c960	DAG: Fix arith_fence handling in SignBitIsZeroFP (#172537 )	2025-12-16 20:10:38 +00:00
Frederik Harwath	6ad41bcc49	[CodeGen] expand-fp: Change frem expansion criterion (#158285 ) The existing condition for checking whether or not to expand an frem instruction in expand-fp is not sufficiently precise. The expansion on other targets than AMDGPU - which is the only intended user right now - is only prevented due to the interaction with the MaxLegalFpConvertBitWidth check. Relying on this is conceptually wrong and limits the use of the pass for other targets and further expansions (e.g. merging with the similar ExpandLargeDivRem pass). Change the expansion criterion to always expand frem of a given type for targets that use "Expand" as the legalization action for the underlying scalar type and use this to exit the pass early for targets which do not require any expansions. This requires to change the frem legalization action for all targets which do not want frem to be expanded in this pass from "Expand" to "LibCall". --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-12-16 17:31:26 +01:00
Usman Nadeem	1ea201d73b	[WoA] Remove extra barriers after ARM LSE instructions with MSVC (#169596 ) `c9821abfc0` added extra fences after sequentially consistent stores for compatibility with MSVC's seq_cst loads (ldr+dmb). These extra fences should not be needed for ARM LSE instructions that have both acquire+release semantics, which results in a two way barrier, and should be enough for sequential consistency. Fixes https://github.com/llvm/llvm-project/issues/162345 Change-Id: I9148c73d0dcf3bf1b18a0915f96cac71ac1800f2	2025-12-15 17:19:40 -08:00
Daniel Paoliello	644fd3b665	[FastISel] Don't select a CallInst as a BasicBlock in the SelectionDAG fallback if it has bundled ops (#162895 ) This was discovered while looking at the codegen for x64 when Control Flow Guard is enabled. When using `SelectionDAG`, LLVM would generate the following sequence for a CF guarded indirect call: ``` leaq target_func(%rip), %rax rex64 jmpq __guard_dispatch_icall_fptr(%rip) # TAILCALL ``` However, when Fast ISel was used the following is generated: ``` leaq target_func(%rip), %rax movq __guard_dispatch_icall_fptr(%rip), %rcx rex64 jmpq %rcx # TAILCALL ``` This was happening despite Fast ISel aborting and falling back to `SelectionDAG`. The root cause for this code gen is that `SelectionDAGISel` has a special case when Fast ISel aborts when lowering a `CallInst` where it tries to lower the instruction as its own basic block, which for such a CF Guard call means that it is lowering an indirect call to `__guard_dispatch_icall_fptr` without observing that the function was being loaded into a pointer in the preceding (and bundled) instruction. The fix for this is to not use the special case when a `CallInst` has bundled instructions: it's better to allow the call and its bundled instructions to be lowered together by `SelectionDAG` instead.	2025-12-15 14:38:20 -08:00
Orlando Cazalet-Hyams	3e32735020	[DWARF] Add support for DW_GNU_call_target_clobbered (#172336 ) Fixes assertion trip introduced in #172167 See https://issues.chromium.org/issues/468825583#comment2	2025-12-15 18:24:21 +00:00
Fabrice de Gans	28e9954a44	llvm: Add missing `VirtualFileSystem.h` include (#171848 ) `vfs::FileSystem` is forward-declared in `SanitizerBinaryMetadata.h`. The corresponding header must be included in any source file that includes that header, or we risk issues when building with `LLVM_BUILD_LLVM_DYLIB` to build LLVM as a DLL on Windows. This effort is tracked in #109483.	2025-12-15 11:45:13 -05:00
Benjamin Maxwell	1847a4efae	[SDAG] Fix incorrect usage of VECREDUCE_ADD (#171459 ) The mask needs to be extended to `i32` before reducing or the reduction can incorrectly optimized to a VECREDUCE_XOR.	2025-12-15 15:01:31 +00:00
Nikita Popov	3f82a8a784	[ExpandFp] Use getSignMask() (NFC) This was using getSigned() with an unsigned (not sign extended) argument. Using plain get() would be correct here. We can go one step further and use getSignMask() to avoid the issue entirely.	2025-12-15 15:44:03 +01:00
Simon Pilgrim	a68fde5780	[DAG] foldAddToAvg - optimize nested m_Reassociatable matchers (#171681 ) The use of nested m_Reassociatable matchers by #169644 can result in high compile times as the inner m_Reassociatable call is being repeated a lot while the outer call is trying to match. Place the inner m_ReassociatableAnd at the beginning of the pattern so it is not repeatedly matched in recursion.	2025-12-15 13:41:02 +00:00
Orlando Cazalet-Hyams	792704038a	[DebugInfo][DWARF] Use DW_AT_call_target_clobbered for exprs with volatile regs (#172167 ) Without this patch DW_AT_call_target is used for all indirect call address location expressions. The DWARF spec says: For indirect calls or jumps where the address is not computable without use of registers or memory locations that might be clobbered by the call the DW_AT_call_target_clobbered attribute is used instead of the DW_AT_call_target attribute. This patch implements that behaviour.	2025-12-15 12:54:18 +00:00
Nathan Corbyn	2f9bf3f292	[GlobalISel](NFC) Refactor construction of LLTs in `LegalizerHelper` (#170664 ) I spotted a number of places where we're duplicating logic provided by the `LLT` class inline in `LegalizerHelper`. This PR tidies up these spots.	2025-12-15 12:26:27 +00:00
Nikita Popov	ce1b04720a	[SelectOptimize] Respect optnone (#170858 ) Add the missing skipFunction() call so that optnone attributes and opt-bisect-limit is respected.	2025-12-15 09:21:02 +01:00
Mingjie Xu	681dbf9941	[WinEH] Use removeIncomingValueIf() in UpdatePHIOnClonedBlock() (NFC) (#171962 )	2025-12-14 09:41:13 +08:00
Craig Topper	0cdc1b6dd4	[SelectionDAG] Support integer types with multiple registers in ComputePHILiveOutRegInfo. (#172081 ) PHIs that are larger than a legal integer type are split into multiple virtual registers that are numbered sequentially. We can propagate the known bits for each of these registers individually. Big endian is not supported yet because the register order needs to be reversed. Fixes #171671	2025-12-13 13:24:41 -08:00
Matt Arsenault	b2d9356719	DAG: Make more use of the LibcallImpl overload of getExternalSymbol (#172171 ) Also add a new copy for TargetExternalSymbol that AArch64 needs.	2025-12-13 19:16:47 +00:00
Orlando Cazalet-Hyams	fa1dceb67f	[DebugInfo][DWARF] Allow memory locations in DW_AT_call_target expressions (#171183 ) Fixes #70949. Prior to PR #151378 memory locations were incorrect; that patch prevented the emission of the incorrect locations. This patch fixes the underlying issue.	2025-12-13 17:37:35 +00:00
Matt Arsenault	d8b03f282a	DAG: Use the LibcallImpl to get calling conv in ExpandDivRemLibCall (#172152 )	2025-12-13 11:41:24 +00:00
Seraphimt	0603d4af1d	Fix misprint in computeKnownFPClass in GISelValueTracking.cpp (#171566 ) Fix wrong value(from Instruction enum) in conditional and add test check. Related with https://github.com/llvm/llvm-project/issues/169959	2025-12-12 20:59:07 +01:00
KRM7	e0e5b6e1f7	[GISel][Inlineasm] Support inlineasm i/s constraint for symbols (#170094 )	2025-12-12 20:16:17 +01:00
Seraphimt	112a6126ef	Fixes non-functional changes found static analyzer (#171197 ) As per @arsenm 's instructions, I've separated the non-functional changes from https://github.com/llvm/llvm-project/pull/169958. Afterwards I'll tackle the functional ones one by one. I hope I did everything right this time. Full descriptions in the article: https://pvs-studio.com/en/blog/posts/cpp/1318/ 3. Array overrun is possible. The PVS-Studio warning: V557 Array overrun is possible. The value of 'regIdx' index could reach 31. VEAsmParser.cpp 696 10. Excessive check. The PVS-Studio warning: V547 Expression 'IsLeaf' is always false. PPCInstrInfo.cpp 419 11. Doubling the same check. The PVS-Studio warning: V581 The conditional expressions of the 'if' statements situated alongside each other are identical. Check lines: 5820, 5823. PPCInstrInfo.cpp 5823 15. Excessive check. The PVS-Studio warning: V547 Expression 'i != e' is always true. MachineFunction.cpp 1444 17. Excessive assignment. The PVS-Studio warning: V1048 The 'FirstOp' variable was assigned the same value. MachineInstr.cpp 1995 18. Excessive check. The PVS-Studio warning: V547 Expression 'AllSame' is always true. SimplifyCFG.cpp 1914 19. Excessive check. The PVS-Studio warning: V547 Expression 'AbbrevDecl' is always true. LVDWARFReader.cpp 398	2025-12-12 20:03:02 +01:00
Nikita Popov	1d7bfb752f	[SafeStack] Use getSigned() for negative value	2025-12-12 11:15:44 +01:00
Sam Tebbs	19e1011df5	[SelectionDAG] Fix unsafe cases for loop.dependence.{war/raw}.mask (#168565 ) Both `LOOP_DEPENDENCE_WAR_MASK` and `LOOP_DEPENDENCE_RAW_MASK` are currently hard to split correctly, and there are a number of incorrect cases. The difficulty comes from how the intrinsics are defined. For example, take `LOOP_DEPENDENCE_WAR_MASK`. It is defined as the OR of: * `(ptrB - ptrA) <= 0` * `elementSize * lane < (ptrB - ptrA)` Now, if we want to split a loop dependence mask for the high half of the mask we want to compute: * `(ptrB - ptrA) <= 0` * `elementSize * (lane + LoVT.getElementCount()) < (ptrB - ptrA)` However, with the current opcode definitions, we can only modify ptrA or ptrB, which may change the result of the first case, which should be invariant to the lane. This patch resolves these cases by adding a "lane offset" to the ISD opcodes. The lane offset is always a constant. For scalable masks, it is implicitly multiplied by vscale. This makes splitting trivial as we increment the lane offset by `LoVT.getElementCount()` now. Note: In the AArch64 backend, we only support zero lane offsets (as other cases are tricky to lower to whilewr/rw). --------- Co-authored-by: Benjamin Maxwell <benjamin.maxwell@arm.com>	2025-12-12 08:44:33 +00:00
Nikita Popov	43a4442fac	[ExpandFp] Fix incorrect ConstantInt construction (#171861 ) Explicitly cast the value to (int) before negating, so it gets properly sign extended. Otherwise we end up with a large unsigned value instead of a negative value for large bit widths. This was found while working on https://github.com/llvm/llvm-project/pull/171456.	2025-12-12 08:54:44 +01:00
Jan Svoboda	8e999e3d78	[llvm][clang] Sandbox filesystem reads (#165350 ) This PR introduces a new mechanism for enforcing a sandbox around filesystem reads coming from the compiler. A fatal error is raised whenever the `llvm::sys::fs`, `llvm::MemoryBuffer::getFile*()` APIs get used directly instead of going through the "blessed" virtual interface of `llvm::vfs::FileSystem`.	2025-12-11 15:42:13 -08:00
Craig Topper	3e414b940a	[FunctionLoweringInfo] Use KnownBits::intersectWith. NFC (#171893 )	2025-12-11 13:21:02 -08:00
Craig Topper	98a8072a65	[FunctionLoweringInfo] Remove unnecesary check for isVectorTy when isIntegerTy is true. NFC (#171880 ) isIntegerTy is only true for scalars.	2025-12-11 13:20:41 -08:00
Alexis Engelke	6813f8f037	[IR] Don't store switch case values as operands SwitchInst case values must be ConstantInt, which have no use list. Therefore it is not necessary to store these as Use, instead store them more efficiently as a simple array of pointers after the uses, similar to how PHINode stores basic blocks. After this change, the successors of all terminators are stored consecutively in the operand list. This is preparatory work for improving the performance of successor access. Add new C API functions so that switch case values remain accessible from bindings for other languages. While this could also be achieved by merely changing the order of operands (i.e., first all successors, then all constants), doing so would increase the asymptotic runtime of addCase from O(1) to O(n) (i.e., adding n cases would be O(n^2)), because it would need to shift all constants by one slot. Having null/invalid operands is also a bad idea and would cause much more breakage. Pull Request: https://github.com/llvm/llvm-project/pull/170984	2025-12-11 18:38:39 +01:00
David Green	8a4cc440f2	[AArch64] Run optimizeTerminators earlier too. (#170907 ) Running optimizeTerminators prior to other optimizations like branch layout can lead to more folding and better codegen, but is not on its own able to capture all cases. There is benefit to running it in both places. This adds the existing code from #161508 into the AArch64RedundantCopyElimination pass, which sounds like a sensible enough place for it. This is a recommit with an extra fix for shrink-wrapping domtree use.	2025-12-11 15:33:15 +00:00
Ramkumar Ramachandra	85fafd5db0	[SCEVExp] Get DL from SE, strip constructor arg (NFC) (#171823 )	2025-12-11 14:26:47 +00:00
Matt Arsenault	a3aaa1a391	DAG: Use RuntimeLibcalls to legalize vector frem calls (#170719 ) This continues the replacement of TargetLibraryInfo uses in codegen with RuntimeLibcallsInfo started in 821d2825a4f782da3da3c03b8a002802bff4b95c. The series there handled all of the multiple result calls. This extends for the other handled case, which happened to be frem. For some reason the Libcall for these are prefixed with "REM_", for the instruction "frem", which maps to the libcall "fmod".	2025-12-11 13:33:27 +00:00
Nikita Popov	d33d80fae6	[FastISel] Don't force SDAG fallback for libcalls (#171782 ) The fast instruction selector should should not force an SDAG fallback to potentially make use of optimized libcall implementations. Looking at `3e6fa462f3`, part of the motivation was to avoid libcalls in unoptimized builds for targets that don't have them, but I believe this should be handled by Clang directly emitting intrinsics instead of libcalls (which it already does). FastISel should not second guess this. Followup to https://github.com/llvm/llvm-project/pull/171288.	2025-12-11 14:14:06 +01:00
JaydeepChauhan14	9b6b52b534	[AsmPrinter][NFC] Reuse Target Triple variable (#171612 )	2025-12-11 12:28:59 +01:00
Shubham Sandeep Rastogi	16e6055273	Revert "[SelectionDAG] Salvage debuginfo when combining load and sext… (#171745 ) … instrs. (#169779)" This reverts commit 2b958b9ee24b8ea36dcc777b2d1bcfb66c4972b6. I might have broken the sanitizer-x86_64-linux bot /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_procmaps_linux.cpp clang++: /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:248: const T &llvm::ArrayRef<llvm::DbgValueLocEntry>::operator[](size_t) const [T = llvm::DbgValueLocEntry]: Assertion `Index < Length && "Invalid index!"' failed.	2025-12-10 16:49:59 -08:00

1 2 3 4 5 ...

38854 Commits