llvm-project

Author	SHA1	Message	Date
Daniel Thornburgh	bd4dd9ca02	Add a generic reloc_none test	2025-08-18 11:37:40 -07:00
Daniel Thornburgh	5a056672cb	Take symbol name by metadata arg rather than ptr to GlobalValue	2025-08-18 11:37:40 -07:00
Daniel Thornburgh	0e830ad120	fake.use -> reloc.none	2025-08-18 11:37:40 -07:00
Daniel Thornburgh	f3fdeff3e3	[IR] llvm.reloc.none intrinsic for no-op symbol references This intrinsic emits a BFD_RELOC_NONE relocation at the point of call, which allows optimizations and languages to explicitly pull in symbols from static libraries without there being any code or data that has an effectual relocation against such a symbol. See issue #146159 for context.	2025-08-18 11:37:40 -07:00
Krzysztof Parzyszek	8429f7faaa	[flang][OpenMP] Parsing support for DYN_GROUPPRIVATE (#153615 ) This does not perform semantic checks or lowering.	2025-08-18 13:35:02 -05:00
Steven Perron	0fb1057e40	[SPIRV] Filter disallowed extensions for env (#150051 ) Not all SPIR-V extensions are allows in every environment. When we use the `-spirv-ext=all` option, the backend currently believes that all extensions can be used. This commit filters out the extensions on the command line to remove those that are not known to be allowed for the current environment. Alternatives considered: I considered modifying the SPIRVExtensionsParser::parse to use a different list of extensions for "all" depending on the target triple. However that does not work because the target triple is not available, and cannot be made available in a reasonable way. Fixes #147717 --------- Co-authored-by: Victor Lomuller <victor@codeplay.com>	2025-08-18 18:33:58 +00:00
Thurston Dang	ade755d62b	[msan] Add Instrumentation for Avx512 Instructions: pmaddw, pmaddubs (#153919 ) This applies the pmadd handler (recently improved in https://github.com/llvm/llvm-project/pull/153353) to the Avx512 equivalent of the pmaddw and pmaddubs intrinsics: <16 x i32> @llvm.x86.avx512.pmaddw.d.512(<32 x i16>, <32 x i16>) <32 x i16> @llvm.x86.avx512.pmaddubs.w.512(<64 x i8>, <64 x i8>)	2025-08-18 11:31:15 -07:00
Kyle Wang	064f02dac0	[VectorCombine] Preserve scoped alias metadata (#153714 ) Right now if a load op is scalarized, the `!alias.scope` and `!noalias` metadata are dropped. This PR is to keep them if exist.	2025-08-18 18:16:32 +00:00
Jordan Rupprecht	8d256733a0	[bazel] Port #151175 : VectorFromElementsLowering (#154169 )	2025-08-18 13:07:05 -05:00
Brox Chen	d49aab10bd	Revert "[AMDGPU][True16][CodeGen] use vgpr16 for zext patterns (#1538… (#154163 ) This reverts commit 7c53c6162bd43d952546a3ef7d019babd5244c29. This patch hit an issue in hip test. revert and will reopen later	2025-08-18 14:01:19 -04:00
Shaoce SUN	7e8ff2afa9	[RISCV][GISel] Optimize +0.0 to use fcvt.d.w for s64 on rv32 (#153978 ) Resolve the TODO: on RV32, when constructing the double-precision constant `+0.0` for `s64`, `BuildPairF64Pseudo` can be optimized to use the `fcvt.d.w` instruction to generate the result directly.	2025-08-18 17:52:24 +00:00
Justin Fargnoli	58de8f2c25	[Inliner] Add option (default off) to inline all calls regardless of the cost (#152365 ) Add a default off option to the inline cost calculation to always inline all viable calls regardless of the cost/benefit and cost/threshold calculations. For performance reasons, some users require that all calls be inlined. Rather than forcing them to adjust the inlining threshold to an arbitrarily high value, offer an option to inline all calls.	2025-08-18 17:48:49 +00:00
LauraElanorJones	350f4a3e3b	Decent to Descent (#154040 ) [lldb] Rename RecursiveDecentFormatter to RecursiveDescentFormatter (NFC)	2025-08-18 12:47:14 -05:00
Krzysztof Drewniak	7f27482a32	[AMDGPU][LowerBufferFatPointers] Fix lack of rewrite when loading/storing null (#154128 ) Fixes #154056. The fat buffer lowering pass was erroniously detecting that it did not need to run on functions that only load/store to the null constant (or other such constants). We thought this would be covered by specializing constants out to instructions, but that doesn't account foc trivial constants like null. Therefore, we check the operands of instructions for buffer fat pointers in order to find such constants and ensure the pass runs. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2025-08-18 12:32:54 -05:00
Shafik Yaghmour	99829573cc	[Clang][Webassembly] Remove unrachable code in ParseTypeQualifierListOpt (#153729 ) Static analysis flagged this goto as unreachable and indeed it is, so removing it.	2025-08-18 10:27:37 -07:00
Aiden Grossman	6960bf556c	[Github] Drop llvm-project-tests All users of this have been claned up so we can now drop it fully. Reviewers: cmtice, tstellar Reviewed By: cmtice Pull Request: https://github.com/llvm/llvm-project/pull/153877	2025-08-18 10:20:31 -07:00
Panagiotis Karouzakis	c2e7fad446	[DemandedBits] Support non-constant shift amounts (#148880 ) This patch adds support for the shift operators to handle non-constant shift operands. ashr proof -->https://alive2.llvm.org/ce/z/EN-siK lshr proof --> https://alive2.llvm.org/ce/z/eeGzyB shl proof --> https://alive2.llvm.org/ce/z/dpvbkq	2025-08-19 01:11:16 +08:00
Yang Bai	4eb1a07d7d	[mlir][vector] Support multi-dimensional vectors in VectorFromElementsLowering (#151175 ) This patch introduces a new unrolling-based approach for lowering multi-dimensional `vector.from_elements` operations. Implementation Details: 1. New Transform Pattern: Added `UnrollFromElements` that unrolls a N-D(N>=2) from_elements op to a (N-1)-D from_elements op align the outermost dimension. 2. Utility Functions: Added `unrollVectorOp` to reuse the unroll algo of vector.gather for vector.from_elements. 3. Integration: Added the unrolling pattern to the convert-vector-to-llvm pass as a temporal transformation. 4. Use direct LLVM dialect operations instead of intermediate vector.insert operations for efficiency in `VectorFromElementsLowering`. Example: ```mlir // unroll %v = vector.from_elements %e0, %e1, %e2, %e3 : vector<2x2xf32> => %poison_2d = ub.poison : vector<2x2xf32> %vec_1d_0 = vector.from_elements %e0, %e1 : vector<2xf32> %vec_2d_0 = vector.insert %vec_1d_0, %poison_2d [0] : vector<2xf32> into vector<2x2xf32> %vec_1d_1 = vector.from_elements %e2, %e3 : vector<2xf32> %result = vector.insert %vec_1d_1, %vec_2d_0 [1] : vector<2xf32> into vector<2x2xf32> // convert-vector-to-llvm %v = vector.from_elements %e0, %e1, %e2, %e3 : vector<2x2xf32> => %poison_2d = ub.poison : vector<2x2xf32> %poison_2d_cast = builtin.unrealized_conversion_cast %poison_2d : vector<2x2xf32> to !llvm.array<2 x vector<2xf32>> %poison_1d_0 = llvm.mlir.poison : vector<2xf32> %c0_0 = llvm.mlir.constant(0 : i64) : i64 %vec_1d_0_0 = llvm.insertelement %e0, %poison_1d_0[%c0_0 : i64] : vector<2xf32> %c1_0 = llvm.mlir.constant(1 : i64) : i64 %vec_1d_0_1 = llvm.insertelement %e1, %vec_1d_0_0[%c1_0 : i64] : vector<2xf32> %vec_2d_0 = llvm.insertvalue %vec_1d_0_1, %poison_2d_cast[0] : !llvm.array<2 x vector<2xf32>> %poison_1d_1 = llvm.mlir.poison : vector<2xf32> %c0_1 = llvm.mlir.constant(0 : i64) : i64 %vec_1d_1_0 = llvm.insertelement %e2, %poison_1d_1[%c0_1 : i64] : vector<2xf32> %c1_1 = llvm.mlir.constant(1 : i64) : i64 %vec_1d_1_1 = llvm.insertelement %e3, %vec_1d_1_0[%c1_1 : i64] : vector<2xf32> %vec_2d_1 = llvm.insertvalue %vec_1d_1_1, %vec_2d_0[1] : !llvm.array<2 x vector<2xf32>> %result = builtin.unrealized_conversion_cast %vec_2d_1 : !llvm.array<2 x vector<2xf32>> to vector<2x2xf32> ``` --------- Co-authored-by: Nicolas Vasilache <Nico.Vasilache@amd.com> Co-authored-by: Yang Bai <yangb@nvidia.com> Co-authored-by: James Newling <james.newling@gmail.com> Co-authored-by: Diego Caballero <dieg0ca6aller0@gmail.com>	2025-08-18 10:09:12 -07:00
Tobias Stadler	8135b7c1ab	[LV] Emit all remarks for unvectorizable instructions (#153833 ) If ExtraAnalysis is requested, emit all remarks caused by unvectorizable instructions - instead of only the first. This is in line with how other places handle DoExtraAnalysis and it can be quite helpful to get info about all instructions in a loop that prevent vectorization.	2025-08-18 18:04:53 +01:00
Ramkumar Ramachandra	97f554249c	[VPlan] Preserve nusw in createInBoundsPtrAdd (#151549 ) Rename createInBoundsPtrAdd to createNoWrapPtrAdd, and preserve nusw as well as inbounds at the callsite.	2025-08-18 17:48:42 +01:00
Andreas Jonson	1b60236200	[SimplifyCFG] Avoid redundant calls in gather. (NFC) (#154133 ) Split out from https://github.com/llvm/llvm-project/pull/154007 as it showed compile time improvements NFC as there needs to be at least two icmps that is part of the chain.	2025-08-18 18:45:52 +02:00
Nishant Patel	4a9d038acd	[MLIR][XeGPU] Distribute load_nd/store_nd/prefetch_nd with offsets from Wg to Sg (#153432 ) This PR adds pattern to distribute the load/store/prefetch nd ops with offsets from workgroup to subgroup IR. This PR is part of the transition to move offsets from create_nd to load/store/prefetch nd ops. Create_nd PR : #152351	2025-08-18 09:45:29 -07:00
LLVM GN Syncbot	d6e0922a5e	[gn build] Port 3ecfc0330d93	2025-08-18 16:02:02 +00:00
Damyan Pepper	cc49f3b3e1	[NFC][HLSL] Remove confusing enum aliases / duplicates (#153909 ) Remove: * DescriptorType enum - this almost exactly shadowed the ResourceClass enum * ClauseType aliased ResourceClass Although these were introduced to make the HLSL root signature handling code a bit cleaner, they were ultimately causing confusion as they appeared to be unique enums that needed to be converted between each other. Closes #153890	2025-08-18 08:58:33 -07:00
Yitzhak Mandelbaum	3ecfc0330d	[clang][dataflow] Add support for serialization and deserialization. (#152487 ) Adds support for compact serialization of Formulas, and a corresponding parse function. Extends Environment and AnalysisContext with necessary functions for serializing and deserializing all formula-related parts of the environment.	2025-08-18 11:55:12 -04:00
Jeremy Kun	c67d27dad0	[mlir][Presburger] NFC: return var index from IntegerRelation::addLocalFloorDiv (#153463 ) addLocalFloorDiv currently returns void and requires the caller to know that the newly added local variable is in a particular index. This commit returns the index of the newly added variable so that callers need not tie themselves to this implementation detail. I found one relevant callsite demonstrating this and updated it. I am using this API out of tree and wanted to make our out-of-tree code a bit more resilient to upstream changes.	2025-08-18 08:47:47 -07:00
Antonio Frighetto	33761df961	Revert "[SimpleLoopUnswitch] Record loops from unswitching non-trivial conditions" This reverts commit e9de32fd159d30cfd6fcc861b57b7e99ec2742ab due to multiple performance regressions observed across downstream Numba benchmarks (https://github.com/llvm/llvm-project/issues/138509#issuecomment-3193855772). While avoiding non-trivial unswitches on newly-cloned loops helps mitigate the pathological case reported in https://github.com/llvm/llvm-project/issues/138509, it may as well make the IR less friendly to vectorization / loop- canonicalization (in the test reported, previously no select with loop-carried dependence existed in the new specialized loops), leading the abovementioned approach to be reconsidered.	2025-08-18 17:40:08 +02:00
Aiden Grossman	17f5f5ba55	[X86] Avoid Register implicit int conversion PushedRegisters in this patch needs to be of type int64_t because iot is grabbing registers from immediate operands of pseudo instructions. However, we then compare to an actual register type later, which relies on the implicit conversion within Register to int, which can result in build failures in some configurations.	2025-08-18 15:37:25 +00:00
黃國庭	0773854575	[DAG] Fold trunc(avg(x,y)) for avgceil/floor u/s nodes if they have sufficient leading zero/sign bits (#152273 ) avgceil version : https://alive2.llvm.org/ce/z/2CKrRh Fixes #147773 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-18 16:36:26 +01:00
Alex MacLean	d12f58ff11	[NVVM] Add various intrinsic attrs, cleanup and consolidate td (#153436 ) - llvm.nvvm.reflect - Use a PureIntrinsic for (adding speculatable), this will be replaced by a constant prior to lowering so speculation is fine. - llvm.nvvm.tex.* - Add [IntrNoCallback, IntrNoFree, IntrWillReturn] - llvm.nvvm.suld.* - Add [IntrNoCallback, IntrNoFree] and [IntrWillReturn] when not using "clamp" mode - llvm.nvvm.sust.* - Add [IntrNoCallback, IntrNoFree, IntrWriteMem] and [IntrWillReturn] when not using "clamp" mode - llvm.nvvm.[suq\|txq\|istypep].* - Use DefaultAttrsIntrinsic - llvm.nvvm.read.ptx.sreg.* - Add [IntrNoFree, IntrWillReturn] to non-constant reads as well.	2025-08-18 08:33:23 -07:00
Andres-Salamanca	916218ccbd	[CIR] Upstream GotoOp (#153701 ) This PR upstreams `GotoOp`. It moves some tests from the `goto` test file to the `label` test file, and adds verify logic to `FuncOp`. The gotosSolver, required for lowering, will be implemented in a future PR.	2025-08-18 10:25:40 -05:00
Craig Topper	60aa0d4bfc	[RISCV] Add P-ext MC support for pli.dh, pli.db, and plui.dh. (#153972 ) Refactor the pli.b/h/w and plui.h/w tablegen classes.	2025-08-18 08:23:14 -07:00
Jacques Pienaar	4bf33958da	[mlir] Update builders to use new form. (#154132 ) Mechanically applied using clang-tidy.	2025-08-18 15:19:34 +00:00
Jay Foad	f15c6ff6cb	[AMDGPU] Make use of SIInstrInfo::isWaitcnt. NFC. (#154087 )	2025-08-18 16:18:46 +01:00
Timm Baeder	6ce13ae1c2	[clang][bytecode] Always track item types in InterpStack (#151088 ) This has been a long-standing problem, but we didn't use to call the destructors of items on the stack unless we explicitly `pop()` or `discard()` them. When interpretation was interrupted midway-through (because something failed), we left `Pointer`s on the stack. Since all `Block`s track what `Pointer`s point to them (via a doubly-linked list in the `Pointer`), that meant we potentially leave deallocated pointers in that list. We used to work around this by removing the `Pointer` from the list before deallocating the block. However, we now want to track pointers to global blocks as well, which poses a problem since the blocks are never deallocated and thus those pointers are always left dangling. I've tried a few different approaches to fixing this but in the end I just gave up on the idea of never knowing what items are in the stack. We already have an `ItemTypes` vector that we use for debugging assertions. This patch simply enables this vector unconditionally and uses it in the abort case to properly `discard()` all elements from the stack. That's a little sad IMO but I don't know of another way of solving this problem. As expected, this is a slight hit to compile times: https://llvm-compile-time-tracker.com/compare.php?from=574d0a92060bf4808776b7a0239ffe91a092b15d&to=0317105f559093cfb909bfb01857a6b837991940&stat=instructions:u	2025-08-18 17:15:31 +02:00
AZero13	08a140add8	[AArch64] Fix build-bot assertion error in AArch64 (#154124 ) Fixes build bot assertion. I forgot to include logic that will be added in a future PR that handles -1 correctly. For now, let's just return nullptr like we used to.	2025-08-18 15:12:07 +00:00
William Tran-Viet	1c51886920	[libc++] Implement P3168R2: Give optional range support (#149441 ) Resolves #105430 - Implement all required pieces of P3168R2 - Leverage existing `wrap_iter` and `bounded_iter` classes to implement the `optional` regular and hardened iterator type, respectively - Update documentation to match	2025-08-18 18:04:45 +03:00
Tiger Ding	4ab14685a0	[AMDGPU] Narrow only on store to pow of 2 mem location (#150093 ) Lowering in GlobalISel for AMDGPU previously always narrows to i32 on truncating store regardless of mem size or scalar size, causing issues with types like i65 which is first extended to i128 then stored as i64 + i8 to i128 locations. Narrowing only on store to pow of 2 mem location ensures only narrowing to mem size near end of legalization. This LLVM defect was identified via the AMD Fuzzing project.	2025-08-19 00:04:27 +09:00
Brox Chen	7c53c6162b	[AMDGPU][True16][CodeGen] use vgpr16 for zext patterns (#153894 ) Update true16 mode with zext patterns using vgpr16 for 16bit data types. This stop isel from inserting invalid "vgpr32 = copy vgpr16"	2025-08-18 11:01:57 -04:00
David Green	03912a1de5	[GlobalISel] Translate scalar sequential vecreduce.fadd/fmul as fadd/fmul. (#153966 ) A llvm.vector.reduce.fadd(float, <1 x float>) will be translated to G_VECREDUCE_SEQ_FADD with two scalar operands, which is illegal according to the verifier. This makes sure we generate a fadd/fmul instead.	2025-08-18 14:59:44 +00:00
LLVM GN Syncbot	f4b5c24022	[gn build] Port e6e874ce8f05	2025-08-18 14:52:19 +00:00
LLVM GN Syncbot	ad064bc5c3	[gn build] Port a0f325bd41c9	2025-08-18 14:52:18 +00:00
erichkeane	ec227050e3	[OpenACC] Fix verify lines from 8fc80519cdb97c Like a big dummy, I completely skipped running this test locally and forgot it would need check lines. sigh, Looks like SOMEONE has a case of the Mondays! Anyway, this patch fixes it by adding the proper verify lines.	2025-08-18 07:49:38 -07:00
Craig Topper	98e8f01d18	[RISCV] Rename MIPS_PREFETCH->MIPS_PREF. NFC (#154062 ) This matches the instruction's assembler mnemonic.	2025-08-18 07:38:10 -07:00
erichkeane	8fc80519cd	[OpenACC] Fix crash on error recovery of variable in OpenACC mode As reported, OpenACC's variable declaration handling was assuming some semblence of legality in the example, so it didn't properly handle an error case. This patch fixes its assumptions so that we don't crash. Fixes #154008	2025-08-18 07:37:45 -07:00
Timm Baeder	8f0da9b8bd	[clang][bytecode] Disable EndLifetime op for array elements (#154119 ) This breaks a ton of libc++ tests otherwise, since calling std::destroy_at will currently end the lifetime of the entire array not just the given element. See https://github.com/llvm/llvm-project/issues/147528	2025-08-18 16:32:50 +02:00
David Green	8b52e5ac22	[AArch64] Update and cleanup irtranslator-reductions.ll. NFC	2025-08-18 15:30:23 +01:00
erichkeane	0dbcdf33b8	[OpenACC] Fix racing commit test failures for firstprivate lowering The original patch to implement basic lowering for firstprivate didn't have the Sema work to change the name of the variable being generated from openacc.private.init to openacc.firstprivate.init. I forgot about that when I merged the Sema changes this morning, so the tests now failed. This patch fixes those up. Additionally, Suggested on #153622 post-commit, it seems like a good idea to use a size of APInt that matches the size-type, so this changes us to use that instead.	2025-08-18 07:26:50 -07:00
Aaron Ballman	f5dc3021cd	[C] Fix failing assertion with designated inits (#154120 ) Incompatible pointer to integer conversion diagnostic checks would trigger an assertion when the designated initializer is for an array of unknown bounds. Fixes #154046	2025-08-18 14:22:31 +00:00
Connector Switch	b368e7f6a5	[flang] optimize `acosd` precision (#154118 ) Part of https://github.com/llvm/llvm-project/issues/150452.	2025-08-18 14:15:52 +00:00

1 2 3 4 5 ...

548990 Commits