llvm-project

Author	SHA1	Message	Date
Michael Buch	c68b4d64dd	[lldb][ClangASTImporter][NFC] Create helper for CanImport Upstreams a `CanImport` helper for `clang::Decl`s.	2025-08-13 10:22:23 +01:00
Michael Buch	89681839e3	[lldb][ClangASTImporter][NFC] Factor out completion logic out of ClangASTImporterDelegate Upstreams two helpers that make this more readable.	2025-08-13 10:22:23 +01:00
Ahmed Bougacha	8c8f3286a7	[compiler-rt] Don't run arm64e builtins tests on darwin. (#153312 ) The compiler-rt build gradually learned to target arm64e. With that, we build builtins for arm64e, but running their tests usually isn't possible, because most versions of macOS so far restrict arm64e (on account of its unstable ABI). Starting with macOS 26, arm64e executables can be run, because the aligned linker automatically targets ptrauth ABI version 1. Without that, (at ABI version 0) these can't be executed. We can't rely or require new linkers (and we elsewhere explicitly fallback to ld classic anyway), so in the meantime one way to execute these would be to explicitly ask for ABI version 1, which we generally try to avoid, and don't support in our llvm (which unconditionally targets ABI version 0). This is also an uncommon situation; sanitizer runtime tests aren't run on arm64e today, because we haven't listed arm64e as a supported arch yet. Everything other than builtins also tests for execution in cmake first; we should consider that, but it has its own problems. So we can simply disable arm64e from tests, by filtering it out as a valid darwin host arch, which accurately reflects reality. When we try to add arm64e sanitizer runtime build and test support, we'll want to change that, but that's a bigger problem than builtins.	2025-08-13 10:21:34 +01:00
Adam Siemieniuk	7d1b9cad87	[mlir][amx] Vector to AMX conversion pass (#151121 ) Adds a pass for Vector to AMX operation conversion. Initially, a direct rewrite for vector contraction in packed VNNI layout is supported. Operations are expected to already be in shapes which are AMX-compatible for the rewriting to occur.	2025-08-13 11:08:52 +02:00
Nikita Popov	240c454c4d	[CodeGen] Remove default ctors for InputArg and OutputArg (#153205 ) These make it easy to forget to initialize some members, like the newly added OrigTy. Force these to always go through the ctor instead.	2025-08-13 10:51:43 +02:00
David Spickett	b563b274b8	[lldb] Convert registers values into target endian for expressions (#148836 ) Relates to https://github.com/llvm/llvm-project/issues/135707 Where it was reported that reading the PC using "register read" had different results to an expression "$pc". This was happening because registers are treated in lldb as pure "values" that don't really have an endian. We have to store them somewhere on the host of course, so the endian becomes host endian. When you want to use a register as a value in an expression you're pretending that it's a variable in memory. In target memory. Therefore we must convert the register value to that endian before use. The test I have added is based on the one used for XML register flags. Where I fake an AArch64 little endian and an s390x big endian target. I set up the data in such a way the pc value should print the same for both, either with register read or an expression. I considered just adding a live process test that checks the two are the same but with on one doing cross endian testing, I doubt it would have ever caught this bug. Simulating this means most of the time, little endian hosts will test little to little and little to big. In the minority of cases with a big endian host, they'll check the reverse. Covering all the combinations.	2025-08-13 09:48:29 +01:00
David Spickett	dc41571cd8	[llvm][docs] Update CMake commands for cross compiling Arm builtins (#151544 ) This does a few things: * LLVM_CONFIG_PATH is deprecated, use LLVM_CMAKE_DIR instead. * Don't use $ before command examples. I would normally, but the key cmake commands didn't use it so I removed it from all commands. * Makes the commands shown full commands, so you don't have to piece them together. * Uses shell variables to cut down on repetition and make this easier to port to other targets. * Adds a few options to disable more compiler-rt things. * Use the built in cmake options for sysroot and toolchains. * Include test options in the first cmake command, so you don't have to re-do the whole thing after you read the testing section. * Removes the section about using BaremetalARM.cmake. The closest I got to getting that cache to work was: ``` SYSROOT=/home/david.spickett/arm-gnu-toolchain-14.3.rel1-x86_64-arm-none-eabi/arm-none-eabi/libc LLVM_TOOLCHAIN=/home/david.spickett/LLVM-20.1.8-Linux-X64/ cmake \ -G Ninja \ -DCMAKE_C_COMPILER=${LLVM_TOOLCHAIN}/bin/clang \ -DBAREMETAL_ARMV6M_SYSROOT=${SYSROOT} \ -DBAREMETAL_ARMV7M_SYSROOT=${SYSROOT} \ -DBAREMETAL_ARMV7EM_SYSROOT=${SYSROOT} \ -DCMAKE_BUILD_TYPE=Release \ -DLLVM_ENABLE_RUNTIMES="compiler-rt" \ -C ../llvm-project/clang/cmake/caches/BaremetalARM.cmake \ -DCOMPILER_RT_BUILD_BUILTINS=ON \ -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \ -DCOMPILER_RT_BUILD_MEMPROF=OFF \ -DCOMPILER_RT_BUILD_PROFILE=OFF \ -DCOMPILER_RT_BUILD_CTX_PROFILE=OFF \ -DCOMPILER_RT_BUILD_SANITIZERS=OFF \ -DCOMPILER_RT_BUILD_XRAY=OFF \ -DCOMPILER_RT_BUILD_ORC=OFF \ -DCOMPILER_RT_BUILD_CRT=OFF \ ../llvm-project/runtimes ``` All this does is build the x86 builtins. I tried forcing the issue with: ``` -DBUILTIN_SUPPORTED_ARCH="armv7m;armv6m;armv7em" \ ``` But again, just x86. It's probably something deep in compiler-rt failing a compiler check for the Arm targets. Even if that's the case, fixing that means adding more options to the cmake command. I can't find evidence of a full command using this cache file since the commit that introduced it and that command no longer works. I think if you ever got this to work again the command would be as long and complex as the ones already shown in the document. I would also argue that some of the other caches, for example Fuschia's, are much better example of multi-target runtimes builds. If what's in this document isn't enough, folks should be learning from those files and about the runtimes build overall before attempting anything complex (though it does not take much to be "complex").	2025-08-13 09:47:43 +01:00
Diana Picus	420a5de1a4	[AMDGPU] Ignore inactive VGPRs in .vgpr_count (#149052 ) When using the `amdgcn.init.whole.wave` intrinsic, we add dummy VGPR arguments with the purpose of preserving their inactive lanes. The pattern may look something like this: ``` entry: call amdgcn.init.whole.wave branch to shader or tail shader: $vInactive = IMPLICIT_DEF ; Tells regalloc it's safe to use the active lanes actual code... tail: call amdgcn.cs.chain [...], implicit $vInactive ``` We should not report these VGPRs in the `.vgpr_count` metadata. This patch achieves that goal by ignoring meta instructions and calls. This should be safe since if those registers are actually used in any other context, they will be counted there. The same reasoning applies in the general case, so we don't explicitly check for the existence of `init.whole.wave`. This is a reworked version of #133242, which was reverted in #144039 and split into smaller bits.	2025-08-13 10:47:00 +02:00
Ryotaro Kasuga	bf6796fa8f	[DA] Extract duplicated logic from exactSIVtest and exactRDIVtest (NFC) (#152712 ) This patch refactors `exactSIVtest` and `exactRDIVtest` by consolidating duplicated logic into a single function. Same as #152688, the main goal is to improve code maintainability, since extra validation logic (as written in TODO comments) may be necessary.	2025-08-13 17:45:28 +09:00
Timm Baeder	56131e3959	[clang][bytecode] Diagnose incomplete types more consistently (#153368 ) To match the diagnostics of the current interpreter.	2025-08-13 10:40:21 +02:00
Nikolas Klauser	78636be4d6	[libc++] Move more tests into test/extensions (#152975 ) This should be the last set of tests moved to `test/extensions` for now.	2025-08-13 10:14:24 +02:00
Nikolas Klauser	3ca414b63a	[libc++] Move some standard tests from test/libcxx (#152982 ) This also removes some tests which were redundant, wrong, or never run. Specifically, - `libcxx/utilities/meta/stress_tests/*` were never run and are of questionable usefulness - `libcxx/utilities/template.bitset/includes.pass.cpp` is completely redundant and partially incorrect Also notably, `libcxx/language.support/support.c.headers/support.c.headers.other/math.lerp.verify.cpp` has been refactored to only test the standard mandate.	2025-08-13 10:13:46 +02:00
Simon Pilgrim	267f592ca0	[Headers][X86] Allow _mm_cmov_si128/_mm256_cmov_si256 intrinsics to be used in constexpr (#153236 )	2025-08-13 08:53:26 +01:00
Benjamin Maxwell	271688b87a	[AArch64][SME] Port all SME routines to RuntimeLibcalls (#152505 ) This updates everywhere we emit/check an SME routines to use RuntimeLibcalls to get the function name and calling convention. Note: RuntimeLibcallEmitter had some issues with emitting non-unique variable names for sets of libcalls, so I tweaked the output to avoid the need for variables.	2025-08-13 08:48:59 +01:00
Mel Chen	b9138bde35	[LV][EVL] More lit tests for interleaved access. nfc (#152959 ) Add test cases for reverse interleaved access and interleaved access with gap.	2025-08-13 15:43:39 +08:00
Jasmine Tang	d32793ca6e	Revert "[WebAssembly] Combine i128 to v16i8 for setcc & expand memcmp for 16 byte loads with simd128" (#153360 ) Reverts llvm/llvm-project#149461 The first test w/ memcmp in `test/neon/test_neon_wasm_simd.cpp` in the Emscripten test suite has failed. This PR applies a revert so I can take a closer look at it Test case link: https://github.com/emscripten-core/emscripten/blob/main/test/neon/test_neon_wasm_simd.cpp Compile option: `em++ test_neon_wasm_simd.cpp -O2 -mfpu=neon -msimd128 -o something.js` Original comment report: https://github.com/llvm/llvm-project/pull/149461#issuecomment-3181652746	2025-08-13 07:41:44 +00:00
Florian Hahn	48bfaa4c06	[VPlan] Replace VPBB for vector.ph during skeleton creation (NFC) Shift replacement of regular VPBB for vector.ph with the VPIRBB wrapping the created IR block directly to skeleton creation, to be consistent with how the scalar preheader is handled.	2025-08-13 08:30:18 +01:00
Aiden Grossman	dfe18b1a0e	[libcxx] Bump clang version to v22 (#153264 ) Clang tip of tree is now v22, so bump the versions based on that now that we have an updated container image. --------- Co-authored-by: Nikolas Klauser <nikolasklauser@berlin.de>	2025-08-13 09:26:42 +02:00
Abhishek Kaushik	2415e3b3bf	[NFC][MC][GOFF] Use `llvm_unreachable` for unreachable case (#152930 )	2025-08-13 12:56:12 +05:30
Aiden Grossman	7f4d201db4	[libcxx] Bump container image to 77cb098 (#153095 ) Switch to the next runner set to evaluate switching the container image to 77cb098.	2025-08-13 09:24:02 +02:00
Matt Arsenault	db126d8004	CodeGen: Make MachineFunction's subtarget member a reference (#153352 )	2025-08-13 16:22:32 +09:00
yanming	02ab6f358c	[flang][fir][NFC] unify flang's code style with the rest.	2025-08-13 15:11:06 +08:00
Sergei Barannikov	8f3254aa4a	[TableGen][DecoderEmitter] Returns insn_t / std::vector<Islands> by value (NFC) (#153354 ) The containers passed by reference are always empty on entry to the functions that fill them. Return them by value instead and let the compiler do the return value optimization.	2025-08-13 07:09:13 +00:00
Valentin Clement (バレンタインクレメン)	2ae4e95dda	[flang][cuda] Add bind name for __ddiv_XX interfaces (#153271 )	2025-08-12 23:30:43 -07:00
Valentin Clement (バレンタインクレメン)	60170f92a3	[flang][cuda] Add missing interface for __powf (#153294 ) `__powf` is defined in the CUDA Fortran programming guide but it's missing from our cudadevice module. Add the interface and bind name to `__nv_powf` https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/index.html#fortran-device-modules https://docs.nvidia.com/cuda/libdevice-users-guide/__nv_powf.html#__nv_powf	2025-08-12 23:08:41 -07:00
Ryotaro Kasuga	bce0f9d2bf	[DA] Extract duplicated logic from gcdMIVtest (NFCI) (#152688 ) This patch refactors `gcdMIVtest` by consolidating duplicated logic into a single function. The main goal of this change is to improve code maintainability rather than readability, especially since we may need to revise this logic for correctness (as noted in the added TODO comments). I hope this patch is NFC, but I've also added several new assertions, which may cause some previously passing cases to fail.	2025-08-13 15:07:50 +09:00
Valentin Clement (バレンタインクレメン)	09505b11e5	[flang][cuda] Add missing interface for __cosf (#153306 ) `__cosf` is mentioned to be supported here: https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/index.html#fortran-device-modules Add the missing interface with a bind c name linking it to `__nv_cosf`	2025-08-12 22:48:45 -07:00
Fangrui Song	04eb5e0cd4	test: Add REQUIRES: riscv	2025-08-12 22:39:57 -07:00
Fangrui Song	94655dc8ae	[ELF] -r: Synthesize R_RISCV_ALIGN at input section start" (#151639 ) Clear `synthesizedAligns` to prevent stray relocations to an unrelated text section. Enhance the test to check llvm-readelf -r output. --- Without linker relaxation enabled for a particular relocatable file or section (e.g., using .option norelax), the assembler will not generate R_RISCV_ALIGN relocations for alignment directives. This becomes problematic in a two-stage linking process: ``` ld -r a.o b.o -o ab.o // b.o is norelax. Its alignment information is lost in ab.o. ld ab.o -o ab ``` When ab.o is linked into an executable, the preceding relaxed section (a.o's content) might shrink. Since there's no R_RISCV_ALIGN relocation in b.o for the linker to act upon, the `.word 0x3a393837` data in b.o may end up unaligned in the final executable. To address the issue, this patch inserts NOP bytes and synthesizes an R_RISCV_ALIGN relocation at the beginning of a text section when the alignment >= 4. For simplicity, when RVC is disabled, we synthesize an ALIGN relocation (addend: 2) for a 4-byte aligned section, allowing the linker to trim the excess 2 bytes. See also https://sourceware.org/bugzilla/show_bug.cgi?id=33236	2025-08-12 22:38:17 -07:00
Valentin Clement (バレンタインクレメン)	136c5586bd	[flang][cuda] Add bind name for __clz interface (#153268 )	2025-08-12 22:28:20 -07:00
Fangrui Song	98164d4706	Revert "[ELF] -r: Synthesize R_RISCV_ALIGN at input section start" (#151639 ) This reverts commit 6f53f1c8d2bdd13e30da7d1b85ed6a3ae4c4a856. synthesiedAligns is not cleared, leading to stray relocations for unrelated sections. Revert for now.	2025-08-12 22:18:15 -07:00
Fangrui Song	856290d1c1	Revert "Add `REQUIRES: riscv` to test added in 151639 to skip the test when riscv is not built. (#152858 )" This reverts commit d1827f040f6e056e62cf4158bdf90d0acdf3d287.	2025-08-12 22:18:14 -07:00
Matheus Izvekov	73feab502e	[clang] fix getTrivialTemplateArgumentLoc template template argument (#153344 ) This fixes a regression reported here https://github.com/llvm/llvm-project/pull/147835#issuecomment-3181811371, where getTrivialTemplateArgumentLoc can't see through template name sugar when producing a trivial TemplateArgumentLoc for template template arguments. Since this regression was never released, there are no release notes.	2025-08-13 02:09:08 -03:00
Valentin Clement (バレンタインクレメン)	587b6ce6b9	[flang][cuda] Add bind name for __mul24 and __umul24 (#153307 )	2025-08-12 22:02:11 -07:00
Jin Huang	91de0a2c43	[libc] Refactor libc code to improve readability. (#153308 ) The PR is going to improve the readability for the files under `llvm-project/libc/src/wchar` directory. --------- Co-authored-by: Jin Huang <jingold@google.com>	2025-08-12 21:41:21 -07:00
Thurston Dang	cf002847a4	Revert "[msan] Improve packed multiply-add instrumentation" (#153343 ) Reverts llvm/llvm-project#152941 Buildbot breakage: https://lab.llvm.org/buildbot/#/builders/66/builds/17843	2025-08-12 21:32:07 -07:00
Longsheng Mou	2edee0bc79	[mlir][gpu] Support outlining nested `gpu.launch` (#152696 ) This PR fixes a crash in `GpuKernelOutliningPass` that occurred when encountering a symbol that was not a `FlatSymbolRefAttr`, enabling outlining of nested `gpu.launch` operations. Fixes #149318.	2025-08-13 11:42:52 +08:00
Alexey Samsonov	04081caa09	[libc] Remove LIBC_ERRNO_MODE_SYSTEM mode. (#153077 ) Use LIBC_ERRNO_MODE_SYSTEM_INLINE instead as the default for the "public packaging" (i.e. release mode) of an overlay build. The Bazel build has already switched to use it by default in 5ccc734fa0355f971f8f515457a0bece33ab6642. This should be a safe change, as LIBC_ERRNO_MODE_SYSTEM_INLINE works a drop-in (but simpler) LIBC_ERRNO_MODE_SYSTEM replacement. Remove the associated code paths and config settings. Fixes issue #143454.	2025-08-12 19:52:40 -07:00
Shoreshen	db96363c0a	[AMDGPU] Avoid put implicit_def into bundle that break reg's liveness (#142563 ) Cause: 1. `implicit_def` inside bundle does not count for define of reg in machineinst verifier 2. Including `implicit_def` will cause relative reg not define, result in `Bad machine code: Using an undefined physical register` in the machineinst verifier Fixes https://github.com/llvm/llvm-project/issues/139102 --------- Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>	2025-08-13 10:41:44 +08:00
Matt Arsenault	d40d04f9d6	AArch64: Remove int128 compiler-rt calls from arm64ec renames (#153124 ) It might have been a bug that these were previously not included, but they don't appear to have ever been used: https://godbolt.org/z/zE6zs8xxa If these really exist, they probably should be included. Removes 4 unused entries from the set of libcall impls.	2025-08-13 11:41:32 +09:00
Luke Lau	9217b6ab2e	[VPlan] Enforce that there is only ever one header mask. NFC (#152489 ) We almost only ever have one header mask, except with the data tail folding style, i.e. with VPInstruction::ActiveLaneMask. All we need to do is to make sure to erase the old header icmp based header mask when replacing it.	2025-08-13 02:39:04 +00:00
Maksim Levental	2b842e5600	[mlir][python] fix PyThreadState_GetFrame again (#153333 ) add more APIs missing from 3.8 (fix rocm builder)	2025-08-12 21:29:23 -05:00
Thurston Dang	ba603b5e4d	[msan] Improve packed multiply-add instrumentation (#152941 ) The current instrumentation has false positives: if there is a single uninitialized bit in any of the operands, the entire output is poisoned. This does not take into account that multiplying an uninitialized value with zero results in an initialized zero value. This step allows elements that are zero to clear the corresponding shadow during the multiplication step. The horizontal add step and accumulation step (if any) are modeled using bitwise OR. Future work can apply this improved handler to the AVX512 equivalent intrinsics (x86_avx512_pmaddw_d_512, x86_avx512_pmaddubs_w_512.) and AVX VNNI intrinsics.	2025-08-12 19:13:48 -07:00
Connector Switch	f4dd442395	[flang] Optimize `tanpi` precision (#153215 ) Part of #150452.	2025-08-13 10:07:17 +08:00
Connector Switch	12e0d524bc	[flang] Optimize `sinpi` precision (#153211 ) Part of #150452.	2025-08-13 10:06:29 +08:00
Connector Switch	d9074db137	[flang] Optimize `cospi` precision (#153208 ) Part of #150452.	2025-08-13 10:06:09 +08:00
Connector Switch	4537f0ee61	[flang] Optimize `atanpi` precision (#153207 ) Part of #150452.	2025-08-13 10:05:48 +08:00
Connector Switch	c664ce49e3	[flang] Optimize `asinpi` precision (#153203 ) Part of #150452.	2025-08-13 10:05:25 +08:00
Felipe de Azevedo Piovezan	a203546496	Revert "[lldb] Call FixUpPointer in WritePointerToMemory" This reverts commit 085a53cb89c4021da2e32d1757a1ee44668e8596. This patch is hitting a corner case tested by `TestScriptedProcessEmptyMemoryRegion.py`.	2025-08-12 18:51:00 -07:00
Jonas Devlieghere	84c5b9525e	[lldb] Use numeric_limits for all overflow checks in ObjectFileWasm (#153332 ) Use std::numeric_limits<uint32_t>::max() for all overflow checks in ObjectFileWasm and fix a few locations where I incorrectly used `>=` instead of `>`.	2025-08-13 01:49:03 +00:00

1 2 3 4 5 ...

548363 Commits