548332 Commits

Author SHA1 Message Date
Fangrui Song
856290d1c1 Revert "Add REQUIRES: riscv to test added in 151639 to skip the test when riscv is not built. (#152858)"
This reverts commit d1827f040f6e056e62cf4158bdf90d0acdf3d287.
2025-08-12 22:18:14 -07:00
Matheus Izvekov
73feab502e
[clang] fix getTrivialTemplateArgumentLoc template template argument (#153344)
This fixes a regression reported here
https://github.com/llvm/llvm-project/pull/147835#issuecomment-3181811371,
where getTrivialTemplateArgumentLoc can't see through template name
sugar when producing a trivial TemplateArgumentLoc for template template
arguments.

Since this regression was never released, there are no release notes.
2025-08-13 02:09:08 -03:00
Valentin Clement (バレンタイン クレメン)
587b6ce6b9
[flang][cuda] Add bind name for __mul24 and __umul24 (#153307) 2025-08-12 22:02:11 -07:00
Jin Huang
91de0a2c43
[libc] Refactor libc code to improve readability. (#153308)
The PR is going to improve the readability for the files under
`llvm-project/libc/src/wchar` directory.

---------

Co-authored-by: Jin Huang <jingold@google.com>
2025-08-12 21:41:21 -07:00
Thurston Dang
cf002847a4
Revert "[msan] Improve packed multiply-add instrumentation" (#153343)
Reverts llvm/llvm-project#152941

Buildbot breakage:
https://lab.llvm.org/buildbot/#/builders/66/builds/17843
2025-08-12 21:32:07 -07:00
Longsheng Mou
2edee0bc79
[mlir][gpu] Support outlining nested gpu.launch (#152696)
This PR fixes a crash in `GpuKernelOutliningPass` that occurred when
encountering a symbol that was not a `FlatSymbolRefAttr`, enabling
outlining of nested `gpu.launch` operations. Fixes #149318.
2025-08-13 11:42:52 +08:00
Alexey Samsonov
04081caa09
[libc] Remove LIBC_ERRNO_MODE_SYSTEM mode. (#153077)
Use LIBC_ERRNO_MODE_SYSTEM_INLINE instead as the default for the "public
packaging" (i.e. release mode) of an overlay build. The Bazel build has
already switched to use it by default in
5ccc734fa0355f971f8f515457a0bece33ab6642. This should be a safe change,
as LIBC_ERRNO_MODE_SYSTEM_INLINE works a drop-in (but simpler)
LIBC_ERRNO_MODE_SYSTEM replacement. Remove the associated code paths and
config settings.

Fixes issue #143454.
2025-08-12 19:52:40 -07:00
Shoreshen
db96363c0a
[AMDGPU] Avoid put implicit_def into bundle that break reg's liveness (#142563)
Cause:
1. `implicit_def` inside bundle does not count for define of reg in
machineinst verifier
2. Including `implicit_def` will cause relative reg not define, result
in `Bad machine code: Using an undefined physical register` in the
machineinst verifier

Fixes https://github.com/llvm/llvm-project/issues/139102

---------

Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
2025-08-13 10:41:44 +08:00
Matt Arsenault
d40d04f9d6
AArch64: Remove int128 compiler-rt calls from arm64ec renames (#153124)
It might have been a bug that these were previously not included,
but they don't appear to have ever been used:
https://godbolt.org/z/zE6zs8xxa

If these really exist, they probably should be included. Removes 4
unused entries from the set of libcall impls.
2025-08-13 11:41:32 +09:00
Luke Lau
9217b6ab2e
[VPlan] Enforce that there is only ever one header mask. NFC (#152489)
We almost only ever have one header mask, except with the data tail
folding style, i.e. with VPInstruction::ActiveLaneMask.

All we need to do is to make sure to erase the old header icmp based
header mask when replacing it.
2025-08-13 02:39:04 +00:00
Maksim Levental
2b842e5600
[mlir][python] fix PyThreadState_GetFrame again (#153333)
add more APIs missing from 3.8 (fix rocm builder)
2025-08-12 21:29:23 -05:00
Thurston Dang
ba603b5e4d
[msan] Improve packed multiply-add instrumentation (#152941)
The current instrumentation has false positives: if there is a single uninitialized bit in any of the operands, the entire output is poisoned. This does not take into account that multiplying an uninitialized value with zero results in an initialized zero value.

This step allows elements that are zero to clear the corresponding shadow during the multiplication step. The horizontal add step and accumulation step (if any) are modeled using bitwise OR.

Future work can apply this improved handler to the AVX512 equivalent intrinsics (x86_avx512_pmaddw_d_512, x86_avx512_pmaddubs_w_512.) and AVX VNNI intrinsics.
2025-08-12 19:13:48 -07:00
Connector Switch
f4dd442395
[flang] Optimize tanpi precision (#153215)
Part of #150452.
2025-08-13 10:07:17 +08:00
Connector Switch
12e0d524bc
[flang] Optimize sinpi precision (#153211)
Part of #150452.
2025-08-13 10:06:29 +08:00
Connector Switch
d9074db137
[flang] Optimize cospi precision (#153208)
Part of #150452.
2025-08-13 10:06:09 +08:00
Connector Switch
4537f0ee61
[flang] Optimize atanpi precision (#153207)
Part of #150452.
2025-08-13 10:05:48 +08:00
Connector Switch
c664ce49e3
[flang] Optimize asinpi precision (#153203)
Part of #150452.
2025-08-13 10:05:25 +08:00
Felipe de Azevedo Piovezan
a203546496 Revert "[lldb] Call FixUpPointer in WritePointerToMemory"
This reverts commit 085a53cb89c4021da2e32d1757a1ee44668e8596.

This patch is hitting a corner case tested by
`TestScriptedProcessEmptyMemoryRegion.py`.
2025-08-12 18:51:00 -07:00
Jonas Devlieghere
84c5b9525e
[lldb] Use numeric_limits for all overflow checks in ObjectFileWasm (#153332)
Use std::numeric_limits<uint32_t>::max() for all overflow checks in
ObjectFileWasm and fix a few locations where I incorrectly used `>=`
instead of `>`.
2025-08-13 01:49:03 +00:00
David Majnemer
acef1db3b2 [APFloat] Remove some overly optimistic assertions
An earlier draft of DoubleAPFloat::convertToSignExtendedInteger had
arranged for overflow to be handled in a different way.  However, these
assertions are now possible if Hi+Lo are out of range and Lo != 0.

A test has been added to defend against a regression.
2025-08-12 18:32:58 -07:00
Sirui Mu
331a5db9de
[CIR] Add initial support for atomic types (#152923) 2025-08-13 09:22:48 +08:00
Sirui Mu
7b8189aab8
[CIR] Add CIRGen for pseudo destructor calls (#153014) 2025-08-13 09:21:40 +08:00
Maksim Levental
9df846bf71
[mlir][python] fix PyThreadState_GetFrame (#153325)
`PyThreadState_GetFrame` wasn't added until 3.9 (fixes currently failing
rocm builder)
2025-08-13 01:16:04 +00:00
Alex MacLean
9e6b29137b
[NVPTX] miscellaneous minor cleanup (NFC) (#152329) 2025-08-12 18:15:01 -07:00
Jonas Devlieghere
c681149ea4
Revert "[lldb] Use the Python limited API with SWIG 4.2 or later" (#153327)
Reverts llvm/llvm-project#153119 because with
`LLDB_USE_LIBEDIT_READLINE_COMPAT_MODULE`, we're using
`PyImport_Inittab` which isn't part of the stable API.
2025-08-13 01:13:37 +00:00
John Harrison
350f6abb83
[lldb] Adjusting the base MCP protocol types per the spec. (#153297)
* This adjusts the `Request`/`Response` types to have an `id` that is
either a string or a number.
* Merges 'Error' into 'Response' to have a single response type that
represents both errors and results.
* Adjusts the `Error.data` field to by any JSON value.
* Adds `operator==` support to the base protocol types and simplifies
the tests.
2025-08-12 17:56:52 -07:00
Jonas Devlieghere
c14ca4520f
[lldb] Use the Python limited API with SWIG 4.2 or later (#153119)
Use the Python limited API when building with SWIG 4.2 or later.
2025-08-12 19:51:43 -05:00
LLVM GN Syncbot
8c27d8881b [gn build] Port 2e9944a03e6b 2025-08-13 00:27:25 +00:00
David Majnemer
f6d143fd1f [APFloat] Properly implement frexp(DoubleAPFloat)
The prior implementation did not consider that the Lo component may
underflow when it undergoes scaling.  This means that we need to
carefully handle things like binade crossings or how to handle
roundTowardZero when Hi and Lo have different signs.

Particularly annoying is roundTiesToAway when Hi and Lo have different
signs.  It basically requires us to implement roundTiesTowardZero.
2025-08-12 17:03:27 -07:00
David Majnemer
e722ef4956 Reapply "[APFloat] Properly implement DoubleAPFloat::convertToSignExtendedInteger"
This reverts commit 8b44945a9231d4d7be0858a1c5d9c13d397bc512.

The compilation failure under !NDEBUG has been fixed.
2025-08-12 17:01:49 -07:00
Mircea Trofin
374cbfd327
[licm] clone MD_prof when hoisting conditional branch (#152232)
The profiling - related metadata information for the hoisted conditional branch should be copied from the original branch, not from the current terminator of the block it's hoisted to.

The patch adds a way to disable the fix just so we can do an ablation test, after which the flag will be removed. The same flag will be reused for other similar fixes.

(This was identified through `profcheck` (see Issue #147390), and this PR addresses most of the test failures (when running under profcheck) under `Transforms/LICM`.)
2025-08-13 02:01:00 +02:00
Thurston Dang
e8608960b1 [asan] Disable fakestack_alignment.ll test for Android
This test, introduced in
457b14c327,
breaks the Android build bot (https://lab.llvm.org/buildbot/#/builders/186/builds/11522).

ASan on Android has been deprecated in favor of HWASan
(https://source.android.com/docs/security/test/asan), so disable this
test.
2025-08-12 23:51:59 +00:00
Nico Weber
d25eddd77c [gn] port a02444fb69e6 (OutOfProcessInterpreterTests.cpp revert)
This reverts 130ddbb01917c3be97.
2025-08-12 19:51:40 -04:00
Slava Zakharin
b8e4232bd2
[flang] Cast fir.select[_rank] selector to i64. (#153239)
Properly cast the selector to `i64` regardless of its integer type.
We used to generate llvm.trunc always.

We have to use `i64` as long as the case values may exceed INT_MAX.

Fixes #153050.
2025-08-12 16:43:44 -07:00
Steven Wu
6032ff6c81
[CAS] Fix a bug in CAS storeFromOpenFileImpl (#153315)
Fix a bug in upstreamed CAS implemenation due to copy/paste error. The
missing coverage will be covered by future upstreamed tests.
2025-08-12 23:25:14 +00:00
Sam Elliott
7317e3c9dd [NFC][RISCV] Correct signed/unsigned in Comment 2025-08-12 16:17:22 -07:00
Daniel Paoliello
fc2146ef31
[win][arm64ec] Handle Arm64EC for Clang CodeGen tests that current XFAIL AArch64 Windows (#153255)
* `c-strings.c` - add an `XFAIL` for Arm64EC and add a comment to
explain the failure.
* `volatile-1.c` - add a regex for alignment during loads.
2025-08-12 16:12:49 -07:00
Daniel Paoliello
2a82e23146
Fix handling of dontcall attributes for arches that lower calls via fastSelectInstruction (#153302)
Recently my change to avoid duplicate `dontcall` attribute errors
(#152810) caused the Clang `Frontend/backend-attribute-error-warning.c`
test to fail on Arm32:
<https://lab.llvm.org/buildbot/#/builders/154/builds/20134>

The root cause is that, if the default `IFastSel` path bails, then
targets are given the opportunity to lower instructions via
`fastSelectInstruction`. That's the path taken by Arm32 and since its
implementation of `selectCall` didn't call `diagnoseDontCall` no error
was emitted.

I've checked the other implementations of `fastSelectInstruction` and
the only other one that lowers call instructions in WebAssembly, so I've
fixed that too.
2025-08-12 16:12:22 -07:00
Chris B
6e59d1da08
Revert "[DirectX][objdump] Add support for printing signatures" (#153313)
Reverts llvm/llvm-project#152531
2025-08-12 17:33:56 -05:00
Stanislav Mekhanoshin
d0ee82040c
[AMDGPU] Add s_barrier_init|join|leave instructions (#153296) 2025-08-12 15:07:07 -07:00
Adam Yang
8710571aba
[AMDGPU] Fixed llvm-debuginfo-analyzer for AMDGPU. (#145125)
Constructing Target triple with `ObjectFile::makeTriple` instead of just
with `Arch` and leaving the rest unknown. Also creating the subtarget
with the `CPU`. AMDGPU needs the full triple and `CPU` to disassemble
correctly.

To run a full test, also fixed a failure in `SIPreAllocateWWMRegs` with
the `$noreg` operand in `DBG_VALUE`.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-08-12 22:04:52 +00:00
Chris B
9526d3b0b9
[DirectX][objdump] Add support for printing signatures (#152531)
This adds support for printing the signature sections as part of the
`-p` flag for printing private headers.

The formatting aims to roughly match the formatting used by DXC's
`/dumpbin` flag.

Resolves #152380.
2025-08-12 17:00:14 -05:00
Maksim Levental
a40f47c972
[mlir][python] automatic location inference (#151246)
This PR implements "automatic" location inference in the bindings. The
way it works is it walks the frame stack collecting source locations
(Python captures these in the frame itself). It is inspired by JAX's
[implementation](523ddcfbca/jax/_src/interpreters/mlir.py (L462))
but moves the frame stack traversal into the bindings for better
performance.

The system supports registering "included" and "excluded" filenames;
frames originating from functions in included filenames **will not** be
filtered and frames originating from functions in excluded filenames
**will** be filtered (in that order). This allows excluding all the
generated `*_ops_gen.py` files.

The system is also "toggleable" and off by default to save people who
have their own systems (such as JAX) from the added cost.

Note, the system stores the entire stacktrace (subject to
`locTracebackFramesLimit`) in the `Location` using specifically a
`CallSiteLoc`. This can be useful for profiling tools (flamegraphs
etc.).

Shoutout to the folks at JAX for coming up with a good system.

---------

Co-authored-by: Jacques Pienaar <jpienaar@google.com>
2025-08-12 16:59:59 -05:00
Nick Sarnie
da3182a288
[LLVM][docs] Update full list of options for LLVM_TARGETS_TO_BUILD (#153299)
We added `SPIRV` as a non-experimental backend in
cda81b1ec9.

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-08-12 21:48:29 +00:00
Farzon Lotfi
1ca8ad29db
[SPIRV] Create a new OpSelect selector and fix register types. (#152311)
fixes #135572

There are two problems that are causing problems first register types
are copied from older registers instead of evaluating the spirv types.

Second the way OpSelect is defined in SPIRVInstrInfo.td we always
default to integer for TernOpTyped. There seems to be a problem of
multiple matches in the getMatchTable so when executeMatchTable runs we
aren't getting the right opSelect.

Correcting the tablegen wasn't very easy so instead created an emitter
for Select that evaluated the register types. this passes the original
llvm/test/CodeGen/SPIRV/instructions/select.ll tests and the new float
ones I'm adding in issue-135572-emit-float-opselect.ll
2025-08-12 17:43:30 -04:00
Nick Sarnie
116c318225
[Clang][Driver] Don't pass -mllvm to the linker for non-LLVM offloading toolchains (#153272)
When determining what arguments to pass to `clang-linker-wrapper` as
device linker args, don't forward `-mllvm` args if the offloading
toolchain doesn't have native LLVM support.

I saw this when working with SPIR-V, we were trying to pass `-mllvm` to
`spirv-link`.

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-08-12 21:41:00 +00:00
Stanislav Mekhanoshin
af67e0f94f
[AMDGPU] Remove obsolete comments from VOP1Instructions.td. NFC. (#153249) 2025-08-12 14:29:21 -07:00
Philip Reames
49b17a0c1c
[MIR] Further cleanup on mutliple save/restore point support [nfc] (#153250)
Remove the type alias now that the std::variant aspect is gone, directly
using std::vector in the few places that need it is more idiomatic.

Move a routine from a core header to single user.
2025-08-12 14:16:41 -07:00
Trevor Gross
919021b0df
[Arm64EC] Add support for half (#152843)
`f16` is passed and returned in vector registers on both x86 on AArch64,
the same calling convention as `f32`, so it is a straightforward type to
support. The calling convention support already exists, added as part of
a6065f0fa55a ("Arm64EC entry/exit thunks, consolidated. (#79067)").
Thus, add mangling and remove the error in order to make `half` work.

MSVC does not yet support `_Float16`, so for now this will remain an
LLVM-only extension.

Fixes the `f16` portion of
https://github.com/llvm/llvm-project/issues/94434
2025-08-12 14:15:52 -07:00
Florian Hahn
8cdab07aaa
Reapply "[VPlan] Remove trivial dead VPPhi cycles."
This reverts commit 1c7c8e3ad39957285524ff116d9a6aec0d9b62f9.

Recommit with a fix for the verifier error caused for EVL recipes.

Extra test coverage added in 6f939da60e.
2025-08-12 22:09:30 +01:00