548354 Commits

Author SHA1 Message Date
Aaditya
460ec42cc1 Code Formating 2025-08-21 14:59:38 +05:30
Aaditya
15b3c6682f Directly checking for S_XOR_B32 2025-08-21 14:59:38 +05:30
Aaditya
9f5dfe1fad Running Clang Format 2025-08-21 14:59:37 +05:30
Aaditya
47bb973176 Removing break before else 2025-08-21 14:59:37 +05:30
Aaditya
0819895763 Removing Redundant Instructions 2025-08-21 14:59:37 +05:30
Aaditya
12c1daf0ce [AMDGPU] Extending wave reduction intrinsics for i64 types - 3
Supporting Arithemtic Operations: `and`, `or`, `xor`
2025-08-21 14:59:37 +05:30
Aaditya
e5007647e5 Adding helper function for expanding arithmetic ops. 2025-08-21 14:58:43 +05:30
Aaditya
163ae0d91e Checking for targets with native 64-bit add/sub support 2025-08-20 14:55:18 +05:30
Aaditya
991f9b6ddf Marking dead scc 2025-08-20 13:57:44 +05:30
Aaditya
6579973bcd Renaming Variables 2025-08-20 13:57:43 +05:30
Aaditya
79b9c33304 [AMDGPU] Extending wave reduction intrinsics for i64 types - 2
Supporting Arithemtic Operations: `add`, `sub`
2025-08-20 13:57:43 +05:30
Aaditya
cae47329ff Using S_MOV_B64_IMM_PSEUDO instead of dealing with legality concerns. 2025-08-20 13:57:34 +05:30
Aaditya
9362371fdc Addressing Review Comments 2025-08-13 18:31:43 +05:30
Aaditya
4277c1370b [AMDGPU] Extending wave reduction intrinsics for i64 types - 1
Supporting Min/Max Operations: `min`, `max`, `umin`, `umax`
2025-08-13 12:33:51 +05:30
Valentin Clement (バレンタイン クレメン)
2ae4e95dda
[flang][cuda] Add bind name for __ddiv_XX interfaces (#153271) 2025-08-12 23:30:43 -07:00
Valentin Clement (バレンタイン クレメン)
60170f92a3
[flang][cuda] Add missing interface for __powf (#153294)
`__powf` is defined in the CUDA Fortran programming guide but it's
missing from our cudadevice module. Add the interface and bind name to
`__nv_powf`


https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/index.html#fortran-device-modules


https://docs.nvidia.com/cuda/libdevice-users-guide/__nv_powf.html#__nv_powf
2025-08-12 23:08:41 -07:00
Ryotaro Kasuga
bce0f9d2bf
[DA] Extract duplicated logic from gcdMIVtest (NFCI) (#152688)
This patch refactors `gcdMIVtest` by consolidating duplicated logic into
a single function. The main goal of this change is to improve code
maintainability rather than readability, especially since we may need to
revise this logic for correctness (as noted in the added TODO comments).

I hope this patch is NFC, but I've also added several new assertions,
which may cause some previously passing cases to fail.
2025-08-13 15:07:50 +09:00
Valentin Clement (バレンタイン クレメン)
09505b11e5
[flang][cuda] Add missing interface for __cosf (#153306)
`__cosf` is mentioned to be supported here:
https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/index.html#fortran-device-modules

Add the missing interface with a bind c name linking it to `__nv_cosf`
2025-08-12 22:48:45 -07:00
Fangrui Song
04eb5e0cd4 test: Add REQUIRES: riscv 2025-08-12 22:39:57 -07:00
Fangrui Song
94655dc8ae [ELF] -r: Synthesize R_RISCV_ALIGN at input section start" (#151639)
Clear `synthesizedAligns` to prevent stray relocations to an unrelated
text section. Enhance the test to check llvm-readelf -r output.

---

Without linker relaxation enabled for a particular relocatable file or
section (e.g., using .option norelax), the assembler will not generate
R_RISCV_ALIGN relocations for alignment directives. This becomes
problematic in a two-stage linking process:

```
ld -r a.o b.o -o ab.o
// b.o is norelax. Its alignment information is lost in ab.o.
ld ab.o -o ab
```

When ab.o is linked into an executable, the preceding relaxed section
(a.o's content) might shrink. Since there's no R_RISCV_ALIGN relocation
in b.o for the linker to act upon, the `.word 0x3a393837` data in b.o
may end up unaligned in the final executable.

To address the issue, this patch inserts NOP bytes and synthesizes an
R_RISCV_ALIGN relocation at the beginning of a text section when the
alignment >= 4.

For simplicity, when RVC is disabled, we synthesize an ALIGN relocation
(addend: 2) for a 4-byte aligned section, allowing the linker to trim
the excess 2 bytes.

See also https://sourceware.org/bugzilla/show_bug.cgi?id=33236
2025-08-12 22:38:17 -07:00
Valentin Clement (バレンタイン クレメン)
136c5586bd
[flang][cuda] Add bind name for __clz interface (#153268) 2025-08-12 22:28:20 -07:00
Fangrui Song
98164d4706 Revert "[ELF] -r: Synthesize R_RISCV_ALIGN at input section start" (#151639)
This reverts commit 6f53f1c8d2bdd13e30da7d1b85ed6a3ae4c4a856.

synthesiedAligns is not cleared, leading to stray relocations for
unrelated sections. Revert for now.
2025-08-12 22:18:15 -07:00
Fangrui Song
856290d1c1 Revert "Add REQUIRES: riscv to test added in 151639 to skip the test when riscv is not built. (#152858)"
This reverts commit d1827f040f6e056e62cf4158bdf90d0acdf3d287.
2025-08-12 22:18:14 -07:00
Matheus Izvekov
73feab502e
[clang] fix getTrivialTemplateArgumentLoc template template argument (#153344)
This fixes a regression reported here
https://github.com/llvm/llvm-project/pull/147835#issuecomment-3181811371,
where getTrivialTemplateArgumentLoc can't see through template name
sugar when producing a trivial TemplateArgumentLoc for template template
arguments.

Since this regression was never released, there are no release notes.
2025-08-13 02:09:08 -03:00
Valentin Clement (バレンタイン クレメン)
587b6ce6b9
[flang][cuda] Add bind name for __mul24 and __umul24 (#153307) 2025-08-12 22:02:11 -07:00
Jin Huang
91de0a2c43
[libc] Refactor libc code to improve readability. (#153308)
The PR is going to improve the readability for the files under
`llvm-project/libc/src/wchar` directory.

---------

Co-authored-by: Jin Huang <jingold@google.com>
2025-08-12 21:41:21 -07:00
Thurston Dang
cf002847a4
Revert "[msan] Improve packed multiply-add instrumentation" (#153343)
Reverts llvm/llvm-project#152941

Buildbot breakage:
https://lab.llvm.org/buildbot/#/builders/66/builds/17843
2025-08-12 21:32:07 -07:00
Longsheng Mou
2edee0bc79
[mlir][gpu] Support outlining nested gpu.launch (#152696)
This PR fixes a crash in `GpuKernelOutliningPass` that occurred when
encountering a symbol that was not a `FlatSymbolRefAttr`, enabling
outlining of nested `gpu.launch` operations. Fixes #149318.
2025-08-13 11:42:52 +08:00
Alexey Samsonov
04081caa09
[libc] Remove LIBC_ERRNO_MODE_SYSTEM mode. (#153077)
Use LIBC_ERRNO_MODE_SYSTEM_INLINE instead as the default for the "public
packaging" (i.e. release mode) of an overlay build. The Bazel build has
already switched to use it by default in
5ccc734fa0355f971f8f515457a0bece33ab6642. This should be a safe change,
as LIBC_ERRNO_MODE_SYSTEM_INLINE works a drop-in (but simpler)
LIBC_ERRNO_MODE_SYSTEM replacement. Remove the associated code paths and
config settings.

Fixes issue #143454.
2025-08-12 19:52:40 -07:00
Shoreshen
db96363c0a
[AMDGPU] Avoid put implicit_def into bundle that break reg's liveness (#142563)
Cause:
1. `implicit_def` inside bundle does not count for define of reg in
machineinst verifier
2. Including `implicit_def` will cause relative reg not define, result
in `Bad machine code: Using an undefined physical register` in the
machineinst verifier

Fixes https://github.com/llvm/llvm-project/issues/139102

---------

Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
2025-08-13 10:41:44 +08:00
Matt Arsenault
d40d04f9d6
AArch64: Remove int128 compiler-rt calls from arm64ec renames (#153124)
It might have been a bug that these were previously not included,
but they don't appear to have ever been used:
https://godbolt.org/z/zE6zs8xxa

If these really exist, they probably should be included. Removes 4
unused entries from the set of libcall impls.
2025-08-13 11:41:32 +09:00
Luke Lau
9217b6ab2e
[VPlan] Enforce that there is only ever one header mask. NFC (#152489)
We almost only ever have one header mask, except with the data tail
folding style, i.e. with VPInstruction::ActiveLaneMask.

All we need to do is to make sure to erase the old header icmp based
header mask when replacing it.
2025-08-13 02:39:04 +00:00
Maksim Levental
2b842e5600
[mlir][python] fix PyThreadState_GetFrame again (#153333)
add more APIs missing from 3.8 (fix rocm builder)
2025-08-12 21:29:23 -05:00
Thurston Dang
ba603b5e4d
[msan] Improve packed multiply-add instrumentation (#152941)
The current instrumentation has false positives: if there is a single uninitialized bit in any of the operands, the entire output is poisoned. This does not take into account that multiplying an uninitialized value with zero results in an initialized zero value.

This step allows elements that are zero to clear the corresponding shadow during the multiplication step. The horizontal add step and accumulation step (if any) are modeled using bitwise OR.

Future work can apply this improved handler to the AVX512 equivalent intrinsics (x86_avx512_pmaddw_d_512, x86_avx512_pmaddubs_w_512.) and AVX VNNI intrinsics.
2025-08-12 19:13:48 -07:00
Connector Switch
f4dd442395
[flang] Optimize tanpi precision (#153215)
Part of #150452.
2025-08-13 10:07:17 +08:00
Connector Switch
12e0d524bc
[flang] Optimize sinpi precision (#153211)
Part of #150452.
2025-08-13 10:06:29 +08:00
Connector Switch
d9074db137
[flang] Optimize cospi precision (#153208)
Part of #150452.
2025-08-13 10:06:09 +08:00
Connector Switch
4537f0ee61
[flang] Optimize atanpi precision (#153207)
Part of #150452.
2025-08-13 10:05:48 +08:00
Connector Switch
c664ce49e3
[flang] Optimize asinpi precision (#153203)
Part of #150452.
2025-08-13 10:05:25 +08:00
Felipe de Azevedo Piovezan
a203546496 Revert "[lldb] Call FixUpPointer in WritePointerToMemory"
This reverts commit 085a53cb89c4021da2e32d1757a1ee44668e8596.

This patch is hitting a corner case tested by
`TestScriptedProcessEmptyMemoryRegion.py`.
2025-08-12 18:51:00 -07:00
Jonas Devlieghere
84c5b9525e
[lldb] Use numeric_limits for all overflow checks in ObjectFileWasm (#153332)
Use std::numeric_limits<uint32_t>::max() for all overflow checks in
ObjectFileWasm and fix a few locations where I incorrectly used `>=`
instead of `>`.
2025-08-13 01:49:03 +00:00
David Majnemer
acef1db3b2 [APFloat] Remove some overly optimistic assertions
An earlier draft of DoubleAPFloat::convertToSignExtendedInteger had
arranged for overflow to be handled in a different way.  However, these
assertions are now possible if Hi+Lo are out of range and Lo != 0.

A test has been added to defend against a regression.
2025-08-12 18:32:58 -07:00
Sirui Mu
331a5db9de
[CIR] Add initial support for atomic types (#152923) 2025-08-13 09:22:48 +08:00
Sirui Mu
7b8189aab8
[CIR] Add CIRGen for pseudo destructor calls (#153014) 2025-08-13 09:21:40 +08:00
Maksim Levental
9df846bf71
[mlir][python] fix PyThreadState_GetFrame (#153325)
`PyThreadState_GetFrame` wasn't added until 3.9 (fixes currently failing
rocm builder)
2025-08-13 01:16:04 +00:00
Alex MacLean
9e6b29137b
[NVPTX] miscellaneous minor cleanup (NFC) (#152329) 2025-08-12 18:15:01 -07:00
Jonas Devlieghere
c681149ea4
Revert "[lldb] Use the Python limited API with SWIG 4.2 or later" (#153327)
Reverts llvm/llvm-project#153119 because with
`LLDB_USE_LIBEDIT_READLINE_COMPAT_MODULE`, we're using
`PyImport_Inittab` which isn't part of the stable API.
2025-08-13 01:13:37 +00:00
John Harrison
350f6abb83
[lldb] Adjusting the base MCP protocol types per the spec. (#153297)
* This adjusts the `Request`/`Response` types to have an `id` that is
either a string or a number.
* Merges 'Error' into 'Response' to have a single response type that
represents both errors and results.
* Adjusts the `Error.data` field to by any JSON value.
* Adds `operator==` support to the base protocol types and simplifies
the tests.
2025-08-12 17:56:52 -07:00
Jonas Devlieghere
c14ca4520f
[lldb] Use the Python limited API with SWIG 4.2 or later (#153119)
Use the Python limited API when building with SWIG 4.2 or later.
2025-08-12 19:51:43 -05:00
LLVM GN Syncbot
8c27d8881b [gn build] Port 2e9944a03e6b 2025-08-13 00:27:25 +00:00