563170 Commits

Author SHA1 Message Date
Krzysztof Parzyszek
603904fa19
[flang][OpenMP] Make OmpDependenceKind be a common enum, NFC (#172871)
In OpenMP 6.0 a subset of the dependence types is also used in the
`depinfo-modifier` on INIT clause. Make the enum be a common type to
avoid defining separate enum types with mostly identical members.

Use the name `OmpDependenceKind` because the other obvious candidate,
OmpDependenceType, used to be a modifier name in older OpenMP specs.
2025-12-18 10:23:57 -06:00
vangthao95
55089733b6
[AMDGPU][GlobalISel] Add readanylane combines for merge-like instruct… (#172546)
…ions

When a merge-like instruction has all readanylane sources and the result
is copied to VGPRs, eliminate the readanylanes by either using the
original unmerge source directly or building a new merge with the VGPR
sources.
2025-12-18 08:04:06 -08:00
Sudharsan Veeravalli
3bf0a8d6e1
[RISCV] Add Xqci feature flag (#172608)
This patch adds an experimental Xqci feature flag that covers all the
sub-extensions in the Qualcomm uC Extension.
2025-12-18 21:32:49 +05:30
Simon Pilgrim
50bda7296b
[X86] combineConcatVectorOps - add handling for SITOFP vector ops (#172866) 2025-12-18 15:44:16 +00:00
Alex Duran
ae739a240c
[OFFLOAD] Recognize level_zero backend in liboffload (#172818)
The code to recognize the level_zero plugin as a liboffload backend was
split from #158900. This PR adds the support back.

---------

Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com>
Co-authored-by: Nick Sarnie <nick.sarnie@intel.com>
Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-18 15:31:36 +00:00
Alex Duran
5559918321
[OFFLOAD][L0] Improve symbol device lookup (#172820)
When looking for the device address of a symbol, we need to also look if
it's a function symbol if not found as global symbol in the device.

---------

Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com>
Co-authored-by: Nick Sarnie <nick.sarnie@intel.com>
Co-authored-by: Joseph Huber <huberjn@outlook.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-18 15:31:20 +00:00
Alex Duran
3ac0ff2f36
[OFFLOAD][L0] Fix usages of getDebugLevel in L0 plugin (#172815)
Support for getDebugLevel was removed as part of the new debug macros
(#165416). This PR updates such usages to use the new ODBG_* macros.

---------

Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com>
Co-authored-by: Nick Sarnie <nick.sarnie@intel.com>
Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-18 15:30:59 +00:00
Nick Sarnie
d5bc6b191f
[clang][Driver][SPIRV] Add better error when SPIR-V tools is not found (#171704)
Today if SPIR-V Tools is not found, you get the below error:

```
clang: error: unable to execute command: posix_spawn failed: No such file or directory
clang: error: spirv-as command failed with exit code 1 (use -v to see invocation)
```
which is not exactly user friendly. 

Explain what software package is missing and give suggestions on getting
it.

Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
2025-12-18 15:21:41 +00:00
Connector Switch
12e5bd1e30
[libc] Add IN6_IS_ADDR_V4COMPAT (#172646)
This patch adds the `IN6_IS_ADDR_V4COMPAT` macro, which checks whether
an address is IPv4-compatible.
2025-12-18 23:13:54 +08:00
Connector Switch
e071e43589
[libc] Add IN6_IS_ADDR_V4MAPPED (#172645)
This patch adds the `IN6_IS_ADDR_V4MAPPED` macro, which checks whether
an address is IPv4 mapped address.
2025-12-18 23:12:49 +08:00
Connector Switch
12d4889a0a
[libc] Add IN6_IS_ADDR_MC* (#172643)
This patch adds the `IN6_IS_ADDR_MC*` macro, which checks whether an
address is multicast node-local address, multicast link-local address,
multicast site-local address, multicast organization-local address and
multicast global address.
2025-12-18 23:11:35 +08:00
Craig Topper
a256c03206
[RISCV] Rename -enable-p-ext-codegen -riscv-enable-p-ext-simd-codegen. (#172790)
Make it clear this only applies to SIMD code and that it belongs to
RISC-V.
2025-12-18 07:11:16 -08:00
Connector Switch
3f79d13aff
[libc] Add IN6_IS_ADDR_MULTICAST (#172498)
This patch adds the `IN6_IS_ADDR_MULTICAST` macro, which checks whether
an address is multicast address.
2025-12-18 23:10:33 +08:00
Connector Switch
ecd46d350a
[libc] Add IN6_IS_ADDR_LOOPBACK (#172312)
This patch adds the `IN6_IS_ADDR_LOOPBACK` macro, which checks whether
an address is loopback address.
2025-12-18 23:09:35 +08:00
Connector Switch
f904c2ad59
[libc] Add IN6_IS_ADDR_UNSPECIFIED (#172311)
This patch adds the `IN6_IS_ADDR_UNSPECIFIED` macro, which checks
whether an address is unspecified address.
2025-12-18 23:06:54 +08:00
Jan Patrick Lehr
195c1c0dc0
[OpenMP][Offload] Fix test after #172382 (#172865)
The test added in #172382 requires a debug build.
2025-12-18 16:05:59 +01:00
Min-Yih Hsu
e742015f43
[RISCV] Assign separate latencies for vector COPYs in SpacemitX60 scheduling model (#172556)
Currently, we assign the same scheduling info to COPY regardless of
whether it's a scalar or vector one. But this might cause vector COPY
from physical registers to schedule too closed to its consumer,
prolonging the physical register live range and running out of registers
during RA as seen in #167008 .

This patch addresses this issue by creating schedule variants for COPY
instructions of vector register classes so that they can have the same
latency as simple vector arithmetics (WriteVIALUV). It is worth noting
that we _only_ need latency in this case -- keeping processor resources
in (vector) COPYs still causes the aforementioned register shortage
issue, because these COPY might then be blocked by structural hazards
and again, got sunk further down than we want.
2025-12-18 07:04:42 -08:00
Baranov Victor
b0ef56de79
[clang-tidy][NFC] Remove redundant braces with clang-format 'RemoveBracesLLVM' (2/N) (#172751) 2025-12-18 18:04:02 +03:00
Mel Chen
f196b1d66f
[VPlan] Extract reverse operation for reverse accesses (#146525)
This patch introduces VPInstruction::Reverse and extracts the reverse
operations of loaded/stored values from reverse memory accesses. This
extraction facilitates future support for permutation elimination within
VPlan.
2025-12-18 14:57:48 +00:00
macurtis-amd
e741cd88a1
AMDGPU/PromoteAlloca: Fix handling of users of multiple allocas (#172771)
With recent refactoring, LDS promotion worklists for all allocas are
populated upfront. In some cases, this results in a User in multiple
lists. Then as each list is processed, a User might get deleted via
removeFromParent, potentially leaving a dangling pointer in a subsequent
worklist.

Currently this only occurs for memcpy and memmove. Prior to refactoring,
these were handled by DeferredInstr, and were processed after the last
use of the then singular worklist.

This change moves processing of DeferredInstr to after all worklists
have be processed.
2025-12-18 08:41:21 -06:00
guan jian
4e675a0c45
[SelectionDAG] Lowering usub.sat(a, 1) to a - (a != 0) (#170076)
I recently observed that LLVM generates the following code:
```
	addi	a1, a0, -1
	sltu	a0, a0, a1
	addi	a0, a0, -1
	and	a0, a0, a1
	ret
```
This could be optimized using the snez instruction instead.
2025-12-18 14:31:53 +00:00
Simon Pilgrim
345d763986
[X86] Add tests showing failure to concat matching SITOFP/UITOFP vector ops (#172852)
Tests have to perform an additional FADD to prevent
combineConcatVectorOfCasts from performing the fold - we're trying to
show when this fails to occur during a combineConcatVectorOps recursion

Interestingly, due to uitofp expansion AVX1/2 is often managing to
concat where AVX512 can't
2025-12-18 14:28:12 +00:00
Phoebe Wang
d6c2cd69cb
[X86][APX] Check APXSave before enabling APX features (#172834)
According to APX spec 3.1.4.2, APX instructions can normally execute
only when XCR0[APX_F]=1, where APX_F=19.
2025-12-18 22:22:20 +08:00
Benjamin Maxwell
492ca62e2c
[AArch64][SVE] Generalize extract_elt => plast fold to i32 indices (#172692)
This occurs after type legalization, so the index type can be i32 or
i64. This patch simplifies the matching and checks for the optional zero
extend.

Also, a few tests from when this fold was added had broken due to
incorrectly adding `nuw` to the `add <eltCount>, #-1`, which this patch
corrects.
2025-12-18 14:15:20 +00:00
Erich Keane
1940010e15
[OpenMP][CIR] Implement 'parallel's 'proc_bind' clause lowering (#172501)
This patch implements the 'first' clause for OMP, which is the
'proc_bind' clause. This clause takes one of a handful of values and
just passes it onto the OMP dialect.

The 'default' value for this isn't present in the OMP dialect, however
the classic-codegen doesn't generate the library call when this value is
passed, so this is effectively a 'no-op'.
2025-12-18 06:08:13 -08:00
Krzysztof Parzyszek
755f298ddc
[flang][OpenMP] Implement COMBINER clause (#172036)
This adds parsing and lowering of the COMBINER clause. It utilizes the
existing lowering code for combiner-expression to lower the COMBINER
clause as well.
2025-12-18 08:04:28 -06:00
Krzysztof Parzyszek
1deee91bf5
[flang][OpenMP] Move some class definitions into right place, NFC (#172736)
They were accidentally committed out of the alphabetical order.
2025-12-18 07:46:13 -06:00
Abdelrehim, Ahmed Yaser Farouk
8bd5ba7af7
[Clang] Allow AVX/AVX2 lane permute operations in constexpr (#172149)
Resolves #169312 
Enables the usage of the following X86 intrinsics in `constexpr`:
```c
_mm256_permute2f128_pd	    _mm256_permute2f128_ps
_mm256_permute2f128_si256    _mm256_permute2x128_si256
```
2025-12-18 13:41:05 +00:00
LLVM GN Syncbot
6c52b0821f [gn build] Port 50ae726bb349 2025-12-18 13:33:55 +00:00
Andres-Salamanca
4ba8352693
[CIR] Partially upstream coroutine co_return support (#171755)
This PR partially upstreams support for the `co_return` keyword. It
still needs to address the case where a `co_return` returns a value from
a `co_await`.
Additionally, this change focuses on `emitBodyAndFallthrough`, where
depending on whether the function falls through or not it will emit the
user written `co_await`. Another thing to note is the difference from
classic CodeGen, previously it checked whether it could fall through by
using `GetInsertBlock()` to verify that the block existed. In our case,
when a `co_return` is emitted, we mark `setCoreturn()` to indicate that
the coroutine contains a `co_return`.
2025-12-18 08:30:52 -05:00
Xing Xue
50ae726bb3
[libc++][AIX] Move to new locale APIs (#172068)
This patch moves to the new locale base APIs for AIX.

Co-authored-by: Nikolas Klauser <nikolasklauser@berlin.de>
2025-12-18 08:27:10 -05:00
lonely eagle
4a9342392d
[mlir] Use SymbolOpInterface to implement operateOnSymbol in test-symbol-uses pass (#172675)
Fix https://github.com/llvm/llvm-project/issues/172603 by using
SymbolOpInterface to implement operateOnSymbol.

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-12-18 20:58:30 +08:00
Stefan Gränitz
bdc822d2cf
[lli] Honor --jit-linker-kind=rtdyld on platforms that default to jitlink (#167860)
So far, the setting enforced only jitlink but not rtdyld. We get better
test coverage now that we honor both cases. We drop EPC-based execution
on the way, because with ORC lli always executes in-process.
2025-12-18 13:53:31 +01:00
Simon Pilgrim
cd7c511cc0
[X86] combineConcatVectorOps - add handling for CVTPS2DQ/CVTTPS2DQ vector ops (#172841) 2025-12-18 12:52:11 +00:00
Matt Arsenault
c68fa5ebab
AMDGPU: Handle amdgcn_rcp in computeKnownFPClass (#172490) 2025-12-18 13:50:11 +01:00
Med Ismail Bennani
6767b86c34
[lldb] Fix frame-format string missing space when module is invalid (#172767)
This patch is a follow-up to 96c733e to fix a missing space in the
frame.pc format entity. This space was intended to be prepended to the
module format entity scope but if the module is not valid, which is
often the case for python pc-less scripted frames, the space between the
pc and the function name is missing.

Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
2025-12-18 13:38:58 +01:00
Stefan Gränitz
8fe1dddce6
[lldb] Restrict JITLoaderGDB test to native Linux environments (#172292)
This test used to work on non-Linux platforms that could run simple ELF
objects in a JIT session. However, there is a risk that this will become
too unstable for CI, so let's limit it to what we actually need.
2025-12-18 13:38:49 +01:00
nerix
d1e98939c8
[LLDB] Run MSVC STL vector tests with PDB (#172726) 2025-12-18 13:18:53 +01:00
Baranov Victor
0206f18a9c
[clang-tidy][NFC] Remove redundant braces with clang-format 'RemoveBracesLLVM' (3/N) (#172752) 2025-12-18 15:04:06 +03:00
Baranov Victor
233a88579f
[clang-tidy][NFC] Remove redundant braces with clang-format 'RemoveBracesLLVM' (1/N) (#172748)
Prepare codebase to enable
https://clang.llvm.org/docs/ClangFormatStyleOptions.html#removebracesllvm.
2025-12-18 14:58:21 +03:00
Paul Walker
cba7bb9d2f
[LLVM][CodeGen][X86] Make printConstant's output for vector ConstantFP match that of ConstantVector. (#172679) 2025-12-18 11:58:05 +00:00
Simon Pilgrim
5f84dfff53
[X86] Add tests showing failure to concat matching CVTPS2DQ/CVTTPS2DQ vector ops (#172836) 2025-12-18 11:55:21 +00:00
Alexey Moksyakov
6f748698f7
Revert "[bolt][aarch64] simplify rodata/literal load for X86 & AArch6… (#172822)
few tests are broken on ubuntu, need find out the cause 
This reverts commit 999c9382571d6aadf9b786263862bf4085dd2dba.

Co-authored-by: yavtuk <yavtuk@ya.ru>
2025-12-18 14:19:17 +03:00
Frederik Harwath
5c05824d2b
[CodeGen] Rename expand-fp to expand-ir-insts (#172681)
The pass now contains a non-fp expansion and should
be used for any similar expansions regardless of the
types involved. Hence a generic name seems apt.

Rename the source files, pass, and adjust the pass
description. Move all tests for the expansions
that have previously been merged into the pass
to a single directory.
2025-12-18 11:15:04 +00:00
David Spickett
80e3548372
[llvm][AMDGPU] Fix signed/unsigned comparison warning in 32-bit builds (#172623)
llvm::count_if calls std::count_if which returns a difference_type.
difference_type is always signed but is never going to be a negative
value when used as the result of count_if.

This resulted in warnings in our 32-bit Arm builds like: 
```
AMDGPUIGroupLP.cpp:1050:20: warning: comparison of integers of different signs: 
'typename iterator_traits<const SDep *>::difference_type' (aka 'int') and 'unsigned int' [-Wsign-compare]
 1050 |       if (SuccSize >= Size)
      |           ~~~~~~~~ ^  ~~~~
```

I presume these warnings are not generated in 64-bit builds because
unsigned is 32-bit even for 64-bit platforms and there is no risk in
extending 32-bit unsigned into 64-bit signed.

To fix the warning I've changed the type of SuccSize to unsigned, and
the assignment acts like a static_cast into that type.
2025-12-18 11:11:09 +00:00
Marco Elver
11e8237545
[LowerAllowCheck] Move tests to Transforms/LowerAllowCheck (#172028)
Group the LowerAllowCheck tests in their own directory, like other
Transforms tests.

NFC.
2025-12-18 12:09:46 +01:00
Jay Foad
35c2dbd481
[AMDGPU] Remove trivially true predicates from GCNSubtarget. NFC. (#172830) 2025-12-18 11:05:34 +00:00
Ilia Kuklin
f4e941b209
[lldb] Use AST nodes as Subscript and BitExtraction arguments in DIL (#169363)
Use AST nodes as Subscript and BitExtraction arguments instead of bare
integers. This enables using any supported expression as an array or bit
index.
2025-12-18 16:04:31 +05:00
Matt Arsenault
d6f159dd05
AMDGPU: Add pattern for copysign of 0 (#172699)
Avoiding v_bfi_b32 is desirable since on gfx9 it
requires materializing the constant.

Similar could be done for infinity, with or 0x7fffffff
2025-12-18 11:34:24 +01:00
Charles Zablit
7fe5953a44
[lldb][windows] add Windows Virtual Console support (#168729) 2025-12-18 10:29:38 +00:00