58592 Commits

Author SHA1 Message Date
Durgadoss R
a03b2250db
[NVPTX][Docs] [NFC] Update docs on intrinsics (#133136)
Recently, we have added a set of complex intrinsics on
the TMA, tcgen05, and Cvt family of instructions.

This patch captures the key learnings from our experience
so far and documents them as guidelines for future design.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-04-04 15:39:25 +05:30
Tobias Stadler
1302610f03
[MergeFunc] Fix crash caused by bitcasting ArrayType (#133259)
createCast in MergeFunctions did not consider ArrayTypes, which results
in the creation of a bitcast between ArrayTypes in the thunk function,
leading to an assertion failure in the provided test case.

The version of createCast in GlobalMergeFunctions does handle
ArrayTypes, so this common code has been factored out into the
IRBuilder.
2025-04-04 10:16:40 +01:00
Fangrui Song
92c93f5286 [MC] Merge MCAsmLexer and AsmLexer
Follow-up to #134207

Both classes define `IsAtStartOfStatement` but the semantics are
confusingly different. Rename the base class one.
2025-04-03 22:11:49 -07:00
Fangrui Song
c9f6d26e04
[MC] Merge MCAsmLexer.{h,cpp} into AsmLexer.{h,cpp} (#134207)
2b11c7de4ae182496438e166cb6758d41b6e1740 introduced
`llvm/include/llvm/MC/MCAsmLexer.h` and made `AsmLexer` inherit from
`MCAsmLexer`, likely to allow target-specific parsers to depend solely
on `MCAsmLexer`. However, this separation now seems unnecessary and
confusing.

`MCAsmLexer` defines virtual functions with `AsmLexer` as its only
implementation, and `AsmLexer` itself has few extra public methods.

To simplify the codebase, this change merges MCAsmLexer.{h,cpp} into
AsmLexer.{h,cpp}. MCAsmLexer.h is temporarily kept as a forwarder.

Note: I doubt that a downstream lexer handling an assembly syntax
significantly different from the standard GNU Assembler syntax would
want to inherit from `MCAsmLexer`. Instead, it's more likely they'd
extend `AsmLexer` by adding new states and modifying its internal logic,
as seen with variables for MASM, M68k, and HLASM.
2025-04-03 19:22:45 -07:00
Mircea Trofin
2146826169
[ctxprof] Support for "move" semantics for the contextual root (#134192)
This PR finishes what PR #133992 started.
2025-04-03 18:36:45 -07:00
Sumit Agarwal
996cf5dc67
[HLSL] Implement dot2add intrinsic (#131237)
Resolves #99221 
Key points: For SPIRV backend, it decompose into a `dot` followed a
`add`.

- [x] Implement dot2add clang builtin,
- [x] Link dot2add clang builtin with hlsl_intrinsics.h
- [x] Add sema checks for dot2add to CheckHLSLBuiltinFunctionCall in
SemaHLSL.cpp
- [x] Add codegen for dot2add to EmitHLSLBuiltinExpr in CGBuiltin.cpp
- [x] Add codegen tests to clang/test/CodeGenHLSL/builtins/dot2add.hlsl
- [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/dot2add-errors.hlsl
- [x] Create the int_dx_dot2add intrinsic in IntrinsicsDirectX.td
- [x] Create the DXILOpMapping of int_dx_dot2add to 162 in DXIL.td
- [x] Create the dot2add.ll and dot2add_errors.ll tests in
llvm/test/CodeGen/DirectX/
2025-04-03 16:23:09 -06:00
Alexander Yermolovich
4f902d2425
[llvm-dwarfdump] Make --verify for .debug_names multithreaded. (#127281)
This PR makes verification of .debug_names acceleration table
multithreaded. In local testing it improves verification of clang
.debug_names from four minutes to under a minute.
This PR relies on a current mechanism of extracting DIEs into a vector. 
Future improvements can include creating API to extract one DIE at a
time, or grouping Entires into buckets by CUs and extracting before
parallel step.

Single Thread
4:12.37 real,   246.88 user,    3.54 sys,       0 amem,10232004 mmem
Multi Thread
0:49.40 real,   612.84 user,    515.73 sys,     0 amem, 11226292 mmem
2025-04-03 14:02:27 -07:00
Henry Jiang
7d3dfc862d
[JITLink][XCOFF] Setup initial build support for XCOFF (#127266)
This patch starts the initial implementation of JITLink for XCOFF (Object format for AIX).
2025-04-03 17:01:18 -04:00
zhijian lin
1a540c3b8b
[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#133155)
ISD::ADDC, ISD::ADDE, ISD::SUBC and ISD::SUBE are being deprecated,
using ISD::UADDO_CARRY,ISD::USUBO_CARRY instead. Lowering the UADDO,
UADDO_CARRY, USUBO, USUBO_CARRY in the patch.
2025-04-03 13:22:49 -04:00
Luke Lau
9a5b0f302b
Reapply "[InstCombine] Match scalable splats in m_ImmConstant (#132522)" (#134262)
This reapplies #132522.

Previously casts of scalable m_ImmConstant splats weren't being folded
by ConstantFoldCastOperand, triggering the "Constant-fold of ImmConstant
should not fail" assertion.

There are no changes to the code in this PR, instead we just needed
#133207 to land first.

A test has been added for the assertion in
llvm/test/Transforms/InstSimplify/vec-icmp-of-cast.ll
@icmp_ult_sext_scalable_splat_is_true.

<hr/>

#118806 fixed an infinite loop in FoldShiftByConstant that could occur
when the shift amount was a ConstantExpr.

However this meant that FoldShiftByConstant no longer kicked in for
scalable vectors because scalable splats are represented by
ConstantExprs.

This fixes it by allowing scalable splats of non-ConstantExprs in
m_ImmConstant, which also fixes a few other test cases where scalable
splats were being missed.

But I'm also hoping that UseConstantIntForScalableSplat will eventually
remove the need for this.

I noticed this when trying to reverse a combine on RISC-V in #132245,
and saw that the resulting vector and scalar forms were different.
2025-04-03 18:03:16 +01:00
Luke Lau
b61e3874fa Revert "[InstCombine] Match scalable splats in m_ImmConstant (#132522)"
This reverts commit df9e5ae5b40c4d245d904a2565e46f5b7ab9c7c8.

This is triggering an assertion failure on llvm-test-suite with
-enable-vplan-native-path:
https://lab.llvm.org/buildbot/#/builders/198/builds/3365
2025-04-03 15:16:56 +01:00
Lukacma
ae8ad8649d
[Clang][AArch64] Model ZT0 table using inaccessible memory (#133727)
This patch changes how ZT0 table is modelled at LLVM-IR level. Currently
accesses to ZT0 are represented at LLVM-IR level as memory reads and
writes. This patch changes that and models them as purely Inaccessible
memory accesses without any unmodeled side-effects.
2025-04-03 14:22:48 +01:00
Nikita Popov
efbbdd69c7
[ADT] Make DenseMap::init() private (NFC) (#134229)
I believe this method was not supposed to be public, as it has
additional preconditions (it will misbehave when called on a non-empty
DenseMap).

The public API for this is reserve().
2025-04-03 15:14:45 +02:00
Juan Manuel Martinez Caamaño
041e84261a
[Clang][AMDGPU] Expose buffer load lds as a clang builtin (#132048)
CK is using either inline assembly or inline LLVM-IR builtins to
generate buffer_load_dword lds instructions.

This patch exposes this instruction as a Clang builtin available on gfx9 and gfx10.

Related to SWDEV-519702 and SWDEV-518861
2025-04-03 09:22:38 +02:00
Yingwei Zheng
b6c0ce0bb6
[IR][NFC] Use SwitchInst::defaultDestUnreachable (#134199) 2025-04-03 14:47:47 +08:00
Hua Tian
7e65944292
[llvm][CodeGen] avoid repeated interval calculation in window scheduler (#132352)
Some new registers are reused when replacing some old ones in
certain use case of ModuloScheduleExpander. It is necessary to
avoid repeated interval calculations for these registers.
2025-04-03 14:25:55 +08:00
Krzysztof Drewniak
554859c736
[TTI] Make isLegalMasked{Load,Store} take an address space (#134006)
In order to facilitate targets that only support masked loads/stores
on certain address spaces (AMDGPU will support them in an upcoming
patch, but only for address space 7), add an AddressSpace parameter
to isLegalMaskedLoad and isLegalMaskedStore
2025-04-02 15:38:10 -05:00
Luke Lau
df9e5ae5b4
[InstCombine] Match scalable splats in m_ImmConstant (#132522)
#118806 fixed an infinite loop in FoldShiftByConstant that could occur
when the shift amount was a ConstantExpr.

However this meant that FoldShiftByConstant no longer kicked in for
scalable vectors because scalable splats are represented by
ConstantExprs.

This fixes it by allowing scalable splats of non-ConstantExprs in
m_ImmConstant, which also fixes a few other test cases where scalable
splats were being missed.

But I'm also hoping that UseConstantIntForScalableSplat will eventually
remove the need for this.

I noticed this when trying to reverse a combine on RISC-V in #132245,
and saw that the resulting vector and scalar forms were different.

---------

Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2025-04-02 21:21:52 +01:00
Florian Hahn
3bdf9a0880
[EquivalenceClasses] Use SmallVector for deterministic iteration order. (#134075)
Currently iterators over EquivalenceClasses will iterate over std::set,
which guarantees the order specified by the comperator. Unfortunately in
many cases, EquivalenceClasses are used with pointers, so iterating over
std::set of pointers will not be deterministic across runs.

There are multiple places that explicitly try to sort the equivalence
classes before using them to try to get a deterministic order
(LowerTypeTests, SplitModule), but there are others that do not at the
moment and this can result at least in non-determinstic value naming in
Float2Int.

This patch updates EquivalenceClasses to keep track of all members via a
extra SmallVector and removes code from LowerTypeTests and SplitModule
to sort the classes before processing.

Overall it looks like compile-time slightly decreases in most cases, but
close to noise:

https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&stat=instructions

PR: https://github.com/llvm/llvm-project/pull/134075
2025-04-02 20:27:43 +01:00
Juan Manuel Martinez Caamaño
0375ef07c3
[Clang][AMDGPU] Add __builtin_amdgcn_cvt_off_f32_i4 (#133741)
This built-in maps to `V_CVT_OFF_F32_I4` which treats its input as
a 4-bit signed integer and returns `0.0625f * src`.

SWDEV-518861
2025-04-02 19:51:40 +02:00
Jorge Gorbe Moya
a57023b6a0 Add missing include for llvm::Error after 74ec038ffb34575ee93fa313cd0ea0db0c0a7e0a 2025-04-02 10:43:05 -07:00
Fangrui Song
b6e2df54c4 [MC] Move some member variables from AsmParser to MCAsmParser
to eliminate some virtual functions and avoid duplication
between AsmParser/MasmParser.
2025-04-02 09:59:18 -07:00
Nikita Popov
74ec038ffb [OMPIRBuilder] Don't include MemorySSAUpdater.h (NFC)
This header does not use MemorySSA in any way -- it was just using
a typedef from it. Write out the type instead.
2025-04-02 18:48:51 +02:00
Yingwei Zheng
65ed35393c
[IR] Add helper CmpPredicate::dropSameSign (#134071)
Address review comment
https://github.com/llvm/llvm-project/pull/133711#discussion_r2024519641
2025-04-02 22:25:01 +08:00
dianqk
842785adf7
[MachineInstr] Remove the code that was accidentally added in #132536 (NFC) 2025-04-02 20:08:37 +08:00
Kazu Hirata
cc10896fa2
[SandboxVectorizer] Use llvm::erase (NFC) (#134018) 2025-04-01 21:58:57 -07:00
Finn Plummer
676755561d
Reland "[HLSL][RootSignature] Implement parsing of a DescriptorTable with empty clauses" (#133958)
This pr relands https://github.com/llvm/llvm-project/pull/133302.

It resolves two issues:
- Linking error during build,
[here](https://github.com/llvm/llvm-project/pull/133302#issuecomment-2767259848).
There was a missing dependency for `clangLex` for the
`ParseHLSLRootSignatureTest.cpp` unit testing. This library was added to
the dependencies to resolve the error. It wasn't caught previously as
the library was transitively linked in most build environments
- Warning of unused declaration,
[here](https://github.com/llvm/llvm-project/pull/133302#issuecomment-2767091368).
There was a usability line in `LexHLSLRootSignature.h` of the form
`using TokenKind = enum RootSignatureToken::Kind` which causes this
error. The declaration is removed from the header file to be used
locally in the `.cpp` files that use it.
Notably, the original pr would also exposed `clang::hlsl::TokenKind` to
everywhere it was included, which had a name clash with
`tok::TokenKind`. This is another motivation to change to the proposed
resolution.

---------

Co-authored-by: Finn Plummer <finnplummer@microsoft.com>
2025-04-01 14:58:30 -07:00
Florian Hahn
ec59313c04
[EquivalenceClasses] Use range-based for loops (NFC). 2025-04-01 21:45:01 +01:00
Matthias Braun
adba14acea
Stop using __attribute__((retain)) in GCC builds (#133793)
GCC sometimes produces warnings about `__attribute__((retain))` despite
`__has_attribute(retain)` being 1. See:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99587

The amount of users who benefit from the attribute is probably very
small compared to the amount of `-Werror` enabled builds or the desire
to keep `-Wattributes` enabled in the LLVM build. So for now drop usage
of the `retain` attribute in GCC builds.
2025-04-01 13:20:43 -07:00
Virginia Cangelosi
79487757b7
[Clang][LLVM] Implement multi-multi vectors MOP4{A/S} (#129230)
Implement all multi-multi {BF/F/S/U/SU/US}MOP4{A/S} instructions in
clang and llvm following the acle in
https://github.com/ARM-software/acle/pull/381/files
2025-04-01 19:20:27 +01:00
Matt Arsenault
5c4302442b
llvm-reduce: Reduce global variable code model (#133865)
The current API doesn't have a way to unset it. The query returns
an optional, but the set doesn't. Alternatively I could switch the
set to also use optional.
2025-04-01 23:54:10 +07:00
Petr Hosek
4b19db6db9
Revert "AsmPrinter: Remove ELF's special lowerRelativeReference for unnamed_addr function" (#133935)
Reverts llvm/llvm-project#132684
2025-04-01 09:39:07 -07:00
Matt Arsenault
7e25b24073
IRNormalizer: Replace cl::opts with pass parameters (#133874)
Not sure why the "fold-all" option naming didn't match the
variable "FoldPreOutputs", but I've preserved the difference.

More annoyingly, the pass name "normalize" does not match the pass
name IRNormalizer and should probably be fixed one way or the other.

Also the existing test coverage for the flags is lacking. I've added
a test that shows they parse, but we should have tests that they
do something.
2025-04-01 23:27:20 +07:00
Jonathan Thackray
558ce50ebc
[Clang][LLVM] Implement multi-single vectors MOP4{A/S} (#129226)
Implement all multi-single {BF/F/S/U/SU/US}MOP4{A/S} instructions in clang and
llvm following the ACLE in https://github.com/ARM-software/acle/pull/381/files
2025-04-01 17:04:59 +01:00
Virginia Cangelosi
e92ff64bad
[Clang][LLVM] Implement single-multi vectors MOP4{A/S} (#128854)
Implement all single-multi {BF/F/S/U/SU/US}MOP4{A/S} instructions in
clang and llvm following the acle in
https://github.com/ARM-software/acle/pull/381/files.

This PR depends on https://github.com/llvm/llvm-project/pull/127797

This patch updates the semantics of template arguments in intrinsic
names for clarity and ease of use. Previously, template argument numbers
indicated which character in the prototype string determined the final
type suffix, which was confusing—especially for intrinsics using
multiple prototype modifiers per operand (e.g., intrinsics operating on
arrays of vectors). The number had to reference the correct character in
the prototype (e.g., the ‘u’ in “2.u”), making the system cumbersome and
error-prone.
With this patch, template argument numbers now refer to the operand
number that determines the final type suffix, providing a more intuitive
and consistent approach.
2025-04-01 15:05:30 +01:00
Virginia Cangelosi
6892d54286
[Clang][LLVM] Implement single-single vectors MOP4{A/S} (#127797)
Implement all single-single {BF/F/S/U/SU/US}MOP4{A/S} instructions in
clang and llvm following the acle in
https://github.com/ARM-software/acle/pull/381/files
2025-04-01 13:35:09 +01:00
Akshat Oke
4a68702455
[CodeGen][NPM] Port XRayInstrumentation to NPM (#129865) 2025-04-01 15:38:49 +05:30
Florian Hahn
9e5bfbf77d
[EquivalenceClasses] Update member_begin to take ECValue (NFC).
Remove a level of indirection and update code to use range-based for
loops.
2025-04-01 09:28:46 +01:00
Florian Hahn
64d493f987
[EquivalenceClasses] Return ECValue directly from insert (NFC).
Removes a redundant lookup in the mapping.:
2025-04-01 08:45:46 +01:00
Fangrui Song
36978fadb8 [MC] Add UseAtForSpecifier
Some ELF targets don't use @ for relocation specifiers.
We should not report `error: invalid variant` when @ is used.

Attempt to make expr@specifier parsing less hacky.
2025-04-01 00:06:05 -07:00
Fangrui Song
dd862356e2
AsmPrinter: Remove ELF's special lowerRelativeReference for unnamed_addr function
https://reviews.llvm.org/D17938 introduced lowerRelativeReference to
give ConstantExpr sub (A-B) special semantics in ELF: when `A` is an
`unnamed_addr` function, create a PLT-generating relocation. This was
intended for C++ relative vtables, but C++ relative vtable ended up
using DSOLocalEquivalent (lowerDSOLocalEquivalent).

This special treatment of `unnamed_addr` seems unusual.
Let's remove it. Only COFF needs an overload to generate a @IMGREL32
relocation specifier (llvm/test/MC/COFF/cross-section-relative.ll).

Pull Request: https://github.com/llvm/llvm-project/pull/132684
2025-03-31 20:44:29 -07:00
Matt Arsenault
f77f2b9c56
llvm-reduce: Try to preserve instruction metadata as argument attributes (#133557)
Fixes #131825
2025-04-01 07:34:31 +07:00
Matthias Braun
5d1f27f349
GlobalISel: neg (and x, 1) --> SIGN_EXTEND_INREG x, 1 (#131367)
The pattern
```LLVM
%shl = shl i32 %x, 31
%ashr = ashr i32 %shl, 31
```
would be combined to `G_EXT_INREG %x, 1` by GlobalISel. However
InstCombine normalizes this pattern to:
```LLVM
%and = and i32 %x, 1
%neg = sub i32 0, %and
```
This adds a combiner for this variant as well.
2025-03-31 16:06:51 -07:00
Florian Hahn
32f24029c7
Reapply "[EquivalenceClasses] Replace findValue with contains (NFC)."
This reverts the revert commit 616f447fc84bdc7655117f1b303d895dc3b93e4d.

It includes updates to remaining users in Polly and Clang, to avoid
failures when building those projects.
2025-03-31 22:27:59 +01:00
Finn Plummer
5e2860a8d3
Revert "[HLSL][RootSignature] Implement parsing of a DescriptorTable with empty clauses" (#133790)
Reverts llvm/llvm-project#133302

Reverting to inspect build failures that were introduced from use of the
`clang::Preprocessor` in unit testing, as well as, the warning about an
unused declaration. See linked issue for failures.
2025-03-31 13:38:09 -07:00
Florian Hahn
616f447fc8
Revert "[EquivalenceClasses] Replace findValue with contains (NFC)."
Breaks clang builds.

This reverts commit 8e390dedd71d0c2bcbe8775aee2e234ef7a5b787.
2025-03-31 20:38:12 +01:00
Florian Hahn
8e390dedd7
[EquivalenceClasses] Replace findValue with contains (NFC).
Replace remaining use of findValue with more compact and limited
contains().
2025-03-31 20:11:00 +01:00
Finn Plummer
e4b9486056
[HLSL][RootSignature] Implement parsing of a DescriptorTable with empty clauses (#133302)
- defines the Parser class and an initial set of helper methods to
support consuming tokens. functionality is demonstrated through a simple
empty descriptor table test case
- defines an initial in-memory representation of a DescriptorTable
- implements a test harness that will be used to validate the correct
diagnostics are generated. it will construct a dummy pre-processor with
diagnostics consumer to do so

Implements the first part of
https://github.com/llvm/llvm-project/issues/126569
2025-03-31 10:26:51 -07:00
Rahul Joshi
74b7abf154
[IRBuilder] Add new overload for CreateIntrinsic (#131942)
Add a new `CreateIntrinsic` overload with no `Types`, useful for
creating calls to non-overloaded intrinsics that don't need additional
mangling.
2025-03-31 08:10:34 -07:00
Tom Tromey
68947342b7
Add support for fixed-point types (#129596)
This adds DWARF generation for fixed-point types. This feature is needed
by Ada.

Note that a pre-existing GNU extension is used in one case. This has
been emitted by GCC for years, and is needed because standard DWARF is
otherwise incapable of representing these types.
2025-03-31 07:42:21 -07:00