llvm-project

Author	SHA1	Message	Date
Durgadoss R	a03b2250db	[NVPTX][Docs] [NFC] Update docs on intrinsics (#133136 ) Recently, we have added a set of complex intrinsics on the TMA, tcgen05, and Cvt family of instructions. This patch captures the key learnings from our experience so far and documents them as guidelines for future design. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-04-04 15:39:25 +05:30
Tobias Stadler	1302610f03	[MergeFunc] Fix crash caused by bitcasting ArrayType (#133259 ) createCast in MergeFunctions did not consider ArrayTypes, which results in the creation of a bitcast between ArrayTypes in the thunk function, leading to an assertion failure in the provided test case. The version of createCast in GlobalMergeFunctions does handle ArrayTypes, so this common code has been factored out into the IRBuilder.	2025-04-04 10:16:40 +01:00
Fangrui Song	92c93f5286	[MC] Merge MCAsmLexer and AsmLexer Follow-up to #134207 Both classes define `IsAtStartOfStatement` but the semantics are confusingly different. Rename the base class one.	2025-04-03 22:11:49 -07:00
Fangrui Song	c9f6d26e04	[MC] Merge MCAsmLexer.{h,cpp} into AsmLexer.{h,cpp} (#134207 ) 2b11c7de4ae182496438e166cb6758d41b6e1740 introduced `llvm/include/llvm/MC/MCAsmLexer.h` and made `AsmLexer` inherit from `MCAsmLexer`, likely to allow target-specific parsers to depend solely on `MCAsmLexer`. However, this separation now seems unnecessary and confusing. `MCAsmLexer` defines virtual functions with `AsmLexer` as its only implementation, and `AsmLexer` itself has few extra public methods. To simplify the codebase, this change merges MCAsmLexer.{h,cpp} into AsmLexer.{h,cpp}. MCAsmLexer.h is temporarily kept as a forwarder. Note: I doubt that a downstream lexer handling an assembly syntax significantly different from the standard GNU Assembler syntax would want to inherit from `MCAsmLexer`. Instead, it's more likely they'd extend `AsmLexer` by adding new states and modifying its internal logic, as seen with variables for MASM, M68k, and HLASM.	2025-04-03 19:22:45 -07:00
Mircea Trofin	2146826169	[ctxprof] Support for "move" semantics for the contextual root (#134192 ) This PR finishes what PR #133992 started.	2025-04-03 18:36:45 -07:00
Sumit Agarwal	996cf5dc67	[HLSL] Implement dot2add intrinsic (#131237 ) Resolves #99221 Key points: For SPIRV backend, it decompose into a `dot` followed a `add`. - [x] Implement dot2add clang builtin, - [x] Link dot2add clang builtin with hlsl_intrinsics.h - [x] Add sema checks for dot2add to CheckHLSLBuiltinFunctionCall in SemaHLSL.cpp - [x] Add codegen for dot2add to EmitHLSLBuiltinExpr in CGBuiltin.cpp - [x] Add codegen tests to clang/test/CodeGenHLSL/builtins/dot2add.hlsl - [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/dot2add-errors.hlsl - [x] Create the int_dx_dot2add intrinsic in IntrinsicsDirectX.td - [x] Create the DXILOpMapping of int_dx_dot2add to 162 in DXIL.td - [x] Create the dot2add.ll and dot2add_errors.ll tests in llvm/test/CodeGen/DirectX/	2025-04-03 16:23:09 -06:00
Alexander Yermolovich	4f902d2425	[llvm-dwarfdump] Make --verify for .debug_names multithreaded. (#127281 ) This PR makes verification of .debug_names acceleration table multithreaded. In local testing it improves verification of clang .debug_names from four minutes to under a minute. This PR relies on a current mechanism of extracting DIEs into a vector. Future improvements can include creating API to extract one DIE at a time, or grouping Entires into buckets by CUs and extracting before parallel step. Single Thread 4:12.37 real, 246.88 user, 3.54 sys, 0 amem,10232004 mmem Multi Thread 0:49.40 real, 612.84 user, 515.73 sys, 0 amem, 11226292 mmem	2025-04-03 14:02:27 -07:00
Henry Jiang	7d3dfc862d	[JITLink][XCOFF] Setup initial build support for XCOFF (#127266 ) This patch starts the initial implementation of JITLink for XCOFF (Object format for AIX).	2025-04-03 17:01:18 -04:00
zhijian lin	1a540c3b8b	[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#133155 ) ISD::ADDC, ISD::ADDE, ISD::SUBC and ISD::SUBE are being deprecated, using ISD::UADDO_CARRY,ISD::USUBO_CARRY instead. Lowering the UADDO, UADDO_CARRY, USUBO, USUBO_CARRY in the patch.	2025-04-03 13:22:49 -04:00
Luke Lau	9a5b0f302b	Reapply "[InstCombine] Match scalable splats in m_ImmConstant (#132522 )" (#134262 ) This reapplies #132522. Previously casts of scalable m_ImmConstant splats weren't being folded by ConstantFoldCastOperand, triggering the "Constant-fold of ImmConstant should not fail" assertion. There are no changes to the code in this PR, instead we just needed #133207 to land first. A test has been added for the assertion in llvm/test/Transforms/InstSimplify/vec-icmp-of-cast.ll @icmp_ult_sext_scalable_splat_is_true. <hr/> #118806 fixed an infinite loop in FoldShiftByConstant that could occur when the shift amount was a ConstantExpr. However this meant that FoldShiftByConstant no longer kicked in for scalable vectors because scalable splats are represented by ConstantExprs. This fixes it by allowing scalable splats of non-ConstantExprs in m_ImmConstant, which also fixes a few other test cases where scalable splats were being missed. But I'm also hoping that UseConstantIntForScalableSplat will eventually remove the need for this. I noticed this when trying to reverse a combine on RISC-V in #132245, and saw that the resulting vector and scalar forms were different.	2025-04-03 18:03:16 +01:00
Luke Lau	b61e3874fa	Revert "[InstCombine] Match scalable splats in m_ImmConstant (#132522 )" This reverts commit df9e5ae5b40c4d245d904a2565e46f5b7ab9c7c8. This is triggering an assertion failure on llvm-test-suite with -enable-vplan-native-path: https://lab.llvm.org/buildbot/#/builders/198/builds/3365	2025-04-03 15:16:56 +01:00
Lukacma	ae8ad8649d	[Clang][AArch64] Model ZT0 table using inaccessible memory (#133727 ) This patch changes how ZT0 table is modelled at LLVM-IR level. Currently accesses to ZT0 are represented at LLVM-IR level as memory reads and writes. This patch changes that and models them as purely Inaccessible memory accesses without any unmodeled side-effects.	2025-04-03 14:22:48 +01:00
Nikita Popov	efbbdd69c7	[ADT] Make DenseMap::init() private (NFC) (#134229 ) I believe this method was not supposed to be public, as it has additional preconditions (it will misbehave when called on a non-empty DenseMap). The public API for this is reserve().	2025-04-03 15:14:45 +02:00
Juan Manuel Martinez Caamaño	041e84261a	[Clang][AMDGPU] Expose buffer load lds as a clang builtin (#132048 ) CK is using either inline assembly or inline LLVM-IR builtins to generate buffer_load_dword lds instructions. This patch exposes this instruction as a Clang builtin available on gfx9 and gfx10. Related to SWDEV-519702 and SWDEV-518861	2025-04-03 09:22:38 +02:00
Yingwei Zheng	b6c0ce0bb6	[IR][NFC] Use `SwitchInst::defaultDestUnreachable` (#134199 )	2025-04-03 14:47:47 +08:00
Hua Tian	7e65944292	[llvm][CodeGen] avoid repeated interval calculation in window scheduler (#132352 ) Some new registers are reused when replacing some old ones in certain use case of ModuloScheduleExpander. It is necessary to avoid repeated interval calculations for these registers.	2025-04-03 14:25:55 +08:00
Krzysztof Drewniak	554859c736	[TTI] Make isLegalMasked{Load,Store} take an address space (#134006 ) In order to facilitate targets that only support masked loads/stores on certain address spaces (AMDGPU will support them in an upcoming patch, but only for address space 7), add an AddressSpace parameter to isLegalMaskedLoad and isLegalMaskedStore	2025-04-02 15:38:10 -05:00
Luke Lau	df9e5ae5b4	[InstCombine] Match scalable splats in m_ImmConstant (#132522 ) #118806 fixed an infinite loop in FoldShiftByConstant that could occur when the shift amount was a ConstantExpr. However this meant that FoldShiftByConstant no longer kicked in for scalable vectors because scalable splats are represented by ConstantExprs. This fixes it by allowing scalable splats of non-ConstantExprs in m_ImmConstant, which also fixes a few other test cases where scalable splats were being missed. But I'm also hoping that UseConstantIntForScalableSplat will eventually remove the need for this. I noticed this when trying to reverse a combine on RISC-V in #132245, and saw that the resulting vector and scalar forms were different. --------- Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>	2025-04-02 21:21:52 +01:00
Florian Hahn	3bdf9a0880	[EquivalenceClasses] Use SmallVector for deterministic iteration order. (#134075 ) Currently iterators over EquivalenceClasses will iterate over std::set, which guarantees the order specified by the comperator. Unfortunately in many cases, EquivalenceClasses are used with pointers, so iterating over std::set of pointers will not be deterministic across runs. There are multiple places that explicitly try to sort the equivalence classes before using them to try to get a deterministic order (LowerTypeTests, SplitModule), but there are others that do not at the moment and this can result at least in non-determinstic value naming in Float2Int. This patch updates EquivalenceClasses to keep track of all members via a extra SmallVector and removes code from LowerTypeTests and SplitModule to sort the classes before processing. Overall it looks like compile-time slightly decreases in most cases, but close to noise: https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&stat=instructions PR: https://github.com/llvm/llvm-project/pull/134075	2025-04-02 20:27:43 +01:00
Juan Manuel Martinez Caamaño	0375ef07c3	[Clang][AMDGPU] Add __builtin_amdgcn_cvt_off_f32_i4 (#133741 ) This built-in maps to `V_CVT_OFF_F32_I4` which treats its input as a 4-bit signed integer and returns `0.0625f * src`. SWDEV-518861	2025-04-02 19:51:40 +02:00
Jorge Gorbe Moya	a57023b6a0	Add missing include for llvm::Error after 74ec038ffb34575ee93fa313cd0ea0db0c0a7e0a	2025-04-02 10:43:05 -07:00
Fangrui Song	b6e2df54c4	[MC] Move some member variables from AsmParser to MCAsmParser to eliminate some virtual functions and avoid duplication between AsmParser/MasmParser.	2025-04-02 09:59:18 -07:00
Nikita Popov	74ec038ffb	[OMPIRBuilder] Don't include MemorySSAUpdater.h (NFC) This header does not use MemorySSA in any way -- it was just using a typedef from it. Write out the type instead.	2025-04-02 18:48:51 +02:00
Yingwei Zheng	65ed35393c	[IR] Add helper `CmpPredicate::dropSameSign` (#134071 ) Address review comment https://github.com/llvm/llvm-project/pull/133711#discussion_r2024519641	2025-04-02 22:25:01 +08:00
dianqk	842785adf7	[MachineInstr] Remove the code that was accidentally added in #132536 (NFC)	2025-04-02 20:08:37 +08:00
Kazu Hirata	cc10896fa2	[SandboxVectorizer] Use llvm::erase (NFC) (#134018 )	2025-04-01 21:58:57 -07:00
Finn Plummer	676755561d	Reland "[HLSL][RootSignature] Implement parsing of a DescriptorTable with empty clauses" (#133958 ) This pr relands https://github.com/llvm/llvm-project/pull/133302. It resolves two issues: - Linking error during build, [here](https://github.com/llvm/llvm-project/pull/133302#issuecomment-2767259848). There was a missing dependency for `clangLex` for the `ParseHLSLRootSignatureTest.cpp` unit testing. This library was added to the dependencies to resolve the error. It wasn't caught previously as the library was transitively linked in most build environments - Warning of unused declaration, [here](https://github.com/llvm/llvm-project/pull/133302#issuecomment-2767091368). There was a usability line in `LexHLSLRootSignature.h` of the form `using TokenKind = enum RootSignatureToken::Kind` which causes this error. The declaration is removed from the header file to be used locally in the `.cpp` files that use it. Notably, the original pr would also exposed `clang::hlsl::TokenKind` to everywhere it was included, which had a name clash with `tok::TokenKind`. This is another motivation to change to the proposed resolution. --------- Co-authored-by: Finn Plummer <finnplummer@microsoft.com>	2025-04-01 14:58:30 -07:00
Florian Hahn	ec59313c04	[EquivalenceClasses] Use range-based for loops (NFC).	2025-04-01 21:45:01 +01:00
Matthias Braun	adba14acea	Stop using __attribute__((retain)) in GCC builds (#133793 ) GCC sometimes produces warnings about `__attribute__((retain))` despite `__has_attribute(retain)` being 1. See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99587 The amount of users who benefit from the attribute is probably very small compared to the amount of `-Werror` enabled builds or the desire to keep `-Wattributes` enabled in the LLVM build. So for now drop usage of the `retain` attribute in GCC builds.	2025-04-01 13:20:43 -07:00
Virginia Cangelosi	79487757b7	[Clang][LLVM] Implement multi-multi vectors MOP4{A/S} (#129230 ) Implement all multi-multi {BF/F/S/U/SU/US}MOP4{A/S} instructions in clang and llvm following the acle in https://github.com/ARM-software/acle/pull/381/files	2025-04-01 19:20:27 +01:00
Matt Arsenault	5c4302442b	llvm-reduce: Reduce global variable code model (#133865 ) The current API doesn't have a way to unset it. The query returns an optional, but the set doesn't. Alternatively I could switch the set to also use optional.	2025-04-01 23:54:10 +07:00
Petr Hosek	4b19db6db9	Revert "AsmPrinter: Remove ELF's special lowerRelativeReference for unnamed_addr function" (#133935 ) Reverts llvm/llvm-project#132684	2025-04-01 09:39:07 -07:00
Matt Arsenault	7e25b24073	IRNormalizer: Replace cl::opts with pass parameters (#133874 ) Not sure why the "fold-all" option naming didn't match the variable "FoldPreOutputs", but I've preserved the difference. More annoyingly, the pass name "normalize" does not match the pass name IRNormalizer and should probably be fixed one way or the other. Also the existing test coverage for the flags is lacking. I've added a test that shows they parse, but we should have tests that they do something.	2025-04-01 23:27:20 +07:00
Jonathan Thackray	558ce50ebc	[Clang][LLVM] Implement multi-single vectors MOP4{A/S} (#129226 ) Implement all multi-single {BF/F/S/U/SU/US}MOP4{A/S} instructions in clang and llvm following the ACLE in https://github.com/ARM-software/acle/pull/381/files	2025-04-01 17:04:59 +01:00
Virginia Cangelosi	e92ff64bad	[Clang][LLVM] Implement single-multi vectors MOP4{A/S} (#128854 ) Implement all single-multi {BF/F/S/U/SU/US}MOP4{A/S} instructions in clang and llvm following the acle in https://github.com/ARM-software/acle/pull/381/files. This PR depends on https://github.com/llvm/llvm-project/pull/127797 This patch updates the semantics of template arguments in intrinsic names for clarity and ease of use. Previously, template argument numbers indicated which character in the prototype string determined the final type suffix, which was confusing—especially for intrinsics using multiple prototype modifiers per operand (e.g., intrinsics operating on arrays of vectors). The number had to reference the correct character in the prototype (e.g., the ‘u’ in “2.u”), making the system cumbersome and error-prone. With this patch, template argument numbers now refer to the operand number that determines the final type suffix, providing a more intuitive and consistent approach.	2025-04-01 15:05:30 +01:00
Virginia Cangelosi	6892d54286	[Clang][LLVM] Implement single-single vectors MOP4{A/S} (#127797 ) Implement all single-single {BF/F/S/U/SU/US}MOP4{A/S} instructions in clang and llvm following the acle in https://github.com/ARM-software/acle/pull/381/files	2025-04-01 13:35:09 +01:00
Akshat Oke	4a68702455	[CodeGen][NPM] Port XRayInstrumentation to NPM (#129865 )	2025-04-01 15:38:49 +05:30
Florian Hahn	9e5bfbf77d	[EquivalenceClasses] Update member_begin to take ECValue (NFC). Remove a level of indirection and update code to use range-based for loops.	2025-04-01 09:28:46 +01:00
Florian Hahn	64d493f987	[EquivalenceClasses] Return ECValue directly from insert (NFC). Removes a redundant lookup in the mapping.:	2025-04-01 08:45:46 +01:00
Fangrui Song	36978fadb8	[MC] Add UseAtForSpecifier Some ELF targets don't use @ for relocation specifiers. We should not report `error: invalid variant` when @ is used. Attempt to make expr@specifier parsing less hacky.	2025-04-01 00:06:05 -07:00
Fangrui Song	dd862356e2	AsmPrinter: Remove ELF's special lowerRelativeReference for unnamed_addr function https://reviews.llvm.org/D17938 introduced lowerRelativeReference to give ConstantExpr sub (A-B) special semantics in ELF: when `A` is an `unnamed_addr` function, create a PLT-generating relocation. This was intended for C++ relative vtables, but C++ relative vtable ended up using DSOLocalEquivalent (lowerDSOLocalEquivalent). This special treatment of `unnamed_addr` seems unusual. Let's remove it. Only COFF needs an overload to generate a @IMGREL32 relocation specifier (llvm/test/MC/COFF/cross-section-relative.ll). Pull Request: https://github.com/llvm/llvm-project/pull/132684	2025-03-31 20:44:29 -07:00
Matt Arsenault	f77f2b9c56	llvm-reduce: Try to preserve instruction metadata as argument attributes (#133557 ) Fixes #131825	2025-04-01 07:34:31 +07:00
Matthias Braun	5d1f27f349	GlobalISel: neg (and x, 1) --> SIGN_EXTEND_INREG x, 1 (#131367 ) The pattern ```LLVM %shl = shl i32 %x, 31 %ashr = ashr i32 %shl, 31 ``` would be combined to `G_EXT_INREG %x, 1` by GlobalISel. However InstCombine normalizes this pattern to: ```LLVM %and = and i32 %x, 1 %neg = sub i32 0, %and ``` This adds a combiner for this variant as well.	2025-03-31 16:06:51 -07:00
Florian Hahn	32f24029c7	Reapply "[EquivalenceClasses] Replace findValue with contains (NFC)." This reverts the revert commit 616f447fc84bdc7655117f1b303d895dc3b93e4d. It includes updates to remaining users in Polly and Clang, to avoid failures when building those projects.	2025-03-31 22:27:59 +01:00
Finn Plummer	5e2860a8d3	Revert "[HLSL][RootSignature] Implement parsing of a DescriptorTable with empty clauses" (#133790 ) Reverts llvm/llvm-project#133302 Reverting to inspect build failures that were introduced from use of the `clang::Preprocessor` in unit testing, as well as, the warning about an unused declaration. See linked issue for failures.	2025-03-31 13:38:09 -07:00
Florian Hahn	616f447fc8	Revert "[EquivalenceClasses] Replace findValue with contains (NFC)." Breaks clang builds. This reverts commit 8e390dedd71d0c2bcbe8775aee2e234ef7a5b787.	2025-03-31 20:38:12 +01:00
Florian Hahn	8e390dedd7	[EquivalenceClasses] Replace findValue with contains (NFC). Replace remaining use of findValue with more compact and limited contains().	2025-03-31 20:11:00 +01:00
Finn Plummer	e4b9486056	[HLSL][RootSignature] Implement parsing of a DescriptorTable with empty clauses (#133302 ) - defines the Parser class and an initial set of helper methods to support consuming tokens. functionality is demonstrated through a simple empty descriptor table test case - defines an initial in-memory representation of a DescriptorTable - implements a test harness that will be used to validate the correct diagnostics are generated. it will construct a dummy pre-processor with diagnostics consumer to do so Implements the first part of https://github.com/llvm/llvm-project/issues/126569	2025-03-31 10:26:51 -07:00
Rahul Joshi	74b7abf154	[IRBuilder] Add new overload for CreateIntrinsic (#131942 ) Add a new `CreateIntrinsic` overload with no `Types`, useful for creating calls to non-overloaded intrinsics that don't need additional mangling.	2025-03-31 08:10:34 -07:00
Tom Tromey	68947342b7	Add support for fixed-point types (#129596 ) This adds DWARF generation for fixed-point types. This feature is needed by Ada. Note that a pre-existing GNU extension is used in one case. This has been emitted by GCC for years, and is needed because standard DWARF is otherwise incapable of representing these types.	2025-03-31 07:42:21 -07:00

1 2 3 4 5 ...

58592 Commits