llvm-project

Author	SHA1	Message	Date
Philip Reames	0517772b4a	Delete unused PoisonChecking utility pass This was introduced ~5yrs ago (by me), and has never really gotten any adoption. By now, it's significantly out of sync with new/changed poison propoagation rules. The idea is still reasonable, but the imagined use case is largely covered by alive2 these days anyways.	2024-12-19 14:23:38 -08:00
Florian Hahn	5f096fd221	Revert "[LoopVectorizer] Add support for partial reductions (#92418 )" This reverts commit 060d62b48aeb5080ffcae1dc56e41a06c6f56701. It looks like this is triggering an assertion when build llvm-test-suite on ARM64 macOS. Reproducer from MultiSource/Benchmarks/Ptrdist/bc/number.c target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-n32:64-S128-Fn32" target triple = "arm64-apple-macosx15.0.0" define void @test(i64 %idx.neg, i8 %0) #0 { entry: br label %while.body while.body: ; preds = %while.body, %entry %n1ptr.0.idx131 = phi i64 [ %n1ptr.0.add, %while.body ], [ %idx.neg, %entry ] %n2ptr.0.idx130 = phi i64 [ %n2ptr.0.add, %while.body ], [ 0, %entry ] %sum.1129 = phi i64 [ %add99, %while.body ], [ 0, %entry ] %n1ptr.0.add = add i64 %n1ptr.0.idx131, 1 %conv = sext i8 %0 to i64 %n2ptr.0.add = add i64 %n2ptr.0.idx130, 1 %1 = load i8, ptr null, align 1 %conv97 = sext i8 %1 to i64 %mul = mul i64 %conv97, %conv %add99 = add i64 %mul, %sum.1129 %cmp94 = icmp ugt i64 %n1ptr.0.idx131, 0 %cmp95 = icmp ne i64 %n2ptr.0.idx130, -1 %2 = and i1 %cmp94, %cmp95 br i1 %2, label %while.body, label %while.end.loopexit while.end.loopexit: ; preds = %while.body %add99.lcssa = phi i64 [ %add99, %while.body ] ret void } attributes #0 = { "target-cpu"="apple-m1" } > opt -p loop-vectorize Assertion failed: ((VF.isScalar() \|\| V->getType()->isVectorTy()) && "scalar values must be stored as (0, 0)"), function set, file VPlan.h, line 284.	2024-12-19 21:46:51 +00:00
Kazu Hirata	10d054e954	[memprof] Introduce IndexedCallstackIdConveter (NFC) (#120540 ) This patch introduces IndexedCallstackIdConveter as a convenience wrapper around FrameIdConverter and CallStackIdConverter just for tests. With the new wrapper, we get to replace idioms like: FrameIdConverter<decltype(MemProfData.Frames)> FrameIdConv( MemProfData.Frames); CallStackIdConverter<decltype(MemProfData.CallStacks)> CSIdConv( MemProfData.CallStacks, FrameIdConv); with: IndexedCallstackIdConveter CSIdConv(MemProfData); Unfortunately, this exact pattern occurs in tests only; the combinations of the frame ID converter and call stack ID converter are diverse in production code.	2024-12-19 12:20:25 -08:00
Finn Plummer	45c01e8a33	[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635 ) - update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for all uses, to allow specifiction of target specific intrinsics - add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api - update TTI api to provide `isTargetIntrinsicWith...` functions and consistently name them - move `isTriviallyScalarizable` to VectorUtils - update all uses of the api and provide the TTI parameter Resolves #117030	2024-12-19 11:54:26 -08:00
Justin Bogner	aa07f92210	[DirectX][SPIRV] Consistent names for HLSL resource intrinsics (#120466 ) Rename HLSL resource-related intrinsics to be consistent with the naming conventions discussed in [wg-hlsl:0014]. This is an entirely mechanical change, consisting of the following commands and automated formatting. ```sh git grep -l handle.fromBinding \| xargs perl -pi -e \ 's/(dx\|spv)(.)handle.fromBinding/$1$2resource$2handlefrombinding/g' git grep -l typedBufferLoad_checkbit \| xargs perl -pi -e \ 's/(dx\|spv)(.)typedBufferLoad_checkbit/$1$2resource$2loadchecked$2typedbuffer/g' git grep -l typedBufferLoad \| xargs perl -pi -e \ 's/(dx\|spv)(.)typedBufferLoad/$1$2resource$2load$2typedbuffer/g' git grep -l typedBufferStore \| xargs perl -pi -e \ 's/(dx\|spv)(.)typedBufferStore/$1$2resource$2store$2typedbuffer/g' git grep -l bufferUpdateCounter \| xargs perl -pi -e \ 's/(dx\|spv)(.)bufferUpdateCounter/$1$2resource$2updatecounter/g' git grep -l cast_handle \| xargs perl -pi -e \ 's/(dx\|spv)(.)cast.handle/$1$2resource$2casthandle/g' ``` [wg-hlsl:0014]: https://github.com/llvm/wg-hlsl/blob/main/proposals/0014-consistent-naming-for-dx-intrinsics.md	2024-12-19 12:17:21 -07:00
MagentaTreehouse	254ba78495	[GenericDomTree][NFC] Remove unnecessary `const_cast`s (#97638 )	2024-12-19 09:46:03 -08:00
Craig Topper	f139bde8d8	[SelectionDAG] Move SDNode::use_iterator::getOperandNo to SDUse. (#120536 ) This allows us to write more range based for loops because we no longer need the iterator. It also matches IR's Use class.	2024-12-19 09:07:42 -08:00
Craig Topper	e6b2495545	[SelectionDAG] Split SDNode::use_iterator into user_iterator and use_iterator. (#120531 ) SDNode::use_iterator now returns an SDUse& when dereferenced. SDNode::user_iterator returns SDNode*. SDNode::use_begin/use_end/uses work on use_iterator. SDNode::user_begin/user_end/users work on user_iterator. We can now write range based for loops using SDUse& and SDNode::uses(). I've converted many of these in this patch. I didn't update loops that have additional variables updated in their for statement. Some loops use SDNode::use_iterator::getOperandNo() which also prevents using range based for loops. I plan to move this into SDUse in a follow up patch.	2024-12-19 08:35:32 -08:00
Nicholas Guy	060d62b48a	[LoopVectorizer] Add support for partial reductions (#92418 ) Following on from https://github.com/llvm/llvm-project/pull/94499, this patch adds support to the Loop Vectorizer to emit the partial reduction intrinsics where they may be beneficial for the target. --------- Co-authored-by: Samuel Tebbs <samuel.tebbs@arm.com>	2024-12-19 11:42:40 +00:00
Mikhail Goncharov	cffe22a937	Revert "[NFC] Move DroppedVariableStats code to Analysis (#120502 )" that introduces a circular dependency of analysis -> codegen -> target This reverts commit e389492d6a00e1c49a034e13343098541ebd03c6.	2024-12-19 10:56:02 +01:00
Shubham Sandeep Rastogi	16d952898f	Revert "Add a pass to collect dropped var stats for MIR (#120501 )" This reverts commit 223c7648468cd4f649a578d3f9cbc27a63523192. Reverted due to vuildbot failure: flang-aarch64-libcxx Linking CXX shared library lib/libLLVMAnalysis.so.20.0git FAILED: lib/libLLVMAnalysis.so.20.0git	2024-12-19 00:48:40 -08:00
Shubham Sandeep Rastogi	223c764846	Add a pass to collect dropped var stats for MIR (#120501 ) Reland "Add a pass to collect dropped var stats for MIR" (#117044) I am trying to reland https://github.com/llvm/llvm-project/pull/115566 I also moved the DroppedVariableStats code to the Analysis lib This is part of a stack of patches with https://github.com/llvm/llvm-project/pull/120502 being the first one in the stack	2024-12-19 00:41:48 -08:00
Shubham Sandeep Rastogi	e389492d6a	[NFC] Move DroppedVariableStats code to Analysis (#120502 ) This is done because the CodeGen library and Passes library both link against Analysis, to avoid adding a dependency between CodeGen and Passes if we want to extend the DroppedVariableStats code for MIR stats as well, as seen in https://github.com/llvm/llvm-project/pull/120501	2024-12-18 23:42:24 -08:00
Craig Topper	bd261ecc5a	[SelectionDAG] Add SDNode::user_begin() and use it in some places (#120509 ) Most of these are just places that want the first user and aren't iterating over the whole list. While there I changed some use_size() == 1 to hasOneUse() which is more efficient. This is part of an effort to rename use_iterator to user_iterator and provide a use_iterator that dereferences to SDUse&. This patch helps reduce the diff on later patches.	2024-12-18 22:13:04 -08:00
Craig Topper	104ad9258a	[SelectionDAG] Rename SDNode::uses() to users(). (#120499 ) This function is most often used in range based loops or algorithms where the iterator is implicitly dereferenced. The dereference returns an SDNode * of the user rather than SDUse * so users() is a better name. I've long beeen annoyed that we can't write a range based loop over SDUse when we need getOperandNo. I plan to rename use_iterator to user_iterator and add a use_iterator that returns SDUse& on dereference. This will make it more like IR.	2024-12-18 20:09:33 -08:00
Kazu Hirata	6fb967ec9e	[memprof] Move Frame::hash and hashCallStack to IndexedMemProfData (NFC) (#120365 ) Now that IndexedMemProfData::{addFrame,addCallStack} are the only callers of Frame::hash and hashCallStack, respectively, this patch moves those functions into IndexedMemProfData and makes them private. With this patch, we can obtain FrameId and CallStackId only through addFrame and addCallStack, respectively.	2024-12-18 10:56:45 -08:00
Justin Bogner	0e2466f624	[DirectX] Create symbols for resource handles (#119775 ) We need to create symbols with "the original shape of resource and element type" to put in the resource metadata in order to generate valid DXIL. Note that DXC generally doesn't emit an actual symbol outside of library shaders (it emits an undef of a pointer to the type), but since we have to deal with opaque pointers we would need a way to smuggle the type through to match that. Instead, we simply emit symbols for now. Fixed #116849	2024-12-18 10:47:12 -07:00
Shubham Sandeep Rastogi	5717a99d8d	Reland 2de78815604e9027efd93cac27c517bf732587d2 (#119650 ) (#120454 ) [NFC] Move DroppedVariableStats to its own file and redesign it to be extensible. (#115563) Move DroppedVariableStats code to its own file and change the class to have an extensible design so that we can use it to add dropped statistics to MIR passes and the instruction selector.	2024-12-18 08:58:14 -08:00
Justin Bogner	3eca15cbb9	[DirectX] Split resource info into type and binding info. NFC (#119773 ) This splits the DXILResourceAnalysis pass into TypeAnalysis and BindingAnalysis passes. The type analysis pass is made immutable and populated lazily so that it can be used earlier in the pipeline without needing to carefully maintain the invariants of the binding analysis. Fixes #118400	2024-12-18 09:02:28 -07:00
Akash Banerjee	aadf606d90	Fix #110001 build error.	2024-12-18 15:55:56 +00:00
Florian Hahn	76714be5fd	Revert "Add support for single reductions in ComplexDeinterleavingPass (#112875 )" This reverts commit b3eede5e1fa7ab742b86e9be22db7bccd2505b8a. This has been breaking most AArch64 stage2 builds for 4+ hours, reverting to get the bots back to green. https://lab.llvm.org/buildbot/#/builders/41/builds/4172 https://lab.llvm.org/buildbot/#/builders/4/builds/4281 https://lab.llvm.org/buildbot/#/builders/199/builds/263 https://lab.llvm.org/buildbot/#/builders/198/builds/334 https://lab.llvm.org/buildbot/#/builders/143/builds/4276 https://lab.llvm.org/buildbot/#/builders/17/builds/4725	2024-12-18 15:06:52 +00:00
Akash Banerjee	6f0e9c4a56	[OpenMP][Clang] Migrate OpenMP UserDefinedMapper from Clang to OMPIRBuilder (#110001 ) This patch migrates the OpenMP UserDefinedMapper codegen from Clang to the OpenMPIRBuilder. I will be adding further patches in the near future so that OpenMP dialect in MLIR can make use of these.	2024-12-18 15:02:14 +00:00
NAKAMURA Takumi	5a5838fba3	Introduce CounterMappingRegion::isBranch(). NFC.	2024-12-18 20:00:02 +09:00
Nicholas Guy	b3eede5e1f	Add support for single reductions in ComplexDeinterleavingPass (#112875 ) The Complex Deinterleaving pass assumes that all values emitted will result in complex numbers, this patch aims to remove that assumption and adds support for emitting just the real or imaginary components, not both.	2024-12-18 10:34:26 +00:00
Benjamin Maxwell	1ee740a796	[VFABI] Add support for vector functions that return struct types (#119000 ) This patch updates the `VFABIDemangler` to support vector functions that return struct types. For example, a vector variant of `sincos` that returns a vector of sine values and a vector of cosine values within a struct. This patch also adds some helpers for vectorizing types (including struct types). Some of these are used in the `VFABIDemangler`, and others will be used in subsequent patches, so this patch simply adds tests for them.	2024-12-18 09:46:45 +00:00
David Sherwood	13107cb094	[LoopVectorize] Enable more early exit vectorisation tests (#117008 ) PR #112138 introduced initial support for dispatching to multiple exit blocks via split middle blocks. This patch fixes a few issues so that we can enable more tests to use the new enable-early-exit-vectorization flag. Fixes are: 1. The code to bail out for any loop live-out values happens too late. This is because collectUsersInExitBlocks ignores induction variables, which get dealt with in fixupIVUsers. I've moved the check much earlier in processLoop by looking for outside users of loop-defined values. 2. We shouldn't yet be interleaving when vectorising loops with uncountable early exits, since we've not added support for this yet. 3. Similarly, we also shouldn't be creating vector epilogues. 4. Similarly, we shouldn't enable tail-folding. 5. The existing implementation doesn't yet support loops that require scalar epilogues, although I plan to add that as part of PR #88385. 6. The new split middle blocks weren't being added to the parent loop.	2024-12-18 09:25:45 +00:00
Vitaly Buka	55e87a79b9	[BoundsChecking] Add parameters to pass (#119894 ) This check is a part of UBSAN, but does not support verbose output like other UBSAN checks. This is a step to fix that.	2024-12-17 22:07:14 -08:00
Jinsong Ji	c189b2a1ec	[DiagnosticInfo] Fix the default DiagnosticSeverity (#120342 ) After https://github.com/llvm/llvm-project/commit/ea632e1b34e1 the API call to LLVMContext->emitError(I, Errorstr) default to warning instead of error. This cause problems as the API mentioned it is "prefixed with error:".	2024-12-17 23:00:11 -05:00
Krzysztof Drewniak	b24caf3d2b	[llvm][TableGen] Add a !initialized predicate to allow testing for ? (#117964 ) There are cases (like in an upcoming patch to MLIR's `Property` class) where the ? value is a useful null value. However, existing predicates make ti difficult to test if the value in a record one is operating is ? or not. This commit adds the !initialized predicate, which is 1 on concrete, non-? values and 0 on ?. --------- Co-authored-by: Akshat Oke <Akshat.Oke@amd.com>	2024-12-17 20:34:35 -06:00
Teresa Johnson	d7d0e740cc	[MemProf] Refactor single alloc type handling and use in more cases (#120290 ) Emit message when we have aliased contexts that are conservatively hinted not cold. This is not a change in behavior, just in message when the -memprof-report-hinted-sizes flag is enabled.	2024-12-17 12:50:49 -08:00
alx32	ad32576cff	[DWARFVerifier] Allow overlapping ranges for ICF-merged functions (#117952 ) This patch modifies the DWARF verifier to handle a valid case where two or more functions have identical address ranges due to being merged by ICF (Identical Code Folding). Previously, the verifier would incorrectly report these as errors, but functions merged via ICF (such as when using LLD's --keep-icf-stabs option) can legitimately share the same address range. A new test case has been added to verify this behavior using YAML-based DWARF data that simulates two DW_TAG_subprogram entries with identical address ranges. The test ensures that the verifier correctly identifies this as a valid case and doesn't emit any errors, while still maintaining the existing verification for truly invalid overlapping ranges in other scenarios. Before this change, the newly added test case would have failed, with `llvm-dwarfdump` marking the overlapping address ranges in the DWARF as an error. We also modify the existing tests `llvm-dwarfutil/ELF/X86/verify.test` and `llvm/test/tools/llvm-dwarfdump/X86/verify_parent_zero_length.yaml` which rely on the existence of the error that we're trying to suppress. We slightly change one offset so that the ranges don't perfectly overlap and an error is still generated.	2024-12-17 11:00:56 -08:00
Nick Sarnie	1c16807d0d	[LLVM] Add Intel vendor in Triple (#120250 ) We plan to make use of this in SPIR-V-based OpenMP offloading, for which there is already an initial patch in review. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2024-12-17 12:30:21 -06:00
alx32	558de0e1f9	[llvm-gsymutil] Add option to load callsites from DWARF (#119913 ) This change adds support for loading gSYM callsite information from DWARF. Previously the only support was for loading callsites info from YAML. For testing, we add a pass where `macho-gsym-merged-callsites-dsym` loads callsite info from DWARF rather than YAML.	2024-12-17 08:51:30 -08:00
Florian Hahn	a487b792e2	[TySan] Add initial Type Sanitizer (LLVM) (#76259 ) This patch introduces the LLVM components of a type sanitizer: a sanitizer for type-based aliasing violations. It is based on Hal Finkel's https://reviews.llvm.org/D32198. C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help. For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected. The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime. The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls. The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated. Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer. When the sanitizer is active, we disable actually using the TBAA metadata for AA. This way we're less likely to use TBAA to remove memory accesses that we'd like to verify. As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work. It goes together with the corresponding clang changes (https://github.com/llvm/llvm-project/pull/76260) and compiler-rt changes (https://github.com/llvm/llvm-project/pull/76261) PR: https://github.com/llvm/llvm-project/pull/76259	2024-12-17 13:57:34 +00:00
Florian Hahn	8ea9576d94	[SCEV] Add initial matchers for SCEV expressions. (NFC) (#119390 ) This patch adds initial matchers for unary and binary SCEV expressions and specializes it for SExt, ZExt and binary add expressions. Also adds matchers for SCEVConstant and SCEVUnknown. This patch only converts a few instances to use the new matchers to make sure everything builds as expected for now. The goal of the matchers is to hopefully make it slightly easier to write code matching SCEV patterns. Depends on https://github.com/llvm/llvm-project/pull/119389 PR: https://github.com/llvm/llvm-project/pull/119390	2024-12-17 12:12:56 +00:00
SpencerAbson	908e30658d	[AArch64] Implement intrinsics for FP8 SME FMLAL/FMLALL (multi) (#119546 ) This patch implements the following intrinsics: Multi-vector 8-bit floating-point multiply-add long (multiple vectors). ``` c // Only if __ARM_FEATURE_SME_F8F16 != 0 void svmla_za16[_mf8]_vg2x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8x2_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla_za16[_mf8]_vg2x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8x4_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); // Only if __ARM_FEATURE_SME_F8F32 != 0 void svmla_za32[_mf8]_vg4x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8x2_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla_za32[_mf8]_vg4x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8x4_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); ``` In accordance with https://github.com/ARM-software/acle/pull/323	2024-12-17 11:47:20 +00:00
Lang Hames	24c2744a18	[ORC] Fix LazyReexports resource key management. Multiple reentry points may be associated with a single key.	2024-12-17 22:38:15 +11:00
Benjamin Maxwell	a7dafea384	[SDAG] Allow folding stack slots into sincos/frexp in more cases (#118117 ) This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to check that it is safe to fold a store into a node that will expand to a library call that takes output pointers. This requires checking for two (independent) properties: 1. The store is not within a CALLSEQ_START..CALLSEQ_END pair * If it is, the expansion would lead to nested call sequences (which is invalid) 2. The node does not appear as a predecessor to the store * If it does, attempting to merge the store into the call would result in a cycle in the DAG These two properties are checked as part of the same traversal in `canFoldStoreIntoLibCallOutputPointers()`	2024-12-17 10:54:17 +00:00
Matt Arsenault	10b12e6e07	LiveVariables: Use Register (#120204 )	2024-12-17 17:45:24 +07:00
SpencerAbson	9c89b40f18	[AArch64] Implement intrinsics for FMLAL/FMLALL (single) (#119568 ) Multi-vector 8-bit floating-point multiply-add long (single) ```c // Only if __ARM_FEATURE_SME_F8F16 != 0 void svmla[_single]_za16[_mf8]_vg2x1_fpm(uint32_t slice, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla[_single]_za16[_mf8]_vg2x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla[_single]_za16[_mf8]_vg2x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); // Only if __ARM_FEATURE_SME_F8F32 != 0 void svmla[_single]_za32[_mf8]_vg4x1_fpm(uint32_t slice, svmfloat8_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla[_single]_za32[_mf8]_vg4x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla[_single]_za32[_mf8]_vg4x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); ``` In accordance with https://github.com/ARM-software/acle/pull/323. Co-authored-by: Momchil Velikov momchil.velikov@arm.com	2024-12-17 09:31:54 +00:00
Lang Hames	300deebf41	[ORC] Make LazyReexportsManager implement ResourceManager. This ensures that the reexports mappings are cleared when the resource tracker associated with each mapping is removed.	2024-12-17 18:45:16 +11:00
Lang Hames	b3d2548d5b	[ORC] Introduce LinkGraphLayer interface and LinkGraphLinkingLayer. (#120182 ) Introduces a new layer interface, LinkGraphLayer, that can be used to add LinkGraphs to an ExecutionSession. This patch moves most of ObjectLinkingLayer's functionality into a new LinkGraphLinkingLayer which should (in the future) be able to be used without linking libObject. ObjectLinkingLayer now inherits from LinkGraphLinkingLayer and just handles conversion of object files to LinkGraphs, which are then handed down to LinkGraphLinkingLayer to be linked.	2024-12-17 17:18:58 +11:00
Ashley Coleman	41a6e9cfd6	[HLSL] Implement `WaveActiveAllTrue` Intrinsic (#117245 ) Resolves https://github.com/llvm/llvm-project/issues/99161 - [x] Implement `WaveActiveAllTrue` clang builtin, - [x] Link `WaveActiveAllTrue` clang builtin with `hlsl_intrinsics.h` - [x] Add sema checks for `WaveActiveAllTrue` to `CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp` - [x] Add codegen for `WaveActiveAllTrue` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp` - [x] Add codegen tests to `clang/test/CodeGenHLSL/builtins/WaveActiveAllTrue.hlsl` - [x] Add sema tests to `clang/test/SemaHLSL/BuiltIns/WaveActiveAllTrue-errors.hlsl` - [x] Create the `int_dx_WaveActiveAllTrue` intrinsic in `IntrinsicsDirectX.td` - [x] Create the `DXILOpMapping` of `int_dx_WaveActiveAllTrue` to `114` in `DXIL.td` - [x] Create the `WaveActiveAllTrue.ll` and `WaveActiveAllTrue_errors.ll` tests in `llvm/test/CodeGen/DirectX/` - [x] Create the `int_spv_WaveActiveAllTrue` intrinsic in `IntrinsicsSPIRV.td` - [x] In SPIRVInstructionSelector.cpp create the `WaveActiveAllTrue` lowering and map it to `int_spv_WaveActiveAllTrue` in `SPIRVInstructionSelector::selectIntrinsic`. - [x] Create SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WaveActiveAllTrue.ll`	2024-12-16 16:13:35 -08:00
Justin Bogner	482237e884	[DirectX] Get resource information via TargetExtType (#119772 ) Instead of storing an auxilliary structure with the information from the DXIL resource target extension types duplicated, access the information that we can via the type itself. This also means we need to handle some of the target extension types we haven't fully defined yet, like Texture and CBuffer. For now we make an educated guess to what those should look like based on llvm/wg-hlsl#76, and we can update them fairly easily when we've defined them more thoroughly. First part of #118400	2024-12-16 16:04:25 -07:00
Artem Pianykh	8402a0fab0	[NFC][Utils] Extract CloneFunctionBodyInto from CloneFunctionInto (#118624 ) Summary: This and previously extracted `CloneFunction*Into` functions will be used in later diffs. Test Plan: ninja check-llvm-unit check-llvm	2024-12-16 22:30:56 +00:00
SpencerAbson	38099d0608	[AArch64] Implement intrinsics for SME FP8 FMLAL/FMLALL (Indexed) (#118549 ) This patch implements the following intrinsics: Multi-vector 8-bit floating-point multiply-add long. ``` c // Only if __ARM_FEATURE_SME_F8F16 != 0 void svmla_lane_za16[_mf8]_vg2x1_fpm(uint32_t slice, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla_lane_za16[_mf8]_vg2x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla_lane_za16[_mf8]_vg2x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, uint64_t imm_idx fpm_t fpm) __arm_streaming __arm_inout("za"); // Only if __ARM_FEATURE_SME_F8F32 != 0 void svmla_lane_za32[_mf8]_vg4x1_fpm(uint32_t slice, svmfloat8_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm)__arm_streaming __arm_inout("za"); void svmla_lane_za32[_mf8]_vg4x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm)__arm_streaming __arm_inout("za"); void svmla_lane_za32[_mf8]_vg4x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8_t zm, uint64_t imm_idx, fpm_t fpm)__arm_streaming __arm_inout("za"); ``` In accordance with: https://github.com/ARM-software/acle/pull/323	2024-12-16 21:45:38 +00:00
Jay Foad	2fe2969659	[CodeGen] Simplify LLT bitfields. NFC. (#120074 ) - Put the element size field in the same place for all non-pointer types. - Put the element size and address space fields in the same place for all pointer types. - Put the number of elements and scalable fields in the same place for all vector types. This simplifies initialization and accessor methods isScalable, getElementCount, getScalarSizeInBits and getAddressSpace.	2024-12-16 21:23:33 +00:00
vporpo	3769fcb3e7	[SandboxVec][Interval] Implement Interval::notifyMoveInstr() (#119471 ) This patch implements the notifier for Instruction intervals. It updates the interval's top/bottom.	2024-12-16 12:59:24 -08:00
Artem Pianykh	a9237b1a10	[NFC][Utils] Extract CloneFunctionMetadataInto from CloneFunctionInto (#118623 ) Summary: The new API expects the caller to populate the VMap. We need it this way for a subsequent change around coroutine cloning. Test Plan: ninja check-llvm-unit check-llvm	2024-12-16 20:50:05 +00:00
Ramkumar Ramachandra	290f38cd1a	IR: fix getSwappedCmpPredicate() return type (#120097 ) The change 51a895a (IR: introduce struct with CmpInst::Predicate and samesign) missed a change to ICmpInst::getSwappedCmpPredicate(), which intends to return a CmpPredicate, but returns a Predicate instead. Fix this.	2024-12-16 16:29:21 +00:00

1 2 3 4 5 ...

57581 Commits