llvm-project

Author	SHA1	Message	Date
Tex Riddell	818d715989	[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs (#113637 ) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 - Return true for atan2 from isTriviallyVectorizable - Add atan2 to VecFuncs.def for massv and accelerate libraries. - Add atan2 to hasOptimizedCodeGen - Add atan2 support in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite and update vectorization tests - Add atan2 name check to isLoweredToCall in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h - Note: there's no test coverage for these names in isLoweredToCall, except that Transforms/TailCallElim/inf-recursion.ll is impacted by the "fabs" case Thanks to @jroelofs for the atan2 accelerate veclib and associated test additions, plus the hasOptimizedCodeGen addition. Part of: Implement the atan2 HLSL Function #70096.	2024-11-08 16:07:38 -08:00
Rohit Aggarwal	dfb60bb919	Adding more vector calls for -fveclib=AMDLIBM (#109662 ) AMD has it's own implementation of vector calls. New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos Please refer [https://github.com/amd/aocl-libm-ose] --------- Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>	2024-10-29 10:09:55 +00:00
Farzon Lotfi	dcbf2c2ca0	[Scalarizer][DirectX] support structs return types (#111569 ) Based on this RFC: https://discourse.llvm.org/t/rfc-allow-the-scalarizer-pass-to-scalarize-vectors-returned-in-structs/82306 LLVM intrinsics do not support out params. To get around this limitation implementers will make intrinsics return structs to capture a return type and an out param. This implementation detail should not impact scalarization since these cases should be elementwise operations. ## Three changes are needed. - The CallInst visitor needs to be updated to handle Structs - A new visitor is needed for `ExtractValue` instructions - finsh needs to be update to handle structs so that insert elements are properly propogated. ## Testing changes - Add support for `llvm.frexp` - Add support for `llvm.dx.splitdouble` fixes https://github.com/llvm/llvm-project/issues/111437	2024-10-21 12:51:01 -04:00
Amr Hesham	4ba1800be6	[LLVM][NFC] Reduce copying of parameter in lambda (#110299 ) Reduce redundant copy parameter in lambda Fixes #95642	2024-10-16 09:55:01 +01:00
Piotr Fusik	cc7b24a4d1	[NFC] Fix typos in comments (#109765 )	2024-09-24 11:19:56 +02:00
Yingwei Zheng	a156b5a47d	[SLP] Add vectorization support for [u\|s]cmp (#106747 ) This patch adds vectorization support for [u\|s]cmp intrinsic calls.	2024-09-02 17:06:07 +08:00
Simon Pilgrim	d58d105cda	[Analysis] isTriviallyVectorizable - add vectorization support for acos/asin/atan and cosh/sinh/tanh intrinsics (#106584 ) Show fallback cases in amdlibm tests where it doesn't have that specific op	2024-08-30 16:49:23 +01:00
mskamp	b22fa9093b	[ValueTracking][X86] Compute KnownBits for phadd/phsub (#92429 ) Add KnownBits computations to ValueTracking and X86 DAG lowering. These instructions add/subtract adjacent vector elements in their operands. Example: phadd [X1, X2] [Y1, Y2] = [X1 + X2, Y1 + Y2]. This means that, in this example, we can compute the KnownBits of the operation by computing the KnownBits of [X1, X2] + [X1, X2] and [Y1, Y2] + [Y1, Y2] and intersecting the results. This approach also generalizes to all x86 vector types. There are also the operations phadd.sw and phsub.sw, which perform saturating addition/subtraction. Use sadd_sat and ssub_sat to compute the KnownBits of these operations. Also adjust the existing test case pr53247.ll because it can be transformed to a constant using the new KnownBits computation. Fixes #82516.	2024-07-16 15:50:21 +01:00
Nikita Popov	2d209d964a	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902 ) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.	2024-06-27 16:38:15 +02:00
Nikita Popov	d42b392696	[VectorUtils] Use SmallPtrSet::remove_if() (NFC)	2024-06-26 14:55:06 +02:00
Simon Pilgrim	5b4000dc58	[VectorUtils] Add llvm::scaleShuffleMaskElts wrapper for narrowShuffleMaskElts/widenShuffleMaskElts, NFC. (#96646 ) Using the target number of vector elements, scaleShuffleMaskElts will try to use narrowShuffleMaskElts/widenShuffleMaskElts to scale the shuffle mask accordingly. Working on #58895 I didn't want to create yet another case where we have to handle both re-scaling cases.	2024-06-26 10:43:58 +01:00
Nikita Popov	605e18479c	[VectorUtils] Use poison instead of undef in findScalarElement() Out-of-range extractelement returns poison, and so do poison elements in the shufflevector mask.	2024-06-24 16:20:40 +02:00
Farzon Lotfi	1d87433593	[x86] Add tan intrinsic part 4 (#90503 ) This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Much of this change was following how G_FSIN and G_FCOS were used. Changes: - `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN` opcode - `llvm/docs/LangRef.rst` - Document the tan intrinsic - `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan intrinsic as a vector function similar to the tanf libcall. - `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to `ISD::FTAN` - `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall mappings - `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN` Opcode - `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN` Opcode handler - `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map `G_FTAN` to `ftan` - `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`, `strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a vector intrinsic - `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic to `G_FTAN` Opcode - `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to the list of floating point math operations also associate `G_FTAN` with the `TAN_F` runtime lib. - `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math operation common behaviors. - llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function expansion operations for `FTAN` and `STRICT_FTAN`. Also define both opcodes in `PromoteNode`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN` and `STRICT_FTAN` handling in the legalizer - `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define `SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an intrinsic that doesn't return NaN. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map `LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map `Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for `Intrinsic::tan`. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan` and `strict_ftan` names for the equivalent ISD opcodes. - `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and ISD::FTAN as a target lowering action. - `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for tan intrinsic resolves https://github.com/llvm/llvm-project/issues/70082	2024-06-05 15:01:33 -04:00
Pierre van Houtryve	cf328ff96d	[IR] Memory Model Relaxation Annotations (#78569 ) Implements the core/target-agnostic components of Memory Model Relaxation Annotations. RFC: https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5	2024-04-24 08:52:25 +02:00
Yingwei Zheng	a1a590ef12	[InstCombine] Fix miscompilation in PR83947 (#83993 ) `762f762504/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (L394-L407)` Comment from @topperc: > This transforms assumes the mask is a non-zero splat. We only know its a splat and not provably all 0s. The mask is a constexpr that includes the address of the global variable. We can't resolve the constant expression to an exact value. Fixes #83947.	2024-03-05 22:34:04 +08:00
Alexandros Lamprineas	92289db82f	[VFABI] Move the Vector ABI demangling utility to LLVMCore. (#77513 ) This fixes #71892 allowing us to check magled names in the IR verifier.	2024-01-17 09:55:30 +00:00
Alexandros Lamprineas	e512df3ecc	[LV] Fix crash when vectorizing function calls with linear args. (#76274 ) llvm/lib/IR/Type.cpp:694: Assertion `isValidElementType(ElementType) && "Element type of a VectorType must be an integer, floating point, or pointer type."' failed. Stack dump: llvm::FixedVectorType::get(llvm::Type, unsigned int) llvm::VPWidenCallRecipe::execute(llvm::VPTransformState&) llvm::VPBasicBlock::execute(llvm::VPTransformState) llvm::VPRegionBlock::execute(llvm::VPTransformState) llvm::VPlan::execute(llvm::VPTransformState) ... Happens with function calls of void return type.	2024-01-02 18:14:16 +00:00
Paschalis Mpeis	ddb6db4d09	[VFABI] Create FunctionType for vector functions (#75058 ) `createFunctionType` returns a FunctionType that may contain a mask, which is currently placed as the last parameter to the Function. The placement happens according to `VFParameters` of `VFInfo`, and it should be able to handle VFABI specification changes. Regarding the return type, it uses the scalar type of the input instruction, as the specification does not encode in the mangled name such information. If that ever happens, that information should be available from `VFInfo`.	2023-12-19 12:05:28 +00:00
Paschalis Mpeis	7b83f69db4	[NFC] Replace CallInst with FunctionType in VFABI, VFShape API (#74569 ) Minor simplification applied to VFShape::getScalarShape, VFShape::get, and VFABI::tryDemangleForVFABI methods. Also, remove unnecessary `static_cast` in `SLPVectorizer.cpp`	2023-12-06 17:14:58 +00:00
Graham Hunter	b1fba568f6	[SVE] Don't require lookup when demangling vector function mappings (#72260 ) We can determine the VF from a combination of the mangled name (which indicates the arguments that take vectors) and the element sizes of the arguments for the scalar function the mapping has been established for. The assert when demangling fails has been removed in favour of just not adding the mapping, which prevents the crash seen in https://github.com/llvm/llvm-project/issues/71892 This patch also stops using _LLVM_ as an ISA for scalable vector tests, since there aren't defined rules for the way vector arguments should be handled (e.g. packed vs. unpacked representation).	2023-11-23 17:15:48 +00:00
Ramkumar Ramachandra	2302e4c327	Reland "VectorUtils: mark xrint as trivially vectorizable" (#71416 ) With the recent change 98c90a13 (ISel: introduce vector ISD::LRINT, ISD::LLRINT; custom RISCV lowering), it is now possible for SLPVectorizer, LoopVectorize, and Scalarizer to operate on llvm.lrint and llvm.llrint, with vector codegen for the RISC-V target. Make a trivial change to VectorUtils, and update the corresponding tests. A couple of important fixes have been landed since the original patch was landed and reverted, and it is now safe to re-land the patch: 5e1d81a (LegalizeIntegerTypes: implement PromoteIntRes for xrint) and fd887a3 (LegalizeVectorTypes: fix bug in widening of vec result in xrint). See also #71399, which proves that lrint and llrint will indeed produce vector codegen on RISC-V. Fixes #55208.	2023-11-06 18:49:49 +00:00
Ramkumar Ramachandra	ac7c816dc2	Revert "VectorUtils: mark lrint, llrint as trivially vectorizable (#69945 )" This reverts commit 5bfd89bda7c2d5ff167c7bcea0c8d69b0b498f08. It was causing build failures on ffmpeg on i686.	2023-11-01 09:57:22 +00:00
Ramkumar Ramachandra	5bfd89bda7	VectorUtils: mark lrint, llrint as trivially vectorizable (#69945 ) With the recent change 98c90a13 (ISel: introduce vector ISD::LRINT, ISD::LLRINT; custom RISCV lowering), it is now possible for SLPVectorizer, LoopVectorize, and Scalarizer to operate on llvm.lrint and llvm.llrint, with vector codegen for the RISC-V target. Make a trivial change to VectorUtils, and update the corresponding tests.	2023-10-31 21:29:15 +00:00
Kazu Hirata	3b7bfeb483	[llvm] Stop including llvm/ADT/SmallString.h (NFC) Identified with misc-include-cleaner.	2023-10-22 10:42:15 -07:00
JolantaJensen	01797dad86	Fix mechanism propagating mangled names for TLI function mappings (#66656 ) Currently the mappings from TLI are used to generate the list of available "scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant" LLVM IR attribute. Function names from TLI are wrapped in mangled name following the pattern: _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)] The problem is the mangled name uses _LLVM_ as the ISA name which prevents the compiler to compute vectorization factor for scalable vectors as it cannot make any decision based on the _LLVM_ ISA. If we use "s" as the ISA name, the compiler can make decisions based on VFABI specification where SVE spacific rules are described. This patch is only a refactoring stage where there is no change to the compiler's behaviour.	2023-10-02 18:58:39 +01:00
Anna Thomas	3cf24dbbdd	[LV] Complete load groups and release store groups. Try 2. This is a complete fix for CompleteLoadGroups introduced in D154309. We need to check for dependency between A and every member of the load Group of B. This patch also fixes another miscompile seen when we incorrectly sink stores below a depending load (see testcase in interleaved-accesses-sink-store-across-load.ll). This is fixed by releasing store groups correctly. This change was previously reverted (e85fd3cbdd68) due to Asan failure with use-after-free error. A testcase is added and the bug is fixed in this version of the patch. Differential Revision: https://reviews.llvm.org/D155520	2023-08-08 18:10:23 -04:00
Anna Thomas	e85fd3cbdd	Revert "[LV] Complete load groups and release store groups in presence of dependency" This reverts commit eaf6117f3388615f51198e47c0d6be0252729508 (D155520). There's an ASAN build failure that needs investigation.	2023-07-26 15:07:26 -04:00
Anna Thomas	eaf6117f33	[LV] Complete load groups and release store groups in presence of dependency This is a complete fix for CompleteLoadGroups introduced in D154309. We need to check for dependency between A and every member of the load Group of B. This patch also fixes another miscompile seen when we incorrectly sink stores below a depending load (see testcase in interleaved-accesses-sink-store-across-load.ll). This is fixed by releasing store groups correctly. Differential Revision: https://reviews.llvm.org/D155520	2023-07-25 17:32:09 -04:00
Anna Thomas	9675e3fa81	[LV] Address post-commit NFC comments in interleave Addressed most of post-commit comments in D154309.	2023-07-14 16:24:07 -04:00
Florian Hahn	d7e79bd7d4	[LV] Check if ops can safely be truncated in computeMinimumValueSizes. Update computeMinimumValueSizes to check if an instruction's operands can safely be truncated. If more than MinBW bits are demanded by for the operand or if the operand is a constant and cannot be safely truncated, it is not safe to evaluate the instruction in the narrower MinBW. Skip those cases. Fixes https://github.com/llvm/llvm-project/issues/47927 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D154717	2023-07-11 20:18:55 +01:00
Elliot Goodrich	39d8e6e22c	Add missing StringExtras.h includes In preparation for removing the `#include "llvm/ADT/StringExtras.h"` from the header to source file of `llvm/Support/Error.h`, first add in all the missing includes that were previously included transitively through this header. This is fixing all files missed in b0abd4893fa1. Differential Revision: https://reviews.llvm.org/D154543	2023-07-08 10:19:07 +01:00
Florian Hahn	4d847bf4d0	[LV] Do not add load to group if it moves across conflicting store. This patch prevents invalid load groups from being formed, where a load needs to be moved across a conflicting store. Once we hit a store that conflicts with a load with an existing interleave group, we need to stop adding earlier loads to the group, as this would force hoisting the previous stores in the group across the conflicting load. To detect such cases, add a new CompletedLoadGroups set, which is used to keep track of load groups to which no earlier loads can be added. Fixes https://github.com/llvm/llvm-project/issues/63602 Reviewed By: anna Differential Revision: https://reviews.llvm.org/D154309	2023-07-07 11:06:30 +01:00
Philip Reames	e41dce4d49	[LAA/LV] Simplify stride speculation logic [NFC] (try 2) The original commit wasn't quite NFC, and this was caught by an arguably overly strong assert. Specifically, I'd failed to strip off the integer cast off the SCEV before saving it in the map. The result - other than a failed assert - is that we'd speculate on the casted unknown, not the unknown. The only case I can think of where that might change behavior would be a sext(i1 load). I doubt that case is interesting in practice, but it's good to be strictly NFC on this change regardless. Original commit message follows.. The existing code makes it hard to tell that collectStridedAccess is really about identifying some loop invariant SCEV which is profitable to speculate is equal to one. The odd dual usage structure of Value and SCEV confuses this point. We could choose to loosen the profitability analysis if desired. I'm not proposing doing so at this time as it exposes too many cases where the speculation is unprofitable. Differential Revision: https://reviews.llvm.org/D147750	2023-05-11 10:19:23 -07:00
Philip Reames	dc0d00c5fc	Revert "[LAA/LV] Simplify stride speculation logic [NFC]" This reverts commit d5b840131223f2ffef4e48ca769ad1eb7bb1869a. Running this through broader testing after rebasing is revealing a crash. Reverting while I investigate.	2023-05-11 09:26:35 -07:00
Philip Reames	d5b8401312	[LAA/LV] Simplify stride speculation logic [NFC] The existing code makes it hard to tell that collectStridedAccess is really about identifying some loop invariant SCEV which is profitable to speculate is equal to one. The odd dual usage structure of Value and SCEV confuses this point. We could choose to loosen the profitability analysis if desired. I'm not proposing doing so at this time as it exposes too many cases where the speculation is unprofitable. Differential Revision: https://reviews.llvm.org/D147750	2023-05-11 08:32:56 -07:00
ManuelJBrito	d22edb9794	[IR][NFC] Change UndefMaskElem to PoisonMaskElem Following the change in shufflevector semantics, poison will be used to represent undefined elements in shufflevector masks. Differential Revision: https://reviews.llvm.org/D149256	2023-04-27 18:01:54 +01:00
Jay Foad	593e25ffae	[Vectorize] Fix vectorization, scalarization and folding of llvm.is.fpclass llvm.is.fpclass is different from other vectorizable intrinsics in that it is overloaded on an argument type, not on the return type. Differential Revision: https://reviews.llvm.org/D148905	2023-04-24 13:42:08 +01:00
Jay Foad	2b81ec3265	Revert "[ConstantFolding] Fix crash when folding vector llvm.is.fpclass" This reverts commit 5fc6425fb6c77052a26cf0cf7b886449fabe1af4. It is reported to cause other crashes that require a larger fix.	2023-04-21 14:01:06 +01:00
Jay Foad	5fc6425fb6	[ConstantFolding] Fix crash when folding vector llvm.is.fpclass Differential Revision: https://reviews.llvm.org/D148803	2023-04-20 15:34:50 +01:00
Philip Reames	2d79b71366	[LAA] Continue moving utilities to sole use to isolate symbolic stride reasoning [nfc]	2023-04-06 08:27:57 -07:00
Philip Reames	800a99c4f4	[LAA] Group implementation of stride speculation into one file [nfc] These utilities are only used in one place, so move them there and make them static.	2023-04-05 20:39:08 -07:00
Paul Osmialowski	6b6f312cce	[TLI][AArch64] Extend SLEEF vectorized functions mapping with VLA functions This commit extends D134719 "[AArch64] Enable libm vectorized functions via SLEEF" with the mappings for the scalable functions. It also introduces all the necessary changes needed to support masked interfaces. Reviewed By: danielkiss, sdesmalen Differential Revision: https://reviews.llvm.org/D146839	2023-03-29 13:07:09 +01:00
Paul Osmialowski	f8f1909d36	Revert "[TLI][AArch64] Extend SLEEF vectorized functions mapping with VLA functions" Reverting it so I could land it with Arcanist. This reverts commit 59dcf927ee43e995374907b6846b657f68d7ea49.	2023-03-29 12:54:22 +01:00
Paul Osmialowski	59dcf927ee	[TLI][AArch64] Extend SLEEF vectorized functions mapping with VLA functions This commit extends D134719 "[AArch64] Enable libm vectorized functions via SLEEF" with the mappings for the scalable functions. It also introduces all the necessary changes needed to support masked interfaces. Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>	2023-03-29 11:07:35 +01:00
Kazu Hirata	526966d07d	Use llvm::bit_ceil (NFC) Note that: std::has_single_bit(X) ? X : llvm::NextPowerOf2(X); is equivalent to: std::bit_ceil(X) even for input 0.	2023-01-28 16:13:09 -08:00
Kazu Hirata	02a52b7306	[llvm] Use llvm::bit_width (NFC)	2023-01-28 15:04:20 -08:00
Kazu Hirata	55e2cd1609	Use llvm::count{lr}_{zero,one} (NFC)	2023-01-28 12:41:20 -08:00
Roman Lebedev	f487dfd830	[NFC][Analysis] Implement `getShuffleMaskWithWidestElts()` wrapper (+tests) It will be needed in an upcoming patch to implement some shuffle combining.	2022-12-26 01:04:48 +03:00
Fangrui Song	2fa744e631	std::optional::value => operator*/operator-> value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). This commit fixes LLVMAnalysis and its dependencies.	2022-12-16 22:44:08 +00:00
Fangrui Song	d4b6fcb32e	[Analysis] llvm::Optional => std::optional	2022-12-14 07:32:24 +00:00

1 2 3 4

186 Commits