llvm-project

Author	SHA1	Message	Date
Chow	3e008cb333	Scalarizer : Fix vector shuffle issue when can't aligned to customized minBits. (#163912 ) When set a value to minBits, and doing scalarizer pass, if last remained boolean vector size can't be aligned to min bits, remained bits should be processed each by each, and not allowed to do a direct shuffle during packing. Problem: In 'concatenate' step, when processing a boolean vector, if last remained bits (fragment) can't be aligned to minBits, but required to be packed, those bits should be processed each by each. A direct call to vector shuffle is to assume those remained boolean bits can be packed to target pack size. For example, when processing a boolean vector with `size = 7`, but set `min bits = 4`, first fragment with `4` bits can be packed correctly, but there are still `3` bits remained which can't be used in a vector shuffle call. Solution: If remained bits can't be aligned to required target (min bits) pack size, process them each by each. (This will mostly only influence boolean vector as they have bit width not aligned to pow(2).) --------- Co-authored-by: Zhou, Shaochi(AMD) <shaozhou@amd.com>	2025-12-08 18:06:39 +00:00
Rahul Joshi	4b87d5861d	[NFC][LLVM] Namespace cleanup in Scalarizer.cpp (#163766 )	2025-10-16 10:15:55 -07:00
Farzon Lotfi	581ba1cbf7	[DirectX] Fix crash in passes when building with LLVM_ENABLE_EXPENSIVE_CHECKS (#150483 ) fixes #148681 fixes #148680 For the scalarizer pass we just need to indicate that scalarization took place, I used the logic for knowing when to eraseFromParent to indicate this. For the DXILLegalizePass the new `legalizeScalarLoadStoreOnArrays` did not use `ToRemove` which means our uses of !ToRemove.empty(); was no longer correct. This meant each legalization now needed a means of indicated if a change was maded. For DXILResourceAccess.cpp the `Changed` bool was never set to true. So removed it and replaced it with `!Resources.empty();` since we only call `replaceAccess` if we have items in Resources.	2025-07-24 17:17:47 -04:00
Deric C.	0c14f0e891	[Scalarizer] Use correct key for ExtractValueInst gather (#149855 ) Fixes #149345 Effectively no-op pairs of insertelement-extractelement instructions were being created due to the ExtractValueInst visitor in the Scalarizer storing its scalarized result into the Scattered map using an incorrect key (specifically the type used in the key). This PR fixes this issue.	2025-07-21 17:12:15 -07:00
Deric C.	1440f02259	[Scalarizer] Ensure valid VectorSplits for each struct element in `visitExtractValueInst` (#128538 ) Fixes #127739 The `visitExtractValueInst` is missing a check that was present in `splitCall` / `visitCallInst`. This check ensures that each struct element has a VectorSplit, and that each VectorSplit contains the same number of elements packed per fragment. --------- Co-authored-by: Jay Foad <jay.foad@amd.com>	2025-03-04 13:10:31 -08:00
Finn Plummer	45c01e8a33	[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635 ) - update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for all uses, to allow specifiction of target specific intrinsics - add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api - update TTI api to provide `isTargetIntrinsicWith...` functions and consistently name them - move `isTriviallyScalarizable` to VectorUtils - update all uses of the api and provide the TTI parameter Resolves #117030	2024-12-19 11:54:26 -08:00
Finn Plummer	8663b8777e	[NFC][VectorUtils][TargetTransformInfo] Add `isVectorIntrinsicWithOverloadTypeAtArg` api (#114849 ) This changes allows target intrinsics to specify and overwrite overloaded types. - Updates `ReplaceWithVecLib` to not provide TTI as there most probably won't be a use-case - Updates `SLPVectorizer` to use available TTI - Updates `VPTransformState` to pass down TTI - Updates `VPlanRecipe` to use passed-down TTI This change will let us add scalarization for `asdouble`: #114847	2024-11-21 11:04:25 -08:00
Farzon Lotfi	21b3769d1d	[Scalarizer] Fix to only scalarize if intrinsic was marked as isTriviallyScalarizable (#113625 ) fixes #113624	2024-10-24 23:26:02 -07:00
Farzon Lotfi	dcbf2c2ca0	[Scalarizer][DirectX] support structs return types (#111569 ) Based on this RFC: https://discourse.llvm.org/t/rfc-allow-the-scalarizer-pass-to-scalarize-vectors-returned-in-structs/82306 LLVM intrinsics do not support out params. To get around this limitation implementers will make intrinsics return structs to capture a return type and an out param. This implementation detail should not impact scalarization since these cases should be elementwise operations. ## Three changes are needed. - The CallInst visitor needs to be updated to handle Structs - A new visitor is needed for `ExtractValue` instructions - finsh needs to be update to handle structs so that insert elements are properly propogated. ## Testing changes - Add support for `llvm.frexp` - Add support for `llvm.dx.splitdouble` fixes https://github.com/llvm/llvm-project/issues/111437	2024-10-21 12:51:01 -04:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Farzon Lotfi	63a0a81e73	[NFC][Scalarizer][TargetTransformInfo] Add isTargetIntrinsicWithScalarOpAtArg api (#111441 ) This change allows target intrinsics can have scalar args fixes [111440](https://github.com/llvm/llvm-project/issues/111440) This change will let us add scalarization for WaveReadLaneAt: https://github.com/llvm/llvm-project/pull/111010	2024-10-07 19:57:07 -04:00
Matt Arsenault	1bc9b67bd8	Scalarizer: Replace cl::opts with pass parameters (#110645 ) Preserve the existing defaults (although load-store defaulting to false is a really bad one). Also migrate DirectX tests to new PM.	2024-10-02 14:45:26 +04:00
Rahul Joshi	1b7b3b8d35	[NFC] Move intrinsic related functions to Intrinsic namespace (#110125 ) Move static functions `Function::lookupIntrinsicID` and `Function::isTargetIntrinsic` to Intrinsic namespace.	2024-09-30 07:42:53 -07:00
Farzon Lotfi	0f97b4824a	[Scalarizer][DirectX] Add support for scalarization of Target intrinsics (#108776 ) Since we are using the Scalarizer pass in the backend we needed a way to allow this pass to operate on Target intrinsics. We achieved this by adding `TargetTransformInfo ` to the Scalarizer pass. This allowed us to call a function available to the DirectX backend to know if an intrinsic is a target intrinsic that should be scalarized.	2024-09-17 11:35:42 -04:00
Farzon Lotfi	c05e29bff0	[LegacyPM][DirectX] Add legacy scalarizer back for use in the DirectX backend (#107427 ) As discussed in this [proposal](https://github.com/llvm/wg-hlsl/pull/62/files?short_path=ac6e592#diff-ac6e59276afe8016e307eedc5c835f534c0cb353707760b44df0fa9d905a5cf8). We had to bring back the legacy pass manager interface for the scalarizer pass. Two reasons for this: 1. The DirectX backend is still using the legacy pass manager 2. The new PM isn't hooked up in clang yet via `BackendUtil.cpp`'s `AddEmitPasses` That means even if we add a `buildCodeGenPipeline` we won't be able to benefit from the new pass manager's scalarizer pass interface. The remaining changes are hooking up the scalarizer pass to the DirectX backend, updating the DirectX test cases, and allowing the `optdriver` to not block the legacy invocation of the scalarizer pass. Future work still needs to be done to allow the scalarizer pass to handle target specific intrinsics. closes #105178	2024-09-12 15:53:50 -04:00
Kazu Hirata	4b28b3fae4	[Transforms] Use range-based for loops (NFC) (#97195 )	2024-07-02 16:20:44 -07:00
Nikita Popov	2d209d964a	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902 ) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.	2024-06-27 16:38:15 +02:00
Stephen Tozer	d75f9dd1d2	Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497 )" Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.	2024-06-24 18:00:22 +01:00
Stephen Tozer	6481dc5761	[IR][NFC] Update IRBuilder to use InsertPosition (#96497 ) Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.	2024-06-24 17:27:43 +01:00
Aiden Grossman	2470857fe7	[NewPM] Remove ScalarizerLegacyPass (#72814 ) This pass isn't used anywhere upstream and thus has no test coverage. Because of these reasons, remove it.	2023-11-20 01:09:27 -08:00
Kazu Hirata	697082de74	[Scalar] Use LLVMContext::MD_mem_parallel_loop_access directly (NFC) (#69549 ) This patch "constant propagates" LLVMContext::MD_mem_parallel_loop_access into wherever ParallelLoopAccessMDKind is used.	2023-10-19 18:38:25 -07:00
Kazu Hirata	0187960cdd	[Scalar] Use LLVMContext::MD_mem_parallel_loop_access (NFC)	2023-10-15 00:14:14 -07:00
Mikhail Goncharov	74f4daef04	fix unused variable warnings in conditionals for 92023b15099012a657da07ebf49dd7d94a260f84	2023-08-30 14:36:42 +02:00
Nuno Lopes	31d8bdbcad	[Scalarizer] Fold -1 mask in shufflevector to poison instead of undef Per latest LangRef	2023-07-23 15:02:23 +01:00
Nikita Popov	9cf5254878	[llvm] Remove some uses of isOpaqueOrPointeeTypeEquals() (NFC)	2023-07-18 11:18:31 +02:00
Nicolai Hähnle	2cb5c6d124	Scalarizer: limit scalarization for small element types Scalarization can expose optimization opportunities for the individual elements of a vector, and can therefore be beneficial on targets like GPUs that tend to operate on scalars anyway. However, notably with 16-bit operations it is often beneficial to keep <2 x i16 / half> vectors around since there are packed instructions for those. Refactor the code to operate on "fragments" of split vectors. The fragments are usually scalars, but may themselves be smaller vectors when the scalarizer-min-bits option is used. If the split is uneven, the last fragment is a shorter remainder. This is almost NFC when the new option is unused, but it happens to clean up some code in the fully scalarized case as well. Differential Revision: https://reviews.llvm.org/D149842	2023-06-13 21:14:32 +02:00
Jay Foad	63901cb082	[Scalarizer] Scalarize freeze instruction Differential Revision: https://reviews.llvm.org/D152518	2023-06-09 13:54:24 +01:00
Nicolai Hähnle	d0a125a1e6	Scalarizer: use the canonical form of {extract,insert}element This leads to a bunch of trivial test churn, plus some extra test changes that are purely due to update_test_checks. Pulled out of https://reviews.llvm.org/D149842 as a preparatory change. Differential Revision: https://reviews.llvm.org/D149944	2023-05-05 13:05:31 +02:00
Jay Foad	593e25ffae	[Vectorize] Fix vectorization, scalarization and folding of llvm.is.fpclass llvm.is.fpclass is different from other vectorizable intrinsics in that it is overloaded on an argument type, not on the return type. Differential Revision: https://reviews.llvm.org/D148905	2023-04-24 13:42:08 +01:00
Fangrui Song	3152156334	[Transforms/Scalar] llvm::Optional => std::optional	2022-12-13 08:05:14 +00:00
Nicolai Hähnle	6c379cb318	Scalarizer: fix an opaque pointer bug With opaque pointers, it's possible for the same pointer value to be used to store different vector types (both number and type of elements), so we need to take that into account when caching the scattering. Differential Revision: https://reviews.llvm.org/D139359	2022-12-08 20:48:14 +01:00
Nicolai Hähnle	1a78c64654	Scalarizer: explicitly exclude scalable vectors They are unsupported and would previously crash, now we just skip them. Hypothetically, one could consider "scalarizing" a <vscale x n x T> into n copies of <vscale x 1 x T>. But (1) it's unclear how to do that because insertelement etc. don't work with scalable vectors in the required way, and (2) there is no user of such functionality. Differential Revision: https://reviews.llvm.org/D139358	2022-12-08 20:48:14 +01:00
Kazu Hirata	595f1a6aaf	[llvm] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 19:47:13 -08:00
Kazu Hirata	343de6856e	[Transforms] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 21:11:37 -08:00
Manuel Brito	1e55d5b1f2	Use poison instead of undef as placeholder for vector construction [NFC] Differential Revision: https://reviews.llvm.org/D138450	2022-11-21 18:43:23 +00:00
Thomas Symalla	fc26a75280	[NFC] Fixed several misspellings of "Splitter" in Scalarizer Spliiter => Splitter	2022-10-22 15:13:56 +02:00
Nuno Lopes	0586d1cac2	[NFC] Switch a few uses of undef to poison as placeholders for unreachble code	2022-06-30 21:47:31 +01:00
serge-sans-paille	aaf1630ac3	[Scalarizer] No need to gather a scattered extracted element ExtractElement does not produce a vector out of a vector, so there's no need to call a gather once done. Fix #54469 Credits to npopov@redhat.com for the original approach. Differential Revision: https://reviews.llvm.org/D126012	2022-06-21 18:43:54 +02:00
Kazu Hirata	129b531c9c	[llvm] Use value_or instead of getValueOr (NFC)	2022-06-18 23:07:11 -07:00
David Green	6f81903e89	[LV][SLP] Mark fptosi_sat as vectorizable This adds fptosi_sat and fptoui_sat to the list of trivially vectorizable functions, mainly so that the loop vectorizer can vectorize the instruction. Marking them as trivially vectorizable also allows them to be SLP vectorized, and Scalarized. The signature of a fptosi_sat requires two type overrides (@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first operand of the intrinsic as a overloaded (but not scalar) operand. Differential Revision: https://reviews.llvm.org/D124358	2022-05-03 09:32:34 +01:00
David Green	9727c77d58	[NFC] Rename Instrinsic to Intrinsic	2022-04-25 18:13:23 +01:00
Benoit Jacob	9879c555f2	Expose ScalarizerPass options to C++ (not just commandline) Context: I needed this for https://github.com/google/iree/pull/8474 . I found that TSan instrumentation expects vector sizes to be <= 16, and in my project (IREE) we have tests with higher vector sizes. That left some test functions uninstrumented, resulting in crashes as instrumented code called into them. Differential Revision: https://reviews.llvm.org/D121182	2022-03-14 12:00:35 +01:00
Nikita Popov	c262ba2aab	[Scalarizer] Avoid pointer element type accesses Pass through the load/store type to the Scatterer instead.	2022-03-03 10:28:58 +01:00
serge-sans-paille	59630917d6	Cleanup includes: Transform/Scalar Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817	2022-03-03 07:56:34 +01:00
Nikita Popov	aa97bc116d	[NFC] Remove uses of PointerType::getElementType() Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.	2022-01-25 09:44:52 +01:00
Daniele Vettorel	67887b0f81	[Scalarizer] Do not insert instructions between PHI nodes and debug intrinsics. The scalarizer pass seems to be inserting instructions in-between PHI nodes or debug intrinsics that end up staying at the end of the pass, resulting in malformed IR and violating assumptions. This patch adds a check to make sure the `extractelement` instructions that it adds are correctly placed after all PHI nodes and debug intrinsics. Patch by vettoreldaniele. Reviewed By: bjope Differential Revision: https://reviews.llvm.org/D112472	2021-11-02 09:53:59 -04:00
Kazu Hirata	4f0225f6d2	[Transforms] Migrate from getNumArgOperands to arg_size (NFC) Note that getNumArgOperands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.	2021-10-01 09:57:40 -07:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit 0ee439b705e82a4fe20e2, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00
Juneyoung Lee	1fc992bd86	[Scalarizer] Use poison as insertelement's placeholder This patch makes Scalarizer to use poison as insertelement's placeholder. It contains two changes in Scalarizer.cpp, and the both changes does not change the semantics of the optimized program. It is because the placeholder value (poison) is already completely hidden by following insertelement instructions. The first change at visitBitCastInst() creates poison vector of MidTy and consecutively inserts FanIn times, which is # of elems of MidTy. The second change at ScalarizerVisitor::finish() creates poison with Op->getType(), and it is filled with Count insertelements. The test diffs show that the poison value is never exposed after insertelements. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93989	2021-01-04 00:35:28 +09:00
Bjorn Pettersson	aa8be5aeea	[Scalarizer] Avoid changing name of non-instructions The "takeName" logic in ScalarizerVisitor::gather did not consider that the value vector could refer to non-instructions, such as global variables. This patch make sure that we avoid changing the name of a value if it isn't an instruction. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D87685	2020-09-15 14:15:50 +02:00

1 2 3

111 Commits