llvm-project

Author	SHA1	Message	Date
Benjamin Maxwell	95f00a63ce	[IR] Allow fast math flags on calls with homogeneous FP struct types (#110506 ) This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`).	2024-10-02 10:05:09 +01:00
Jay Foad	eb6e7e8f89	[unittests] Use {} instead of std::nullopt to initialize empty ArrayRef (#109388 ) Follow up to #109133.	2024-09-21 10:59:50 +01:00
Jorge Gorbe Moya	094e6b8276	[IR] Make UnaryInstruction::classof recognize FreezeInst. (#107944 )	2024-09-09 21:01:44 -07:00
Tim Besard	eb7d535199	[LLVM] Add a C API for creating instructions with custom syncscopes. (#104775 ) Another upstreaming of C API extensions we have in Julia/LLVM.jl. Although [we went](https://github.com/maleadt/LLVM.jl/pull/431) with a string-based API there, here I'm proposing something that's similar to existing metadata/attribute APIs: - explicit functions to map syncscope names to IDs, and back - `LLVM*SyncScope` versions of builder APIs that already take a `SingleThread` argument: atomic rmw, atomic xchg, fence - `LLVMGetAtomicSyncScopeID` and `LLVMSetAtomicSyncScopeID` for other atomic instructions - testing through `llvm-c-test`'s `--echo` functionality	2024-08-20 14:12:35 +02:00
James Y Knight	dfeb3991fb	Remove the `x86_mmx` IR type. (#98505 ) It is now translated to `<1 x i64>`, which allows the removal of a bunch of special casing. This _incompatibly_ changes the ABI of any LLVM IR function with `x86_mmx` arguments or returns: instead of passing in mmx registers, they will now be passed via integer registers. However, the real-world incompatibility caused by this is expected to be minimal, because Clang never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>` or `double`, depending on ABI. This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type. That type simply no longer corresponds to an IR type, and is used only by MMX intrinsics and inline-asm operands. Because SelectionDAGBuilder only knows how to generate the operands/results of intrinsics based on the IR type, it thus now generates the intrinsics with the type MVT::v1i64, instead of MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus have the X86 backend fix them up in DAGCombine. (This may be a short-lived hack, if all the MMX intrinsics can be removed in upcoming changes.) Works towards issue #98272.	2024-07-25 09:19:22 -04:00
Tsz Chan	298e292a76	[IR] Add overflow check in AllocaInst::getAllocationSize (#97170 ) Fixes #91380.	2024-07-03 12:48:04 +02:00
Stephen Tozer	dc726c3403	Reapply#4 "[RemoveDIs] Load into new debug info format by default in LLVM (#89799 )" Reapplies commit c5aeca73 (and its followup commit 21396be8), which were reverted due to missing functionality in MLIR and Flang regarding printing debug records. This has now been added in commit 08aa511, along with support for printing debug records in flang. This reverts commit 2dc2290860355dd2bac3b655eea895fe30fde257.	2024-06-14 09:54:56 +01:00
Stephen Tozer	2dc2290860	Revert new debug info format commits: "[Flang] Update test to not check for tail calls on debug intrinsics" & "Reapply#3 "[RemoveDIs] Load into new debug info format by default in LLVM (#89799)" Recent updates to flang have added debug info generation via MLIR, a path which currently does not support debug records. The patch that enables debug records by default (and a small followup patch) are thus being reverted until the MLIR path has been fixed. This reverts commits: 21396be865b4640abf6afa0b05de6708a1a996e0 c5aeca732d1ff6769b0659efebd1cfb5f60487e4	2024-06-11 12:19:06 +01:00
Stephen Tozer	c5aeca732d	Reapply#3 "[RemoveDIs] Load into new debug info format by default in LLVM (#89799 )" Reapplies commit 91446e2, which was reverted due to a downstream error, discussed on the pull request. The error could not be reproduced upstream, and cannot be reproduced downstream as-of current main, so until the error can be confirmed to still exist this patch should return. This reverts commit 23f8fac745bdde70ed4f9c585d19c4913734f1b8.	2024-06-10 13:04:40 +01:00
Fangrui Song	23f8fac745	Revert "Repply#2 "[RemoveDIs] Load into new debug info format by default in LLVM (#89799 )"" This reverts commit 91446e2aa687ec57ad88dc0df793d0c6e694a7c9 and a unittest followup 1530f319311908b06fe935c89fca692d3e53184f (#90476). In a stage-2 -flto=thin -gsplit-dwarf -g -fdebug-info-for-profiling -fprofile-sample-use= build of clang, a ThinLTO backend compile has assertion failures: Global is external, but doesn't have external or weak linkage! ptr @_ZN5clang12ast_matchers8internal18makeAllOfCompositeINS_8QualTypeEEENS1_15BindableMatcherIT_EEN4llvm8ArrayRefIPKNS1_7MatcherIS5_EEEE function declaration may only have a unique !dbg attachment ptr @_ZN5clang12ast_matchers8internal18makeAllOfCompositeINS_8QualTypeEEENS1_15BindableMatcherIT_EEN4llvm8ArrayRefIPKNS1_7MatcherIS5_EEEE The failures somehow go away if -fprofile-sample-use= is removed.	2024-05-13 16:37:39 -07:00
Stephen Tozer	91446e2aa6	Repply#2 "[RemoveDIs] Load into new debug info format by default in LLVM (#89799 )" Reapplies the original commit: 2f01fd99eb8c8ab3db9aba72c4f00e31e9e60a05 The previous application of this patch failed due to some missing DbgVariableRecord support in clang, which has been added now by commit 8805465e. This will probably break some downstream tools that don't already handle debug records. If your downstream code breaks as a result of this change, the simplest fix is to convert the module in question to the old debug format before you process it, using `Module::convertFromNewDbgValues()`. For more information about how to handle debug records or about what has changed, see the migration document: https://llvm.org/docs/RemoveDIsDebugInfo.html This reverts commit 4fd319ae273ed6c252f2067909c1abd9f6d97efa.	2024-05-03 12:55:31 +01:00
Stephen Tozer	4fd319ae27	Revert#2 "[RemoveDIs] Load into new debug info format by default in LLVM (#89799 )" Reverted following probably-causing failures on some clang buildbots: https://lab.llvm.org/buildbot/#/builders/245/builds/24037 This reverts commit a12622543de15df45fb9ad64e8ab723289d55169.	2024-05-02 17:52:02 +01:00
Stephen Tozer	a12622543d	Reapply "[RemoveDIs] Load into new debug info format by default in LLVM (#89799 )" Fixes the broken tests in the original commit: 2f01fd99eb8c8ab3db9aba72c4f00e31e9e60a05 This will probably break some downstream tools that don't already handle debug records. If your downstream code breaks as a result of this change, the simplest fix is to convert the module in question to the old debug format before you process it, using `Module::convertFromNewDbgValues()`. For more information about how to handle debug records or about what has changed, see the migration document: https://llvm.org/docs/RemoveDIsDebugInfo.html This reverts commit 00821fed09969305b0003d3313c44d1e761a7131.	2024-05-02 16:32:12 +01:00
Stephen Tozer	00821fed09	Revert "[RemoveDIs] Load into new debug info format by default in LLVM (#89799 )" A unit test was broken by the above commit: https://lab.llvm.org/buildbot/#/builders/139/builds/64627 This reverts commit 2f01fd99eb8c8ab3db9aba72c4f00e31e9e60a05.	2024-05-01 16:56:34 +01:00
Stephen Tozer	2f01fd99eb	[RemoveDIs] Load into new debug info format by default in LLVM (#89799 ) This patch enables parsing and creating modules directly into the new debug info format. Prior to this patch, all modules were constructed with the old debug info format by default, and would be converted into the new format just before running LLVM passes. This is an important milestone, in that this means that every tool will now be exposed to debug records, rather than those that run LLVM passes. As far as I've tested, all LLVM tools/projects now either handle debug records, or convert them to the old intrinsic format. There are a few unit tests that need updating for this patch; these are either cases of tests that previously needed to set the debug info format to function, or tests that depend on the old debug info format in some way. There should be no visible change in the output of any LLVM tool as a result of this patch, although the likelihood of this patch breaking downstream code means an NFC tag might be a little misleading, if not technically incorrect: This will probably break some downstream tools that don't already handle debug records. If your downstream code breaks as a result of this change, the simplest fix is to convert the module in question to the old debug format before you process it, using `Module::convertFromNewDbgValues()`. For more information about how to handle debug records or about what has changed, see the migration document: https://llvm.org/docs/RemoveDIsDebugInfo.html	2024-05-01 16:50:12 +01:00
Nikita Popov	1baa385065	[IR][PatternMatch] Only accept poison in getSplatValue() (#89159 ) In #88217 a large set of matchers was changed to only accept poison values in splats, but not undef values. This is because we now use poison for non-demanded vector elements, and allowing undef can cause correctness issues. This patch covers the remaining matchers by changing the AllowUndef parameter of getSplatValue() to AllowPoison instead. We also carry out corresponding renames in matchers. As a followup, we may want to change the default for things like m_APInt to m_APIntAllowPoison (as this is much less risky when only allowing poison), but this change doesn't do that. There is one caveat here: We have a single place (X86FixupVectorConstants) which does require handling of vector splats with undefs. This is because this works on backend constant pool entries, which currently still use undef instead of poison for non-demanded elements (because SDAG as a whole does not have an explicit poison representation). As it's just the single use, I've open-coded a getSplatValueAllowUndef() helper there, to discourage use in any other places.	2024-04-18 15:44:12 +09:00
Jeremy Morse	6b62a9135a	[RemoveDIs] Reapply 3fda50d3915, insert instructions using iterators I'd reverted this in 6c7805d5d1 after a bad stage. Original commit messsage follows: [NFC][RemoveDIs] Bulk update utilities to insert with iterators As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. I've also switched DemotePHIToStack to take an optional iterator: it needs to take an iterator, and having a no-insert-location behaviour appears to be important. The constructors for ICmpInst and FCmpInst have been updated too. They're the only instructions that take block _references_ rather than pointers for certain calls, and a future patch is going to make use of default-null block insertion locations. All of this should be NFC.	2024-03-04 13:14:39 +00:00
Youngsuk Kim	5a314609b4	[llvm] Replace calls to Type::getPointerTo (NFC) Clean-up towards removing method Type::getPointerTo.	2023-11-30 13:18:51 -06:00
Fangrui Song	dd3184c30f	[unittest,examples] Replace uses of IRBuilder::getInt8PtrTy with getPtrTy. NFC	2023-11-27 08:29:13 -08:00
Sander de Smalen	81b7f115fb	[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979 ) It seems TypeSize is currently broken in the sense that: TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8) without failing its assert that explicitly tests for this case: assert(LHS.Scalable == RHS.Scalable && ...); The reason this fails is that `Scalable` is a static method of class TypeSize, and LHS and RHS are both objects of class TypeSize. So this is evaluating if the pointer to the function Scalable == the pointer to the function Scalable, which is always true because LHS and RHS have the same class. This patch fixes the issue by renaming `TypeSize::Scalable` -> `TypeSize::getScalable`, as well as `TypeSize::Fixed` to `TypeSize::getFixed`, so that it no longer clashes with the variable in FixedOrScalableQuantity. The new methods now also better match the coding standard, which specifies that: * Variable names should be nouns (as they represent state) * Function names should be verb phrases (as they represent actions)	2023-11-22 08:52:53 +00:00
Fangrui Song	8e247b8f47	Replace TypeSize::{getFixed,getScalable} with canonical TypeSize::{Fixed,Scalable}. NFC	2023-10-27 00:30:41 -07:00
Alexey Bataev	e22818d5c9	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-05 06:17:07 -07:00
Arthur Eubanks	07389535a7	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit b186f1f68be11630355afb0c08b80374a6d31782. Causes crashes, see https://reviews.llvm.org/D158449.	2023-10-04 14:37:16 -07:00
Alexey Bataev	b186f1f68b	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-04 07:53:30 -07:00
Alexey Bataev	1129dec778	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 6f43d28f3452b3ef598bc12b761cfc2dbd0f34c9 to fix a crash reported in https://reviews.llvm.org/D158449.	2023-10-03 13:02:16 -07:00
Alexey Bataev	6f43d28f34	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-03 10:26:11 -07:00
Alexey Bataev	ebcb5d59fc	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.	2023-09-29 15:03:46 -07:00
Alexey Bataev	9f5960e004	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-29 13:16:03 -07:00
Alexey Bataev	3204f88a8b	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit c88c281cf1ac1a01c55231b93826d7c8ae83985b to fix the crash revealed by https://lab.llvm.org/buildbot/#/builders/230/builds/19353.	2023-09-28 11:57:32 -07:00
Alexey Bataev	c88c281cf1	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-28 11:03:21 -07:00
Bjorn Pettersson	fd05c34b18	Stop using legacy helpers indicating typed pointer types. NFC Since we no longer support typed LLVM IR pointer types, the code can be simplified into for example using PointerType::get directly instead of using Type::getInt8PtrTy and Type::getInt32PtrTy etc. Differential Revision: https://reviews.llvm.org/D156733	2023-08-02 12:08:37 +02:00
Teresa Johnson	85a13c716f	[IR] Add interface to remove a CallBase string function attribute Adds an interface to remove a string function attribute attached to a CallBase, and a corresponding unittest. This was extracted from D141077, and will be used by a follow on patch that removes memprof attributes when needed. Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D149192	2023-04-26 08:47:15 -07:00
Vasileios Porpodas	32b38d248f	[NFC] Rename Instruction::insertAt() to Instruction::insertInto(), to be consistent with BasicBlock::insertInto() Differential Revision: https://reviews.llvm.org/D140085	2022-12-15 12:27:45 -08:00
Vasileios Porpodas	adfb23c607	[NFC] Cleanup: Remove Function::getBasicBlockList() when not required. This is part of a series of patches that aim at making Function::getBasicBlockList() private. Differential Revision: https://reviews.llvm.org/D139910	2022-12-13 11:46:29 -08:00
Kazu Hirata	b6a01caa64	[llvm/unittests] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 22:10:37 -08:00
Vasileios Porpodas	606f790330	[IR][NFC] Adds Instruction::insertAt() for inserting at a specific point in the instr list. Currently the only way to do this is to work with the instruction list directly. This is part of a series of cleanup patches towards making BasicBlock::getInstList() private. Differential Revision: https://reviews.llvm.org/D138875	2022-11-29 20:15:10 -08:00
Nikita Popov	304f1d59ca	[IR] Switch everything to use memory attribute This switches everything to use the memory attribute proposed in https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly attributes are dropped. The readnone, readonly and writeonly attributes are restricted to parameters only. The old attributes are auto-upgraded both in bitcode and IR. The bitcode upgrade is a policy requirement that has to be retained indefinitely. The IR upgrade is mainly there so it's not necessary to update all tests using memory attributes in this patch, which is already large enough. We could drop that part after migrating tests, or retain it longer term, to make it easier to import IR from older LLVM versions. High-level Function/CallBase APIs like doesNotAccessMemory() or setDoesNotAccessMemory() are mapped transparently to the memory attribute. Code that directly manipulates attributes (e.g. via AttributeList) on the other hand needs to switch to working with the memory attribute instead. Differential Revision: https://reviews.llvm.org/D135780	2022-11-04 10:21:38 +01:00
Chuanqi Xu	645d2dd3a9	Revert "Don't treat readnone call in presplit coroutine as not access memory" This reverts commit 57224ff4a6833dca1f17568cc9cf77f9579030ae. This commit may trigger crashes on some workloads. Revert it for clearness.	2022-07-20 17:00:58 +08:00
Chuanqi Xu	57224ff4a6	Don't treat readnone call in presplit coroutine as not access memory To solve the readnone problems in coroutines. See https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015 for details. According to the discussion, we decide to fix the problem by inserting isPresplitCoroutine() checks in different passes instead of wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes. In this direction, we might not be able to cover every case at first. Let's take a "find and fix" strategy. Reviewed By: nikic, nhaehnle, jyknight Differential Revision: https://reviews.llvm.org/D127383	2022-07-20 10:37:23 +08:00
Nikita Popov	2a721374ae	[IR] Don't use blockaddresses as callbr arguments Following some recent discussions, this changes the representation of callbrs in IR. The current blockaddress arguments are replaced with `!` label constraints that refer directly to callbr indirect destinations: ; Before: %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo)) to label %asm.fallthrough [label %foo] ; After: %res = callbr i8* asm "", "=r,r,!i"(i8* %x) to label %asm.fallthrough [label %foo] The benefit of this is that we can easily update the successors of a callbr, without having to worry about also updating blockaddress references. This should allow us to remove some limitations: * Allow unrolling/peeling/rotation of callbr, or any other clone-based optimizations (https://github.com/llvm/llvm-project/issues/41834) * Allow duplicate successors (https://github.com/llvm/llvm-project/issues/45248) This is just the IR representation change though, I will follow up with patches to remove limtations in various transformation passes that are no longer needed. Differential Revision: https://reviews.llvm.org/D129288	2022-07-15 10:18:17 +02:00
Kazu Hirata	d152e50c15	[llvm] Don't use Optional::{hasValue,getValue} (NFC)	2022-06-25 11:24:23 -07:00
Jack Andersen	09325d3606	[CAPI] Expose CastInst::getCastOpcode in C API Reviewed By: deadalnix Differential Revision: https://reviews.llvm.org/D91514	2022-04-30 18:40:04 -04:00
Serge Pavlov	881350a92d	Mapping of FP operations to constrained intrinsics A new function 'getConstrainedIntrinsic' is added, which for any gived instruction returns id of the corresponding constrained intrinsic. If there is no constrained counterpart for the instruction or the instruction is already a constrained intrinsic, the function returns zero. This is recommit of 115b3ace369254f573ca28934ef30ab9d8f497ef, reverted in 8160dd582b67430a5c24c836a57ae3c15cfa973c. Differential Revision: https://reviews.llvm.org/D69562	2022-03-31 11:07:47 +07:00
Serge Pavlov	8160dd582b	Revert "Mapping of FP operations to constrained intrinsics" This reverts commit 115b3ace369254f573ca28934ef30ab9d8f497ef. Starting from this commit the buildbot sanitizer-x86_64-linux-bootstrap-msan starts failing (build 10071). Reverted for investigation.	2022-03-30 16:46:43 +07:00
Serge Pavlov	115b3ace36	Mapping of FP operations to constrained intrinsics A new function 'getConstrainedIntrinsic' is added, which for any gived instruction returns id of the corresponding constrained intrinsic. If there is no constrained counterpart for the instruction or the instruction is already a constrained intrinsic, the function returns zero. Differential Revision: https://reviews.llvm.org/D69562	2022-03-30 12:21:30 +07:00
Serge Guelton	d2cc6c2d0c	Use a sorted array instead of a map to store AttrBuilder string attributes Using and std::map<SmallString, SmallString> for target dependent attributes is inefficient: it makes its constructor slightly heavier, and involves extra allocation for each new string attribute. Storing the attribute key/value as strings implies extra allocation/copy step. Use a sorted vector instead. Given the low number of attributes generally involved, this is cheaper, as showcased by https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions Differential Revision: https://reviews.llvm.org/D116599	2022-01-10 14:49:53 +01:00
Luke Benes	2249ecee8d	[IR][ShuffleVector] Fix Wdangling-else warning in InstructionsTest Fix a dangling else that gcc-11 warned about. The EXPECT_EQ macro expands to an if-else, so the whole construction contains a hidden dangling else. Differential Revision: https://reviews.llvm.org/D113346	2021-11-07 00:07:01 +03:00
Roman Lebedev	a5cd27880a	[IR] Improve member `ShuffleVectorInst::isReplicationMask()` When we have an actual shuffle, we can impose the additional restriction that the mask replicates the elements of the first operand, so we know the replication factor as a ratio of output and op0 vector sizes.	2021-11-06 00:09:27 +03:00
Roman Lebedev	0b36431810	[NFCI] InstructionTest: trim `InstructionsTest.ShuffleMaskIsReplicationMask_*` complexity These tests have pretty high O() complexity due to their nature, which leads to potentially-long runtimes. While in release build for me they took ~1 and ~2 sec, as noted in https://reviews.llvm.org/D113214#inline-1080479 they take minutes in debug build. Fine-tune the amount of permutations they deal with, without affecting the test coverage. After this, they take <~10ms each for me (in release build), hopefully that is good-enough for debug build too.	2021-11-05 19:22:48 +03:00
Roman Lebedev	01d8759ac9	[IR][ShuffleVector] Introduce `isReplicationMask()` matcher Avid readers of this saga may recall from previous installments, that replication mask replicates (lol) each of the `VF` elements in a vector `ReplicationFactor` times. For example, the mask for `ReplicationFactor=3` and `VF=4` is: `<0,0,0,1,1,1,2,2,2,3,3,3>`. More importantly, replication mask is used by LoopVectorizer when using masked interleaved memory operations. As discussed in previous installments, while it is used by LV, and we seem to support masked interleaved memory operations on X86, it's support in cost model leaves a lot to be desired: until basically yesterday even for AVX512 we had no cost model for it. As it has been witnessed in the recent AVX2 `X86TTIImpl::getInterleavedMemoryOpCost()` costmodel patches, while it is hard-enough to query the cost of a particular assembly sequence [from llvm-mca], afterwards the check lines LV costmodel tests must be updated manually. This is, at the very least, boring. Okay, now we have decent costmodel coverage for interleaving shuffles, but now basically the same mind-killing sequence has to be performed for replication mask. I think we can improve at least the second half of the problem, by teaching the `TargetTransformInfoImplCRTPBase::getUserCost()` to recognize `Instruction::ShuffleVector` that are repetition masks, adding exhaustive test coverage using `-cost-model -analyze` + `utils/update_analyze_test_checks.py` This way we can have good exhaustive coverage for cost model, and only basic coverage for the LV costmodel. This patch adds precise undef-aware `isReplicationMask()`, with exhaustive test coverage. * `InstructionsTest.ShuffleMaskIsReplicationMask` shows that it correctly detects all the known masks. * `InstructionsTest.ShuffleMaskIsReplicationMask_undef` shows that replacing some mask elements in a known replication mask still allows us to recognize it as a replication mask. Note, with enough undef elts, we may detect a different tuple. * `InstructionsTest.ShuffleMaskIsReplicationMask_Exhaustive_Correctness` shows that if we detected the replication mask with given params, then if we actually generate a true replication mask with said params, it matches element-wise ignoring undef mask elements. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D113214	2021-11-05 16:53:47 +03:00

1 2 3

125 Commits