llvm-project

Author	SHA1	Message	Date
David Green	ac321cbb03	[AArch64][GlobalISel] Legalize Insert vector element (#81453 ) This attempts to standardize and extend some of the insert vector element lowering. Most notably: - More types are handled by splitting illegal vectors. - The index type for G_INSERT_VECTOR_ELT is canonicalized to TLI.getVectorIdxTy(), similar to extact_vector_element. - Some of the existing patterns now have the index type specified to make sure they can apply to GISel too. - The C++ selection code has been removed, relying on tablegen patterns. - G_INSERT_VECTOR_ELT with small GPR input elements are pre-selected to use a i32 type, allowing the existing patterns to apply. - Variable index inserts are lowered in post-legalizer lowering, expanding into a stack store and reload.	2024-04-08 08:44:13 +01:00
Tuan Chuong Goh	13a78fd1ac	[AArch64][GlobalISel] Re-commit Legalize G_SHUFFLE_VECTOR for Odd-Sized Vectors (#83038 ) Legalize smaller/larger than legal vectors with i8 and i16 element sizes. Vectors with elements smaller than i8 will get widened to i8 elements.	2024-03-04 15:03:55 +00:00
David Majnemer	3dd6750027	[AArch64] Add more complete support for BF16 We can use a small amount of integer arithmetic to round FP32 to BF16 and extend BF16 to FP32. While a number of operations still require promotion, this can be reduced for some rather simple operations like abs, copysign, fneg but these can be done in a follow-up. A few neat optimizations are implemented: - round-inexact-to-odd is used for F64 to BF16 rounding. - quieting signaling NaNs for f32 -> bf16 tries to detect if a prior operation makes it unnecessary.	2024-03-03 22:39:50 +00:00
chuongg3	a7cfff8dc6	[AArch64][GlobalISel] Lower Shuffle Vector to REV (#79591 ) Add lowering for i16 and i32 vectors for Shuffle Vector instructions with REV mask	2024-01-28 20:35:02 +00:00
David Green	c0931d4950	[AArch64][GlobalISel] Lower scalarizing G_UNMERGE_VALUES to G_EXTRACT_VECTOR_ELT This adds post-legalizing lowering of G_UNMERGE_VALUES which take a vector and produce scalar values for each lane. They are converted to a G_EXTRACT_VECTOR_ELT for each lane, allowing all the existing tablegen patterns to apply to them. A couple of tablegen patterns need to be altered to make sure the type of the constant operand is known, so that the patterns are recognized under global isel. Closes #75662	2023-12-21 09:22:23 +00:00
chuongg3	45f51f9f7c	[AArch64][GlobalISel] Select UMULL instruction (#65469 ) Global ISel now selects `UMULL` and `UMULL2` instructions. G_MUL instruction with input operands coming from `SEXT` or `ZEXT` operations are turned into UMULL G_MUL instructions with v2s64 result type is always scalarised except: `mul ( unmerge( ext ), unmerge( ext ))` So the extend could be unmerged and fold away the unmerge in the middle: `mul ( unmerge( ext ), unmerge( ext ))` => `mul ( unmerge( merge( ext( unmerge )), unmerge( merge( ext( unmerge ))))` => `mul ( ext(unmerge)), ( ext( unmerge ))) `	2023-09-25 09:34:51 +01:00
Vladislav Dzhidzhoev	13b7629a58	[GlobalISel][AArch64] Combine unmerge(G_EXT v, undef) to unmerge(v). When having <N x t> d1, unused = unmerge(G_EXT <2*N x t> v1, undef, N), it is possible to express it just as unused, d1 = unmerge v1. It is useful for tackling regressions in arm64-vcvt_f.ll, introduced in https://reviews.llvm.org/D144670.	2023-09-05 16:14:44 +02:00
Vladislav Dzhidzhoev	7eeeeb0cc9	Revert "[GlobalISel][AArch64] Combine unmerge(G_EXT v, undef) to unmerge(v)." This reverts commit 6b37a65264bb4e7d400d5283a65f9e8e1575f2d7. Accindentally pushed before squashing.	2023-09-05 16:13:27 +02:00
Vladislav Dzhidzhoev	bb1a03df47	Addressed @aemerson comments	2023-09-05 16:00:49 +02:00
Vladislav Dzhidzhoev	0e826f0e6d	Refactored, added MIR test.	2023-09-05 16:00:48 +02:00
Vladislav Dzhidzhoev	6b37a65264	[GlobalISel][AArch64] Combine unmerge(G_EXT v, undef) to unmerge(v). When having <N x t> d1, unused = unmerge(G_EXT <2*N x t> v1, undef, N), it is possible to express it just as unused, d1 = unmerge v1. It is useful for tackling regressions in arm64-vcvt_f.ll, introduced in https://reviews.llvm.org/D144670.	2023-09-05 16:00:48 +02:00
pvanhout	aaf6755631	[GlobalISel] Refactor Combiner API Remove CodeGen leftovers from the old combiner backend and adapt the API to fit the new backend better. It's now quite a bit closer to how InstructionSelector works. - `CombinerInfo` is now a simple "options" struct. - `Combiner` is now the base class of all TableGen'd combiner implementation. - Many fields have been moved from derived classes into that class. - It has been refactored to create & own the Observer and Builder. - `tryCombineAll` TableGen'd method can now be renamed, which allows targets to implement the actual `tryCombineAll` call manually and do whatever they want to do before/after it. Note: `CombinerHelper` needs to be mutable because none of its methods are const. This can be revisited later. Depends on D158710 Reviewed By: aemerson, dsanders Differential Revision: https://reviews.llvm.org/D158713	2023-09-05 08:19:05 +02:00
David Green	a047dfe0d5	[AArch64][GISel] Lower EXT of 0 to a COPY This allows us to select G_SHUFFLE_VECTOR with identity masks (possibly including undef elements), but avoid the actual EXT instruction if the shift amount is 0.	2023-08-16 17:12:15 +01:00
David Green	bbe945b8a1	[AArch64][GISel] Expand G_DUP and G_DUPLANE to v8s8 and v4s16 This fills in the gaps with v8s8 and v4s8 vectors for G_DUP and G_DUPLANE, using the existing code that is generalized to more types.	2023-08-04 12:43:53 +01:00
pvanhout	af67b6760b	[AArch64] Split lowerVectorFCMP combine It's the only combine (AFAIK) that didn't use an apply function. There is no reason for it to mutate instructions in the matcher, so split it up. Reviewed By: aemerson, arsenm Differential Revision: https://reviews.llvm.org/D154947	2023-07-12 13:13:37 +02:00
pvanhout	655714a300	[AArch64] Use GlobalISel MatchTable Combiner Backend Only a few minor test changes needed because I removed the "helper" suffix from the combiner name, as it's not really a helper anymore but more like the implementation itself. Depends on D153757 NOTE: This would land iff D153757 (RFC) lands too. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D153850	2023-07-11 11:27:14 +02:00
pvanhout	5eb8cb0949	[NFC][GlobalISel] Don't return `bool` from apply functions There is no case where those functions return false. It's always return true. Even if they were to return false, it's not really something we should rely on I think. With the current combiner implementation, it would just make `tryCombineAll` return false without retrying anymore rules. I also believe that if an applyer were to return false, it would mean that the match function is not good enough. Asserting on failure in an apply function is a better idea, IMO. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D153619	2023-06-26 09:23:58 +02:00
David Green	c663b2576f	[AArch64][GISel] Add FP16 fcmp lowering This adds v4f16 and v8f16 lowering for fp16 vector compares. It splits the getActionDefinitionsBuilder of G_FCMP from G_ICMP, as they are quite different operations, and adds fp16 vector lowering. Differential Revision: https://reviews.llvm.org/D147947	2023-04-17 17:22:46 +01:00
Tim Northover	6b98824a58	AArch64: emit `fcmp ord %a, zeroinitializer` as a single fcmeq. Most "ord" checks need two real-world compares to implement, but this is the canonical form of a "!isnan" check, which is equivalent to comparing the input for equality against itself.	2022-12-07 19:17:30 +00:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Kazu Hirata	f0105ee968	[GISel] Use std::optional in AArch64PostLegalizerLowering.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-25 22:20:07 -08:00
Amara Emerson	8055aa8e8a	[AArch64][GlobalISel] Make vector G_SEXT_INREG legal and allow combining. As a result of making these legal, and tweaking the combine to allow vectors, we generate vector G_SEXT_INREG during legalization. The reason we want to make these legal in the first place is to allow for more combine opportunities. Once those have been done, we can just lower them back to shifts in the post-legalizer lowering. This needs to be one commit otherwise we start causing tests to fail due to incomplete support for selection etc.	2022-10-05 00:28:08 +01:00
Amara Emerson	3daf7ddaef	[GlobalISel] Allow prelegalizer combiners to have access to LegalizerInfo. Before, the isPreLegalize() query in CombinerHelper only checked for the presence of a LegalizerInfo object. This is problematic when we want to have a combine actually check for legality in a pre-legalizer combine pass, since if we pass a LegalizerInfo object to the constructor it causes the combines to think that we're running post legalizer, which isn't true. This change fixes it to instead check an explicit bool that passes to signal whether the pass will be run before or after legalization. Doing so exposed a bug in the extending loads combine, which tried to check for legality of candidate extending loads if LegalizerInfo was present. Since we only ran it pre-legalizer and therefore with a null LegalizerInfo, it never actually ran. Also fixes the legality checks to keep the tests passing. Differential Revision: https://reviews.llvm.org/D135044	2022-10-03 07:36:18 +01:00
Amara Emerson	7653586d88	[AArch64][GlobalISel] Implement another combine for shufflevector->AArch64 G_EXT. This is a port of an existing optimization in AArch64 ISelLowering, handling a case when the same input vector can be used for both ext inputs. Differential Revision: https://reviews.llvm.org/D134891	2022-09-29 22:53:24 +01:00
Kazu Hirata	b5188591a0	[llvm] Remove redundaunt virtual specifiers (NFC) Identified with modernize-use-override.	2022-07-24 21:50:35 -07:00
Kazu Hirata	3a3cb929ab	[llvm] Use = default (NFC)	2022-02-06 22:18:35 -08:00
Petar Avramovic	d477a7c2e7	GlobalISel/Utils: Refactor integer/float constant match functions Rework getConstantstVRegValWithLookThrough in order to make it clear if we are matching integer/float constant only or any constant(default). Add helper functions that get DefVReg and APInt/APFloat from constant instr getIConstantVRegValWithLookThrough: integer constant, only G_CONSTANT getFConstantVRegValWithLookThrough: float constant, only G_FCONSTANT getAnyConstantVRegValWithLookThrough: either G_CONSTANT or G_FCONSTANT Rename getConstantVRegVal and getConstantVRegSExtVal to getIConstantVRegVal and getIConstantVRegSExtVal. These now only match G_CONSTANT as described in comment. Relevant matchers now return both DefVReg and APInt/APFloat. Replace existing uses of getConstantstVRegValWithLookThrough and getConstantVRegVal with new helper functions. Any constant match is only required in: ConstantFoldBinOp: for constant argument that was bit-cast of float to int getAArch64VectorSplat: AArch64::G_DUP operands can be any constant amdgpu select for G_BUILD_VECTOR_TRUNC: operands can be any constant In other places use integer only constant match. Differential Revision: https://reviews.llvm.org/D104409	2021-09-17 11:22:13 +02:00
Amara Emerson	56a6686e0c	[AArch64][GlobalISel] Don't form truncstores in postlegalizer-lowering for s128. We don't support truncating s128 stores, so don't form them.	2021-07-20 00:04:34 -07:00
Sander de Smalen	c9acd2f32e	[GlobalISel] NFC: Change LLT::changeNumElements to LLT::changeElementCount. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D104453	2021-06-25 15:54:00 +01:00
Sander de Smalen	d5e14ba88c	[GlobalISel] NFC: Change LLT::vector to take ElementCount. This also adds new interfaces for the fixed- and scalable case: * LLT::fixed_vector * LLT::scalable_vector The strategy for migrating to the new interfaces was as follows: * If the new LLT is a (modified) clone of another LLT, taking the same number of elements, then use LLT::vector(OtherTy.getElementCount()) or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2) or operator. That is because there is no reason to specifically restrict the types to 'fixed_vector'. If the algorithm works on the number of elements (as unsigned), then just use fixed_vector. This will need to be fixed up in the future when modifying the algorithm to also work for scalable vectors, and will need then need additional tests to confirm the behaviour works the same for scalable vectors. * If the test used the '/Scalable=/true` flag of LLT::vector, then this is replaced by LLT::scalable_vector. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D104451	2021-06-24 11:26:12 +01:00
Amara Emerson	ae2b36e8bd	[AArch64][GlobalISel] Support truncstorei8/i16 w/ combine to form truncating G_STOREs. This needs some tablegen changes so that we can actually import the patterns properly. Differential Revision: https://reviews.llvm.org/D102204	2021-05-11 11:33:03 -07:00
Jessica Paquette	79be9c59c6	[AArch64][GlobalISel] Add post-legalizer lowering for NEON vector fcmps This is roughly equivalent to the floating point portion of `AArch64TargetLowering::LowerVSETCC`. Main part that's missing is the v4s16 bit. This also adds helpers equivalent to `EmitVectorComparison`, and `changeVectorFPCCToAArch64CC`. This moves `changeFCMPPredToAArch64CC` out of the selector into AArch64GlobalISelUtils for the sake of code reuse. This is done in post-legalizer lowering with pseudos to simplify selection. The imported patterns end up handling selection for us this way. Differential Revision: https://reviews.llvm.org/D101782	2021-05-10 15:40:06 -07:00
Jessica Paquette	49c3565b9b	[AArch64][GlobalISel] Swap compare operands when it may be profitable This adds support for swapping comparison operands when it may introduce new folding opportunities. This is roughly the same as the code added to AArch64ISelLowering in 162435e7b5e026b9f988c730bb6527683f6aa853. For an example of a testcase which exercises this, see llvm/test/CodeGen/AArch64/swap-compare-operands.ll (Godbolt for that testcase: https://godbolt.org/z/43WEMb) The idea behind this is that sometimes, we may be able to fold away, say, a shift or extend in a compare by swapping its operands. e.g. in the case of this compare: ``` lsl x8, x0, #1 cmp x8, x1 cset w0, lt ``` The following is equivalent: ``` cmp x1, x0, lsl #1 cset w0, gt ``` Most of the code here is just a reimplementation of what already exists in AArch64ISelLowering. (See `getCmpOperandFoldingProfit` and `getAArch64Cmp` for the equivalent code.) Note that most of the AND code in the testcase doesn't actually fold. It seems like we're missing selection support for that sort of fold right now, since SDAG happily folds these away (e.g testSwapCmpWithShiftedZeroExtend8_32 in the original .ll testcase) Differential Revision: https://reviews.llvm.org/D89422	2021-04-09 15:46:48 -07:00
David Blaikie	b627802e81	Remove unused variable (rolling it into an assert)	2021-03-09 16:06:44 -08:00
Amara Emerson	45a9dca015	[AArch64][GlobalISel] Form G_DUPLANE32 for <2 x s32> shufflevectors in lowering. For <2 x s32>, we can use G_DUPLANE32, but with a <4 x s32> source. To make it work, we can just widen the original source with a concat_vectors. Doing this allows <2 x float> indexed fmul instruction selection patterns to fire, which gives a nice 0.3% code size saving on Bullet with -Os. Differential Revision: https://reviews.llvm.org/D98059	2021-03-09 11:36:26 -08:00
Jessica Paquette	5c26be214d	[AArch64][GlobalISel] Lower G_BUILD_VECTOR -> G_DUP If we have ``` %vec = G_BUILD_VECTOR %reg, %reg, ..., %reg ``` Then lower it to ``` %vec = G_DUP %reg ``` Also update the selector to handle constant splats on G_DUP. This will not combine when the splat is all zeros or ones. Tablegen-imported patterns rely on these being G_BUILD_VECTOR. Minor code size improvements on CTMark at -Os. Also adds some utility functions to make it a bit easier to recognize splats, and an AArch64-specific splat helper. Differential Revision: https://reviews.llvm.org/D97731	2021-03-08 13:01:10 -08:00
Jessica Paquette	daf7d7f0dc	[AArch64][GlobalISel] Correct function evaluation order in applyINS The order in which the nested calls to Builder.buildWhatever are evaluated in differs between GCC and Clang. This caused a bot failure because the MIR in the testcase was coming out in a different order than expected. Rather than using nested calls, pull them out in order to fix the order of evaluation.	2021-02-23 16:21:11 -08:00
Jessica Paquette	ef1f7f1d7d	Recommit "[AArch64][GlobalISel] Match G_SHUFFLE_VECTOR -> insert elt + extract elt" Attempted fix for the added test failing. https://lab.llvm.org/buildbot/#/builders/104/builds/2355/steps/5/logs/stdio I can't reproduce the failure anywhere, so I'm going to guess that passing a std::function as MatchInfo is sketchy in this context. Switch it to a std::tuple and hope for the best.	2021-02-23 11:55:16 -08:00
Jessica Paquette	662402a8b3	Revert "[AArch64][GlobalISel] Match G_SHUFFLE_VECTOR -> insert elt + extract elt" This reverts commit 867e379c0e14527eb7aa68485a10324693e35f5d. For some reason this is upsetting Linux/Windows bots. Reverting while I try to reproduce.	2021-02-22 17:36:17 -08:00
Jessica Paquette	867e379c0e	[AArch64][GlobalISel] Match G_SHUFFLE_VECTOR -> insert elt + extract elt Match a G_SHUFFLE_VECTOR with a mask that allows it to be represented as a G_INSERT_VECTOR_ELT and a G_EXTRACT_VECTOR_ELT. This ports `isINSMask` from AArch64ISelLowering and the portion of `AArch64TargetLowering::LowerVECTOR_SHUFFLE` which handles the equivalent transformation. This provides more opportunities for matching DUP. We don't have all of the necessary combines to actually make DUP out of these yet, but this is better for size than the full TBL expansion for G_SHUFFLE_VECTOR. This is a -0.1% code size improvement on CTMark/Bullet at -Os. IR example: https://godbolt.org/z/sdcevT Differential Revision: https://reviews.llvm.org/D97214	2021-02-22 14:44:09 -08:00
Matt Arsenault	581d13f8ae	GlobalISel: Return APInt from getConstantVRegVal Returning int64_t was arbitrarily limiting for wide integer types, and the functions should handle the full generality of the IR. Also changes the full form which returns the originally defined vreg. Add another wrapper for the common case of just immediately converting to int64_t (arguably this would be useful for the full return value case as well). One possible issue with this change is some of the existing uses did break without conversion to getConstantVRegSExtVal, and it's possible some without adequate test coverage are now broken.	2020-12-22 22:23:58 -05:00
Jessica Paquette	b184a2eccf	[GlobalISel] Add matchers for specific constants and a matcher for negations It's fairly common to need matchers for a specific constant value, or for common idioms like finding a negated register. Add - `m_SpecificICst`, which returns true when matching a specific value.. - `m_ZeroInt`, which returns true when an integer 0 is matched. - `m_Neg`, which returns when a register is negated. Also update a few places which use idioms related to the new matchers. Differential Revision: https://reviews.llvm.org/D91397	2020-11-13 09:24:54 -08:00
Amara Emerson	f347d78cca	[AArch64][GlobalISel] Add AArch64::G_DUPLANE[X] opcodes for lane duplicates. These were previously handled by pattern matching shuffles in the selector, but adding a new opcode and making it equivalent to the AArch64duplane SDAG node allows us to select more patterns, like lane indexed FMLAs (patch adding a test for that will be committed later). The pattern matching code has been simply moved to postlegalize lowering. Differential Revision: https://reviews.llvm.org/D90820	2020-11-05 11:18:11 -08:00
Jessica Paquette	19dc9c9780	[AArch64][GlobalISel] Move imm adjustment for G_ICMP to post-legalizer lowering Move the code which adjusts the immediate/predicate on a G_ICMP to AArch64PostLegalizerLowering. This - Reduces the number of places we need to test for optimized compares in the selector. We know that the compare should have been simplified by the time it hits the selector, so we can avoid testing this in selects, brconds, etc. - Allows us to potentially fold more compares (previously, this optimization was only done after calling `tryFoldCompare`, this may allow us to hit some more TST cases) - Simplifies the selection code in `emitIntegerCompare` significantly; we can just use an emitSUBS function. - Allows us to avoid checking that the predicate has been updated after `emitIntegerCompare`. Also add a utility header file for things that may be useful in the selector and various combiners. No need for an implementation file at this point, since it's just one constexpr function for now. I've run into a couple cases where having one of these would be handy, so might as well add it here. There are a couple functions in the selector that can probably be factored out into here. Differential Revision: https://reviews.llvm.org/D89823	2020-10-22 15:27:36 -07:00
Jessica Paquette	147b9497e7	[AArch64][GlobalISel] Split post-legalizer combiner to allow for lowering at -O0 There are a lot of combines in AArch64PostLegalizerCombiner which exist to facilitate instruction matching in the selector. (E.g. matching for G_ZIP and other shuffle vector pseudos) It still makes sense to select these instructions at -O0. Matching earlier in a combiner can reduce complexity in the selector significantly. For example, a good portion of our selection code for compares would be a lot easier to represent in a combine. This patch moves matching combines into a "AArch64PostLegalizerLowering" combiner which runs at all optimization levels. Also, while we're here, improve the documentation for the AArch64PostLegalizerCombiner, and fix up the filepath in its file comment. And also add a 'r' which somehow got dropped from a bunch of function names. https://reviews.llvm.org/D89820	2020-10-22 14:43:25 -07:00

45 Commits