llvm-project

Author	SHA1	Message	Date
chuongg3	f6f8e202f5	[AArch64][GlobalISel] Refactor Combine G_CONCAT_VECTOR (#80866 ) The combine now works using tablegen and checks if new instruction is legal before creating it.	2024-02-15 10:09:20 +00:00
chuongg3	927b8a0f4f	[AArch64][GlobalISel] Combine vecreduce(ext) to {U/S}ADDLV (#75832 )	2024-01-15 18:26:27 +00:00
chuongg3	fcfe1b6482	[GlobalISel] Refactor extractParts() (#75223 ) Moved extractParts() and extractVectorParts() from LegalizerHelper to Utils to be able to use it in different passes. extractParts() will also try to use unmerge when doing irregular splits where possible, falling back to extract elements when not.	2024-01-15 16:40:39 +00:00
Simon Pilgrim	afdc0c5c96	Fix MSV signed/unsigned mismatch warning. NFC.	2023-11-15 12:00:32 +00:00
chuongg3	692fbd6c00	[AArch64][GlobalISel] Support udot lowering for vecreduce add (#70784 ) vecreduce_add(mul(ext, ext)) -> vecreduce_add(udot) vecreduce_add(ext) -> vecreduce_add(ext) Vectors of scalar size of 8-bits with element count of multiples of 8	2023-11-15 11:41:46 +00:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Amara Emerson	730e8f659d	[AArch64][GlobalISel] Fix global offset folding combine inserting MIs into wrong place. Was causing use-before-def issues. Not sure how it remained undetected for so long.	2023-09-08 06:28:12 -07:00
pvanhout	aaf6755631	[GlobalISel] Refactor Combiner API Remove CodeGen leftovers from the old combiner backend and adapt the API to fit the new backend better. It's now quite a bit closer to how InstructionSelector works. - `CombinerInfo` is now a simple "options" struct. - `Combiner` is now the base class of all TableGen'd combiner implementation. - Many fields have been moved from derived classes into that class. - It has been refactored to create & own the Observer and Builder. - `tryCombineAll` TableGen'd method can now be renamed, which allows targets to implement the actual `tryCombineAll` call manually and do whatever they want to do before/after it. Note: `CombinerHelper` needs to be mutable because none of its methods are const. This can be revisited later. Depends on D158710 Reviewed By: aemerson, dsanders Differential Revision: https://reviews.llvm.org/D158713	2023-09-05 08:19:05 +02:00
Fangrui Song	111fcb0df0	[llvm] Fix duplicate word typos. NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 18:25:16 -07:00
pvanhout	655714a300	[AArch64] Use GlobalISel MatchTable Combiner Backend Only a few minor test changes needed because I removed the "helper" suffix from the combiner name, as it's not really a helper anymore but more like the implementation itself. Depends on D153757 NOTE: This would land iff D153757 (RFC) lands too. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D153850	2023-07-11 11:27:14 +02:00
pvanhout	5eb8cb0949	[NFC][GlobalISel] Don't return `bool` from apply functions There is no case where those functions return false. It's always return true. Even if they were to return false, it's not really something we should rely on I think. With the current combiner implementation, it would just make `tryCombineAll` return false without retrying anymore rules. I also believe that if an applyer were to return false, it would mean that the match function is not good enough. Asserting on failure in an apply function is a better idea, IMO. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D153619	2023-06-26 09:23:58 +02:00
Amara Emerson	3daf7ddaef	[GlobalISel] Allow prelegalizer combiners to have access to LegalizerInfo. Before, the isPreLegalize() query in CombinerHelper only checked for the presence of a LegalizerInfo object. This is problematic when we want to have a combine actually check for legality in a pre-legalizer combine pass, since if we pass a LegalizerInfo object to the constructor it causes the combines to think that we're running post legalizer, which isn't true. This change fixes it to instead check an explicit bool that passes to signal whether the pass will be run before or after legalization. Doing so exposed a bug in the extending loads combine, which tried to check for legality of candidate extending loads if LegalizerInfo was present. Since we only ran it pre-legalizer and therefore with a null LegalizerInfo, it never actually ran. Also fixes the legality checks to keep the tests passing. Differential Revision: https://reviews.llvm.org/D135044	2022-10-03 07:36:18 +01:00
Kazu Hirata	b5188591a0	[llvm] Remove redundaunt virtual specifiers (NFC) Identified with modernize-use-override.	2022-07-24 21:50:35 -07:00
Martin Storsjö	8d7a17b7c8	[AArch64] Fix the upper limit for folded address offsets for COFF In COFF, the immediates in IMAGE_REL_ARM64_PAGEBASE_REL21 relocations are limited to 21 bit signed, i.e. the offset has to be less than (1 << 20). The previous limit did intend to cover for this case, but had missed that the 21 bit field was signed. This fixes issue https://github.com/llvm/llvm-project/issues/54753. Differential Revision: https://reviews.llvm.org/D123160	2022-04-06 22:54:13 +03:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Florian Hahn	f64580f8d2	[AArch64][GISel] Optimize 8 and 16 bit variants of uaddo. Try simplify G_UADDO with 8 or 16 bit operands to wide G_ADD and TBNZ if result is only used in the no-overflow case. It is restricted to cases where we know that the high-bits of the operands are 0. If there's an overflow, then the the 9th or 17th bit must be set, which can be checked using TBNZ. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D111888	2021-11-05 19:11:15 +01:00
Petar Avramovic	d477a7c2e7	GlobalISel/Utils: Refactor integer/float constant match functions Rework getConstantstVRegValWithLookThrough in order to make it clear if we are matching integer/float constant only or any constant(default). Add helper functions that get DefVReg and APInt/APFloat from constant instr getIConstantVRegValWithLookThrough: integer constant, only G_CONSTANT getFConstantVRegValWithLookThrough: float constant, only G_FCONSTANT getAnyConstantVRegValWithLookThrough: either G_CONSTANT or G_FCONSTANT Rename getConstantVRegVal and getConstantVRegSExtVal to getIConstantVRegVal and getIConstantVRegSExtVal. These now only match G_CONSTANT as described in comment. Relevant matchers now return both DefVReg and APInt/APFloat. Replace existing uses of getConstantstVRegValWithLookThrough and getConstantVRegVal with new helper functions. Any constant match is only required in: ConstantFoldBinOp: for constant argument that was bit-cast of float to int getAArch64VectorSplat: AArch64::G_DUP operands can be any constant amdgpu select for G_BUILD_VECTOR_TRUNC: operands can be any constant In other places use integer only constant match. Differential Revision: https://reviews.llvm.org/D104409	2021-09-17 11:22:13 +02:00
Jon Roelofs	a642872476	[GISel] Support llvm.memcpy.inline Differential revision: https://reviews.llvm.org/D105072	2021-06-30 12:39:05 -07:00
Jessica Paquette	6d8b070d96	[AArch64][GlobalISel] Enable memcpy family combines on minsize functions The combines in `tryCombineMemCpyFamily` have heuristics (e.g. `TLI.getMaxStoresPerMemset`) which consider size. So, theoretically, enabling these combines on minsize functions shouldn't be harmful. With this enabled we save 0.9% geomean on CTMark at -Oz, and 5.1% on Bullet. There are no code size regressions. Differential Revision: https://reviews.llvm.org/D102198	2021-05-10 15:25:23 -07:00
Amara Emerson	5b158093e2	[AArch64][GlobalISel] Create a new minimal combiner pass just for -O0. We never bothered to have a separate set of combines for -O0 in the prelegalizer before. This results in some minor performance hits for a mode where performance isn't a concern (although not regressing code size significantly is still preferable). This also removes the CSE option since we don't need it for -O0. Through experiments, I've arrived at a set of combines that gets the most code size improvement at -O0, while reducing the amount of time spent in the combiner by around 35% give or take. Differential Revision: https://reviews.llvm.org/D102038	2021-05-07 17:01:27 -07:00
Jessica Paquette	4d41810cf6	[AArch64][GlobalISel] Don't match thread-local globals in matchFoldGlobalOffset SelectionDAG has separate ISD opcodes for regular global values and thread-local global values, while GlobalISel does not. This combine was ported from SDAG directly without knowing that. As a result, it was running on TLS globals. This makes it so that `matchFoldGlobalOffset` doesn't match on TLS globals, and adds an assert to `selectTLSGlobalValue` to make sure that TLS globals never have offsets. Differential Revision: https://reviews.llvm.org/D101478	2021-04-28 13:48:18 -07:00
Jessica Paquette	23f657c165	[AArch64][GlobalISel] Emit bzero on Darwin Darwin platforms for both AArch64 and X86 can provide optimized `bzero()` routines. In this case, it may be preferable to use `bzero` in place of a memset of 0. This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can be generated by platforms which may want to use bzero. To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The conditions for this are largely a port of the bzero case in `AArch64SelectionDAGInfo::EmitTargetCodeForMemset`. The only difference in comparison to the SelectionDAG code is that, when compiling for minsize, this will fire for all memsets of 0. The original code notes that it's not beneficial to do this for small memsets; however, using bzero here will save a mov from wzr. For minsize, I think that it's preferable to prioritise omitting the mov. This also fixes a bug in the libcall legalization code which would delete instructions which could not be legalized. It also adds a check to make sure that we actually get a libcall name. Code size improvements (Darwin): - CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign) - CTMark -Oz: -0.2% geomean (-0.5% on bullet) Differential Revision: https://reviews.llvm.org/D99358	2021-03-25 17:14:25 -07:00
Jessica Paquette	0ca83730cc	Recommit "[AArch64][GlobalISel] Fold constants into G_GLOBAL_VALUE" This reverts commit 962b73dd0fc3906980e597f72a35eee7121cc5e2. This commit was reverted because of some internal SPEC test failures. It turns out that this wasn't actually relevant to anything in open source, so it's safe to recommit this.	2021-03-18 16:01:02 -07:00
Jessica Paquette	962b73dd0f	Revert "[AArch64][GlobalISel] Fold constants into G_GLOBAL_VALUE" This reverts commit 61b4702a408834228c1c139b0e9af98616774db4. We were seeing some test failures in SPECINT2006 due to this change. Reverting to investigate.	2021-02-16 10:50:12 -08:00
Jessica Paquette	61b4702a40	[AArch64][GlobalISel] Fold constants into G_GLOBAL_VALUE This is pretty much just ports `performGlobalAddressCombine` from AArch64ISelLowering. (AArch64 doesn't use the generic DAG combine for this.) This adds a pre-legalize combine which looks for this pattern: ``` %g = G_GLOBAL_VALUE @x %ptr1 = G_PTR_ADD %g, cst1 %ptr2 = G_PTR_ADD %g, cst2 ... %ptrN = G_PTR_ADD %g, cstN ``` And then, if possible, transforms it like so: ``` %g = G_GLOBAL_VALUE @x %offset_g = G_PTR_ADD %g, -min(cst) %ptr1 = G_PTR_ADD %offset_g, cst1 %ptr2 = G_PTR_ADD %offset_g, cst2 ... %ptrN = G_PTR_ADD %offset_g, cstN ``` Where min(cst) is the smallest out of the G_PTR_ADD constants. This means we should save at least one G_PTR_ADD. This also updates code in the legalizer + selector which assumes that G_GLOBAL_VALUE will never have an offset and adds/updates relevant tests. Differential Revision: https://reviews.llvm.org/D96624	2021-02-12 14:55:15 -08:00
Amara Emerson	12b9b778d9	[AArch64][GlobalISel] Enable CSE for the prelegalizer combiner. Differential Revision: https://reviews.llvm.org/D95647	2021-01-28 16:38:49 -08:00
Amara Emerson	be62b3ba34	[AArch64][GlobalISel] Add a combine to fold away truncate in: G_ICMP EQ/NE (G_TRUNC(v), 0) We try to do this optimization if we can determine that testing for the truncated bits with an eq/ne predicate results in the same thing as testing the lower bits. Differential Revision: https://reviews.llvm.org/D95645	2021-01-28 16:29:14 -08:00
Jessica Paquette	147b9497e7	[AArch64][GlobalISel] Split post-legalizer combiner to allow for lowering at -O0 There are a lot of combines in AArch64PostLegalizerCombiner which exist to facilitate instruction matching in the selector. (E.g. matching for G_ZIP and other shuffle vector pseudos) It still makes sense to select these instructions at -O0. Matching earlier in a combiner can reduce complexity in the selector significantly. For example, a good portion of our selection code for compares would be a lot easier to represent in a combine. This patch moves matching combines into a "AArch64PostLegalizerLowering" combiner which runs at all optimization levels. Also, while we're here, improve the documentation for the AArch64PostLegalizerCombiner, and fix up the filepath in its file comment. And also add a 'r' which somehow got dropped from a bunch of function names. https://reviews.llvm.org/D89820	2020-10-22 14:43:25 -07:00
Matt Arsenault	0b7f6cc71a	GlobalISel: Add generic instructions for memory intrinsics AArch64, X86 and Mips currently directly consumes these and custom lowering to produce a libcall, but really these should follow the normal legalization process through the libcall/lower action.	2020-08-26 20:08:45 -04:00
Daniel Sanders	e35ba09961	[gicombiner] Allow generated combiners to store additional members Summary: Adds the ability to add members to a generated combiner via a State base class. In the current AArch64PreLegalizerCombiner this is used to make Helper available without having to provide it to every call. As part of this, split the command line processing into a separate object so that it still only runs once even though the generated combiner is constructed more frequently. Depends on D81862 Reviewers: aditya_nandakumar, bogner, volkan, aemerson, paquette, arsenm Reviewed By: arsenm Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81863	2020-06-16 14:47:04 -07:00
Amara Emerson	e53f558057	[AArch64][GlobalISel] Move GlobalISel source files to a dedicated subdir. Differential Revision: https://reviews.llvm.org/D81116	2020-06-04 10:51:38 -07:00

31 Commits