llvm-project

Author	SHA1	Message	Date
Jack Andersen	f108c7f59d	[GlobalISel] Allow DBG_VALUE to use undefined vregs before LiveDebugValues. Expanding on D109750. Since `DBG_VALUE` instructions have final register validity determined in `LDVImpl::handleDebugValue`, there is no apparent reason to immediately prune unused register operands as their defs are erased. Consequently, this renders `MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval` moot; gaining a substantial performance improvement. The only necessary changes involve making relevant passes consider invalid DBG_VALUE vregs uses as valid. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D112852	2021-12-05 15:55:59 -05:00
Abinav Puthan Purayil	bc5dbb0bae	[GlobalISel] Add matchers for constant splat. This change exposes isBuildVectorConstantSplat() to the llvm namespace and uses it to implement the constant splat versions of m_SpecificICst(). CombinerHelper::matchOrShiftToFunnelShift() can now work with vector types and CombinerHelper::matchMulOBy2()'s match for a constant splat is simplified. Differential Revision: https://reviews.llvm.org/D114625	2021-11-30 15:18:50 +05:30
Mirko Brkusanin	0dd570ff56	[AMDGPU][GlobalISel] Transform (fsub (fpext (fneg (fmul x, y))), z) -> (fneg (fma (fpext x), (fpext y), z)) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D98050	2021-11-29 16:27:22 +01:00
Mirko Brkusanin	37c2a2201d	[AMDGPU][GlobalISel] Transform (fsub (fpext (fmul x, y)), z) -> (fma (fpext x), (fpext y), (fneg z)) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D98049	2021-11-29 16:27:22 +01:00
Mirko Brkusanin	5fe7fcd28e	[AMDGPU][GlobalISel] Transform (fsub (fneg (fmul, x, y)), z) -> (fma (fneg x), y, (fneg z)) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D98048	2021-11-29 16:27:22 +01:00
Mirko Brkusanin	a782169270	[AMDGPU][GlobalISel] Transform (fsub (fmul x, y), z) -> (fma x, y, -z) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D96614	2021-11-29 16:27:22 +01:00
Mirko Brkusanin	e5e49a08f1	[AMDGPU][GlobalISel] Transform (fadd (fma x, y, (fpext (fmul u, v))), z) -> (fma x, y, (fma (fpext u), (fpext v), z)) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D98047	2021-11-29 16:27:21 +01:00
Mirko Brkusanin	f732292536	[AMDGPU][GlobalISel] Transform (fadd (fma x, y, (fmul u, v)), z) -> (fma x, y, (fma u, v, z)) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D97938	2021-11-29 16:27:21 +01:00
Mirko Brkusanin	8951136216	[AMDGPU][GlobalISel] Transform (fadd (fpext (fmul x, y)), z) -> (fma (fpext x), (fpext y), z) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D97937	2021-11-29 16:27:21 +01:00
Mirko Brkusanin	881840fc26	[AMDGPU][GlobalISel] Transform (fadd (fmul x, y), z) -> (fma x, y, z) Patch by: Mateja Marjanovic Differential Revision: https://reviews.llvm.org/D93305	2021-11-29 16:27:21 +01:00
Abinav Puthan Purayil	4af45f10cc	[GlobalISel] Fold or of shifts to funnel shift. This change folds a basic funnel shift idiom: - (or (shl x, amt), (lshr y, sub(bw, amt))) -> fshl(x, y, amt) - (or (shl x, sub(bw, amt)), (lshr y, amt)) -> fshr(x, y, amt) This also helps in folding to rotate shift if x and y are equal since we already have a funnel shift to rotate combine. Differential Revision: https://reviews.llvm.org/D114499	2021-11-26 17:05:29 +05:30
Kazu Hirata	259cd6f893	[llvm] Use range-based for loops (NFC)	2021-11-25 22:17:10 -08:00
Kazu Hirata	d45cb1d7ea	[llvm] Use range-based for loops (NFC)	2021-11-23 08:54:48 -08:00
Mirko Brkusanin	db6bc2ab51	[AMDGPU][GlobalISel] Fold G_FNEG above when users cannot fold mods If possible fold fneg into instruction above if users cannot fold mods and we know it will decrease instruction count. Follows same logic as SDAG combiner in choosing opportunities to combine. Differential Revision: https://reviews.llvm.org/D112827	2021-11-17 14:25:13 +01:00
Jon Roelofs	b046eb19b8	[AArch64][GlobalISel] combine (and (or x, c1), c2) => (and x, c2) iff c1 & c2 == 0 https://godbolt.org/z/h8ejrG4hb rdar://83597585 Differential Revision: https://reviews.llvm.org/D111856	2021-10-20 12:11:52 -07:00
Jon Roelofs	1300677f97	[AArch64][GlobalISel] combine and + [la]sr => ubfx https://godbolt.org/z/h8ejrG4hb rdar://83597585 Differential Revision: https://reviews.llvm.org/D111839	2021-10-18 10:33:01 -07:00
Amara Emerson	53ebfa7c5d	[AArch64][GlobalISel] Fix combiner assertion in matchConstantOp(). We shouldn't call APInt::getSExtValue() on a >64b value.	2021-10-11 15:55:13 -07:00
Roman Lebedev	684cbae89a	[KnownBits] Introduce `countMaxActiveBits()` and use it in a few places	2021-10-11 23:36:06 +03:00
Amara Emerson	f95d9c95bb	[GlobalISel] Fix the stores of truncates -> wide store combine for non-evenly dividing type sizes. If the wide store we'd generate is not a multiple of the memory type of the narrow stores (e.g. s48 and s32), we'd assert. Fix that.	2021-10-09 21:18:20 -07:00
Amara Emerson	17b89f9daa	[GlobalISel] Improve G_UMHULH -> LSHR combine to accept non-uniform constant vectors.	2021-10-08 11:25:26 -07:00
Mirko Brkusanin	d20840c937	[GlobalISel] Combine for eliminating redundant operand negations Differential Revision: https://reviews.llvm.org/D111319	2021-10-08 14:29:22 +02:00
Amara Emerson	08b3c0d995	[GlobalISel] Combine G_UMULH x, (1 << c)) -> x >> (bitwidth - c) In order to not generate an unnecessary G_CTLZ, I extended the constant folder in the CSEMIRBuilder to handle G_CTLZ. I also added some extra handing of vector constants too. It seems we don't have any support for doing constant folding of vector constants, so the tests show some other useless G_SUB instructions too. Differential Revision: https://reviews.llvm.org/D111036	2021-10-07 23:51:37 -07:00
Amara Emerson	8bfc0e06dc	[GlobalISel] Port the udiv -> mul by constant combine. This is a straight port from the equivalent DAG combine. Differential Revision: https://reviews.llvm.org/D110890	2021-10-07 11:37:17 -07:00
Mirko Brkusanin	40e00063bc	[GlobalISel] Combine fabs(fneg(x)) to fabs(x) Differential Revision: https://reviews.llvm.org/D110943	2021-10-05 13:43:39 +02:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Amara Emerson	ca8316b704	[GlobalISel] Extend CombinerHelper::matchConstantOp() to match constant splat vectors. This allows the "x op 0 -> x" fold to optimize vector constant RHSs. Differential Revision: https://reviews.llvm.org/D110802	2021-09-30 14:31:25 -07:00
Amara Emerson	80f4bb5c61	[GlobalISel] Extend G_SELECT of known condition combine to vectors. Adds a new utility function: isConstantOrConstantSplatVector(). Differential Revision: https://reviews.llvm.org/D110786	2021-09-30 12:16:44 -07:00
Jessica Paquette	15a24e1fdb	[GlobalISel] Combine mulo x, 2 -> addo x, x Similar to what SDAG does when it sees a smulo/umulo against 2 (see: `DAGCombiner::visitMULO`) This pattern is fairly common in Swift code AFAICT. Here's an example extracted from a Swift testcase: https://godbolt.org/z/6cT8Mesx7 Differential Revision: https://reviews.llvm.org/D110662	2021-09-28 16:59:43 -07:00
Petar Avramovic	d477a7c2e7	GlobalISel/Utils: Refactor integer/float constant match functions Rework getConstantstVRegValWithLookThrough in order to make it clear if we are matching integer/float constant only or any constant(default). Add helper functions that get DefVReg and APInt/APFloat from constant instr getIConstantVRegValWithLookThrough: integer constant, only G_CONSTANT getFConstantVRegValWithLookThrough: float constant, only G_FCONSTANT getAnyConstantVRegValWithLookThrough: either G_CONSTANT or G_FCONSTANT Rename getConstantVRegVal and getConstantVRegSExtVal to getIConstantVRegVal and getIConstantVRegSExtVal. These now only match G_CONSTANT as described in comment. Relevant matchers now return both DefVReg and APInt/APFloat. Replace existing uses of getConstantstVRegValWithLookThrough and getConstantVRegVal with new helper functions. Any constant match is only required in: ConstantFoldBinOp: for constant argument that was bit-cast of float to int getAArch64VectorSplat: AArch64::G_DUP operands can be any constant amdgpu select for G_BUILD_VECTOR_TRUNC: operands can be any constant In other places use integer only constant match. Differential Revision: https://reviews.llvm.org/D104409	2021-09-17 11:22:13 +02:00
Konstantin Schwarz	d2e66d7fa4	[GlobalISel] Add a combine for and(load , mask) -> zextload This only handles simple masks, not shifted masks, for now. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D109357	2021-09-16 10:42:46 +02:00
Ahmed Bougacha	94a2f9cdb6	[GlobalISel] Fix CombinerHelper::isPredecessor for same def/use MI. The doc comment for isPredecessor says: Returns true if \p DefMI precedes \p UseMI or they are the same instruction. And dominates relies on that behavior for its own: Returns true if \p DefMI dominates \p UseMI. By definition an instruction dominates itself. Make both statements correct by fixing isPredecessor. Found by inspection.	2021-09-15 16:45:27 -07:00
Amara Emerson	5ec1845cad	[AArch64][GlobalISel] Add a new reassociation for G_PTR_ADDs. G_PTR_ADD (G_PTR_ADD X, C), Y) -> (G_PTR_ADD (G_PTR_ADD(X, Y), C) Improves CTMark -Os on AArch64: Program before after diff sqlite3 286932 287024 0.0% kc 432512 432508 -0.0% SPASS 412788 412764 -0.0% pairlocalalign 249460 249416 -0.0% bullet 475740 475512 -0.0% 7zip-benchmark 568864 568356 -0.1% consumer-typeset 419088 418648 -0.1% tramp3d-v4 367628 367224 -0.1% clamscan 383184 382732 -0.1% lencod 430028 429284 -0.2% Geomean difference -0.1% Differential Revision: https://reviews.llvm.org/D109528	2021-09-14 23:57:41 -07:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Amara Emerson	eae44c8a86	[GlobalISel] Implement merging of stores of truncates. This is a port of a combine which matches a pattern where a wide type scalar value is stored by several narrow stores. It folds it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; => ((i32)p) = val; On CTMark AArch64 -Os this results in a good amount of savings: Program before after diff SPASS 412792 412788 -0.0% kc 432528 432512 -0.0% lencod 430112 430096 -0.0% consumer-typeset 419156 419128 -0.0% bullet 475840 475752 -0.0% tramp3d-v4 367760 367628 -0.0% clamscan 383388 383204 -0.0% pairlocalalign 249764 249476 -0.1% 7zip-benchmark 570100 568860 -0.2% sqlite3 287628 286920 -0.2% Geomean difference -0.1% Differential Revision: https://reviews.llvm.org/D109419	2021-09-08 17:06:33 -07:00
Mirko Brkusanin	36527cbe02	[AMDGPU][GlobalISel] Legalize memcpy family of intrinsics Legalize G_MEMCPY, G_MEMMOVE, G_MEMSET and G_MEMCPY_INLINE. Corresponding intrinsics are replaced by a loop that uses loads/stores in AMDGPULowerIntrinsics pass unless their length is a constant lower then MemIntrinsicExpandSizeThresholdOpt (default 1024). Any G_MEM* instruction that reaches legalizer should have a const length argument and should be expanded into appropriate number of loads + stores. Differential Revision: https://reviews.llvm.org/D108357	2021-09-07 12:24:07 +02:00
Konstantin Schwarz	90d5298759	[GlobalISel] Add convenience constructors to MemDesc This allows constructing a MemDesc from a MachineMemoryOperand, a pattern that starts to show up more frequently. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D109161	2021-09-03 12:52:18 +02:00
Jessica Paquette	844d8e0337	[GlobalISel] Combine icmp eq/ne x, 0/1 -> x when x == 0 or 1 This adds the following combines: ``` x = ... 0 or 1 c = icmp eq x, 1 -> c = x ``` and ``` x = ... 0 or 1 c = icmp ne x, 0 -> c = x ``` When the target's true value for the relevant types is 1. This showed up in the following situation: https://godbolt.org/z/M5jKexWTW SDAG currently supports the `ne` case, but not the `eq` case. This can probably be further generalized, but I don't feel like thinking that hard right now. This gives some minor code size improvements across the board on CTMark at -Os for AArch64. (0.1% for 7zip and pairlocalalign in particular.) Differential Revision: https://reviews.llvm.org/D109130	2021-09-02 15:05:31 -07:00
Konstantin Schwarz	4b4bc1ea16	[GlobalISel] Do not generate illegal G_SEXTLOADs after legalization The sext_inreg_of_load combine did not have the isLegalOrBeforeLegalizer check, leading to the generation of potentially illegal G_SEXTLOADs when run after legalization. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D108626	2021-08-25 10:13:39 +02:00
Sebastian Neubauer	fbae34635d	[GlobalISel] Add combine for PTR_ADD with regbanks Combine two G_PTR_ADDs, but keep the register bank of the constant. That way, the combine can be used in post-regbank-select combines. Introduce two helper methods in CombinerHelper, getRegBank and setRegBank that get and set an optional register bank to a register. That way, they can be used before and after register bank selection. Differential Revision: https://reviews.llvm.org/D103326	2021-08-17 13:58:16 +02:00
Jessica Paquette	50efbf9cbe	[GlobalISel] Narrow binops feeding into G_AND with a mask This is a fairly common pattern: ``` %mask = G_CONSTANT iN <mask val> %add = G_ADD %lhs, %rhs %and = G_AND %add, %mask ``` We have combines to eliminate G_AND with a mask that does nothing. If we combined the above to this: ``` %mask = G_CONSTANT iN <mask val> %narrow_lhs = G_TRUNC %lhs %narrow_rhs = G_TRUNC %rhs %narrow_add = G_ADD %narrow_lhs, %narrow_rhs %ext = G_ZEXT %narrow_add %and = G_AND %ext, %mask ``` We'd be able to take advantage of those combines using the trunc + zext. For this to work (or be beneficial in the best case) - The operation we want to narrow then widen must only be used by the G_AND - The G_TRUNC + G_ZEXT must be free - Performing the operation at a narrower width must not produce a different value than performing it at the original width after masking. Example comparison between SDAG + GISel: https://godbolt.org/z/63jzb1Yvj At -Os for AArch64, this is a 0.2% code size improvement on CTMark/pairlocalign. Differential Revision: https://reviews.llvm.org/D107929	2021-08-13 18:31:13 -07:00
Amara Emerson	7ec4ce157b	[AArch64][GlobalISel] Relax oneuse restriction for PTR_ADD chain combining to check addressing legality. With contributions by Sebastian Neubauer Differential Revision: https://reviews.llvm.org/D105676	2021-08-10 16:41:18 -07:00
Amara Emerson	4c2e01232c	[GlobalISel] Fix a combine causing DBG_VALUE with dangling vregs. We should use MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval() instead of eraseFromParent(). We should probably use that in other places too but fix this issue which affects clang bootstrap builds for now.	2021-08-07 01:41:02 -07:00
Petar Avramovic	66de26b1f9	GlobalISel: Fix matchEqualDefs for instructions with multiple defs Instructions that produceSameValue produce same values for operands with same index. matchEqualDefs used to return true for any two values from different instructions that produce same values. Fix this by checking if values are defined by operands with the same index. Differential Revision: https://reviews.llvm.org/D107362	2021-08-05 15:05:45 +02:00
Dominik Montada	cc947e29ea	[GlobalISel] Combine shr(shl x, c1), c2 to G_SBFX/G_UBFX Reviewed By: foad Differential Revision: https://reviews.llvm.org/D107330	2021-08-05 13:52:10 +02:00
Amara Emerson	c54d5c9756	[GlobalISel] Use GMergeLikeOp to simplify a combine. NFC.	2021-07-29 13:53:16 -07:00
Amara Emerson	532c458fa8	[GlobalISel] Add GPtrAdd and use it in some combines.	2021-07-29 12:04:02 -07:00
Amara Emerson	c658b472f3	[GlobalISel] Add a constant folding combine. Use it AArch64 post-legal combiner. These don't always get folded because when the instructions are created the constants are obscured by artifacts. Differential Revision: https://reviews.llvm.org/D106776	2021-07-26 14:53:33 -07:00
Amara Emerson	dec34104bf	[GlobalISel] Add combine for merge(unmerge) and use AArch64 postlegal-combiner. Differential Revision: https://reviews.llvm.org/D106761	2021-07-26 10:37:31 -07:00
Amara Emerson	03cdb5221d	[GlobalISel] Fix load-or combine moving loads across potential aliasing stores. Although this combine checks that there's no load folding barriers between the loads that it's trying to merge, it was inserting the load at the MIRBuilder's default insertion point, which is the G_OR use inst. This was causing a miscompile in the test suite's SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-bswap-2 Differential Revision: https://reviews.llvm.org/D106251	2021-07-19 10:23:23 -07:00
Matt Arsenault	5a0d940f2a	GlobalISel: Preserve memory type for memset expansion	2021-07-16 11:41:32 -04:00

1 2 3 4 5

207 Commits