llvm-project

Author	SHA1	Message	Date
Craig Topper	8e132c5c1d	[LegalizeTypes][ARM][X86] Change ExpandIntRes_ABS to use sra+xor+sub. Previously we used sra+add+xor if ADDCARRY is supported. This changes to sra+xor+sub is SUBCARRY is available. This is consistent with the recent change to the default expansion in LegalizeDAG. Differential Revision: https://reviews.llvm.org/D121039	2022-03-07 11:28:32 -08:00
Craig Topper	440c4b705a	[SelectionDAG][RISCV][ARM][PowerPC][X86][WebAssembly] Change default abs expansion to use sra (X, size(X)-1); sub (xor (X, Y), Y). Previous we used sra (X, size(X)-1); xor (add (X, Y), Y). By placing sub at the end, we allow RISCV to combine sign_extend_inreg with it to form subw. Some X86 tests for Z - abs(X) seem to have improved as well. Other targets look to be a wash. I had to modify ARM's abs matching code to match from sub instead of xor. Maybe instead ISD::ABS should be made legal. I'll try that in parallel to this patch. This is an alternative to D119099 which was focused on RISCV only. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D119171	2022-02-20 21:11:23 -08:00
Craig Topper	3ff9cc01f2	[X86] Use CMOVNS for abs instead of CMOVGE. CMOVGE reads SF and OF. CMOVNS only reads SF. This matches with other recent changes to use a single flag where possible. It also matches gcc codegen. I believe this technically changes whether the conditioanl move happens on INT_MIN, but for INT_MIN both registers are the same so it doesn't matter. Differential Revision: https://reviews.llvm.org/D111826	2021-10-14 12:28:28 -07:00
Guozhi Wei	6599961c17	[TwoAddressInstructionPass] Improve the SrcRegMap and DstRegMap computation This patch contains following enhancements to SrcRegMap and DstRegMap: 1 In findOnlyInterestingUse not only check if the Reg is two address usage, but also check after commutation can it be two address usage. 2 If a physical register is clobbered, remove SrcRegMap entries that are mapped to it. 3 In processTiedPairs, when create a new COPY instruction, add a SrcRegMap entry only when the COPY instruction is coalescable. (The COPY src is killed) With these enhancements isProfitableToCommute can do better commute decision, and finally more register copies are removed. Differential Revision: https://reviews.llvm.org/D108731	2021-10-11 15:28:31 -07:00
Matt Arsenault	4a36e96c3f	RegAllocGreedy: Account for reserved registers in num regs heuristic This simple heuristic uses the estimated live range length combined with the number of registers in the class to switch which heuristic to use. This was taking the raw number of registers in the class, even though not all of them may be available. AMDGPU heavily relies on dynamically reserved numbers of registers based on user attributes to satisfy occupancy constraints, so the raw number is highly misleading. There are still a few problems here. In the original testcase that made me notice this, the live range size is incorrect after the scheduler rearranges instructions, since the instructions don't have the original InstrDist offsets. Additionally, I think it would be more appropriate to use the number of disjointly allocatable registers in the class. For the AMDGPU register tuples, there are a large number of registers in each tuple class, but only a small fraction can actually be allocated at the same time since they all overlap with each other. It seems we do not have a query that corresponds to the number of independently allocatable registers. Relatedly, I'm still debugging some allocation failures where overlapping tuples seem to not be handled correctly. The test changes are mostly noise. There are a handful of x86 tests that look like regressions with an additional spill, and a handful that now avoid a spill. The worst looking regression is likely test/Thumb2/mve-vld4.ll which introduces a few additional spills. test/CodeGen/AMDGPU/soft-clause-exceeds-register-budget.ll shows a massive improvement by completely eliminating a large number of spills inside a loop.	2021-09-14 21:00:29 -04:00
Simon Pilgrim	9c86c5e8ad	[DAG] Legalize abs(x) -> umin(x,sub(0,x)) iff umin/sub are legal If umin() is legal, this is likely to result in smaller codegen expansion for abs(x) than the xor(add,ashr) method. Followup to D92095 Alive2: https://alive2.llvm.org/ce/z/8nuX6s https://alive2.llvm.org/ce/z/q2hB9w	2020-11-25 18:06:02 +00:00
Simon Pilgrim	0637dfe88b	[DAG] Legalize abs(x) -> smax(x,sub(0,x)) iff smax/sub are legal If smax() is legal, this is likely to result in smaller codegen expansion for abs(x) than the xor(add,ashr) method. This is also what PowerPC has been doing for its abs implementation, so it lets us get rid of a load of custom lowering code there (and which was never updated when they added smax lowering). Alive2: https://alive2.llvm.org/ce/z/xRk3cD Differential Revision: https://reviews.llvm.org/D92095	2020-11-25 15:03:03 +00:00
Craig Topper	da79b1eecc	[SelectionDAG][X86][ARM] Teach ExpandIntRes_ABS to use sra+add+xor expansion when ADDCARRY is supported. Rather than using SELECT instructions, use SRA, UADDO/ADDCARRY and XORs to expand ABS. This is the multi-part version of the sequence we use in LegalizeDAG. It's also the same as the Custom sequence uses for i64 on 32-bit and i128 on 64-bit. So we can remove the X86 customization. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D87215	2020-09-07 13:15:26 -07:00
Craig Topper	01b3e16757	[X86] Use the same sequence for i128 ISD::ABS on 64-bit targets as we use for i64 on 32-bit targets. Differential Revision: https://reviews.llvm.org/D87214	2020-09-07 11:14:05 -07:00
Nikita Popov	deb4bb2b3a	[IR] Add min/max/abs intrinsics This adds the llvm.abs(), llvm.umin(), llvm.umax(), llvm.smin(), and llvm.smax() intrinsics specified in D81829. For SelectionDAG, the ISD opcodes and all the legalization and lowering already exist, so this just wires them up to the intrinsic in the SDAG builder and adds rudimentary tests. For GlobalISel only the min/max intrinsics are wired up, as llvm.abs() will require the addition of a G_ABS op, and corresponding legalization support. Differential Revision: https://reviews.llvm.org/D84125	2020-07-23 20:56:19 +02:00

10 Commits