llvm-project

Author	SHA1	Message	Date
Nikita Popov	bdf2fbba9c	[AMDGPU] Convert some tests to opaque pointers (NFC)	2022-12-19 12:41:13 +01:00
Pierre van Houtryve	020a9d7b20	[GISel] Add (fsub +-0.0, X) -> fneg combine Allows for better matching of VOP3 mods. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D136442	2022-11-03 08:21:50 +00:00
Pierre van Houtryve	d8258508d4	[AMDGPU][GISel] Update `isCanonicalized` Recognize more opcodes in the function. Fixes some regressions introduced in D134857 for fdiv.f16 too. Depends on D134857 Reviewed By: arsenm, foad Differential Revision: https://reviews.llvm.org/D134862	2022-09-30 14:13:35 +00:00
Pierre van Houtryve	7388520d1c	[GISel] Add more cases to isKnownNeverNaN Make it even with the DAG implementation as of D134854 Reviewed By: arsenm, foad Differential Revision: https://reviews.llvm.org/D134857	2022-09-30 14:10:56 +00:00
Pierre van Houtryve	9a67a6b72a	[AMDGPU][GISel] Legalize V2S16 G_BUILD_VECTOR Preparation patch for D134354 to make V2S16 G_BUILD_VECTOR legal. Also removes RegBankInfo's scalarization of small BUILD_VECTORs, replacing it with InstructionSelector logic instead. This allows for V2S16 BUILD_VECTOR instructions to survive all the way to ISel so we can select FMA/MAD_MIX instructions in D134354. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134433	2022-09-30 14:04:53 +00:00
Mirko Brkusanin	6cae753bf4	[AMDGPU][GlobalISel] Legalize G_FSUB for s16 Differential Revision: https://reviews.llvm.org/D128066	2022-06-20 12:25:49 +02:00
Jay Foad	3eb2281bc0	[AMDGPU] Aggressively fold immediates in SIFoldOperands Previously SIFoldOperands::foldInstOperand would only fold a non-inlinable immediate into a single user, so as not to increase code size by adding the same 32-bit literal operand to many instructions. This patch removes that restriction, so that a non-inlinable immediate will be folded into any number of users. The rationale is: - It reduces the number of registers used for holding constant values, which might increase occupancy. (On the other hand, many of these registers are SGPRs which no longer affect occupancy on GFX10+.) - It reduces ALU stalls between the instruction that loads a constant into a register, and the instruction that uses it. - The above benefits are expected to outweigh any increase in code size. Differential Revision: https://reviews.llvm.org/D114643	2022-05-18 10:19:35 +01:00
Julien Pagès	46adccc5cc	[AMDGPU] Improve Codegen for build_vector Improve the code generation of build_vector. Use the v_pack_b32_f16 instruction instead of v_and_b32 + v_lshl_or_b32 Differential Revision: https://reviews.llvm.org/D98081 Patch by Julien Pagès!	2021-05-12 14:17:44 +01:00

8 Commits