llvm-project

Author	SHA1	Message	Date
Piotr Sobczak	a3d7b3121c	[AMDGPU][NFC] Add getMaxMUBUFImmOffset Replace magic constant 4095 with the function getMaxMUBUFImmOffset(). Differential Revision: https://reviews.llvm.org/D144623	2023-02-23 11:29:59 +01:00
Joe Nash	80a8e6805a	[AMDGPU] Don't set src mods on permlane16 v_permlane16_b32 and v_permlanex16_b32 should not set abs and neg src modifiers on any input, but they can set op_sel on src0 or src1 to represent fi or bc when desired. The ISel patterns were setting the src_modifier bits to -1, effectively setting abs and neg as well, whenever it was intended to set op_sel, due to an error in ISel. ISel should now correctly only set the op_sel bits. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D144519	2023-02-22 11:41:52 -05:00
Kazu Hirata	f8f3db2756	Use APInt::count{l,r}_{zero,one} (NFC)	2023-02-19 22:04:47 -08:00
Kazu Hirata	cbde2124f1	Use APInt::popcount instead of APInt::countPopulation (NFC) This is for consistency with the C++20-style bit manipulation functions in <bit>.	2023-02-19 11:29:12 -08:00
Mirko Brkusanin	43924cbd29	[AMDGPU][GlobalISel] Fix selection of image sample g16 instructions Pre-GFX10 A16 modifier would imply G16. From GFX10 and onwards there are separate instructions for 16bit gradients. This fixes the condition for selecting G16 opcodes. Also stop adding G16 flag to instructions that do not use gradients for GFX10 onwards.	2023-02-09 16:26:55 +01:00
Matt Arsenault	93ec3fa402	AMDGPU: Support atomicrmw uinc_wrap/udec_wrap For now keep the exising intrinsics working.	2023-01-27 22:17:16 -04:00
Kazu Hirata	22cdc6a126	[llvm] Use llvm::bit_ceil instead of PowerOf2Ceil (NFC) The arguments to PowerOf2Ceil in this patch are all known to be nonzero, so we can safely use llvm::bit_ceil here.	2023-01-25 00:05:33 -08:00
Kazu Hirata	caa99a01f5	Use llvm::popcount instead of llvm::countPopulation(NFC)	2023-01-22 12:48:51 -08:00
Fangrui Song	21c4dc7997	std::optional::value => operator*/operator-> value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). This fixes clang.	2022-12-17 00:42:05 +00:00
Jay Foad	6443c0ee02	[AMDGPU] Stop using make_pair and make_tuple. NFC. C++17 allows us to call constructors pair and tuple instead of helper functions make_pair and make_tuple. Differential Revision: https://reviews.llvm.org/D139828	2022-12-14 13:22:26 +00:00
Fangrui Song	67819a72c6	[CodeGen] llvm::Optional => std::optional	2022-12-13 09:06:36 +00:00
Justin Bogner	916ae0a060	[AMDGPU] Handle nnan and fast on the call in fpmed3 patterns We were only allowing these med3 patterns if the operands were known to not be NaN, but we should also allow it if the calls to max/min have the `nnan` or `fast` flags. Differential Revision: https://reviews.llvm.org/D139506	2022-12-06 22:57:52 -08:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Kazu Hirata	959c9cc7ac	[AMDGPU] Use std::optional in AMDGPUInstructionSelector.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-25 22:23:09 -08:00
Pierre van Houtryve	9e7febb4f7	[AMDGPU][GISel] Select llvm.amdgcn.fcmp intrinsics Adds FP CCs opcodes/selection logic, including src mods selection Depends on D136591, D136448 Resolves #58326 (https://github.com/llvm/llvm-project/issues/58326) Reviewed By: arsenm, foad Differential Revision: https://reviews.llvm.org/D136592	2022-11-22 14:18:58 +00:00
Pierre van Houtryve	a751676f98	[AMDGPU][GISel] Add llvm.amdgcn.icmp selection Add missing logic to select i16 variants and enable GISel testing. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D136448	2022-11-22 08:26:50 +00:00
Mirko Brkusanin	e58b116843	[AMDGPU] Add subtarget feature for MAD_U64/I64 bug on GFX11 Differential Revision: https://reviews.llvm.org/D133012	2022-11-18 18:19:27 +01:00
Petar Avramovic	0f3e72e86c	AMDGPU/GlobalISel: Fix crash after mad/fma_mix fails selection When selectVOP3PMadMixModsImpl fails, it can still create new copy instr via selectVOP3ModsImpl. When selectG_FMA_FMAD gives up, new copy instr will remain dead but will not be automatically removed. InstructionSelect does not check if instructions created during selection are dead. Such dead copy doesn't have register class on dst operand and causes crash. Fix is to build copy when operands are being added to selected instruction. Differential Revision: https://reviews.llvm.org/D138044	2022-11-18 18:02:26 +01:00
Matt Arsenault	ae43420f39	AMDGPU/GlobalISel: Fix not selecting modifiers for f16 fma on gfx9 VOP3OpSel wasn't trying to match any modifiers. Just try to match the basic case, like the DAG does.	2022-11-17 18:51:45 -08:00
Jay Foad	342642dc75	[AMDGPU][GISel] Smaller code for scalar 32 to 64-bit extensions Differential Revision: https://reviews.llvm.org/D107639	2022-11-16 06:57:21 +00:00
Pierre van Houtryve	767999fca8	[AMDGPU][GlobalISel] Support mad/fma_mix selection Adds support for selecting the following instructions using GlobalISel: - v_mad_mix/v_fma_mix - v_mad_mixhi/v_fma_mixhi - v_mad_mixlo/v_fma_mixlo To select those instructions properly, some additional changes were needed which impacted other tests as well. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134354	2022-11-08 08:02:34 +00:00
Pierre van Houtryve	1809414fe1	[AMDGPU][GISel] Constrain selected operands in selectG_BUILD_VECTOR Small bugfix. Currently harmless but a case in D134354 triggers it. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D136235	2022-10-21 06:50:16 +00:00
Jay Foad	ea09a426a9	[AMDGPU] Assume getDefIgnoringCopies will succeed. NFC. getDefIgnoringCopies and getSrcRegIgnoringCopies should not fail on valid MIR, so don't bother to check for failure. Differential Revision: https://reviews.llvm.org/D136238	2022-10-19 11:10:00 +01:00
Pierre van Houtryve	c93104073c	[AMDGPU] Always lower SHUFFLE_VECTOR Make it illegal, remove InstructionSelector logic for it Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134967	2022-10-04 14:23:17 +00:00
Pierre van Houtryve	9a67a6b72a	[AMDGPU][GISel] Legalize V2S16 G_BUILD_VECTOR Preparation patch for D134354 to make V2S16 G_BUILD_VECTOR legal. Also removes RegBankInfo's scalarization of small BUILD_VECTORs, replacing it with InstructionSelector logic instead. This allows for V2S16 BUILD_VECTOR instructions to survive all the way to ISel so we can select FMA/MAD_MIX instructions in D134354. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134433	2022-09-30 14:04:53 +00:00
Petar Avramovic	6db7921b65	AMDGPU: Use tablegen patterns for buffer global and flat atomic fadd Remove manual selection for atomic fadd from global-isel. Stop pre-isel translation to AtomicLoadFAdd/G_ATOMICRMW_FADD which corresponds to llvm-ir's atomicrmw fadd instruction. global and flat atomic fadd patterns changes: Split rtn/no-rtn patterns Add missing patterns or fix predicates Remove atomicrmw patterns for v2f16 (atomic rmw doesn't support vectors). Patterns now check addrspace of pointer, added patterns for flat intrinsic. with global addrspace pointer that selects into global atomic instruction. buffer atomic fadd patterns changes: Rdit patterns to import into global-isel. Remove gfx6/gfx7 _addr64 and _offset patterns. Remove patterns that can't be reached (same pattern but different feature). Differential Revision: https://reviews.llvm.org/D130579	2022-09-23 17:52:10 +02:00
Jay Foad	3822a01e0b	[AMDGPU] Add GFX11 ds_bvh_stack_rtn_b32 instruction Differential Revision: https://reviews.llvm.org/D133928	2022-09-15 16:46:14 +01:00
Ivan Kosarev	5db8d6fd2b	[AMDGPU][CodeGen] Support (base \| offset) SMEM loads. Prevents generation of unnecessary s_or_b32 instructions. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D132552	2022-09-05 14:22:06 +01:00
Ivan Kosarev	f33645301e	[AMDGPU][CodeGen] Support (soffset + offset) s_buffer_load's. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D130263	2022-09-05 12:53:05 +01:00
Pierre van Houtryve	59cf9dd923	[AMDGPU][GISel] Enable Selection of ADD3 for G_PTR_ADD Allows things like `(G_PTR_ADD (G_PTR_ADD a, b), c)` to be simplified into a single ADD3 instruction instead of two adds. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D131254	2022-08-24 14:44:19 +00:00
Ivan Kosarev	75950be836	[AMDGPU][NFC] Validate G_MERGE_VALUES as we match zero-extended 32-bit scalars. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D130001	2022-07-21 14:49:57 +01:00
Stanislav Mekhanoshin	523a99c0eb	[AMDGPU] Support for gfx940 fp8 smfmac Differential Revision: https://reviews.llvm.org/D129908	2022-07-18 12:12:41 -07:00
Ivan Kosarev	432cbd7827	[AMDGPU][CodeGen] Support (register + immediate) SMRD offsets. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D129381	2022-07-18 11:29:31 +01:00
Kazu Hirata	611ffcf4e4	[llvm] Use value instead of getValue (NFC)	2022-07-13 23:11:56 -07:00
Ivan Kosarev	8cd79bc12c	[AMDGPU][GlobalISel] Support register offsets for SMRDs. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D128836	2022-07-05 13:41:06 +01:00
Piotr Sobczak	4874838a63	[AMDGPU] gfx11 WMMA instruction support gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate) instructions. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D128756	2022-06-30 11:13:45 -04:00
Jay Foad	3fbc945c3a	[AMDGPU] llvm.amdgcn.exp.compr is not supported on GFX11 Differential Revision: https://reviews.llvm.org/D128259	2022-06-28 14:48:25 +01:00
Joe Nash	f1cfaa956d	[AMDGPU] Use GFX11 S_PACK_HL instruction in more cases Differential Revision: https://reviews.llvm.org/D128527	2022-06-28 14:35:19 +01:00
Kazu Hirata	a7938c74f1	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-25 21:42:52 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
Joe Nash	ae72fee74e	[AMDGPU] gfx11 Select on Buffer Atomic FAdd Rtn type Reviewed By: #amdgpu, foad, rampitec Differential Revision: https://reviews.llvm.org/D128205	2022-06-23 11:05:32 -04:00
Rodrigo Dominguez	971fa4b196	[AMDGPU] GFX11: remove ShaderType from ds_ordered_count offset field In GFX11 ShaderType is determined by the hardware and should no longer be written into bits[3:2] of the ds_ordered_count offset field. Differential Revision: https://reviews.llvm.org/D128196	2022-06-23 14:20:33 +01:00
Joe Nash	90254d524f	[AMDGPU] gfx11 Remove SDWA from shuffle_vector ISel gfx11 does not have SDWA Reviewed By: #amdgpu, rampitec Differential Revision: https://reviews.llvm.org/D128208	2022-06-21 14:55:00 -04:00
Joe Nash	20d20156f4	[AMDGPU] gfx11 VINTERP intrinsics and ISel support Depends on D127664 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D127756	2022-06-17 09:16:59 -04:00
Joe Nash	2d43de13df	[AMDGPU] gfx11 new dot instruction codegen support Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D127904	2022-06-16 14:19:34 -04:00
Jay Foad	c155a944fb	[AMDGPU] GFX11 CodeGen support for MIMG instructions This includes: - New llvm.amdgcn.image.msaa.load.* intrinsics - NSA changes, because MIMG-NSA is now limited to 3 dwords - Split CD forms of IMAGE_SAMPLE instructions out into separate test files since they are no longer supported in GFX11 Differential Revision: https://reviews.llvm.org/D127837	2022-06-16 18:23:14 +01:00
Jay Foad	7b9f620e78	[AMDGPU] Work around GFX11 flat scratch SVS swizzling bug Differential Revision: https://reviews.llvm.org/D127635	2022-06-13 21:00:42 +01:00
Jay Foad	d943c51465	[AMDGPU] Fix GFX11 codegen for V_MAD_U64_U32 and V_MAD_I64_I32 GFX11 uses different pseudos for these because of a new constraint on which operands' registers can overlap. Differential Revision: https://reviews.llvm.org/D127659	2022-06-13 20:59:18 +01:00
Stanislav Mekhanoshin	c9e242f6dd	[AMDGPU] Change GISel error handling for TFE on GFX90A Differential Revision: https://reviews.llvm.org/D126797	2022-06-01 11:07:25 -07:00

... 3 4 5 6 7 ...

563 Commits