llvm-project

Author	SHA1	Message	Date
Stanislav Mekhanoshin	4f34c740ab	[AMDGPU] w/a for s_setreg_b32 gfx1250 hazard with MODE register (#153879 )	2025-08-15 16:08:13 -07:00
Stanislav Mekhanoshin	f1fc50748a	[AMDGPU] w/a hazard with writing s102/103 and reading FLAT_SCRATCH_BASE (#153878 )	2025-08-15 15:23:06 -07:00
Stanislav Mekhanoshin	1f25c4883e	[AMDGPU] Mitigate DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 bug (#153872 ) DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 shall not be claused (we already do not clause DS instructions) and needs waits before and after.	2025-08-15 14:17:54 -07:00
Stanislav Mekhanoshin	29976f2e58	[AMDGPU] Handle S_GETREG_B32 hazard on gfx1250 (#153848 ) GFX1250 SPG says: S_GETREG_B32 does not wait for idle before executing. The user must S_WAIT_ALU 0 before S_GETREG_B32 on: STATUS, STATE_PRIV, EXCP_FLAG_PRIV, or EXCP_FLAG_USER.	2025-08-15 11:38:22 -07:00
Stanislav Mekhanoshin	5d28284dbb	[AMDGPU] gfx1250 does not need nop before VGPR dealloc (#153844 ) This has no impact as the dealloc is now practically disabled.	2025-08-15 11:29:02 -07:00
Stanislav Mekhanoshin	8bce10ac6d	[AMDGPU] Enable kernarg preload on gfx1250 (#153686 )	2025-08-14 16:29:53 -07:00
Stanislav Mekhanoshin	a629119c75	[AMDGPU] Remove wave64 functions (#153690 ) gfx1250 only supports wave32.	2025-08-14 15:54:33 -07:00
Stanislav Mekhanoshin	57c1e01e48	[AMDGPU] Don't allow wgp mode on gfx1250 (#153680 ) - gfx1250 only supports cu mode	2025-08-14 15:16:56 -07:00
Stanislav Mekhanoshin	b296ea9c14	[AMDGPU] s_get_shader_cycles_u64 gfx1250 instruction (#152390 ) It is the same as reading SHADER_CYCLES_LO and SHADER_CYCLES_HI but with a single instruction.	2025-08-06 15:32:28 -07:00
Stanislav Mekhanoshin	d1b6ce50df	[AMDGPU] gfx1250 has fixed GETPC bug and also extended VA to 57 bits (#152373 )	2025-08-06 13:32:26 -07:00
Stanislav Mekhanoshin	c2eddec4ff	[AMDGPU] System scope atomics are emulated over PCIe in gfx1250 (#152369 ) HW will emulate unsupported PCIe atomics via CAS loop, we do not need to expand these anymore.	2025-08-06 13:08:12 -07:00
Stanislav Mekhanoshin	334d0be2d4	[AMDGPU] Support 64-bit LDS atomic fadd on gfx1250 (#152368 )	2025-08-06 13:07:56 -07:00
Stanislav Mekhanoshin	d08c2977e8	[AMDGPU] Add MC support for new gfx1250 src_flat_scratch_base_lo/hi (#152203 )	2025-08-05 14:35:48 -07:00
Stanislav Mekhanoshin	0988510ad4	[AMDGPU] gfx1250 v_perm_pk16_* instructions (#151773 )	2025-08-01 20:12:35 -07:00
Harrison Hao	f9b258c73a	[AMDGPU] Support function attribute to override postRA scheduling direction (#147708 ) This patch adds support for controlling the post-RA machine scheduler direction (topdown, bottomup, bidirectional) on a per-function basis using the "amdgpu-post-ra-direction" function attribute.	2025-08-01 16:07:09 +08:00
Stanislav Mekhanoshin	d70f228e83	[AMDGPU] Add gfx1250 V_ADD_{MIN\|MAX}_{U\|I}32 instructions (#151379 )	2025-07-30 13:12:14 -07:00
Stanislav Mekhanoshin	3dfd939a16	[AMDGPU] gfx1250 V_{MIN\|MAX}_{I\|U}64 opcodes (#151256 )	2025-07-29 19:13:51 -07:00
Stanislav Mekhanoshin	d99238263c	[AMDGPU] Implement v_mad_u32/v_mad_nc_u\|i64_u32 on gfx1250 (#151226 )	2025-07-29 15:06:35 -07:00
Changpeng Fang	6184ef1c2f	[AMDGPU] Support f64 atomics on gfx1250 (#151172 ) - BUF/FLAT/GLOBAL_ADD/MIN/MAX_F64 - DS_ADD_F64 Co-authored-by: Konstantin Zhuravlyov <Konstantin Zhuravlyov@amd.com>	2025-07-29 09:41:00 -07:00
Pierre van Houtryve	be17791f26	[AMDGPU][gfx1250] Add `cu-store` subtarget feature (#150588 ) Determines whether we can use `SCOPE_CU` stores (on by default), or whether all stores must be done at `SCOPE_SE` minimum.	2025-07-29 11:38:43 +02:00
Matt Arsenault	44ff1ed16e	AMDGPU: Move getMaxNumVectorRegs into GCNSubtarget (NFC) (#150889 ) Addresses a TODO	2025-07-28 17:25:20 +09:00
Stanislav Mekhanoshin	96e5eed92a	[AMDGPU] Select VMEM prefetch for llvm.prefetch on gfx1250 (#150493 ) We have a choice to use a scalar or vector prefetch for an uniform pointer. Since we do not have scalar stores our scalar cache is practically readonly. The rw argument of the prefetch intrinsic is used to force vector operation even for an uniform case. On GFX12 scalar prefetch will be used anyway, it is still useful but it will only bring data to L2.	2025-07-24 13:22:50 -07:00
Stanislav Mekhanoshin	a70f7dafc1	[AMDGPU] gfx1250 flat and global prefetch MC support (#150455 )	2025-07-24 11:00:56 -07:00
Changpeng Fang	473bc0d188	[AMDGPU] Support V_FMA_MIX*_BF16 instructions on gfx1250 (#150381 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2025-07-24 09:43:49 -07:00
Changpeng Fang	eb43b79765	[AMDGPU] Disable SGPR read hazard mitigation for gfx1250 (#150344 ) Co-authored-by: Jay Foad <Jay.Foad@amd.com>	2025-07-24 00:05:58 -07:00
Changpeng Fang	9a563b08e2	[AMDGPU] Support V_PK_MIN3/MAX3_NUM_F16 on gfx1250 (#150326 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2025-07-23 15:15:19 -07:00
Stanislav Mekhanoshin	2346968807	[AMDGPU] Add V_ADD\|SUB\|MUL_U64 gfx1250 opcodes (#150291 )	2025-07-23 13:17:56 -07:00
Changpeng Fang	d385e9d86b	AMDGPU: Support V_PK_ADD_{MIN\|MAX}_{I\|U}16 and V_{MIN\|MAX}3_{I\|U}16 on gfx1250 (#150155 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2025-07-23 00:17:22 -07:00
Stanislav Mekhanoshin	a0973de745	[AMDGPU] Select scale_offset for global instructions on gfx1250 (#150107 ) Also switches immediate offset to signed for the subtarget.	2025-07-22 15:04:52 -07:00
Harrison Hao	8c14d3f44f	[MISched] Use SchedRegion in overrideSchedPolicy and overridePostRASchedPolicy (#149297 ) This patch updates `overrideSchedPolicy` and `overridePostRASchedPolicy` to take a `SchedRegion` parameter instead of just `NumRegionInstrs`. This provides access to both the instruction range and the parent `MachineBasicBlock`, which enables looking up function-level attributes. With this change, targets can select post-RA scheduling direction per function using a function attribute. For example: ```cpp void overridePostRASchedPolicy(MachineSchedPolicy &Policy, const SchedRegion &Region) const { const Function &F = Region.RegionBegin->getMF()->getFunction(); Attribute Attr = F.getFnAttribute("amdgpu-post-ra-direction"); ... }	2025-07-22 15:55:12 +08:00
Stanislav Mekhanoshin	a0b854d576	[AMDGPU] MC support for gfx1250 scale_offset modifier (#149881 )	2025-07-21 15:04:59 -07:00
Shilei Tian	7e105fbdbe	[AMDGPU] Add support for `v_tanh_f32` on gfx1250 (#149360 ) Co-authored-by: Mekhanoshin, Stanislav <Stanislav.Mekhanoshin@amd.com>	2025-07-17 15:42:35 -04:00
Stanislav Mekhanoshin	9912ccb0b4	[AMDGPU] gfx1250 MC support for FLAT GVS addressing (#149173 )	2025-07-16 14:35:07 -07:00
Stanislav Mekhanoshin	82d7405b3b	[AMDGPU] Use S_ADD_PC_I64 for long branches in gfx1250 (#148961 )	2025-07-15 17:14:56 -07:00
Stanislav Mekhanoshin	d1e3ab9c4b	[AMDGPU] Use v_mov_b64 in codegen on gfx1250 (#148272 )	2025-07-11 22:16:50 -07:00
Stanislav Mekhanoshin	f090554359	[AMDGPU] MC support for v_fmaak_f64/v_fmamk_f64 gfx1250 intructions (#148282 )	2025-07-11 14:17:03 -07:00
Stanislav Mekhanoshin	7920dff394	[AMDGPU] VOPD/VOPD3 changes for gfx1250 (#147602 )	2025-07-10 14:15:01 -07:00
Stanislav Mekhanoshin	00a85e5704	[AMDGPU] gfx1250: MC support for 64-bit literals (#147861 )	2025-07-09 22:25:47 -07:00
Stanislav Mekhanoshin	d0a4af725e	[AMDGPU] Add FeatureIEEEMinimumMaximumInsts. NFCI. (#147594 ) Co-authored-by: Mirko Brkušanin <Mirko.Brkusanin@amd.com>	2025-07-08 14:32:44 -07:00
Shilei Tian	d258457d42	[AMDGPU] Add support for `v_cvt_f32_fp8` on gfx1250 (#147579 ) Co-authored-by: Mekhanoshin, Stanislav <Stanislav.Mekhanoshin@amd.com>	2025-07-08 16:21:24 -04:00
Changpeng Fang	5035d20dcb	AMDGPU: Implement ds_atomic_async_barrier_arrive_b64/ds_atomic_barrier_arrive_rtn_b64 (#146409 ) These two instructions are supported by gfx1250. We define the instructions and implement the corresponding intrinsic and builtin. Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2025-07-01 11:08:49 -07:00
Changpeng Fang	4729242878	AMDGPU: Add MC layer support for load transpose instructions for gfx1250 (#146024 ) Co-authored with @jayfoad	2025-06-26 22:30:31 -07:00
Changpeng Fang	2550a637a1	AMDGPU: Remove LDS-direct(param)-loads and VINTERP ops from gfx1250 support (#145631 )	2025-06-24 22:44:35 -07:00
Changpeng Fang	3de2af3ef5	AMDGPU: Remove export and related instructions from gfx1250 support (#145624 )	2025-06-24 19:59:26 -07:00
Changpeng Fang	fe8a26263a	AMDGPU: Remove Formatted MUBUF instructions from gfx1250 support (#145590 )	2025-06-24 14:17:13 -07:00
Stanislav Mekhanoshin	fe0568389d	[AMDGPU] Require aligned VGPRs for gfx1250 (#145561 )	2025-06-24 12:16:01 -07:00
Changpeng Fang	ce4d214947	AMDGPU: Remove MTBUF instructions from gfx1250 support (#145563 )	2025-06-24 11:59:13 -07:00
Diana Picus	a201f8872a	[AMDGPU] Replace dynamic VGPR feature with attribute (#133444 ) Use a function attribute (amdgpu-dynamic-vgpr) instead of a subtarget feature, as requested in #130030.	2025-06-24 11:09:36 +02:00
Stanislav Mekhanoshin	40eee8ec7f	[AMDGPU] Add s_setprio_inc_wg gfx1250 instruction (#145152 )	2025-06-22 12:52:05 -07:00
Stanislav Mekhanoshin	affcc5e728	[AMDGPU] Add s_wait_xcnt gfx1250 instruction (#145086 )	2025-06-20 12:28:18 -07:00

1 2 3 4 5

236 Commits