llvm-project

Author	SHA1	Message	Date
Matt Arsenault	9cc298108a	AtomicExpand: Copy metadata from atomicrmw to cmpxchg (#109409 ) When expanding an atomicrmw with a cmpxchg, preserve any metadata attached to it. This will avoid unwanted double expansions in a future commit. The initial load should also probably receive the same metadata (which for some reason is not emitted as an atomic).	2024-10-31 11:54:07 -07:00
Matt Arsenault	e3222e6f80	AMDGPU: Add baseline tests for cmpxchg custom expansion (#109408 ) We need a non-atomic path if flat may access private.	2024-10-31 11:46:13 -07:00
Matt Arsenault	1d0370872f	AMDGPU: Expand flat atomics that may access private memory (#109407 ) If the runtime flat address resolves to a scratch address, 64-bit atomics do not work correctly. Insert a runtime address space check (which is quite likely to be uniform) and select between the non-atomic and real atomic cases. Consider noalias.addrspace metadata and avoid this expansion when possible (we also need to consider it to avoid infinitely expanding after adding the predication code).	2024-10-31 08:08:48 -07:00
Matt Arsenault	b0a25468fa	AMDGPU: Add baseline tests for flat-may-alias private atomic expansions (#109406 )	2024-10-15 22:29:24 +04:00
Matt Arsenault	0edd07770f	AMDGPU: Preserve alignment when custom expanding atomicrmw (#103768 )	2024-08-14 17:16:59 +04:00
Matt Arsenault	edded8d7b5	AMDGPU: Stop handling legacy amdgpu-unsafe-fp-atomics attribute (#101699 ) This is now autoupgraded to annotate atomicrmw instructions in old bitcode.	2024-08-13 22:02:25 +04:00
Matt Arsenault	80c51fad3b	AtomicExpand: Regenerate baseline checks (#103063 )	2024-08-13 20:43:39 +04:00
Matt Arsenault	1ae507d109	AMDGPU: Do not create phi user for atomicrmw with no uses (#103061 )	2024-08-13 19:24:52 +04:00
Matt Arsenault	42b5540211	AMDGPU: Preserve atomicrmw name when specializing address space (#102470 )	2024-08-09 00:43:04 +04:00
Matt Arsenault	bb7143f666	AMDGPU: Avoid creating unnecessary block split in atomic expansion (#102440 ) This was creating a new block to insert the is.shared check, but we can just do that in the original block.	2024-08-09 00:39:12 +04:00
Matt Arsenault	dfda9c5b9e	AMDGPU: Handle new atomicrmw metadata for fadd case (#96760 ) This is the most complex atomicrmw support case. Note we don't have accurate remarks for all of the cases, which I'm planning on fixing in a later change with more precise wording. Continue respecting amdgpu-unsafe-fp-atomics until it's eventual removal. Also seems to fix a few cases not interpreting amdgpu-unsafe-fp-atomics appropriately aaggressively.	2024-08-02 19:41:33 +04:00
Matt Arsenault	41439d5bb7	AMDGPU: Handle remote/fine-grained memory in atomicrmw fmin/fmax lowering (#96759 ) Consider the new atomic metadata when choosing to expand as cmpxchg instead.	2024-08-01 22:08:01 +04:00
Matt Arsenault	a2a73d892a	AMDGPU: Fix no return atomicrmw fadd v2f16 selection for gfx908 (#96948 ) We previously would always expand this with a cmpxchg loop, while it should be the same conditions as the f32 case (except for the denormal concern).	2024-06-27 21:17:16 +02:00
Matt Arsenault	a440a96ec2	AMDGPU: Start selecting flat/global atomicrmw fmin/fmax. (#95592 ) Define subtarget features for atomic fmin/fmax support. The flat/global support is a real messe. We had float/double support at the beginning in gfx6 and gfx7. gfx8 removed these. gfx10 reintroduced them. gfx11 removed the f64 versions again. gfx9 partially reintroduced them, in gfx90a and gfx940 but only for f64.	2024-06-23 10:10:41 +02:00
Matt Arsenault	8520061281	AMDGPU: Support local atomicrmw fmin/fmax for float/double (#95590 ) This has always been supported. Somehow, we ended up with 2 copies of clang builtins for this case, and the newer one erroneously requires gfx8-insts.	2024-06-18 18:34:34 +02:00
Matt Arsenault	cf5ce8cdf1	AMDGPU: Add some tests for i128 and fp128 atomic expansion These produce garbage libcalls, so the result is not useful but this at least shows we don't assert.	2024-06-18 10:33:25 +02:00
Matt Arsenault	4cf1a19b7e	Reapply "AMDGPU: Handle legal v2f16/v2bf16 atomicrmw fadd for global/flat (#95394 )" This reverts commit 95b77d90aae10725ea692e120aac083ef1c1297d.	2024-06-17 16:34:35 +02:00
Nico Weber	95b77d90aa	Revert "AMDGPU: Handle legal v2f16/v2bf16 atomicrmw fadd for global/flat (#95394 )" This reverts commit 5021e6dd548323e1169be3d466d440009e6d1f8e. Breaks tests, see https://github.com/llvm/llvm-project/pull/95394#issuecomment-2169394503	2024-06-15 12:33:13 -04:00
Matt Arsenault	5021e6dd54	AMDGPU: Handle legal v2f16/v2bf16 atomicrmw fadd for global/flat (#95394 ) Unlike the existing fadd cases, choose to ignore the requirement for amdgpu-unsafe-fp-atomics in case of fine-grained memory access. This is to minimize migration pain to the new atomic control metadata. This should not break any users, as the atomic intrinsics are still directly consumed, and clang does not yet produce vector FP atomicrmw.	2024-06-15 09:58:12 +02:00
Matt Arsenault	0a9a5f989f	AMDGPU: Legalize atomicrmw fadd for v2f16/v2bf16 for local memory (#95393 ) Make this legal for gfx940 and gfx12	2024-06-15 09:55:04 +02:00
Matt Arsenault	f3afdc4ad9	AtomicExpand: Fix creating invalid ptrmask for fat pointers (#94955 ) The ptrmask intrinsic requires the integer mask to be the index size, not the pointer size.	2024-06-12 10:45:42 +02:00
Matt Arsenault	a2bc50aa8b	AMDGPU: Add more tests for vector typed atomicrmw fadd Some cases should be legal for gfx940.	2024-06-11 14:44:28 +02:00
Matt Arsenault	d81170873c	AtomicExpand: Preserve metadata when expanding partword RMW (#89769 ) This will be important for AMDGPU in a future patch.	2024-05-23 10:04:47 +02:00
wanglei	9d4f7f44b6	[test][LoongArch] Add -mattr=+d option. NFC Because most of tests assume target-abi=`lp64d`, adding the corresponding feature is reasonable. rg -l loongarch -g '!*.s' \| xargs sed -i '/mtriple=loongarch/ {/-mattr=/!{/target-abi/! s/mtriple=loongarch.. /&-mattr=+d /}}'	2024-05-14 20:23:04 +08:00
Matt Arsenault	82bb2534d4	AMDGPU: Don't bitcast float typed atomic store in IR (#90116 ) Implement the promotion in the DAG. Depends #90113	2024-05-07 21:43:22 +02:00
Matt Arsenault	7927bcdb8a	AMDGPU: Do not bitcast atomicrmw in IR (#90045 ) This is the first step to eliminating shouldCastAtomicRMWIInIR. This and the other atomic expand casting hooks should be removed. This adds duplicate legalization machinery and interfaces. This is already what codegen is supposed to do, and already does for the promotion case. In the case of atomicrmw xchg, there seems to be some benefit to having the bitcasts moved outside of the cmpxchg loop on targets with separate int and FP registers, which we should be able to deal with by directly checking for the legality of the underlying operation. The casting path was also losing metadata when it recreated the instruction.	2024-05-07 18:26:32 +02:00
Matt Arsenault	4e67b5058e	AMDGPU: Add more tests for atomicrmw handling Add agent scope copies of atomicrmw atomics tests. Expand testing for the undo identity atomicrmw case. Test 16-bit atomic expansions.	2024-05-03 11:50:59 +02:00
Matt Arsenault	9f9856d623	AMDGPU: Update name for amdgpu.no.remote.memory metadata	2024-05-03 11:50:59 +02:00
Matt Arsenault	f1112ebe07	AMDGPU: Do not bitcast atomic load in IR (#90060 ) These hooks should be removed. This is a trivial legalization transform the legalizer needs to support. The IR just complicates things, and it was losing metadata. Implement the DAG promotion support, and switch AMDGPU over to using it. Really we'd be a lot better off merging ATOMIC_LOAD and LOAD like GlobalISel does.	2024-04-26 12:20:40 +02:00
Matt Arsenault	76a3be7c76	AMDGPU: Add baseline tests for bad bitcasting of atomic load/store	2024-04-25 16:08:11 +02:00
Matt Arsenault	a45eb62877	AtomicExpand: Fix dropping a syncscope when bitcasting atomicrmw	2024-04-24 19:09:34 +02:00
Pierre van Houtryve	cf328ff96d	[IR] Memory Model Relaxation Annotations (#78569 ) Implements the core/target-agnostic components of Memory Model Relaxation Annotations. RFC: https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5	2024-04-24 08:52:25 +02:00
Matt Arsenault	31af5e9001	AtomicExpand: Emit or with constant on RHS This will save later code from commuting it.	2024-04-23 15:00:31 +02:00
Matt Arsenault	5b6db43f29	AMDGPU: Simplify DS atomicrmw fadd handling (#89468 ) DS atomic fadd F32 does respect the denormal mode, so we do not need to consider the expected FP mode or unsafe-fp-atomics attribute. They don't respect the rounding mode, but we don't care outside of strictfp. This also reveals the fp-mode-is-flush check has been missing in the cases that should be considering it alongside amdgpu-unsafe-fp-atomics. This also stops considering the case where flushing is enabled for f64, as flushing isn't mandated and we barely handle this case.	2024-04-22 12:22:54 +02:00
Matt Arsenault	f433c3b380	AMDGPU: Add tests for atomicrmw handling of new metadata (#89248 ) Add baseline tests which should comprehensively test the new atomic metadata. Test codegen / expansion, and preservation in a few transforms. New metadata defined in #85052	2024-04-20 00:43:36 +02:00
Matt Arsenault	c8db069253	AMDGPU: Use common check prefix in atomic expand test	2024-04-19 15:46:01 +02:00
Matt Arsenault	db2f64ee1f	AMDGPU: Fix not handling atomicrmw fadd in exotic address spaces correctly We try to interpret unknown address space numbers as aliases of global, but this wasn't applied here. Also improve test coverage for the buffer fat pointer address space.	2024-04-17 21:39:26 +02:00
Matt Arsenault	9bd10853e5	AMDGPU: Undo atomicrmw add/sub/xor 0 -> atomicrmw or canonicalization (#87533 ) InstCombine transforms add of 0 to or of 0. For system atomics, this is problematic because while PCIe supports add, it does not support the other operations. Undo this for system scope atomics.	2024-04-13 00:24:12 +02:00
Matt Arsenault	4cb110a84f	[RFC] IR: Support atomicrmw FP ops with vector types (#86796 ) Allow using atomicrmw fadd, fsub, fmin, and fmax with vectors of floating-point type. AMDGPU supports atomic fadd for <2 x half> and <2 x bfloat> on some targets and address spaces. Note this only supports the proper floating-point operations; float vector typed xchg is still not supported. cmpxchg still only supports integers, so this inserts bitcasts for the loop expansion. I have support for fp vector typed xchg, and vector of int/ptr separately implemented but I don't have an immediate need for those beyond feature consistency.	2024-04-06 15:27:45 -04:00
Kevin P. Neal	fe893c93b7	[FPEnv][AtomicExpand] Correct strictfp attribute handling in AtomicExpandPass (#87082 ) The AtomicExpand pass was lowering function calls with the strictfp attribute to sequences that included function calls incorrectly lacking the attribute. This patch corrects that. The pass now also emits the correct constrained fp call instead of normal FP instructions when in a function with the strictfp attribute. Test changes verified with D146845.	2024-03-29 14:54:51 -04:00
Rishabh Bali	fe42e72db2	[CodeGen] Port AtomicExpand to new Pass Manager (#71220 ) Port the `atomicexpand` pass to the new Pass Manager. Fixes #64559	2024-02-25 18:42:22 +05:30
James Y Knight	137f785fa6	[AMDGPU] Set MaxAtomicSizeInBitsSupported. (#75185 ) This will result in larger atomic operations getting expanded to `__atomic_*` libcalls via AtomicExpandPass, which matches what Clang already does in the frontend. While AMDGPU currently disables the use of all libcalls, I've changed it to instead disable all of them _except_ the atomic ones. Those are already be emitted by the Clang frontend, and enabling them in the backend allows the same behavior there.	2023-12-18 16:51:06 -05:00
Alex Richardson	e39f6c1844	[opt] Infer DataLayout from triple if not specified There are many tests that specify a target triple/CPU flags but no DataLayout which can lead to IR being generated that has unusual behaviour. This commit attempts to use the default DataLayout based on the relevant flags if there is no explicit override on the command line or in the IR file. One thing that is not currently possible to differentiate from a missing datalayout `target datalayout = ""` in the IR file since the current APIs don't allow detecting this case. If it is considered useful to support this case (instead of passing "-data-layout=" on the command line), I can change IR parsers to track whether they have seen such a directive and change the callback type. Differential Revision: https://reviews.llvm.org/D141060	2023-10-26 12:07:37 -07:00
Alex Richardson	e86d6a43f0	Regenerate test checks for tests affected by D141060	2023-10-04 10:51:35 -07:00
Pravin Jagtap	5f8fd68672	[AMDGPU] Pre-commit test for D157495 Reviewed By: yassingh Differential Revision: https://reviews.llvm.org/D158243	2023-08-18 06:52:32 -04:00
Matt Arsenault	7575ee7167	AMDGPU: Add more test coverage for FP-typed atomicrmw xchg	2023-08-10 17:38:25 -04:00
Matt Arsenault	35be9e2903	AtomicExpand: Preserve syncscope when expanding partword atomics	2023-08-08 14:38:06 -04:00
Matt Arsenault	3371849194	AMDGPU: Round out system atomics tests There were system scope tests only for integer min/max. Expand this to cover all of the integer operations.	2023-08-08 14:38:05 -04:00
Matt Arsenault	b97e9a9a03	AMDGPU: Fix some typed pointers in atomic expand test	2023-08-08 14:38:05 -04:00
Kai Luo	f26af16e2c	[PowerPC][AIX] Enable quadword atomics by default for AIX On AIX, a libatomic supporting inline quadword atomic operations has been released, so that compatibility is not an issue now, we can enable quadword atomics by default. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D151312	2023-07-25 08:21:07 +08:00

1 2 3

143 Commits