llvm-project

Author	SHA1	Message	Date
Matt Arsenault	143ca74ed3	AtomicExpand: Convert tests to opaque pointers	2022-11-28 08:43:16 -05:00
Manuel Brito	f408635b26	[CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC] Differential Revision: https://reviews.llvm.org/D138483	2022-11-24 08:42:28 +00:00
Nuno Lopes	b50e1bd605	Revert "[CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC]" This reverts commit f50423c1a4422900aa1240fed643f5920451a88d.	2022-11-22 12:41:22 +00:00
Manuel Brito	f50423c1a4	[CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC] Differential Revision: https://reviews.llvm.org/D138483	2022-11-22 11:40:25 +00:00
gonglingqin	19ae5391e3	[LoongArch] Expand atomicrmw fadd/fsub/fmin/fmax with CmpXChg Differential Revision: https://reviews.llvm.org/D137311	2022-11-14 10:11:37 +08:00
Matt Arsenault	3cfa03856f	AtomicExpand: Support cmpxchg expansion for small FP types Handles f16 atomics for AMDGPU.	2022-11-10 22:16:11 -08:00
Bjorn Pettersson	893e351f2f	[test] Avoid legacy PM default pipelines (O0,O1 etc) when running opt Two lit tests were found running something like this: opt -O<n> -pass-locked-to-legacy-PM ... The expand-atomicrmw-xchg-fp.ll seem to have used -O1 just to ensure that the -atomic-expand pass were thinking that it wasn't running at O0 level. Same thing can be ensured by using the -codegen-opt-level=1 option, making it possible to avoid using O1 in that test case. In the vector-reductions-expanded.ll test case it was possible to split the RUN line into using two opt invocations. First running "opt -O2" using the new PM, and then running "opt -expand-reductions" using the legacy PM. I think that given this patch we get closer to removing code related to 'AddOptimizationPasses' in opt.cpp. Differential Revision: https://reviews.llvm.org/D137626	2022-11-09 09:57:57 +01:00
Shilei Tian	1186e9d59f	[LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address space The 32-bit floating-point atomic add instructions on AMDGPUs does not support a "flat" or "generic" address space. So, if the address space cannot be determined statically, the AMDGPU backend will fall back to a CAS loop (which does support "flat" addressing). Instead, this patch emits runtime address-space checks to allow native FP atomic add instructions for global and LDS memory (and non-atomic FP add instructions for private/scratch memory). In order to do that, this patch introduces a new interface function `emitExpandAtomicRMW`. It is expected to be called when a common atomic expand doesn't work for a specific target, such as the case we discussed here. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D129690	2022-11-04 14:11:05 -04:00
Matt Arsenault	b60a9ccd02	AtomicExpand: Use InstSimplifyFolder Automatically cleanup operations if we know the atomic has higher alignment.	2022-10-31 23:31:42 -07:00
Matt Arsenault	07f12170a2	AtomicExpand: Don't create unused instructions for some atomicrmw This wasn't used by every atomicrmw expansion.	2022-10-31 18:34:36 -07:00
Matt Arsenault	d0750ec475	AtomicExpand: Avoid some operations if the atomic is overaligned Let some of the pointer bithacking fold away if we know the LSB are 0.	2022-10-13 23:31:00 -07:00
Matt Arsenault	01adf1f3e5	AtomicExpand: Add some more overaligned atomic tests	2022-09-28 12:51:30 -04:00
Matt Arsenault	a61c3455c0	AtomicExpand: Use llvm.ptrmask instead of ptrtoint This removes the ptrtoint from the load's pointer operand, although we can't entirely eliminate these to get the LSB shift. In a future patch, this will avoid ptrtoint in the case where the atomic is overaligned to the word size.	2022-09-28 12:51:30 -04:00
Petar Avramovic	5cee9047d5	AMDGPU: Improve atomicrmw fadd selection Use same atomicrmw fadd expansion rules for gfx908, gfx940 and gfx11 as for gfx90a. Add missing globalisel legalizer support for flat atomicrmw fadd f32 on gfx940 and gfx11. Isel support for gfx11 will be added in D130579. Differential Revision: https://reviews.llvm.org/D131560	2022-09-23 17:52:10 +02:00
Petar Avramovic	48968c47b0	AMDGPU: Add detailed buffer, global and flat atomic fadd tests Precommit for D130579 that will remove manual selection and use patterns from td files. Tests are grouped based on target features. All patterns have rtn and no-rtn versions. buffer atomics patterns are selected based on the intrinsic used (raw or struct) and the offset operand (imm or vgpr): _offset raw with imm offset _offen raw with vgpr offset (or large imm offset) _idxen struct with imm offset _bothen struct with vgpr offset (or large imm offset) global and flat atomics are selected via intrinsic or the atomicrmw fadd. atomicrmw tests have amdgpu-unsafe-fp-atomics=true and non-system scope since they get expanded otherwise. atomicrmw fadd does not support vector type, test float and double. global atomics patterns are selected based on address type via (global or flat) intrinsic or atomicrmw fadd with global address(addrspace(1)). 'no suffix' vgpr addrspace(1) address _saddr sgpr addrspace(1)* address flat atomics patterns are selected via (flat)intrinsic or atomicrmw fadd with flat address (* - address space 0). Differential Revision: https://reviews.llvm.org/D131561	2022-09-23 17:52:10 +02:00
Matt Arsenault	b9a371f6d1	AtomicExpand: Use correct pointer size for integer This was using the default address space.	2022-09-20 16:51:05 -04:00
Matt Arsenault	4d322ba77b	AMDGPU: Add baseline test for expansion of 16-bit local atomics The expansion is currently using the wrong pointer size.	2022-09-20 16:51:05 -04:00
Matt Arsenault	784d2930c0	AtomicExpand: Switch test to generated checks	2022-09-20 16:51:05 -04:00
Matt Arsenault	28e03692ae	AMDGPU: Fix expansion of 16-bit atomicrmw Fixes issue 57830	2022-09-20 14:47:40 -04:00
Matt Arsenault	a4b1f7a8b5	AMDGPU: Add some tests for atomics with excess alignment	2022-09-19 19:27:21 -04:00
Matt Arsenault	3f77df8e29	AMDGPU: Update baseline test checks	2022-09-19 18:57:33 -04:00
Marco Elver	f0d6709e4a	[AtomicExpandPass] Always copy pcsections Metadata to expanded atomics When expanding IR atomics to target-specific atomics, copy all !pcsections Metadata to expanded atomics automatically. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130885	2022-09-07 11:36:01 +02:00
Kai Luo	ad2f7fd286	[AtomicExpand] Make floating point conversion happens before fence insertion IIUC, the conversion part is not part of atomic operations and fences should be put around converted atomic operations. This also fixes atomic load of floating point values which requires fence on PowerPC. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D127609	2022-08-31 09:54:58 +08:00
gonglingqin	e9a4b8e397	[LoongArch] Optimize the atomic store with amswap_db.[w/d] When AtomicOrdering is release or stronger, use amswap_db.[w/d] $zero, $a1, $a0 instead of dbar 0 st.[w/d] $a0, $a1, 0 Thanks to @xry111 for the suggestion: https://reviews.llvm.org/D128901#3626635 Differential Revision: https://reviews.llvm.org/D129838	2022-08-23 17:11:57 +08:00
gonglingqin	47f3dc6d49	[LoongArch] Add codegen support for atomic fence, atomic load and atomic store Differential Revision: https://reviews.llvm.org/D128901	2022-07-13 15:25:45 +08:00
Kai Luo	6710b21d46	[PowerPC] Allow llvm.ppc.cfence to accept pointer types In the context of atomic load, integer, pointer and float point types are allowed, thus we should allow llvm.ppc.cfence to accept any type mentioned. Fixes https://github.com/llvm/llvm-project/issues/55983. Reviewed By: shchenz, vchuravy Differential Revision: https://reviews.llvm.org/D127554	2022-06-24 10:55:32 +08:00
Kai Luo	8091f7120c	[PowerPC] Correct test RUN line. NFC.	2022-06-14 14:56:00 +08:00
Kai Luo	029fc37270	[PowerPC][AtomicExpand] Precommit IR tests for D127609. NFC.	2022-06-14 14:24:21 +08:00
Kai Luo	18679ac0d7	[PowerPC] Adjust `MaxAtomicSizeInBitsSupported` on PPC64 AtomicExpandPass uses this variable to determine emitting libcalls or not. The default value is 1024 and if we don't specify it for PPC64 explicitly, AtomicExpandPass won't emit `__atomic_` libcalls for those target unable to inline atomic ops and finally the backend emits `__sync_` libcalls. Thanks @efriedma for pointing it out. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122868	2022-04-09 00:03:09 +00:00
Kai Luo	dc77769fc6	[PowerPC] Add cmpxchg test for pwr7 in atomic expand pass. NFC.	2022-04-01 13:27:54 +08:00
Kai Luo	31906a6090	[AtomicExpand][PowerPC] Fix all-one mask value When generating a all-one mask value whose bitwidth is larger than 64, signed extension should be used rather then zero extension. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D120865	2022-03-18 13:35:54 +08:00
Arthur Eubanks	2371c5a0e0	[OpaquePtr][ARM] Use elementtype on ldrex/ldaex/stlex/strex Includes verifier changes checking the elementtype, clang codegen changes to emit the elementtype, and ISel changes using the elementtype. Basically the same as D120527. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D121847	2022-03-16 14:11:53 -07:00
Arthur Eubanks	250620f76e	[OpaquePtr][AArch64] Use elementtype on ldxr/stxr Includes verifier changes checking the elementtype, clang codegen changes to emit the elementtype, and ISel changes using the elementtype. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D120527	2022-03-14 10:09:59 -07:00
Kai Luo	1cfcbf197c	[PowerPC][atomics] Precommit test cases for i128 cmpxchg. NFC.	2022-03-03 10:47:52 +08:00
Kai Luo	1453f048cf	[PowerPC] Add lit.local.cfg in AtomicExpand tests Fixed build errors on other platforms.	2021-07-20 09:13:50 +00:00
Kai Luo	e2ee27b20b	[PowerPC] Fallback to base's implementation of shouldExpandAtomicCmpXchgInIR and shouldExpandAtomicCmpXchgInIR If we can't decide `shouldExpandAtomicCmpXchgInIR` or `shouldExpandAtomicCmpXchgInIR` in PPC's implementation after https://reviews.llvm.org/rGb9c3941cd61de1e1b9e4f3311ddfa92394475f4b, resort to base's implementation. This fixes internal build of OpenMP which uses atomic operations on float. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D106234	2021-07-20 06:14:24 +00:00
LemonBoy	b577ec4956	[AtomicExpandPass][AArch64] Promote xchg with floating-point types to integer ones Follow the same strategy used for atomic loads/stores by converting the operands to equally-sized integer types. This change prevents the atomic expansion pass from generating illegal LL/SC pairs when targeting AArch64: `expand-atomicrmw-xchg-fp.ll` would previously instantiate intrinsics such as `llvm.aarch64.ldaxr.p0f32` that cannot be lowered. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D103232	2021-05-29 08:57:27 +02:00
Tomas Matheson	9d86095ff8	Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" This reverts commit 753185031d939711f8733639a77a6fdc3bdbad22.	2021-05-03 21:48:20 +01:00
Tomas Matheson	753185031d	[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164 Originally submitted as 3338290c187b254ad071f4b9cbf2ddb2623cefc0. Reverted in c7df6b1223d88dfd15248fbf7b7b83dacad22ae3.	2021-05-03 20:25:15 +01:00
LemonBoy	4751cadcca	[AArch64] Prevent spilling between ldxr/stxr pairs Apply the same logic used to check if CMPXCHG nodes should be expanded at -O0: the register allocator may end up spilling some register in between the atomic load/store pairs, breaking the atomicity and possibly stalling the execution. Fixes PR48017 Reviewed By: efriedman Differential Revision: https://reviews.llvm.org/D101163	2021-05-01 17:17:05 +02:00
Tomas Matheson	c7df6b1223	Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" This reverts commit 3338290c187b254ad071f4b9cbf2ddb2623cefc0. Broke expensive checks on debian.	2021-04-30 16:53:14 +01:00
Tomas Matheson	3338290c18	[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164	2021-04-30 16:40:33 +01:00
Stanislav Mekhanoshin	30b3aab329	Copy syncscope when expanding atomicrmw into cmpxchg loop Fixes: SWDEV-280070 Differential Revision: https://reviews.llvm.org/D99902	2021-04-05 17:29:38 -07:00
Konstantin Zhuravlyov	6054a456da	AMDGPU: Add support for amdgpu-unsafe-fp-atomics attribute If amdgpu-unsafe-fp-atomics is specified, allow {flat\|global}_atomic_add_f32 even if atomic modes don't match. Differential Revision: https://reviews.llvm.org/D95391	2021-02-04 08:09:34 -05:00
Pavel Iliin	4d7df43ffd	[AArch64] Out-of-line atomics (-moutline-atomics) implementation. This patch implements out of line atomics for LSE deployment mechanism. Details how it works can be found in llvm/docs/Atomics.rst Options -moutline-atomics and -mno-outline-atomics to enable and disable it were added to clang driver. This is clang and llvm part of out-of-line atomics interface, library part is already supported by libgcc. Compiler-rt support is provided in separate patch. Differential Revision: https://reviews.llvm.org/D91157	2020-11-20 13:30:12 +00:00
Alex Richardson	5bc438efcf	[AtomicExpand] Avoid creating an unnamed libcall I recently modified this pass to better support CHERI-RISC-V and while doing so I noticed that this pass was calling M->getOrInsertFunction() with the result of TLI->getLibcallName(RTLibType). However, AMDGPU fills the libcalls array with nullptr, so this creates an anonymous function instead. This patch changes expandAtomicOpToLibcall to return false in case the libcall does not exist and changes the assert() in the callees to a report_fatal_error() instead. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88800	2020-11-02 17:52:37 +00:00
Matt Arsenault	af0207f2ba	AMDGPU: Check global FP atomics match default FP mode We would always select global FP atomics from atomicrmw fadd, although they have a hardcoded FP mode.	2020-09-23 09:07:50 -04:00
Krzysztof Parzyszek	25a4b1904c	Handle part-word LL/SC in atomic expansion pass Differential Revision: https://reviews.llvm.org/D77213	2020-04-28 10:07:39 -05:00
Jonathan Roelofs	7c5d2bec76	[llvm] Fix missing FileCheck directive colons https://reviews.llvm.org/D77352	2020-04-06 09:59:08 -06:00
Matt Arsenault	32137699f7	AMDGPU: Fix copy-pasted test name error	2019-12-11 19:44:47 +05:30

1 2

81 Commits