119 Commits

Author SHA1 Message Date
Jessica Del
32f9983c06
[AMDGPU] - Add address space for strided buffers (#74471)
This is an experimental address space for strided buffers. These buffers
can have structs as elements and
a stride > 1.
These pointers allow the indexed access in units of stride, i.e., they
point at `buffer[index * stride]`.
Thus, we can use the `idxen` modifier for buffer loads.

We assign address space 9 to 192-bit buffer pointers which contain a
128-bit descriptor, a 32-bit offset and a 32-bit index. Essentially,
they are fat buffer pointers with an additional 32-bit index.
2023-12-15 15:49:25 +01:00
Kazu Hirata
f3dcc2351c
[clang] Use StringRef::{starts,ends}_with (NFC) (#75149)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 08:54:13 -08:00
Dominik Adamski
276a024b49
[NFC][AMDGPU] Unify AMDGPU address space enum (#73944)
Types of AMDGPU address space were defined not only in Clang-specific class
but also in LLVM header.

If we unify the AMD GPU address space enumeration, then we can reuse it in
Clang, Flang and LLVM.
2023-12-11 10:45:21 +01:00
Yaxun (Sam) Liu
b8a9c50f22 [AMDGPU] Add target feature gws to clang
Reviewed by: Matt Arsenault

Differential Revision: https://reviews.llvm.org/D158367
2023-08-25 11:50:47 -04:00
Yaxun (Sam) Liu
7f12dcac79 [HIP] Fix regression about __fp16 args and return value
HIP allows __fp16 as function arguments and return value by passing
-fallow-half-arguments-and-returns to clang through hipcc.

https://reviews.llvm.org/D133885 removed -fallow-half-arguments-and-returns
and add a TargetInfo member to control it.

This caused regressions in some HIP apps
(https://github.com/ROCm-Developer-Tools/HIP/issues/3178).

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D145345

Fixes: https://github.com/ROCm-Developer-Tools/HIP/issues/3178
2023-08-01 11:29:19 -04:00
Yaxun (Sam) Liu
ad96f25b93 [AMDGPU] Rename predefined macro __AMDGCN_WAVEFRONT_SIZE
rename it to __AMDGCN_WAVEFRONT_SIZE__ for consistency.

__AMDGCN_WAVEFRONT_SIZE will be deprecated in the future.

Reviewed by: Matt Arsenault, Johannes Doerfert

Differential Revision: https://reviews.llvm.org/D154207
2023-07-02 11:10:06 -04:00
Yaxun (Sam) Liu
c0f0d50653 [HIP] emit macro __HIP_NO_IMAGE_SUPPORT
HIP texture/image support is optional as some devices
do not have image instructions. A macro __HIP_NO_IMAGE_SUPPORT
is defined for device not supporting images (d0448aa4c4/docs/reference/kernel_language.md (L426) )

Currently the macro is defined by HIP header based on predefined macros
for GPU, e.g __gfx*__ , which is error prone. This patch let clang
emit the predefined macro.

Reviewed by: Matt Arsenault, Artem Belevich

Differential Revision: https://reviews.llvm.org/D151349
2023-06-14 22:53:41 -04:00
Yaxun (Sam) Liu
6adb9a0602 [AMDGPU] Emit predefined macro __AMDGCN_CUMODE__
Predefine __AMDGCN_CUMODE__ as 1 or 0 when compilation assumes CU or WGP modes.

If WGP mode is not supported, ignore -mno-cumode and emit a warning.

This is needed for implementing device functions like __smid
(312dff7b79/include/hip/amd_detail/amd_device_functions.h (L957))

Reviewed by: Matt Arsenault, Artem Belevich, Brian Sumner

Differential Revision: https://reviews.llvm.org/D145343
2023-05-12 18:50:52 -04:00
Krzysztof Drewniak
f0415f2a45 Re-land "[AMDGPU] Define data layout entries for buffers""
Re-land D145441 with data layout upgrade code fixed to not break OpenMP.

This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27.

Differential Revision: https://reviews.llvm.org/D149776
2023-05-03 19:43:56 +00:00
Krzysztof Drewniak
3f2fbe92d0 Revert "[AMDGPU] Define data layout entries for buffers"
This reverts commit f9c1ede2543b37fabe9f2d8f8fed5073c475d850.

Differential Revision: https://reviews.llvm.org/D149758
2023-05-03 16:11:00 +00:00
Krzysztof Drewniak
f9c1ede254 [AMDGPU] Define data layout entries for buffers
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.

The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.

The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.

Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.

This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.

Depends on D143437

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D145441
2023-05-03 15:25:58 +00:00
Dominik Adamski
e43247dd32 [Clang][Flang][AMDGPU] Add support for AMDGPU to Flang driver
Scope of changes:
  1) Extract common code between Clang and Flang for parsing AMDGPU features
  2) Add function which adds implicit target features for AMDGPU as Clang does
  3) Add AMDGPU target as one of valid targets for Flang

Differential Revision: https://reviews.llvm.org/D145579

Reviewed By: yaxunl, awarzynski
2023-03-29 02:23:37 -05:00
Anshil Gandhi
a955a31896 [AMDGPU] Replace target feature for global fadd32
Change target feature of __builtin_amdgcn_global_atomic_fadd_f32
to atomic-fadd-rtn-insts. Enable atomic-fadd-rtn-insts for gfx90a,
gfx940 and gfx1100 as they all support the return variant of
`global_atomic_add_f32`.

Fixes https://github.com/llvm/llvm-project/issues/61331.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D146840
2023-03-28 15:58:30 -06:00
Mariusz Sikora
ea064ee2a3 [AMDGPU] Create Subtarget Features for some of 16 bits atomic fadd instructions
Introducing Subtarget Features for instructions:
- ds_pk_add_bf16
- ds_pk_add_f16
- ds_pk_add_rtn_bf16
- ds_pk_add_rtn_f16
- flat_atomic_pk_add_f16
- flat_atomic_pk_add_bf16
- global_atomic_pk_add_f16
- global_atomic_pk_add_bf16
- buffer_atomic_pk_add_f16

Differential Revision: https://reviews.llvm.org/D146701
2023-03-24 13:10:40 +01:00
Stanislav Mekhanoshin
df0488369d [AMDGPU] Split dot7 feature
Differential Revision: https://reviews.llvm.org/D142507
2023-01-26 10:34:36 -08:00
Stanislav Mekhanoshin
870b92977e [AMDGPU] Split dot8 feature
Differential Revision: https://reviews.llvm.org/D142407
2023-01-24 11:16:07 -08:00
Stanislav Mekhanoshin
4ab2246d48 [AMDGPU] Remove dot1 and dot6 features from clang for gfx11
These are unsupported.

Differential Revision: https://reviews.llvm.org/D142493
2023-01-24 10:52:42 -08:00
serge-sans-paille
5a7f47cc02
[clang] Optimize clang::Builtin::Info density
Reorganize clang::Builtin::Info to have them naturally align on 4 bytes
boundaries.

Instead of storing builtin headers as a straight char pointer, enumerate
them and store the enum. It allows to use a small enum instead of a
pointer to reference them.

On a 64 bit machine, this brings sizeof(clang::Builtin::Info) from 56
down to 48 bytes.

On a release build on my Linux 64 bit machine, it shrinks the size of
libclang-cpp.so by 193kB.

The impact on performance is negligible in terms of instruction count,
but the wall time seems better, see
https://llvm-compile-time-tracker.com/compare.php?from=b3d8639f3536a4876b511aca9fb7948ff9266cee&to=a89b56423f98b550260a58c41e64aff9e56b76be&stat=task-clock

Differential Revision: https://reviews.llvm.org/D142024
2023-01-23 14:27:44 +01:00
serge-sans-paille
a3c248db87
Move from llvm::makeArrayRef to ArrayRef deduction guides - clang/ part
This is a follow-up to https://reviews.llvm.org/D140896, split into
several parts as it touches a lot of files.

Differential Revision: https://reviews.llvm.org/D141139
2023-01-09 12:15:24 +01:00
Matt Arsenault
81849497b4 clang/AMDGPU: Remove flat-address-space from feature map
This was only used for checking if is_shared/is_private were legal,
which we're not bothering to do anymore.

This is apparently visible to more than the target attribute (which
seems to silently ignore unrecognized features), so this has the
potential to break something (i.e. see the OpenMP test change)
2023-01-05 16:35:04 -05:00
Matt Arsenault
f4bcd7f598 AMDGPU/clang: Add builtins for llvm.amdgcn.ballot
Use explicit _w32/_w64 suffixes for the wave size to be consistent
with the existing other wave dependent intrinsics. Also start
diagnosing trying to use both wave32 and wave64.

I would have preferred to avoid the +wavefrontsize64 spam on targets
where that's the only option, but avoiding this seems to be more work
than I expected.
2022-12-29 17:58:55 -05:00
serge-sans-paille
d9ab3e82f3
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.

The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.

Differential Revision: https://reviews.llvm.org/D139881
2022-12-27 09:55:19 +01:00
Pierre van Houtryve
678d8946ba [AMDGPU] Add bf16 storage support
- [Clang] Declare AMDGPU target as supporting BF16 for storage-only purposes on amdgcn
  - Add Sema & CodeGen tests cases.
  - Also add cases that D138651 would have covered as this patch replaces it.
- [AMDGPU] Add BF16 storage-only support
  - Support legalization/dealing with bf16 operations in DAGIsel.
  - bf16 as a type remains illegal and is represented as i16 for storage purposes.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D139398
2022-12-13 10:34:26 -05:00
Alex Richardson
a602f76a24 [clang][TargetInfo] Use LangAS for getPointer{Width,Align}()
Mixing LLVM and Clang address spaces can result in subtle bugs, and there
is no need for this hook to use the LLVM IR level address spaces.
Most of this change is just replacing zero with LangAS::Default,
but it also allows us to remove a few calls to getTargetAddressSpace().

This also removes a stale comment+workaround in
CGDebugInfo::CreatePointerLikeType(): ASTContext::getTypeSize() does
return the expected size for ReferenceType (and handles address spaces).

Differential Revision: https://reviews.llvm.org/D138295
2022-11-30 20:24:01 +00:00
Xiang Li
7e04c0ad63 [HLSL] Add groupshare address space.
Added keyword, LangAS and TypeAttrbute for groupshared.

Tanslate it to LangAS with asHLSLLangAS.

Make sure it translated into address space 3 for DirectX target.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D135060
2022-10-20 09:29:09 -07:00
Kazu Hirata
981cbfb592 [clang] Don't include StringSwitch.h (NFC)
These files don't seem to use StringSwitch.
2022-09-18 22:21:32 -07:00
Fangrui Song
3f18f7c007 [clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D131346
2022-08-08 09:12:46 -07:00
Stanislav Mekhanoshin
9fa5a6b7e8 [AMDGPU] Support for gfx940 fp8 conversions
Differential Revision: https://reviews.llvm.org/D129902
2022-07-18 11:48:43 -07:00
Kazu Hirata
ca4af13e48 [clang] Don't use Optional::getValue (NFC) 2022-06-20 22:59:26 -07:00
Yaxun (Sam) Liu
af9ee3357c [HIP] fix long double size
For amdgpu target long double type is the same as double type.
The width and align of long double type was incorrectly
overridden when copying aux target properties, which
caused assertion in codegen when emitting global
variables with long double type.

This patch fix that by saving and restoring width
and align of long double type.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D127771

Fixes: SWDEV-335515
2022-06-14 21:57:56 -04:00
Yaxun (Sam) Liu
559b8fc17e [AMDGPU] emit macro __GFX9__ etc
Emit predefined macros for GPU family. e.g.
for GPU gfx9xx emit __GFX9__, etc.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D125909
2022-05-19 12:06:56 -04:00
Joe Nash
8bdfc73f63 [AMDGPU][clang] Definition of gfx11 subtarget
Contributors:
Jay Foad <jay.foad@amd.com>
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Patch 2/N for upstreaming of AMDGPU gfx11 architecture

Depends on D124536

Reviewed By: foad, kzhuravl, #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D124537
2022-04-29 13:55:56 -04:00
Matt Arsenault
a1303b23c9 clang/AMDGPU: Define macro for -munsafe-fp-atomics
The HIP headers want to use this to swap the implementation of the
function, rather than relying on backend expansion of the generic
atomic instruction.

Fixes: SWDEV-332998
2022-04-14 22:04:59 -04:00
Aakanksha
840695814a [AMDGPU] Add gfx1036 target
Differential Revision: https://reviews.llvm.org/D120846
2022-03-02 23:26:38 +00:00
Stanislav Mekhanoshin
2e2e64df4a [AMDGPU] Add gfx940 target
This is target definition only.

Differential Revision: https://reviews.llvm.org/D120688
2022-03-02 13:54:48 -08:00
Jon Chesterfield
c2574e63ff [openmp][nfc] Refactor GridValues
Remove redundant fields and replace pointer with virtual function

Of fourteen fields, three are dead and four can be computed from the
remainder. This leaves a couple of currently dead fields in place as
they are expected to be used from the deviceRTL shortly. Two of the
fields that can be computed are only used from codegen and require a
log2() implementation so are inlined into codegen instead.

This change leaves the new methods in the same location in the struct
as the previous fields for convenience at review.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108380
2021-08-23 16:19:11 +01:00
Jon Chesterfield
b1efeface7 Revert "[openmp][nfc] Refactor GridValues"
Failed a nvptx codegen test
This reverts commit 2a47a84b40115b01e03e4d89c1d47ba74beb7bf3.
2021-08-20 18:17:27 +01:00
Jon Chesterfield
2a47a84b40 [openmp][nfc] Refactor GridValues
Remove redundant fields and replace pointer with virtual function

Of fourteen fields, three are dead and four can be computed from the
remainder. This leaves a couple of currently dead fields in place as
they are expected to be used from the deviceRTL shortly. Two of the
fields that can be computed are only used from codegen and require a
log2() implementation so are inlined into codegen instead.

This change leaves the new methods in the same location in the struct
as the previous fields for convenience at review.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108380
2021-08-20 16:41:26 +01:00
Jon Chesterfield
77579b99e9 [openmp][nfc] Replace OMPGridValues array with struct
[nfc] Replaces enum indices into an array with a struct. Named the
fields to match the enum, leaves memory layout and initialization unchanged.

Motivation is to later safely remove dead fields and replace redundant ones
with (compile time) computation. It should also be possible to factor some
common fields into a base and introduce a gfx10 amdgpu instance with less
duplication than the arrays of integers require.

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D108339
2021-08-19 13:25:42 +01:00
Melanie Blower
aaba37187f [clang][PATCH][nfc] Refactor TargetInfo::adjust to pass DiagnosticsEngine to allow diagnostics on target-unsupported options
Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D104729
2021-06-29 13:26:23 -04:00
Melanie Blower
1d85d0879a Revert "[clang][PATCH][nfc] Refactor TargetInfo::adjust to pass DiagnosticsEngine to allow diagnostics on target-unsupported options"
This reverts commit 2dbe1c675fe94eeb7973dcc25b049d25f4ca4fa0.
More buildbot failures
2021-06-28 15:47:21 -04:00
Melanie Blower
2dbe1c675f [clang][PATCH][nfc] Refactor TargetInfo::adjust to pass DiagnosticsEngine to allow diagnostics on target-unsupported options
Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D104729
2021-06-28 15:09:53 -04:00
Melanie Blower
8815ef823c Revert "[clang][PATCH][nfc] Refactor TargetInfo::adjust to pass DiagnosticsEngine to allow diagnostics on target-unsupported options"
This reverts commit 2c02b0c3f45414ac6c64583e006a26113c028304.
buildbot fails
2021-06-28 12:42:59 -04:00
Melanie Blower
2c02b0c3f4 [clang][PATCH][nfc] Refactor TargetInfo::adjust to pass DiagnosticsEngine to allow diagnostics on target-unsupported options
Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D104729
2021-06-28 12:26:53 -04:00
Aakanksha Patil
3453f3dd46 [AMDGPU] Add gfx1035 target
Differential Revision: https://reviews.llvm.org/D104804
2021-06-24 14:32:41 -04:00
Brendon Cahoon
294efbbd3e Reland "[AMDGPU] Add gfx1013 target"
This reverts commit 211e584fa2a4c032e4d573e7cdbffd622aad0a8f.

Fixed a use-after-free error that caused the sanitizers to fail.
2021-06-08 21:15:35 -04:00
Brendon Cahoon
211e584fa2 Revert "[AMDGPU] Add gfx1013 target"
This reverts commit ea10a86984ea73fcec3b12d22404a15f2f59b219.

A sanitizer buildbot reports an error.
2021-06-08 16:29:41 -04:00
Brendon Cahoon
ea10a86984 [AMDGPU] Add gfx1013 target
Differential Revision: https://reviews.llvm.org/D103663
2021-06-08 12:49:49 -04:00
Alexey Bader
2ab513cd3e [SYCL] Enable opencl_global_[host,device] attributes for SYCL
Differential Revision: https://reviews.llvm.org/D100396
2021-05-18 10:27:35 +03:00
Aakanksha Patil
464e4dc50f [AMDGPU] Add gfx1034 target
Differential Revision: https://reviews.llvm.org/D102306
2021-05-13 14:25:18 -04:00