These use a new VOP3PX encoding for the v_mfma_scale_* instructions,
which bundles the pre-scale v_mfma_ld_scale_b32. None of the modifiers
are supported yet (op_sel, neg or clamp).
I'm not sure the intrinsic should really expose op_sel (or any of the
others). If I'm reading the documentation correctly, we should be able
to just have the raw scale operands and auto-match op_sel to byte
extract patterns.
The op_sel syntax also seems extra horrible in this usage, especially with the
usual assumed op_sel_hi=-1 behavior.
Recent change applied too strict check for old and src operands match.
These shall be compatible, but not necessarily exactly the same.
Fixes: SWDEV-493072
OpenCL has a reserved operator (^^), the use of which was diagnosed as
an error (735c6cdebdcd4292928079cb18a90f0dd5cd65fb). However, OpenCL
also encourages working with the blocks language extension. This token
has a parsing ambiguity as a result. Consider:
unsigned x=0;
unsigned y=x^^{return 0;}();
This should result in y holding the value zero (0^0) through an
immediately invoked block call as the right-hand side of the xor
operator. However, it causes errors instead because of this reserved
token: https://godbolt.org/z/navf7jTv1
This token is still reserved in OpenCL 3.0, so we still wish to issue a
diagnostic for its use. However, we do not need to create a token for an
extension point that's been unused for about a decade. So this patch
moves the diagnostic from a parsing diagnostic to a lexing diagnostic
and no longer forms a single token. The diagnostic behavior is slightly
worse as a result, but still seems acceptable.
Part of the reason this is coming up is because WG21 is considering
using ^^ as a token for reflection, so this token may come back in the
future.
This patch adds a new builtin type for AMDGPU's buffer rsrc data type,
which is effectively an AS 8 pointer. This is needed because we'd like
to expose certain intrinsics to users via builtins which take buffer
rsrc as argument.
This change seeks to add support for vendor flavoured SPIRV - more
specifically, AMDGCN flavoured SPIRV. The aim is to generate SPIRV that
carries some extra bits of information that are only usable by AMDGCN
targets, forfeiting absolute genericity to obtain greater expressiveness
for target features:
- AMDGCN inline ASM is allowed/supported, under the assumption that the
[SPV_INTEL_inline_assembly](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_inline_assembly.asciidoc)
extension is enabled/used
- AMDGCN target specific builtins are allowed/supported, under the
assumption that e.g. the `--spirv-allow-unknown-intrinsics` option is
enabled when using the downstream translator
- the featureset matches the union of AMDGCN targets' features
- the datalayout string is overspecified to affix both the program
address space and the alloca address space, the latter under the
assumption that the
[SPV_INTEL_function_pointers](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc)
extension is enabled/used, case in which the extant SPIRV datalayout
string would lead to pointers to function pointing to the private
address space, which would be wrong.
Existing AMDGCN tests are extended to cover this new target. It is
currently dormant / will require some additional changes, but I thought
I'd rather put it up for review to get feedback as early as possible. I
will note that an alternative option is to place this under AMDGPU, but
that seems slightly less natural, since this is still SPIRV, albeit
relaxed in terms of preconditions & constrained in terms of
postconditions, and only guaranteed to be usable on AMDGCN targets (it
is still possible to obtain pristine portable SPIRV through usage of the
flavoured target, though).
Using MMRAs, allow `builtin_amdgcn_fence` to emit fences that only
target one or more address spaces, instead of fencing all address spaces
at once.
This is done through a `amdgpu-as` MMRA. Currently focused on OpenCL
fences, but can very easily support more AS names and codegen on more
than just fences.
This tries to fix all of the places where a diagnostic message starts
with a capital letter (other than acroynyms or proper nouns) or ends
with punctuation (other than a question mark).
This is in support of a planned change to tablegen to start diagnosing
incorrect diagnostic message styles.
As it was pointed out in
https://github.com/llvm/llvm-project/pull/80724, we should not be
checking `getLangOpts().Bool` when determining something related to
logical operators, since it only indicates that bool keyword is present,
not which semantic logical operators have. As a side effect a missing
`-Wpointer-bool-conversion` in OpenCL C was restored since like C23,
OpenCL C has bool keyword but logical operators still return int.
Summary:
These tests likely always failed but was hidden by the expected return
value. Simply make them require AMDGPU as a registered target so they
don't fail on other machines.
Summary:
Currently we cannot compile `__builtin_amdgcn_ballot_w64` on non-wave64
targets even though it is valid. This is relevant for making library
code that can handle both without needing to check the wavefront size.
This patch relaxes the semantic check for w64 so it can be used
normally.
LLVM IR already allows floating point type in atomicrmw.
Update clang atomic fetch max/min builtins to accept
floating point type like we did for fetch add/sub.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D150985
Fixes: SWDEV-401056
This is an ongoing series of commits that are reformatting our
Python code.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D150761
This implements support for allowing {} to consistently zero initialize
objects. We already supported most of this work as a GNU extension, but
the C2x feature goes beyond what the GNU extension allowed.
The changes in this patch are:
* Removed the -Wgnu-empty-initializer warning group. The extension is
now a C2x extension warning instead. Note that use of
`-Wno-gnu-empty-initializer seems` to be quite low in the wild
(https://sourcegraph.com/search?q=context%3Aglobal+-file%3A.*test.*+%22-Wno-gnu-empty-initializer%22&patternType=standard&sm=1&groupBy=repo
which currently only gives 8 hits total), so this is not expected to
be an overly disruptive change. But I'm adding the clang vendors
review group just in case this expectation is wrong.
* Reworded the diagnostic wording to be about a C2x extension, added a
pre-C2x compat warning.
* Allow {} to zero initialize a VLA
This functionality is exposed as an extension in all older C modes
(same as the GNU extension was), but does *not* allow the extension for
VLA initialization in C++ due to concern about handling non-trivially
constructible types.
Differential Revision: https://reviews.llvm.org/D147349
Structs that contain global or local pointers can be passed as kernel
arguments starting OpenCL v2.0 with the introduction of shared virtual memory.
Differential Revision: https://reviews.llvm.org/D143849
`opencl-c-base.h` always defines 5 particular feature macros for
SPIR-V, making it impossible to disable those features.
To allow disabling any of those features, let the header recognize
`__undef_<feature>` macros. The user can then pass the
`-D__undef_<feature>` flag on the command line to disable a specific
feature. The __undef macro could potentially also be set from
`-cl-ext=-feature`, but for now only change the header and only
provide __undef macros for the 5 features that are always enabled in
`opencl-c-base.h`.
Differential Revision: https://reviews.llvm.org/D141297
It turns out we can codegen these on targets without flat addressing,
although the runtime probably didn't put anything useful there. The
proper diagnostic would be to disallow flat pointer uses or languages
with them, not this one edge case. Allows removing one of the special
cases requiring subtarget support in the device libraries.
Use explicit _w32/_w64 suffixes for the wave size to be consistent
with the existing other wave dependent intrinsics. Also start
diagnosing trying to use both wave32 and wave64.
I would have preferred to avoid the +wavefrontsize64 spam on targets
where that's the only option, but avoiding this seems to be more work
than I expected.
Added keyword, LangAS and TypeAttrbute for groupshared.
Tanslate it to LangAS with asHLSLLangAS.
Make sure it translated into address space 3 for DirectX target.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D135060