This PR adds a warning that's emitted when a non-streaming or
non-streaming-compatible builtin is called in an unsuitable function.
Uses work by Kerry McLaughlin.
This is a re-upload of #74064 and fixes a compile time increase.
This PR adds a warning that's emitted when a non-streaming or
non-streaming-compatible builtin is called in an unsuitable function.
Uses work by Kerry McLaughlin.
This patch is needed for the reduction instructions in sve2.1
It add a new header to sve with all the fixed vector types.
The new types are only added if neon is not declared.
This adds new intrisics to support the LDAP1 and STL1 Advanced SIMD
(Neon) instructions introduced as part of FEAT_LRCPC3.
The new intrinsics `vldap1(q)_lane`/`vstl1(q)_lane` generate IR code
similar to the existing `vld1(q)_lane/st1(q)_lane` ones, but capturing
the difference in the atomic release/acquire memory model.
The LLVM code generation changes to ensure that this instruction pair
is lowered to the correct LDAP1/STL1 instructions will be covered in a
separate commit.
Based on a patch by Sam Elliott.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D153128
As of https://reviews.llvm.org/D79708, clang-tblgen generates `arm_neon.h`,
`arm_sve.h` and `arm_bf16.h`, and all those generated files will contain a
typedef of `bfloat16_t`. However, `arm_neon.h` and `arm_sve.h` include
`arm_bf16.h` immediately before their own typedef:
#include <arm_bf16.h>
typedef __bf16 bfloat16_t;
With a recent version of clang (I used 16.0.1) this results in warnings:
/usr/lib/clang/16/include/arm_neon.h:38:16: error: redefinition of typedef 'bfloat16_t' is a C11 feature [-Werror,-Wtypedef-redefinition]
Since `arm_bf16.h` is very likely supposed to be the one true place where
`bfloat16_t` is defined, I propose to delete the duplicate typedefs from the
generated `arm_neon.h` and `arm_sve.h`.
Reviewed By: sdesmalen, simonbutcher
Differential Revision: https://reviews.llvm.org/D148822
Reported by Coverity:
AUTO_CAUSES_COPY
Unnecessary object copies can affect performance.
1. Inside "ExtractAPIVisitor.h" file, in clang::extractapi::impl::ExtractAPIVisitorBase<<unnamed>::BatchExtractAPIVisitor>::VisitFunctionDecl(clang::FunctionDecl const *): Using the auto keyword without an & causes the copy of an object of type DynTypedNode.
2. Inside "NeonEmitter.cpp" file, in <unnamed>::Intrinsic::Intrinsic(llvm::Record *, llvm::StringRef, llvm::StringRef, <unnamed>::TypeSpec, <unnamed>::TypeSpec, <unnamed>::ClassKind, llvm::ListInit *, <unnamed>::NeonEmitter &, llvm::StringRef, llvm::StringRef, bool, bool): Using the auto keyword without an & causes the copy of an object of type Type.
3. Inside "MicrosoftCXXABI.cpp" file, in <unnamed>::MSRTTIBuilder::getClassHierarchyDescriptor(): Using the auto keyword without an & causes the copy of an object of type MSRTTIClass.
4. Inside "CGGPUBuiltin.cpp" file, in clang::CodeGen::CodeGenFunction::EmitAMDGPUDevicePrintfCallExpr(clang::CallExpr const *): Using the auto keyword without an & causes the copy of an object of type CallArg.
5. Inside "SemaDeclAttr.cpp" file, in threadSafetyCheckIsSmartPointer(clang::Sema &, clang::RecordType const *): Using the auto keyword without an & causes the copy of an object of type CXXBaseSpecifier.
6. Inside "ComputeDependence.cpp" file, in clang::computeDependence(clang::DesignatedInitExpr *): Using the auto keyword without an & causes the copy of an object of type Designator.
7. Inside "Format.cpp" file, In clang::format::affectsRange(llvm::ArrayRef<clang::tooling::Range>, unsigned int, unsigned int): Using the auto keyword without an & causes the copy of an object of type Range.
Reviewed By: tahonermann
Differential Revision: https://reviews.llvm.org/D149074
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Similar to D131064, this alters most of the intrinsics in arm_neon.h to
be target based, not preprocessor based. The intrinsics that are changed
are the ones with obvious target features (fp16, fp16fml, cryptos, i8mm
and bf16). The ones that are not yet altered are the ones without target
features like rdma (8.1) and complex (8.3). Those will be switched in a
followup patch that allows targeting architecture versions.
The existing ArchGuard in arm_neon.td is split into ArchGuard that still
adds ifdef defines (for example for intrinsics that require __aarch64__),
and TargetGuards for intrinsics dependant on target features. From there
the TargetGuards are used in two ways:
- For intrinsics emitted as functions, __attribute__((target(TargetGuard)))
is added to the definition of the function. Along with the existing
always_inline intrinsic, this will give a compile time error if the
function is used in a context where the target feature is not available.
- For intrinsics emitted as macros, the __builtins are emitted into
arm_neon.inc using TARGET_BUILTIN as opposed to BUILTIN, which includes
the target feature and gives an error if the builtin is found in a
function without the required features, similar to arm_sve.h.
The second method requires that the intrinsics be separable from the
existing _v intrinsics used in other types. For example
__builtin_neon_splat_lane_bf16 is used as opposed to
__builtin_neon_splat_lane_v. There are some adjustments to the CGBuiltin
to account for intrinsics that can be treated similarly, except for
their target features.
Differential Revision: https://reviews.llvm.org/D132034
I went over the output of the following mess of a command:
(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Differential Revision: https://reviews.llvm.org/D130827
This patch adds the following SHA3 Intrinsics:
vsha512hq_u64,
vsha512h2q_u64,
vsha512su0q_u64,
vsha512su1q_u64
veor3q_u8
veor3q_u16
veor3q_u32
veor3q_u64
veor3q_s8
veor3q_s16
veor3q_s32
veor3q_s64
vrax1q_u64
vxarq_u64
vbcaxq_u8
vbcaxq_u16
vbcaxq_u32
vbcaxq_u64
vbcaxq_s8
vbcaxq_s16
vbcaxq_s32
vbcaxq_s64
Note need to include +sha3 and +crypto when building from the front-end
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D96381
This patch adds the LANE variants for VCMLA on AArch64 as defined in
"Arm Neon Intrinsics Reference for ACLE Q3 2020" [1]
This patch also updates `dup_typed` to accept constant type strings directly.
Based on a patch by Tim Northover.
[1] https://developer.arm.com/documentation/ihi0073/latest
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D93014
Summary:
Whenever Neon is not supported, a generic message is printed:
error: "NEON support not enabled"
Followed by a series of other error messages that are not useful once
the first one is printed.
This patch gives a more precise message in the case where Neon is
unsupported because an invalid float ABI was specified: the soft float
ABI.
error: "NEON intrinsics not available with the soft-float ABI. Please
use -mfloat-abi=softfp or -mfloat-abi=hard"
This message is the same one that GCC gives, so it is also making their
diagnostics more compatible with each other.
Also, by rearranging preprocessor directives, these "unsupported" error
messages are now the only ones printed out, which is also GCC's
behaviour.
Differential Revision: https://reviews.llvm.org/D81847
Summary:
The poly64 types are guarded with ifdefs for AArch64 only. This is wrong. This
was also incorrectly documented in the ACLE spec, but this has been rectified in
the latest release. See paragraph 13.1.2 "Vector data types":
https://developer.arm.com/docs/101028/latest
This patch was written by Alexandros Lamprineas.
Reviewers: ostannard, sdesmalen, fpetrogalli, labrinea, t.p.northover, LukeGeeson
Reviewed By: ostannard
Subscribers: pbarrio, LukeGeeson, kristof.beyls, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79711
These cases all follow the same pattern:
struct A {
friend class X;
//...
class X {};
};
But 'friend class X;' injects 'X' into the surrounding namespace scope,
rather than introducing a class member. So the second 'class X {}' is a
completely different type, which changes the meaning of the earlier name
'X' from '::X' to 'A::X'.
Additionally, the friend declaration is pointless -- members of a class
don't need to be befriended to be able to access private members.
Summary:
Range checks were not properly performed in the lane arguments of Neon
intrinsics implemented based on splat operations. Calls to those
intrinsics where translated to `__builtin__shufflevector` calls directly
by the pre-processor through the arm_neon.h macros, missing the chance
for the proper range checks.
This patch enables the range check by introducing an auxiliary splat
instruction in arm_neon.td, delaying the translation to shufflevector
calls to CGBuiltin.cpp in clang after the checks were performed.
Reviewers: jmolloy, t.p.northover, rsmith, olista01, ostannard
Reviewed By: ostannard
Subscribers: ostannard, dnsampaio, danielkiss, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74619
Summary:
As multiple versions of the same Neon intrinsic can be created through
the same TableGen definition with the same argument types, the existing
`call` operator is not always able to properly perform overload
resolutions.
As these different intrinsic versions are differentiated later on by the
NeonEmitter through name mangling, this patch introduces a new
`call_mangled` operator to the TableGen definitions, which allows a call
for an otherwise ambiguous intrinsic by matching its mangled name with
the mangled variation of the caller.
Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio
Reviewed By: dnsampaio
Subscribers: dnsampaio, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74618
This reverts commit 62ab15ffa3f910c36758e99324deac12ee006c90.
Multiple commits were unintentionally squashed into this one. Reverting
so each of them can be pushed properly.
Summary:
Some of the `*_laneq` intrinsics defined in arm_neon.td were missing the
setting of the `isLaneQ` attribute. This patch sets the attribute on the
related definitions, as they will be required to properly perform range
checks on their lane arguments.
Reviewers: jmolloy, t.p.northover, rsmith, olista01, dnsampaio
Reviewed By: dnsampaio
Subscribers: dnsampaio, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74616
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
The modifier system used to mutate types on NEON intrinsic definitions had a
separate letter for all kinds of transformations that might be needed, and we
were quite quickly running out of letters to use. This patch converts to a much
smaller set of orthogonal modifiers that can be applied together to achieve the
desired effect.
When merging with downstream it is likely to cause a conflict with any local
modifications to the .td files. There is a new script in
utils/convert_arm_neon.py that was used to convert all .td definitions and I
would suggest running it on the last downstream version of those files before
this commit rather than resolving conflicts manually.
The original version broke vcreate_* because it became a macro and didn't
apply the normal integer promotion rules before bitcasting to a vector.
This adds a temporary.
This broke the vcreate_u64 intrinsic. Example:
$ cat /tmp/a.cc
#include <arm_neon.h>
void g() {
auto v = vcreate_u64(0);
}
$ bin/clang -c /tmp/a.cc --target=arm-linux-androideabi16 -march=armv7-a
/tmp/a.cc:4:12: error: C-style cast from scalar 'int' to vector 'uint64x1_t' (vector of 1 'uint64_t' value) of different size
auto v = vcreate_u64(0);
^~~~~~~~~~~~~~
/work/llvm.monorepo/build.release/lib/clang/10.0.0/include/arm_neon.h:4144:11: note: expanded from macro 'vcreate_u64'
__ret = (uint64x1_t)(__p0); \
^~~~~~~~~~~~~~~~~~
Reverting until this can be investigated.
> The modifier system used to mutate types on NEON intrinsic definitions had a
> separate letter for all kinds of transformations that might be needed, and we
> were quite quickly running out of letters to use. This patch converts to a much
> smaller set of orthogonal modifiers that can be applied together to achieve the
> desired effect.
>
> When merging with downstream it is likely to cause a conflict with any local
> modifications to the .td files. There is a new script in
> utils/convert_arm_neon.py that was used to convert all .td definitions and I
> would suggest running it on the last downstream version of those files before
> this commit rather than resolving conflicts manually.
The modifier system used to mutate types on NEON intrinsic definitions had a
separate letter for all kinds of transformations that might be needed, and we
were quite quickly running out of letters to use. This patch converts to a much
smaller set of orthogonal modifiers that can be applied together to achieve the
desired effect.
When merging with downstream it is likely to cause a conflict with any local
modifications to the .td files. There is a new script in
utils/convert_arm_neon.py that was used to convert all .td definitions and I
would suggest running it on the last downstream version of those files before
this commit rather than resolving conflicts manually.
For some reason we were not casting a fairly obscure class of builtin calls we
expected to be polymorphic to vectors of char. It worked because the only
affected intrinsics weren't actually polymorphic after all, but is
unnecessarily complicated.
'a' used to implement a splat in C++ code in NeonEmitter.cpp, but this
can be done directly from .td expansions now (and most ops already did).
So removing it simplifies the overall code.
https://reviews.llvm.org/D69716
Previously we had a handful of bools (Signed, Floating, ...) that could
easily end up in an inconsistent state. This adds an enum Kind which
holds the mutually exclusive states a type might be in, retaining some
of the bools that modified an underlying type.
https://reviews.llvm.org/D69715
It's completely impossible to check that I've actually found all the
issues, due to the use of macros in arm_neon.h, but hopefully this time
it'll take more than a few hours for someone to find another issue.
I have no idea why, but apparently there's a rule that some, but not
all, builtins which should take an fp16 vector actually take an int8
vector as an argument. Fix this, and add test coverage.
Differential Revision: https://reviews.llvm.org/D68838
llvm-svn: 375179
Just running -fsyntax-only over arm_neon.h doesn't cover some intrinsics
which are defined using macros. Add more test coverage for that.
arm-neon-header.c wasn't checking the full set of available NEON target
features; change the target architecture of the test to account for
that.
Fix the generator for arm_neon.h to generate casts in more cases where
they are necessary.
Fix VFMLAL_LOW etc. to express their signatures differently, so the
builtins have the expected type. Maybe the TableGen backend should
detect intrinsics that are defined the wrong way, and produce an error.
The rules here are sort of strange.
Differential Revision: https://reviews.llvm.org/D68743
llvm-svn: 374419
Really, we were already 99% of the way there; just needed a couple minor
fixes that affected 64-bit-only builtins. Based on D61717.
Note that the change to builtin_str changes the type of a few
__builtin_neon_* intrinsics that had the "wrong" type.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43341
Differential Revision: https://reviews.llvm.org/D68683
llvm-svn: 374191
Summary:
The declaration of arm neon intrinsics that are
"big endian safe" print the same code for big
and small endian targets.
This patch avoids duplicates by checking if an
intrinsic is safe to have a single definition.
(decreases header 11k lines out of 73k).
Reviewers: t.p.northover, ostannard, labrinea
Reviewed By: ostannard
Subscribers: kristof.beyls, cfe-commits, olista01
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66588
llvm-svn: 370716