98 Commits

Author SHA1 Message Date
CarolineConcatto
508fd966fb
[CLANG][AArch64]Add SVE tuple types for mfloat8_t (#112687)
This patch adds scalable tuple types vectors for MFloat_8 type,
according to the ACLE[1].

[1] https://github.com/ARM-software/acle.git
2024-10-18 09:10:17 +01:00
CarolineConcatto
cb43021e57
[CLANG]Add Scalable vectors for mfloat8_t (#101644)
This patch adds these new vector sizes for sve:
    svmfloat8_t

According to the ARM ACLE PR#323[1].

[1] ARM-software/acle#323
2024-10-17 09:22:55 +01:00
Sander de Smalen
f22e6d5919
[Clang][AArch64] Fix checkArmStreamingBuiltin for 'sve-b16b16' (#109420)
The implementation made the assumption that any feature starting with
"sve" meant that this was an SVE feature. This is not the case for
"sve-b16b16", as this is a feature that applies to both SVE and SME.

This meant that:
```
  __attribute__((target("+sme2,+sve2,+sve-b16b16")))
  svbfloat16_t foo(svbfloat16_t a, svbfloat16_t b, svbfloat16_t c)
                                                      __arm_streaming {
      return svclamp_bf16(a, b, c);
  }
```
would result in an incorrect diagnostic saying that `svclamp_bf16` could
only be used in non-streaming functions.
2024-10-08 10:01:40 +01:00
Rahul Joshi
a140931be5
[TableGen] Change getValueAsListOfDefs to return const pointer vector (#110713)
Change `getValueAsListOfDefs` to return a vector of const Record
pointer, and remove `getValueAsListOfConstDefs` that was added as a
transition aid.

This is a part of effort to have better const correctness in TableGen
backends:


https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
2024-10-01 14:30:38 -07:00
Rahul Joshi
e9dbdb20f2
[Clang][TableGen] Change NeonEmitter to use const Record * (#110597)
This is a part of effort to have better const correctness in TableGen
backends:


https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
2024-10-01 10:47:09 -07:00
Rahul Joshi
0e948bfd31
[NFC][clang][TableGen] Remove redundant llvm:: namespace qualifier (#108627)
Remove llvm:: from .cpp files, and add "using namespace llvm" if needed.
2024-09-16 06:35:34 -07:00
Rahul Joshi
711278e273
[clang][TableGen] Change SVE Emitter to use const RecordKeeper (#108503)
Change SVE Emitter to use const RecordKeeper.

This is a part of effort to have better const correctness in TableGen
backends:


https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
2024-09-13 07:53:30 -07:00
SpencerAbson
1f70fcefa9
[Clang][AArch64] Add customisable immediate range checking to NEON (#100278)
This patch moves NEON immediate argument specification and checking to
the system currently shared by both SVE and SME.

In its current form, the TableGen definition of a NEON intrinsic cannot
control how its immediate arguments are range-checked, this information
must be inferred from the name of the intrinsic by NeonEmitter, which
also assumes that any NEON instruction will only ever receive a single
immediate argument. For SVE/SME instrinsics, this information is more
conveniently supplied in the TableGen definition.

As a result, for each immediate argument, NEON instructions must define
- The index of the immediate argument to be checked
- The type of immediate range check to be performed,
    (e.g., ImmCheckShiftRight)
- The index of the argument whose type defines the context
    of this immediate check (base type, vector size).
- **Difference from SVE/SME** If this definition generates a polymorphic
NEON builtin, the base type defined by this argument is overwritten by
that of the type code supplied to the overloaded builtin call. This
third argument is omitted in some cases due to this.

Here is an example for
[`vfma_laneq`](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=vfma_laneq)
- The immediate is supplied in argument 3
- The immediate is used as an index into the lanes of argument 2
- So we must perform an immediate check on argument 3, based on the type
information of argument 2.
- `ImmCheck<3, ImmCheckLaneIndex, 2>`

During this work, we discovered that the existing immediate
range-checking system was largely untested, which made it difficult to
make reliable progress. Missing tests have been added to verify this
implementation against all intrinsics which take constrained immediate
arguments. All test immediate range checking tests for NEON intrinsics
are moved to a dedicated directory
`clang/test/Sema/aarch64-neon-immediate-ranges/`.
2024-09-06 13:12:37 +01:00
Kazu Hirata
1fa7f05b70
[clang] Construct SmallVector with ArrayRef (NFC) (#101898) 2024-08-04 23:46:34 -07:00
Sander de Smalen
09c0337a58
[Clang][SveEmitter] Split up TargetGuard into SVE and SME component. (#96482)
One reason to want to split this up is to simplify the code added in
#93802, where it checks the SME streaming-mode requirements for a
builtin by checking for the absence of SVE. If the target guards are
separate, we can generate a table and make the Sema code to verify the
runtime mode simpler.

Another reason is to avoid an issue with a check in SveEmitter.cpp where
it ensures that the 'VerifyRuntimeMode' is set correctly for functions
that have both SVE and SME target guards:

if (!Def->isFlagSet(VerifyRuntimeMode) &&
Def->getGuard().contains("sve") &&
      Def->getGuard().contains("sme"))
    llvm_unreachable("Missing VerifyRuntimeMode flag");

However, if we ever add a new feature with "sme" in the name, even
though it is unrelated to FEAT_SME, then this code no longer works.

Note that the arm_sve.td and arm_sme.td files could do with a bit of
restructuring after this but it seems better to follow that up in an NFC
patch.
2024-06-24 20:31:22 +01:00
Sander de Smalen
b39f523af7
[Clang][AArch64] Expose compatible SVE intrinsics with only +sme (#95787)
This allows code with SVE intrinsics to be compiled with +sme,+nosve,
assuming the encompassing function is in the correct mode (see #93802)
2024-06-21 08:13:18 +01:00
Sander de Smalen
1644a31ae9
[Clang][AArch64] Generalise streaming mode checks for builtins. (#93802)
PR #76975 added 'IsStreamingOrSVE2p1' to emit a diagnostic when a builtin marked
with 'IsStreamingOrSVE2p1' is used in a non-streaming function that is not
compiled with `+sve2p1`.

The problem is a bit more complex than only this case. For example, we've marked
lots of builtins with 'IsStreamingCompatible', meaning it can be used in either
streaming, streaming-compatible or non-streaming functions. But the code in
SemaChecking, doesn't check the appropriate target guards. This issue becomes
relevant when SVE builtins are only available in streaming mode, e.g. when
compiling for SME without SVE.

If we were to add the appropriate target guards, we'd have to add many more
combinations, e.g.:
  IsStreamingSMEOrSVE
  IsStreamingSME2OrSVE2
  IsStreamingSMEOrSVE2p1
  IsStreamingSME2OrSVE2p1
  etc.

To avoid having to add more combinations (and avoid having to add more in the
future for new extensions), we use a single 'IsSVEOrStreamingSVE' flag for all
builtins that are available in streaming mode for the appropriate SME flags, or
in non-streaming mode for the appropriate SVE flags, or both.  The code in
SemaChecking will then verify for which mode (or both) the builtin would be
defined, given the target features of the function/compilation unit.

For example:

  'svclamp' is enabled under FEAT_SVE2p1 and FEAT_SME2

* When we compile for SVE2p1 and SME (but not SME2), the builtin is undefined
  behaviour when called from a streaming function.

* When we compile for SME2 and SVE2 (but not SVE2p1), the builtin is undefined
  behaviour when called from a non-streaming function.

* When we compile for _both_ SVE2p1 and SME2, the builtin can be used in either
  mode (non-streaming, streaming or streaming-compatible)
2024-06-16 17:24:23 +01:00
Sander de Smalen
f81da75693
[Clang][AArch64] Use __clang_arm_builtin_alias for overloaded svreinterpret's (#92427)
The intrinsics are currently defined as:

```
  __aio __attribute__((target("sve")))
  svint8_t svreinterpret_s8(svuint8_t op) __arm_streaming_compatible {
    return __builtin_sve_reinterpret_s8_u8(op);
  }
```

which doesn't work when calling it from an __arm_streaming function when
only +sme is available. By defining it in the same way as we've defined
all the other intrinsics, we can leave it to the code in SemaChecking to
verify that either +sve or +sme is available.

This PR also fixes the target guards for the svreinterpret_c and
svreinterpret_b intrinsics, that convert between svcount_t and svbool_t,
as these are available both in SME2 and SVE2p1.
2024-05-23 10:42:11 +01:00
aniplcc
451cad3a27
[clang] Prefer logical && over & for boolean operations (#87276) 2024-04-02 19:21:39 +02:00
Sander de Smalen
3c90fce450
[Clang][AArch64] Add missing prototypes for streaming-compatible routines (#82649) 2024-02-23 11:31:53 +00:00
Sander de Smalen
1f6f19935c
[Clang][AArch64] Add diagnostics for builtins that use ZT0. (#79140)
Similar to what we did for ZA, this patch adds diagnostics to flag when
using a ZT0 builtin in a function that does not have ZT0 state.
2024-01-23 17:41:12 +01:00
Matthew Devereau
312acdfae1
[AArch64][SME] Take arm_sme.h out of draft (#78961) 2024-01-22 17:12:16 +00:00
Sander de Smalen
40a631f452
[Clang] Refactor diagnostics for SME builtins. (#78258)
The arm_sme.td file was still using `IsSharedZA` and `IsPreservesZA`,
which should be changed to match the new state attributes added in
#76971.

This patch adds `IsInZA`, `IsOutZA` and `IsInOutZA` as the state for the
Clang builtins and fixes up the code in SemaChecking and SveEmitter to
match.

Note that the code is written in such a way that it can be easily
extended with ZT0 state (to follow in a future patch).
2024-01-19 16:02:24 +00:00
Sander de Smalen
032c832719 [Clang][AArch64] Remove unnecessary and incorrect attributes from arm_sme.h.
These attributes were using the GNU attribute syntax, rather than the new
keyword attribute syntax, and they are no longer required as we have code
in SemaChecking to verify whether a builtin is compatible with its caller.
2024-01-16 11:20:59 +00:00
Sander de Smalen
8e7f073eb4
[Clang][AArch64] Change SME attributes for shared/new/preserved state. (#76971)
This patch replaces the `__arm_new_za`, `__arm_shared_za` and
`__arm_preserves_za` attributes in favour of:
* `__arm_new("za")`
* `__arm_in("za")`
* `__arm_out("za")`
* `__arm_inout("za")`
* `__arm_preserves("za")`

As described in https://github.com/ARM-software/acle/pull/276.

One change is that `__arm_in/out/inout/preserves(S)` are all mutually
exclusive, whereas previously it was fine to write `__arm_shared_za
__arm_preserves_za`. This case is now represented with `__arm_in("za")`.

The current implementation uses the same LLVM attributes under the hood,
since `__arm_in/out/inout` are all variations of "shared ZA", so can use
the existing `aarch64_pstate_za_shared` attribute in LLVM.

#77941 will add support for the new "zt0" state as introduced
with SME2.
2024-01-15 09:41:32 +00:00
Sam Tebbs
0eefcaf96d
[Clang][SME] Add IsStreamingOrSVE2p1 (#76975)
This patch adds IsStreamingOrSVE2p1 to the applicable builtins and a
warning for when those builtins are not used in a streaming or sve2p1
function.
2024-01-05 09:55:50 +00:00
Sam Tebbs
a7a78fd427
Revert "[Clang][SME] Add IsStreamingOrSVE2p1" (#76973)
Reverts llvm/llvm-project#75958

I mistakenly included a commit from my local main after rebasing.
2024-01-04 16:53:14 +00:00
Sam Tebbs
8f8152091c
[Clang][SME] Add IsStreamingOrSVE2p1 (#75958)
This patch adds IsStreamingOrSVE2p1 to the applicable builtins and a
warning for when those builtins are not used in a streaming or sve2p1
function.
2024-01-04 16:50:31 +00:00
Sander de Smalen
5055eeea52
[Clang][AArch64] Add missing SME functions to header file. (#75791)
This includes:
* __arm_in_streaming_mode()
* __arm_has_sme()
* __arm_za_disable()
* __svundef_za()
2024-01-02 09:43:30 +00:00
Sam Tebbs
a0a3c793d2
[Clang][SME] Warn when a function doesn't have ZA state (#75805)
This patch adds a warning that's emitted when a builtin call uses ZA
state but the calling function doesn't provide any.

Patch by David Sherwood <david.sherwood@arm.com>.
2023-12-18 16:14:25 +00:00
Sam Tebbs
945c645acb
[AArch64][SME] Warn when using a streaming builtin from a non-streaming function (#75487)
This PR adds a warning that's emitted when a non-streaming or
non-streaming-compatible builtin is called in an unsuitable function.

Uses work by Kerry McLaughlin.

This is a re-upload of #74064 and fixes a compile time increase.
2023-12-18 09:32:34 +00:00
Sam Tebbs
342384ca05
Revert "[AArch64][SME] Warn when using a streaming builtin from a non-streaming function" (#75449)
Reverts llvm/llvm-project#74064
2023-12-14 09:31:55 +00:00
Sam Tebbs
2e45326b08
[AArch64][SME] Warn when using a streaming builtin from a non-streaming function (#74064)
This PR adds a warning that's emitted when a non-streaming or
non-streaming-compatible builtin is called in an unsuitable function.

Uses work by Kerry McLaughlin.
2023-12-14 00:11:04 +00:00
CarolineConcatto
f2464ca317
[SVE2.1][Clang][LLVM]Int/FP reduce builtin in Clang and LLVM intrinsic (#69926)
This patch implements the builtins in Clang
and the LLVM-IR intrinsic for the following:

// Variants are also available for:
// _s8, _s16, _u16, _s32, _u32, _s64, _u64,
// _f16, _f32, _f64uint8x16_t svaddqv[_u8](svbool_t pg, svuint8_t zn);

// Variants are also available for:
// _s8, _u16, _s16, _u32, _s32, _u64, _s64
uint8x16_t svandqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t
sveorqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t svorqv[_u8](svbool_t
pg, svuint8_t zn);

// Variants are also available for:
// _s8, _u16, _s16, _u32, _s32, _u64, _s64;
uint8x16_t svmaxqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t
svminqv[_u8](svbool_t pg, svuint8_t zn);

// Variants are also available for _f32, _f64
float16x8_t svmaxnmqv[_f16](svbool_t pg, svfloat16_t zn); float16x8_t
svminnmqv[_f16](svbool_t pg, svfloat16_t zn);

According to the PR#257[1]

The reduction instruction uses scalable vectors as input and fixed
vectors as output, therefore we changed SVEEmitter to emit fixed vector
types in case the neon header(arm_neon.h) is not present.

[1]https://github.com/ARM-software/acle/pull/257

Co-author: Dinar Temirbulatov <dinar.temirbulatov@arm.com>
2023-12-13 15:45:59 +00:00
CarolineConcatto
ed2d497291
[Clang][AArch64] Add fix vector types to header into SVE (#73258)
This patch is needed for the reduction instructions in sve2.1
 It add a new header to sve with all the fixed vector types.
  The new types are only added if neon is not declared.
2023-12-13 08:59:41 +00:00
Matthew Devereau
6704d6aadd
[SME2] Add LUTI2 and LUTI4 quad Builtins and Intrinsics (#73317)
See https://github.com/ARM-software/acle/pull/217

Patch by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>
2023-12-06 10:08:04 +00:00
David Spickett
d7c03a196e [clang][AArch64][NFC] Remove trailing space in SME intriniscs header 2023-11-27 13:31:15 +00:00
Kerry McLaughlin
48fb8ee081
[Clang][SME2] Add multi-vector add/sub builtins (#69725)
Adds the following SME2 builtins:
 - sv(add|sub)
 - sv(add|sub)_za32/za64,
 - sv(add|sub)_write_za32/za64

Other changes in this patch:
 - CGBuiltin.cpp: The GetAArch64SMEProcessedOperands function is created
    to avoid duplicating existing code from EmitAArch64SVEBuiltinExpr.
 - arm_sve.td: The add/sub SME2 builtins which do not operate on ZA have
been added to arm_sve.td, matching the corrosponding LLVM IR intrinsic
    names which start with @llvm.aarch64.sve for this reason.
- SveEmitter.cpp: Adds the createCoreHeaderIntrinsics function to remove
    duplicated code in createHeader & createSMEHeader. Uses a new enum
(ACLEKind) to choose either "__builtin_sme_" or "__builtin_sve_" when
    emitting the intrinsics.

See https://github.com/ARM-software/acle/pull/217/files
2023-11-07 15:42:43 +00:00
Momchil Velikov
9b3bb7a066
[AArch64] Implement reinterpret builtins for SVE vector tuples (#69598)
This patch adds reinterpret builtins as proposed here:
https://github.com/ARM-software/acle/pull/275.

The builtins take the form:

    sv<dst>x<N>_t svreinterpret_<dst>_<src>_x<N>(sv<src>x<N>_t op)

where
- <src> and <dst> designate the source and the destination type,
respectively, all pairs chosen from {s8, u8, s16, u8, s32, u32, s64,
u64, bf16, f16, f32, f64}
  - <N> designated the number of tuple elements, 2, 3 or 4

A short (overloaded) for is also provided, where the destination type is
explicitly designated and the source type is deduced from the parameter
type. These take the form

    sv<dst>x<N>_t svreinterpret_<dst>(sv<src>x<N>_t op)

For example:

    svuin16x2_t svreinterpret_u16_s32_x2(svint32x2_t op);
    svuin16x2_t svreinterpret_u16(svint32x2_t op);
2023-11-03 11:45:08 +00:00
Paul Walker
72561b3894
[CXXNameMangler] Correct the mangling of SVE ACLE types within function names. (#69460)
* Mark SVE ACLE types as substitution candidates.
* Change mangling of svbfloat16_t from __SVBFloat16_t to
  __SVBfloat16_t.

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst

This is an ABI break with the old behaviour available via
"-fclang-abi-compat=17".
2023-10-24 14:02:51 +01:00
Kazu Hirata
a5dca533bd Use llvm::count (NFC) 2023-10-22 21:18:23 -07:00
Caroline Concatto
200a92520c [Clang][SVE2.1] Add builtins and intrinsics for SVBFMLSLB/T
As described in: https://github.com/ARM-software/acle/pull/257

Patch by: Kerry McLaughlin <kerry.mclaughlin@arm.com>

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D151461
2023-10-19 16:44:39 +00:00
Caroline Concatto
7cad5a9eb4 [Clang][SVE2.1] Add svpext builtins
As described in: https://github.com/ARM-software/acle/pull/257

Reviewed By: hassnaa-arm

Differential Revision: https://reviews.llvm.org/D151081
2023-10-17 16:15:22 +00:00
Sander de Smalen
916415b837 [AArch64][SME] Make the overloaded svreinterpret_* functions streaming-compatible.
Otherwise these functions are not inlined when invoked from streaming
functions.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D159188
2023-09-04 10:15:26 +00:00
Bryan Chan
6dc94c54e5 [Clang][AArch64][SME] Add vector read/write (mova) intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):

  - svread_hor_za8[_s8]_m    // also for u8
  - svread_hor_za16[_s16]_m  // also for u16, f16, bf16
  - svread_hor_za32[_s32]_m  // also for u32, f32
  - svread_hor_za64[_s64]_m  // also for u64, f64
  - svread_hor_za128[_s8]_m  // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
  - svread_ver_za8[_s8]_m    // also for u8
  - svread_ver_za16[_s16]_m  // also for u16, f16, bf16
  - svread_ver_za32[_s32]_m  // also for u32, f32
  - svread_ver_za64[_s64]_m  // also for u64, f64
  - svread_ver_za128[_s8]_m  // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
  - svwrite_hor_za8[_s8]_m   // also for u8
  - svwrite_hor_za16[_s16]_m // also for u16, f16, bf16
  - svwrite_hor_za32[_s32]_m // also for u32, f32
  - svwrite_hor_za64[_s64]_m // also for u64, f64
  - svwrite_hor_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
  - svwrite_ver_za8[_s8]_m   // also for u8
  - svwrite_ver_za16[_s16]_m // also for u16, f16, bf16
  - svwrite_ver_za32[_s32]_m // also for u32, f32
  - svwrite_ver_za64[_s64]_m // also for u64, f64
  - svwrite_ver_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64

Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>

Reviewed By: sdesmalen, kmclaughlin

Differential Revision: https://reviews.llvm.org/D128648
2023-07-20 06:06:33 -04:00
Caroline Concatto
fc8acb563a [Clang][SVE2.1] Add clang support for builtins using svcount_t
In this patch it is used for the prototype:
  * svptrue_c8 (and _c16/_c32/_c64)

 As described in: https://github.com/ARM-software/acle/pull/257

Patch by: Sander de Smalen <sander.desmalen@arm.com>

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D150953
2023-05-31 15:57:44 +00:00
Bryan Chan
9f6250f591 [Clang][AArch64][SME] Add vector load/store (ld1/st1) intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):

  - svld1_hor_za8      // also for _za16, _za32, _za64 and _za128
  - svld1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128
  - svld1_ver_za8      // also for _za16, _za32, _za64 and _za128
  - svld1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128
  - svst1_hor_za8      // also for _za16, _za32, _za64 and _za128
  - svst1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128
  - svst1_ver_za8      // also for _za16, _za32, _za64 and _za128
  - svst1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128

SveEmitter.cpp is extended to generate arm_sme.h (currently named
arm_sme_draft_spec_subject_to_change.h) and other SME definitions from
arm_sme.td, which is modeled after arm_sve.td. Common TableGen definitions
are moved into arm_sve_sme_incl.td.

Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>

Reviewed By: sdesmalen, kmclaughlin

Differential Revision: https://reviews.llvm.org/D127910
2023-05-28 21:08:13 -04:00
Manna, Soumi
245549c575 [NFC][CLANG] Fix Static Code Analysis Concerns
Reported by Static Analyzer Tool, Coverity:

  Bad bit shift operation
  The operation may have an undefined behavior or yield an unexpected result.

  In <unnamed>::SVEEmitter::encodeFlag(unsigned long long, llvm::StringRef): A bit shift operation has a shift amount which is too large or has a negative value.

    // Returns the SVETypeFlags for a given value and mask.
    uint64_t encodeFlag(uint64_t V, StringRef MaskName) const {
      auto It = FlagTypes.find(MaskName);
     	//Condition It != llvm::StringMap<unsigned long long, llvm::MallocAllocator>::const_iterator const(this->FlagTypes.end()), taking true branch.
      if (It != FlagTypes.end()) {
        uint64_t Mask = It->getValue();
        //return_constant: Function call llvm::countr_zero(Mask) may return 64.
        //assignment: Assigning: Shift = llvm::countr_zero(Mask). The value of Shift is now 64.
        unsigned Shift = llvm::countr_zero(Mask);

       //Bad bit shift operation (BAD_SHIFT)
       //large_shift: In expression V << Shift, left shifting by more than 63 bits has undefined behavior. The shift amount, Shift, is 64.
        return (V << Shift) & Mask;
      }
      llvm_unreachable("Unsupported flag");
    }

Asserting Mask != 0 will not suffice to silence Coverity. While Coverity can specifically observe that countr_zero might return 0 (because TrailingZerosCounter<T, 8>::count() has a return 64 statement), It seems like Coverity can not determine that the function can't return 65 or higher. Coverity is reporting is that the shift might overflow,
so that is what should be guarded.
assert(Shift < 64 && "Mask value produced an invalid shift value");

Reviewed By: tahonermann, sdesmalen, erichkeane

Differential Revision: https://reviews.llvm.org/D150140
2023-05-14 20:07:24 -07:00
Dimitry Andric
db49231639 [clang][BFloat] Avoid redefining bfloat16_t in arm_neon.h
As of https://reviews.llvm.org/D79708, clang-tblgen generates `arm_neon.h`,
`arm_sve.h` and `arm_bf16.h`, and all those generated files will contain a
typedef of `bfloat16_t`. However, `arm_neon.h` and `arm_sve.h` include
`arm_bf16.h` immediately before their own typedef:

    #include <arm_bf16.h>
    typedef __bf16 bfloat16_t;

With a recent version of clang (I used 16.0.1) this results in warnings:

    /usr/lib/clang/16/include/arm_neon.h:38:16: error: redefinition of typedef 'bfloat16_t' is a C11 feature [-Werror,-Wtypedef-redefinition]

Since `arm_bf16.h` is very likely supposed to be the one true place where
`bfloat16_t` is defined, I propose to delete the duplicate typedefs from the
generated `arm_neon.h` and `arm_sve.h`.

Reviewed By: sdesmalen, simonbutcher

Differential Revision: https://reviews.llvm.org/D148822
2023-05-03 17:54:58 +02:00
Matt Devereau
a1fae98ba9 [AArch64] Add svboolx2_t and svboolx4_t tuple types
https://reviews.llvm.org/D145505
2023-03-14 10:16:51 +00:00
Kazu Hirata
55e2cd1609 Use llvm::count{lr}_{zero,one} (NFC) 2023-01-28 12:41:20 -08:00
David Green
6cac7c285e [AArch64] Alter arm_sve.h to be target-based, not preprocessor based.
This patch makes SVE intrinsics more useable by gating them on the
target, not by ifdef preprocessor macros. See #56480. This alters the
SVEEmitter for arm_sve.h to remove the #ifdef guards and instead use
TARGET_BUILTIN with the correct features so that the existing "'func'
needs target feature sve" error will be generated when sve is not
present.

The ArchGuard containing defines in the SVEEmitter are changed to
TargetGuard containing target features. In the arm_neon.h emitter there
are both existing ArchGuard ifdefs mixed with new TargetGuard target
feature guards, so the name is change in the SVE too for consistency.
The few functions that are present in arm_sve.h (as opposed to builtin
aliases) have __attribute__((target("sve"))) added. Some of the tests
needed to be rejigged a little, as well as updating the error message,
as the error now happens at a later point.

Differential Revision: https://reviews.llvm.org/D131064
2023-01-04 11:22:20 +00:00
Maciej Gabka
48e1250a91 [clang][SVE] Undefine preprocessor macro defined in
arm_sve.h defines and uses __ai macro which needs to be undefined (as it is
already in arm_neon.h).

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D131580
2022-08-12 12:25:49 +00:00
Fangrui Song
3f18f7c007 [clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D131346
2022-08-08 09:12:46 -07:00
Sander de Smalen
204aaf8795 [AArch64][SVE] Always use overloaded methods instead of preprocessor macro.
This fixes a subtle issue where:

  svprf(pg, ptr, SV_ALL /*is sv_pattern instead of sv_prfop*/)

would be quietly accepted. With this change, the function declaration
guards that the third parameter is a `enum sv_prfop`. Previously `svprf`
would map directly to `__builtin_sve_svprfb`, which accepts the enum
operand as a signed integer and only checks that the incoming range is
valid, meaning that SV_ALL would be discarded as being outside the valid
immediate range, but would have allowed SV_VL1 without issuing a warning
(C) or error (C++).

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D100297
2021-04-13 21:12:53 +01:00