The implementation made the assumption that any feature starting with
"sve" meant that this was an SVE feature. This is not the case for
"sve-b16b16", as this is a feature that applies to both SVE and SME.
This meant that:
```
__attribute__((target("+sme2,+sve2,+sve-b16b16")))
svbfloat16_t foo(svbfloat16_t a, svbfloat16_t b, svbfloat16_t c)
__arm_streaming {
return svclamp_bf16(a, b, c);
}
```
would result in an incorrect diagnostic saying that `svclamp_bf16` could
only be used in non-streaming functions.
This patch moves NEON immediate argument specification and checking to
the system currently shared by both SVE and SME.
In its current form, the TableGen definition of a NEON intrinsic cannot
control how its immediate arguments are range-checked, this information
must be inferred from the name of the intrinsic by NeonEmitter, which
also assumes that any NEON instruction will only ever receive a single
immediate argument. For SVE/SME instrinsics, this information is more
conveniently supplied in the TableGen definition.
As a result, for each immediate argument, NEON instructions must define
- The index of the immediate argument to be checked
- The type of immediate range check to be performed,
(e.g., ImmCheckShiftRight)
- The index of the argument whose type defines the context
of this immediate check (base type, vector size).
- **Difference from SVE/SME** If this definition generates a polymorphic
NEON builtin, the base type defined by this argument is overwritten by
that of the type code supplied to the overloaded builtin call. This
third argument is omitted in some cases due to this.
Here is an example for
[`vfma_laneq`](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=vfma_laneq)
- The immediate is supplied in argument 3
- The immediate is used as an index into the lanes of argument 2
- So we must perform an immediate check on argument 3, based on the type
information of argument 2.
- `ImmCheck<3, ImmCheckLaneIndex, 2>`
During this work, we discovered that the existing immediate
range-checking system was largely untested, which made it difficult to
make reliable progress. Missing tests have been added to verify this
implementation against all intrinsics which take constrained immediate
arguments. All test immediate range checking tests for NEON intrinsics
are moved to a dedicated directory
`clang/test/Sema/aarch64-neon-immediate-ranges/`.
The primary motivation behind this is to allow the enum type to be
referred to earlier in the Sema.h file which is needed for #106321.
It was requested in #106321 that a scoped enum be used (rather than
moving the enum declaration earlier in the Sema class declaration).
Unfortunately doing this creates a lot of churn as all use sites of the
enum constants had to be changed. Appologies to all downstream forks in
advanced.
Note the AA_ prefix has been dropped from the enum value names as they
are now redundant.
When various `Sema*.h` and `Sema*.cpp` files were created, cleanup of
`Sema.h` includes and forward declarations was left for the later.
Now's the time. This commit touches `Sema.h` and Sema components:
1. Unused includes are removed.
2. Unused forward declarations are removed.
3. Missing includes are added (those files are largely IWYU-clean now).
4. Includes were converted into forward declarations where possible.
As this commit focuses on headers, all changes to `.cpp` files were
minimal, and were aiming at keeping everything buildable.
[clang][ARM] Fix warning for using VFP from interrupts.
This warning has three issues:
- The interrupt attribute causes the function to return using an
exception
return instruction. This warning allows calls from one function with
the interrupt attribute to another, and the diagnostic text suggests
that not having the attribute on the callee is a problem. Actually
making such a call will lead to a double exception return, which is
unpredictable according to the ARM architecture manual section
B9.1.1, "Restrictions on exception return instructions". Even on
machines where an exception return from user/system mode is
tolerated, if the callee's interrupt type is anything other than a
supervisor call or secure monitor call, it will also return to a
different address than a normal function would. For example,
returning from an "IRQ" handler will return to lr - 4, which will
generally result in calling the same function again.
- The interrupt attribute currently does not cause caller-saved VFP
registers to be saved and restored if they are used, so putting
__attribute__((interrupt)) on a called function doesn't prevent it
from clobbering VFP state.
- It is part of the -Wextra diagnostic group and can't be individually
disabled when using -Wextra, which also means the diagnostic text of
this specific warning appears in the documentation of -Wextra.
This change addresses all three issues by instead generating a warning
for any interrupt handler where the vfp feature is enabled. The warning
is
also given its own diagnostic group.
Closes#34876.
[clang][ARM] Emit an error when an interrupt handler is called.
Closes#95359.
Add __hlt, which is a MSVC ARM64 intrinsic.
This intrinsic is just the HLT instruction. MSVC's version seems to
return something undefined; in this patch
it will just return zero.
MSVC intrinsics are defined here
https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics.
I used unsigned int as the return type, because that is what the MSVC
intrin.h header uses, even though
it conflicts with the documentation.
PR #76975 added 'IsStreamingOrSVE2p1' to emit a diagnostic when a builtin marked
with 'IsStreamingOrSVE2p1' is used in a non-streaming function that is not
compiled with `+sve2p1`.
The problem is a bit more complex than only this case. For example, we've marked
lots of builtins with 'IsStreamingCompatible', meaning it can be used in either
streaming, streaming-compatible or non-streaming functions. But the code in
SemaChecking, doesn't check the appropriate target guards. This issue becomes
relevant when SVE builtins are only available in streaming mode, e.g. when
compiling for SME without SVE.
If we were to add the appropriate target guards, we'd have to add many more
combinations, e.g.:
IsStreamingSMEOrSVE
IsStreamingSME2OrSVE2
IsStreamingSMEOrSVE2p1
IsStreamingSME2OrSVE2p1
etc.
To avoid having to add more combinations (and avoid having to add more in the
future for new extensions), we use a single 'IsSVEOrStreamingSVE' flag for all
builtins that are available in streaming mode for the appropriate SME flags, or
in non-streaming mode for the appropriate SVE flags, or both. The code in
SemaChecking will then verify for which mode (or both) the builtin would be
defined, given the target features of the function/compilation unit.
For example:
'svclamp' is enabled under FEAT_SVE2p1 and FEAT_SME2
* When we compile for SVE2p1 and SME (but not SME2), the builtin is undefined
behaviour when called from a streaming function.
* When we compile for SME2 and SVE2 (but not SVE2p1), the builtin is undefined
behaviour when called from a non-streaming function.
* When we compile for _both_ SVE2p1 and SME2, the builtin can be used in either
mode (non-streaming, streaming or streaming-compatible)
This patch moves language- and target-specific functions out of
`SemaDeclAttr.cpp`. As a consequence, `SemaAVR`, `SemaM68k`,
`SemaMSP430`, `SemaOpenCL`, `SemaSwift` were created (but they are not
the only languages and targets affected).
Notable things are that `Sema.h` actually grew a bit, because of
templated helpers that rely on `Sema` that I had to make available from
outside of `SemaDeclAttr.cpp`. I also had to left CUDA-related in
`SemaDeclAttr.cpp`, because it looks like HIP is building up on top of
CUDA attributes.
This is a follow-up to #93179 and continuation of efforts to split
`Sema` up. Additional context can be found in #84184 and #92682.
This patch introduces `SemaAMDGPU`, `SemaARM`, `SemaBPF`, `SemaHexagon`,
`SemaLoongArch`, `SemaMIPS`, `SemaNVPTX`, `SemaPPC`, `SemaSystemZ`,
`SemaWasm`. This continues previous efforts to split Sema up. Additional
context can be found in #84184 and #92682.
I decided to bundle target-specific components together because of their
low impact on `Sema`. That said, their impact on `SemaChecking.cpp` is
far from low, and I consider it a success.
Somewhat accidentally, I also moved Wasm- and AMDGPU-specific function
from `SemaDeclAttr.cpp`, because they were exposed in `Sema`. That went
well, and I consider it a success, too. I'd like to move the rest of
static target-specific functions out of `SemaDeclAttr.cpp` like we're
doing with built-ins in `SemaChecking.cpp` .