Add new elementwise popcount builtin to support HLSL function
'countbits'.
elementwise popcount only accepts integer types.
Add hlsl intrinsic 'countbits'
Closes#99094
The last argument of an FP-classify function was checked for vailidity
as an expression, but we never ensured that the usual unary
conversions/etc properly resulted in a valid value. Thus, when we got
the value, it was null, so we had a null dereference.
This patch instead fails out/marks the function call as invalid if the
argument is incorrect. I DID consider just allowing it to continue, but
the result was an extraneous error about how the last argument wasn't a
float (in this case, it was an overload set).
Fixes: #107411
[P2641R4](https://wg21.link/P2641R4)
This new builtin function is declared `consteval`. Support for
`-fexperimental-new-constant-interpreter` will be added in a later
patch.
---------
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
HLSL output parameters are denoted with the `inout` and `out` keywords
in the function declaration. When an argument to an output parameter is
constructed a temporary value is constructed for the argument.
For `inout` pamameters the argument is initialized via copy-initialization
from the argument lvalue expression to the parameter type. For `out`
parameters the argument is not initialized before the call.
In both cases on return of the function the temporary value is written
back to the argument lvalue expression through an implicit assignment
binary operator with casting as required.
This change introduces a new HLSLOutArgExpr ast node which represents
the output argument behavior. The OutArgExpr has three defined children:
- An OpaqueValueExpr of the argument lvalue expression.
- An OpaqueValueExpr of the copy-initialized parameter.
- A BinaryOpExpr assigning the first with the value of the second.
Fixes#87526
---------
Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
Co-authored-by: John McCall <rjmccall@gmail.com>
The primary motivation behind this is to allow the enum type to be
referred to earlier in the Sema.h file which is needed for #106321.
It was requested in #106321 that a scoped enum be used (rather than
moving the enum declaration earlier in the Sema class declaration).
Unfortunately doing this creates a lot of churn as all use sites of the
enum constants had to be changed. Appologies to all downstream forks in
advanced.
Note the AA_ prefix has been dropped from the enum value names as they
are now redundant.
Also changes the behaviour of `__builtin_is_layout_compatible`
None of the historic nor the current definition of layout-compatible
classes mention anything about base classes (other than implicitly
through being standard-layout) and are defined in terms of members, not
direct members.
The __builtin_alloca was returning a flat pointer with no address space
when compiled using openCL1.2 or below but worked fine with openCL2.0
and above. This accounts to the fact that later uses the concept of
generic address space which supports cast to other address space(i.e to
private address space which is used for stack allocation) .
But, in actuality, as it returns pointer to the stack, it should be
pointing to private address space irrespective of openCL version becuase
builtin_alloca allocates stack memory used for current function in which
it is called. Thus,it requires redefintion of the builtin function with
appropraite return pointer to private address space.
This patch fixes:
clang/lib/Sema/SemaChecking.cpp:8220:3: error: default label in
switch which covers all enumeration values
[-Werror,-Wcovered-switch-default]
In finite math mode when special math builtins `__builtin_inf` and
`__builtin_nan` are used a warning is emitted when the builtin is
expanded and at call point.
This warning at call point was missing for` __builtin_inf` and this
patch fixes the issue
(https://github.com/llvm/llvm-project/issues/98018).
The builtin computes the discriminator for a type, which can be used to
sign/authenticate function pointers and member function pointers.
If the type passed to the builtin is a C++ member function pointer type,
the result is the discriminator used to signed member function pointers
of that type. If the type is a function, function pointer, or function
reference type, the result is the discriminator used to sign functions
of that type. It is ill-formed to use this builtin with any other type.
A call to this function is an integer constant expression.
Co-Authored-By: John McCall rjmccall@apple.com
This patch moves documentation of `Sema` functions from `.cpp` files to `Sema.h` when there was no documentation in the latter, or it can be trivially subsumed. More complicated cases when there's less trivial divergence between documentation attached to declaration and the one attached to implementation are left for a later PR that would require review.
It appears that doxygen can find the documentation for a function defined out-of-line even if it's attached to an implementation, and not declaration. But other tools, e.g. clangd, are not as powerful. So this patch significantly improves autocompletion experience for (at least) clangd-based IDEs.
This patch fixes the crash when pointers to incomplete type are passed
to atomic builtins such as `__atomic_load`.
`ASTContext::getTypeInfoInChars` assumes that the argument type is a
complete type, so I added a check to eliminate cases where incomplete
types gets passed to this function
Relevant PR: https://github.com/llvm/llvm-project/pull/91057
Fixes https://github.com/llvm/llvm-project/issues/96289
The builtin causes the program to stop its execution abnormally and
shows a human-readable description of the reason for the termination
when a debugger is attached or in a symbolicated crash log.
The motivation for the builtin is explained in the following RFC:
https://discourse.llvm.org/t/rfc-adding-builtin-verbose-trap-string-literal/75845
clang's CodeGen lowers the builtin to `llvm.trap` and emits debugging
information that represents an artificial inline frame whose name
encodes the category and reason strings passed to the builtin.
This is a constant-expression equivalent to
ptrauth_sign_unauthenticated. Its constant nature lets us guarantee
a non-attackable sequence is generated, unlike
ptrauth_sign_unauthenticated which we generally discourage using.
It being a constant also allows its usage in global initializers, though
requiring constant pointers and discriminators.
The value must be a constant expression of pointer type which evaluates
to a non-null pointer.
The key must be a constant expression of type ptrauth_key.
The extra data must be a constant expression of pointer or integer type;
if an integer, it will be coerced to ptrauth_extra_data_t.
The result will have the same type as the original value.
This can be used in constant expressions.
Co-authored-by: John McCall <rjmccall@apple.com>
This exposes the ABI-stable hash function that allows computing a 16-bit
discriminator from a constant string.
This allows manually matching the implicit string discriminators
computed in the ABI (e.g., from mangled names for vtable pointer/entry
signing), as well as enabling the use of interesting discriminators when
manually annotating specific pointers with the __ptrauth qualifier.
The argument must be a string literal of char character type. The
result has type ptrauth_extra_data_t.
The result value is never zero and always within range for both the
__ptrauth qualifier and ptrauth_blend_discriminator.
This can be used in constant expressions.
Co-authored-by: John McCall <rjmccall@apple.com>
This patch introduces `SemaAMDGPU`, `SemaARM`, `SemaBPF`, `SemaHexagon`,
`SemaLoongArch`, `SemaMIPS`, `SemaNVPTX`, `SemaPPC`, `SemaSystemZ`,
`SemaWasm`. This continues previous efforts to split Sema up. Additional
context can be found in #84184 and #92682.
I decided to bundle target-specific components together because of their
low impact on `Sema`. That said, their impact on `SemaChecking.cpp` is
far from low, and I consider it a success.
Somewhat accidentally, I also moved Wasm- and AMDGPU-specific function
from `SemaDeclAttr.cpp`, because they were exposed in `Sema`. That went
well, and I consider it a success, too. I'd like to move the rest of
static target-specific functions out of `SemaDeclAttr.cpp` like we're
doing with built-ins in `SemaChecking.cpp` .
When an atomic builtin is called with a pointer to an object of size
zero, an arithmetic exception gets thrown because there is a modulo
operation with the objects size in codegen.
Diagnose this in sema instead.
Fixes#90330.
This patch moves `Sema` functions that are specific for x86 into the new
`SemaX86` class. This continues previous efforts to split `Sema` up.
Additional context can be found in #84184 and #92682.
This patch moves `Sema` functions that are specific for RISC-V into the
new `SemaRISCV` class. This continues previous efforts to split `Sema`
up. Additional context can be found in
https://github.com/llvm/llvm-project/pull/84184.
This PR is somewhat different from previous PRs on this topic:
1. Splitting out target-specific functions wasn't previously discussed.
It felt quite natural to do, though.
2. I had to make some static function in `SemaChecking.cpp` member
functions of `Sema` in order to use them in `SemaRISCV`.
3. I dropped "RISCV" from identifiers, but decided to leave "RVV"
(RISC-V "V" vector extensions) intact. I think it's an idiomatic
abbreviation at this point, but I'm open to input from contributors in
that area.
4. I repurposed `SemaRISCVVectorLookup.cpp` for `SemaRISCV`.
I think this was a successful experiment, which both helps the goal of
splitting `Sema` up, and shows a way to approach `SemaChecking.cpp`,
which I wasn't sure how to approach before. As we move more
target-specific function out of there, we'll gradually make the checking
"framework" inside `SemaChecking.cpp` public, which is currently a whole
bunch of static functions. This would enable us to move more functions
outside of `SemaChecking.cpp`.
The following program produces a diagnostic in Clang and EDG, but
compiles correctly in GCC and MSVC:
```cpp
#include <vector>
consteval std::vector<int> fn() { return {1,2,3}; }
constexpr int a = fn()[1];
```
Clang's diagnostic is as follows:
```cpp
<source>:6:19: error: call to consteval function 'fn' is not a constant expression
6 | constexpr int a = fn()[1];
| ^
<source>:6:19: note: pointer to subobject of heap-allocated object is not a constant expression
/opt/compiler-explorer/gcc-snapshot/lib/gcc/x86_64-linux-gnu/14.0.1/../../../../include/c++/14.0.1/bits/allocator.h:193:31: note: heap allocation performed here
193 | return static_cast<_Tp*>(::operator new(__n));
| ^
1 error generated.
Compiler returned: 1
```
Based on my understanding of
[`[dcl.constexpr]/6`](https://eel.is/c++draft/dcl.constexpr#6):
> In any constexpr variable declaration, the full-expression of the
initialization shall be a constant expression
It seems to me that GCC and MSVC are correct: the initializer `fn()[1]`
does not evaluate to an lvalue referencing a heap-allocated value within
the `vector` returned by `fn()`; it evaluates to an lvalue-to-rvalue
conversion _from_ that heap-allocated value.
This PR turns out to be a bug fix on the implementation of
[P2564R3](https://wg21.link/p2564r3); as such, it only applies to C++23
and later. The core problem is that the definition of a
constant-initialized variable
([`[expr.const/2]`](https://eel.is/c++draft/expr.const#2)) is contingent
on whether the initializer can be evaluated as a constant expression:
> A variable or temporary object o is _constant-initialized_ if [...]
the full-expression of its initialization is a constant expression when
interpreted as a _constant-expression_, [...]
That can't be known until we've finished parsing the initializer, by
which time we've already added immediate invocations and consteval
references to the current expression evaluation context. This will have
the effect of evaluating said invocations as full expressions when the
context is popped, even if they're subexpressions of a larger constant
expression initializer. If, however, the variable _is_
constant-initialized, then its initializer is [manifestly
constant-evaluated](https://eel.is/c++draft/expr.const#20):
> An expression or conversion is _manifestly constant-evaluated_ if it
is [...] **the initializer of a variable that is usable in constant
expressions or has constant initialization** [...]
which in turn means that any subexpressions naming an immediate function
are in an [immediate function
context](https://eel.is/c++draft/expr.const#16):
> An expression or conversion is in an immediate function context if it
is potentially evaluated and either [...] it is a **subexpression of a
manifestly constant-evaluated expression** or conversion
and therefore _are not to be considered [immediate
invocations](https://eel.is/c++draft/expr.const#16) or
[immediate-escalating
expressions](https://eel.is/c++draft/expr.const#17) in the first place_:
> An invocation is an _immediate invocation_ if it is a
potentially-evaluated explicit or implicit invocation of an immediate
function and **is not in an immediate function context**.
> An expression or conversion is _immediate-escalating_ if **it is not
initially in an immediate function context** and [...]
The approach that I'm therefore proposing is:
1. Create a new expression evaluation context for _every_ variable
initializer (rather than only nonlocal ones).
2. Attach initializers to `VarDecl`s _prior_ to popping the expression
evaluation context / scope / etc. This sequences the determination of
whether the initializer is in an immediate function context _before_ any
contained immediate invocations are evaluated.
3. When popping an expression evaluation context, elide all evaluations
of constant invocations, and all checks for consteval references, if the
context is an immediate function context. Note that if it could be
ascertained that this was an immediate function context at parse-time,
we [would never have
registered](760910ddb9/clang/lib/Sema/SemaExpr.cpp (L17799))
these immediate invocations or consteval references in the first place.
Most of the test changes previously made for this PR are now reverted
and passing as-is. The only test updates needed are now as follows:
- A few diagnostics in `consteval-cxx2a.cpp` are updated to reflect that
it is the `consteval tester::tester` constructor, not the more narrow
`make_name` function call, which fails to be evaluated as a constant
expression.
- The reclassification of `warn_impcast_integer_precision_constant` as a
compile-time diagnostic adds a (somewhat duplicative) warning when
attempting to define an enum constant using a narrowing conversion. It
also, however, retains the existing diagnostics which @erichkeane
(rightly) objected to being lost from an earlier revision of this PR.
---------
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
If you want an overarching view of how this will all connect see:
https://github.com/llvm/llvm-project/pull/90088
Changes:
- `clang/docs/LanguageExtensions.rst` - Document the new elementwise tan
builtin.
- `clang/include/clang/Basic/Builtins.td` - Implement the tan builtin.
- `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the tan intrinsic on uses
of the builtin
- `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the tan builtin
with the equivalent hlsl apis
- `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks as well as
HLSL specifc sema checks to the tan builtin
- `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
- `llvm/docs/LangRef.rst` - Document the tan intrinsic
Summary:
Previous patches added support for the LLVM rounding intrinsic
functions. This patch allows them to me emitted using the clang builtins
when targeting AMDGPU.
The PR implements a subset of features of function
__builtin_cpu_support() for AIX OS based on the information which AIX
kernel runtime variable `_system_configuration` and function call `getsystemcfg()` of
/usr/include/sys/systemcfg.h in AIX OS can provide.
Following subset of features are supported in the PR
"arch_3_00", "arch_3_1","booke","cellbe","darn","dfp","dscr" ,"ebb","efpsingle","efpdouble","fpu","htm","isel",
"mma","mmu","pa6t","power4","power5","power5+","power6x","ppc32","ppc601","ppc64","ppcle","smt",
"spe","tar","true_le","ucache","vsx"
A Darwin extension '%P' combined with an Objective-C pointer seems to
always be a bug.
'%P' will dump bytes at the pointed-to address (in contrast to '%p'
which dumps the pointer itself). This extension is only allowed in "OS
Log" contexts and is intended to be used like `%{uuid_t}.*16P` or
`%{timeval}.*P`. If an ObjC pointer is used, then the internal runtime
structure (aka, the is-a pointer and other runtime metadata) will be
dumped, which (IMO) is never the expectation.
A simple diagnostic can help flag these scenarios.
Resolves https://github.com/llvm/llvm-project/issues/89968
Co-authored-by: Jared Grubb <jgrubb@apple.com>
Currently, a lot of `__builtin_reduce_*` function do not support
scalable vectors, i.e., ARM SVE and RISCV V. This PR adds support for
them. The main code change is to use a different path to extract the
type from the vectors, the rest is the same and LLVM supports the reduce
functions for `vscale` vectors.
This PR adds scalable vector support for:
- `__builtin_reduce_add`
- `__builtin_reduce_mul`
- `__builtin_reduce_xor`
- `__builtin_reduce_or`
- `__builtin_reduce_and`
- `__builtin_reduce_min`
- `__builtin_reduce_max`
Note: For all except `min/max`, the element type must still be an
integer value. Adding floating point support for `add` and `mul` is
still an open TODO.
OpenACC is going to need an array sections implementation that is a
simpler version/more restrictive version of the OpenMP version.
This patch moves `OMPArraySectionExpr` to `Expr.h` and renames it `ArraySectionExpr`,
then adds an enum to choose between the two.
This also fixes a couple of 'drive-by' issues that I discovered on the way,
but leaves the OpenACC Sema parts reasonably unimplemented (no semantic
analysis implementation), as that will be a followup patch.
Add separate messages about passing arguments or returning parameters
with scalable types.
---------
Co-authored-by: Sander de Smalen <sander.desmalen@arm.com>
This patch implements intrinsic that supports
`std::is_pointer_interconvertible_base_of` type trait from
[P0466R5](https://wg21.link/p0466r5) "Layout-compatibility and
Pointer-interconvertibility Traits".
Normative wording:
> Comment: If `Base` and Derived are non-union class types and are not
(possibly _cv_-qualified) versions of the same type, `Derived` is a
complete type.
> Condition: `Derived` is unambiguously derived from `Base` without
regard to _cv_-qualifiers, and each object of type `Derived` is
pointer-interconvertible (6.7.2 [basic.compound]) with its `Base`
subobject, or `Base` and `Derived` are not unions and name the same
class type without regard to _cv_-qualifiers.
The paper also express the following intent:
> Note that `is_pointer_interconvertible_base_of_v<T,T>` is always true
under this wording, even though `T` is not derived from itself.
I find the treatment of unions in the wording contradictory to this
intent, and I'm not able to find anything relevant in minutes or on the
reflector. That said, this patch implements what the wording says, since
it's very explicit about unions.
The compiler doesn't know in advance if the streaming and non-streaming
vector-lengths are different, so it should be safe to give a warning
diagnostic to warn the user about possible undefined behaviour. If the
user knows the vector lengths are equal, they can disable the warning
separately.
rldimi is 64-bit instruction, due to backward compatibility, it needs to
be expanded into series of rotate and masking in 32-bit environment. In
the future, we may improve bit permutation selector and remove such
direct codegen.
Start of #83882
- `Builtins.td` - add the `hlsl` `all` elementwise builtin.
- `CGBuiltin.cpp` - Show a use case for CGHLSLUtils via an `all`
intrinsic codegen.
- `CGHLSLRuntime.cpp` - move `thread_id` to use CGHLSLUtils.
- `CGHLSLRuntime.h` - Create a macro to help pick the right intrinsic
for the backend.
- `hlsl_intrinsics.h` - Add the `all` api.
- `SemaChecking.cpp` - Add `all` builtin type checking
- `IntrinsicsDirectX.td` - Add the `all` `dx` intrinsic
- `IntrinsicsSPIRV.td` - Add the `all` `spv` intrinsic
Work still needed:
- `SPIRVInstructionSelector.cpp` - Add an implementation of `OpAll` for
`spv_all` intrinsic