This both reapplies #118734, the initial attempt at this, and updates it
significantly.
First, it uses the newly added `StringTable` abstraction for string
tables, and simplifies the construction to build the string table and
info arrays separately. This should reduce any `constexpr` compile time
memory or CPU cost of the original PR while significantly improving the
APIs throughout.
It also restructures the builtins to support sharding across several
independent tables. This accomplishes two improvements from the
original PR:
1) It improves the APIs used significantly.
2) When builtins are defined from different sources (like SVE vs MVE in
AArch64), this allows each of them to build their own string table
independently rather than having to merge the string tables and info
structures.
3) It allows each shard to factor out a common prefix, often cutting the
size of the strings needed for the builtins by a factor two.
The second point is important both to allow different mechanisms of
construction (for example a `.def` file and a tablegen'ed `.inc` file,
or different tablegen'ed `.inc files), it also simply reduces the sizes
of these tables which is valuable given how large they are in some
cases. The third builds on that size reduction.
Initially, we use this new sharding rather than merging tables in
AArch64, LoongArch, RISCV, and X86. Mostly this helps ensure the system
works, as without further changes these still push scaling limits.
Subsequent commits will more deeply leverage the new structure,
including using the prefix capabilities which cannot be easily factored
out here and requires deep changes to the targets.
This caused assertion failures:
clang/lib/Analysis/CFG.cpp:822:
void (anonymous namespace)::CFGBuilder::appendStmt(CFGBlock *, const Stmt *):
Assertion `!isa<Expr>(S) || cast<Expr>(S)->IgnoreParens() == S' failed.
See comment on the PR.
This reverts commit 44aa618ef67d302f5ab77cc591fb3434fe967a2e.
This commit restricts the use of scalar types in vector math builtins,
particularly the `__builtin_elementwise_*` builtins.
Previously, small scalar integer types would be promoted to `int`, as
per the usual conversions. This would silently do the wrong thing for
certain operations, such as `add_sat`, `popcount`, `bitreverse`, and
others. Similarly, since unsigned integer types were promoted to `int`,
something like `add_sat(unsigned char, unsigned char)` would perform a
*signed* operation.
With this patch, promotable scalar integer types are not promoted to
int, and are kept intact. If any of the types differ in the binary and
ternary builtins, an error is issued. Similarly an error is issued if
builtins are supplied integer types of different signs. Mixing enums of
different types in binary/ternary builtins now consistently raises an
error in all language modes.
This brings the behaviour surrounding scalar types more in line with
that of vector types. No change is made to vector types, which are both
not promoted and whose element types must match.
Fixes#84047.
RFC:
https://discourse.llvm.org/t/rfc-change-behaviour-of-elementwise-builtins-on-scalar-integer-types/83725
GCC supports three flags related to overflow behavior:
* `-fwrapv`: Makes signed integer overflow well-defined.
* `-fwrapv-pointer`: Makes pointer overflow well-defined.
* `-fno-strict-overflow`: Implies `-fwrapv -fwrapv-pointer`, making both
signed integer overflow and pointer overflow well-defined.
Clang currently only supports `-fno-strict-overflow` and `-fwrapv`, but
not `-fwrapv-pointer`.
This PR proposes to introduce `-fwrapv-pointer` and adjust the semantics
of `-fwrapv` to match GCC.
This allows signed integer overflow and pointer overflow to be
controlled independently, while `-fno-strict-overflow` still exists to
control both at the same time (and that option is consistent across GCC
and Clang).
HandleImmediateInvocation can call MarkExpressionAsImmediateEscalating
and should always be called before
CheckImmediateEscalatingFunctionDefinition.
However, we were not doing that in `ActFunctionBody`.
We simply move CheckImmediateEscalatingFunctionDefinition to
PopExpressionEvaluationContext.
Fixes#119046
Reimplement Neon FP8 vector types using attribute `neon_vector_type`
instead of having them as builtin types.
This allows to implement FP8 Neon intrinsics without the need to add
special cases for these types when using `__builtin_shufflevector`
or bitcast (using C-style cast operator) between vectors, both
extensively used in the generated code in `arm_neon.h`.
In addition to the invocation case that is already diagnosed, also
diagnose when a block reference appears on either side of a ternary
selection operator.
Until now, clang would accept the added test case only to crash during
code generation.
It turns out that the substitution for expression comparing also needs
an unevaluated context, otherwise any reference to immediate functions
might not be properly handled.
As a fallout, this also guards the VLA transformation under unevaluated
context
with `InConditionallyConstantEvaluateContext` to avoid duplicate
diagnostics.
Fixes https://github.com/llvm/llvm-project/issues/123472
---------
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Closes#119360.
This bug occurs when passing a VLA to `va_arg`. Since the return value
is inferred to be an array, it triggers
`ScalarExprEmitter::VisitCastExpr`, which converts it to a pointer and
subsequently calls `CodeGenFunction::EmitAggExpr`. At this point,
because the inferred type is an `AggExpr` instead of a `ScalarExpr`,
`ScalarExprEmitter::VisitVAArgExpr` is not invoked, and as a result,
`CodeGenFunction::EmitVariablyModifiedType` is also not called, leading
to the size of the VLA not being retrieved.
The solution is to move the call to
`CodeGenFunction::EmitVariablyModifiedType` into
`CodeGenFunction::EmitVAArg`, ensuring that the size of the VLA is
correctly obtained regardless of whether the expression is an `AggExpr`
or a `ScalarExpr`.
For structured bindings, a call to getCapturedDeclRefType(...) was
missing. This PR fixes that behavior and adds the related diagnostics
too.
This fixes https://github.com/llvm/llvm-project/issues/95081.
This diagnoses comparisons like `ptr + unsigned_index < ptr` and `ptr +
unsigned_index >= ptr`, which are always false/true because addition of
a pointer and an unsigned index cannot wrap (or the behavior is
undefined).
This warning is intended to help find broken bounds checks (which must
be implemented in terms of uintptr_t instead).
Fixes https://github.com/llvm/llvm-project/issues/120214.
For a FunctionParmPackExpr that is used as the argument of a
sizeof...(pack) expression, we might exercise the logic that checks the
CXXRecordDecl's members regardless of the type being incomplete, when
rebuilding the DeclRefExpr into non-ODR-used forms.
Fixes https://github.com/llvm/llvm-project/issues/81436
Currently, we support `-wdeprecated-array-compare` for C++20 or above
and don't report any warning for older versions, this PR supports
`-Warray-compare` for older versions and for GCC compatibility.
Fixes#114770
This aligns with the logic in `TreeTransform::RebuildQualifiedType()`
where we refrain from adding const qualifiers to function types.
Previously, we seemed to overlook this edge case when copy-capturing a
variable that is of function type within a const-qualified lambda.
This issue also reveals other related problems as in incorrect type
printout and a suspicious implementation in DeduceTemplateArguments. I
decide to leave them in follow-up work.
Fixes#84961
This introduces `__builtin_hlsl_resource_getpointer`, which lowers to
`llvm.dx.resource.getpointer` and is used to implement indexing into
resources.
This will only work through the backend for typed buffers at this point,
but the changes to structured buffers should be correct as far as the
frontend is concerned.
Note: We probably want this to return a reference in the HLSL device
address space, but for now we're just using address space 0. Creating a
device address space and updating this code can be done later as
necessary.
Fixes#95956
Introduces `__builtin_hlsl_buffer_update_counter` clang buildin that is
used to implement the `IncrementCounter` and `DecrementCounter` methods
on `RWStructuredBuffer` and `RasterizerOrderedStructuredBuffer` (see
Note).
The builtin is translated to LLVM intrisic `llvm.dx.bufferUpdateCounter`
or `llvm.spv.bufferUpdateCounter`.
Introduces `BuiltinTypeMethodBuilder` helper in `HLSLExternalSemaSource`
that enables adding methods to builtin types using builder pattern like
this:
```
BuiltinTypeMethodBuilder(Sema, RecordBuilder, "MethodName", ReturnType)
.addParam("param_name", Type, InOutModifier)
.callBuiltin("buildin_name", { BuiltinParams })
.finalizeMethod();
```
Fixes#113513
[First version](llvm/llvm-project#114148) of this PR was reverted
because of build break.
Introduces `__builtin_hlsl_buffer_update_counter` clang buildin that is
used to implement the `IncrementCounter` and `DecrementCounter` methods
on `RWStructuredBuffer` and `RasterizerOrderedStructuredBuffer` (see
Note).
The builtin is translated to LLVM intrisic `llvm.dx.bufferUpdateCounter`
or `llvm.spv.bufferUpdateCounter`.
Introduces `BuiltinTypeMethodBuilder` helper in `HLSLExternalSemaSource`
that enables adding methods to builtin types using builder pattern like
this:
```
BuiltinTypeMethodBuilder(Sema, RecordBuilder, "MethodName", ReturnType)
.addParam("param_name", Type, InOutModifier)
.callBuiltin("buildin_name", { BuiltinParams })
.finalizeMethod();
```
Fixes#113513
This PR uses the existing lifetime analysis for the `capture_by`
attribute.
The analysis is behind `-Wdangling-capture` warning and is disabled by
default for now. Once it is found to be stable, it will be default
enabled.
Planned followup:
- add implicit inference of this attribute on STL container methods like
`std::vector::push_back`.
- (consider) warning if capturing `X` cannot capture anything. It should
be a reference, pointer or a view type.
- refactoring temporary visitors and other related handlers.
- start discussing `__global` vs `global` in the annotation in a
separate PR.
---------
Co-authored-by: Boaz Brickner <brickner@google.com>
Summary:
Address spaces are used in several embedded and GPU targets to describe
accesses to different types of memory. Currently we use the address
space enumerations to control which address spaces are considered
supersets of eachother, however this is also a target level property as
described by the C standard's passing mentions. This patch allows the
address space checks to use the target information to decide if a
pointer conversion is legal. For AMDGPU and NVPTX, all supported address
spaces can be converted to the default address space.
More semantic checks can be added on top of this, for now I'm mainly
looking to get more standard semantics working for C/C++. Right now the
address space conversions must all be done explicitly in C/C++ unlike
the offloading languages which define their own custom address spaces
that just map to the same target specific ones anyway. The main question
is if this behavior is a function of the target or the language.
The __builtin_counted_by_ref builtin is used on a flexible array
pointer and returns a pointer to the "counted_by" attribute's COUNT
argument, which is a field in the same non-anonymous struct as the
flexible array member. This is useful for automatically setting the
count field without needing the programmer's intervention. Otherwise
it's possible to get this anti-pattern:
ptr = alloc(<ty>, ..., COUNT);
ptr->FAM[9] = 42; /* <<< Sanitizer will complain */
ptr->count = COUNT;
To prevent this anti-pattern, the user can create an allocator that
automatically performs the assignment:
#define alloc(TY, FAM, COUNT) ({ \
TY __p = alloc(get_size(TY, COUNT)); \
if (__builtin_counted_by_ref(__p->FAM)) \
*__builtin_counted_by_ref(__p->FAM) = COUNT; \
__p; \
})
The builtin's behavior is heavily dependent upon the "counted_by"
attribute existing. It's main utility is during allocation to avoid
the above anti-pattern. If the flexible array member doesn't have that
attribute, the builtin becomes a no-op. Therefore, if the flexible
array member has a "count" field not referenced by "counted_by", it
must be set explicitly after the allocation as this builtin will
return a "nullptr" and the assignment will most likely be elided.
---------
Co-authored-by: Bill Wendling <isanbard@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
This fixes a crash when instantiating default arguments for templated
friend function declarations which lack a definition.
There are implementation limits which prevents us from finding the
pattern for such functions, and this causes difficulties
setting up the instantiation scope for the function parameters.
This patch skips instantiating the default argument in these cases,
which causes a minor regression in error recovery, but otherwise avoids
the crash.
Fixes#113324
We made the incorrect assumption that names of fields are unique when
creating their default initializers.
We fix that by keeping track of the instantiaation pattern for field
decls that are placeholder vars,
like we already do for unamed fields.
Fixes#114069
Swift ClangImporter now supports concurrency annotations on imported
declarations and their parameters/results, to make it possible to use
imported APIs in Swift safely there has to be a way to annotate
individual parameters and result types with relevant attributes that
indicate that e.g. a block is called on a particular actor or it accepts
a `Sendable` parameter.
To faciliate that `SwiftAttr` is switched from `InheritableAttr` which
is a declaration attribute to `DeclOrTypeAttr`. To support this
attribute in type context we need access to its "Attribute" argument
which requires `AttributedType` to be extended to include `Attr *` when
available instead of just `attr::Kind` otherwise it won't be possible to
determine what attribute should be imported.
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.
The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.
Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
When parsing its function parameters, we don't change the CurContext to
the lambda's function declaration. However,
CheckIfAnyEnclosingLambdasMustCaptureAnyPotentialCaptures() has not
yet adapted to such behavior when nested lambdas come into play.
Consider the following case,
struct Foo {};
template <int, Foo f> struct Arr {};
constexpr void foo() {
constexpr Foo F;
[&]<int I>() {
[&](Arr<I, F>) {};
}.template operator()<42>();
}
As per [basic.def.odr]p5.2, the use of F constitutes an ODR-use. And
per [basic.def.odr]p10, F should be ODR-usable in that interleaving
scope.
We failed to accept the case because the call to tryCaptureVariable()
in getStackIndexOfNearestEnclosingCaptureCapableLambda() suggested
that F is needlessly captureable. That was due to a missed handling
for AfterParameterList in FunctionScopeIndexToStopAt, where it still
presumed DC and LSI matched.
Fixes#47400Fixes#90896
This paper adds 'i' and 'j' as suffixes for forming a _Complex constant.
This feature has been supported in Clang since at least Clang 3.0, so
only test coverage is needed.
It does remove -Wgnu-imaginary-constant in C mode (still used in C++
mode) because the feature is now a C2y feature rather than a GNU one.
Consider #109148:
```c++
template <typename ...Ts>
void f() {
[] {
(^Ts);
};
}
```
When we encounter `^Ts`, we try to parse a block and subsequently call
`DiagnoseUnexpandedParameterPack()` (in `ActOnBlockArguments()`), which
sees `Ts` and sets `ContainsUnexpandedParameterPack` to `true` in the
`LambdaScopeInfo` of the enclosing lambda. However, the entire block is
subsequently discarded entirely because it isn’t even syntactically
well-formed. As a result, `ContainsUnexpandedParameterPack` is `true`
despite the lambda’s body no longer containing any unexpanded packs,
which causes an assertion the next time
`DiagnoseUnexpandedParameterPack()` is called.
This pr moves handling of unexpanded parameter packs into
`CapturingScopeInfo` instead so that the same logic is used for both
blocks and lambdas. This fixes this issue since the
`ContainsUnexpandedParameterPack` flag is now part of the block (and
before that, its `CapturingScopeInfo`) and no longer affects the
surrounding lambda directly when the block is parsed. Moreover, this
change makes blocks actually usable with pack expansion.
This fixes#109148.
While profiling a clang invocation using `-ftime-trace`, now we add
instant events when template instantiation is deferred.
These events include the fully qualified name of the function template
being deferred and therefore could be very verbose. This is therefore
only added in verbose mode (when `TimeTraceVerbose` is enabled).
The point of time when a particular instantiation is deferred can be
used to identify the parent TimeTrace scope (usually another function
instantiation), which is responsible for deferring this instantiation.
This relationship can be used to attribute the cost of a deferred
template instantiation to the function deferring this particular
instantiation.
The special-casing for RequiresExprBodyDecl caused a regression, as
reported in #110785.
The original fix for #84020 has been superseded by fd87d765c0, which
establishes a `DependentScopeDeclRefExpr` instead of a
`CXXDependentScopeMemberExpr` for the case in issue. So the spurious
diagnostic in #84020 would no longer occur.
This also merges the test for #84020 together with that for #110785 into
clang/test/SemaTemplate/instantiate-requires-expr.cpp.
No release note because I think this merits a backport.
Fixes#110785
- In Sema, when encountering Decls with function effects needing
verification, add them to a vector, DeclsWithEffectsToVerify.
- Update AST serialization to include DeclsWithEffectsToVerify.
- In AnalysisBasedWarnings, use DeclsWithEffectsToVerify as a work
queue, verifying functions with declared effects, and inferring (when
permitted and necessary) whether their callees have effects.
---------
Co-authored-by: Doug Wyatt <dwyatt@apple.com>
Co-authored-by: Sirraide <aeternalmail@gmail.com>
Co-authored-by: Erich Keane <ekeane@nvidia.com>
HLSL has a different set of usual arithmetic conversions for vector
types to resolve a common type for binary operator expressions.
This PR implements the current spec proposal from:
https://github.com/microsoft/hlsl-specs/pull/311
There is one case that may need additional handling for implicitly
truncating vector<T,1> to T early to allow other transformations.
Fixes#106253
Re-lands #108659
HLSL has a different set of usual arithmetic conversions for vector
types to resolve a common type for binary operator expressions.
This PR implements the current spec proposal from:
https://github.com/microsoft/hlsl-specs/pull/311
There is one case that may need additional handling for implicitly
truncating `vector<T,1>` to `T` early to allow other transformations.
Fixes#106253