Whereas it is UB in terms of the standard to delete an array of objects
via pointer whose static type doesn't match its dynamic type, MSVC
supports an extension allowing to do it.
Aside from array deletion not working correctly in the mentioned case,
currently not having this extension implemented causes clang to generate
code that is not compatible with the code generated by MSVC, because
clang always puts scalar deleting destructor to the vftable. This PR
aims to resolve these problems.
It was reverted due to link time errors in chromium with sanitizer
coverage enabled,
which is fixed by https://github.com/llvm/llvm-project/pull/131929 .
The second commit of this PR also contains a fix for a runtime failure
in chromium reported
in
https://github.com/llvm/llvm-project/pull/126240#issuecomment-2730216384
.
Fixes https://github.com/llvm/llvm-project/issues/19772
Closes#99156.
Tasks completed:
- Implement `smoothstep` using HLSL source in `hlsl_intrinsics.h`
- Implement the `smoothstep` SPIR-V target built-in in
`clang/include/clang/Basic/BuiltinsSPIRV.td`
- Add sema checks for `smoothstep` to `CheckSPIRVBuiltinFunctionCall` in
`clang/lib/Sema/SemaSPIRV.cpp`
- Add codegen for spv `smoothstep` to `EmitSPIRVBuiltinExpr` in
`clang/lib/CodeGen/TargetBuiltins/SPIR.cpp`
- Add codegen tests to `clang/test/CodeGenHLSL/builtins/smoothstep.hlsl`
- Add spv codegen test to
`clang/test/CodeGenSPIRV/Builtins/smoothstep.c`
- Add sema tests to
`clang/test/SemaHLSL/BuiltIns/smoothstep-errors.hlsl`
- Add spv sema tests to
`clang/test/SemaSPIRV/BuiltIns/smoothstep-errors.c`
- Create the `int_spv_smoothstep` intrinsic in `IntrinsicsSPIRV.td`
- In SPIRVInstructionSelector.cpp create the `smoothstep` lowering and
map it to `int_spv_smoothstep` in
`SPIRVInstructionSelector::selectIntrinsic`
- Create SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/smoothstep.ll`
- Create SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/opencl/smoothstep.ll`
Summary:
When we were first porting to COV5, this lead to some ABI issues due to
a change in how we looked up the work group size. Bitcode libraries
relied on the builtins to emit code, but this was changed between
versions. This prevented the bitcode libraries, like OpenMP or libc,
from being used for both COV4 and COV5. The solution was to have this
'none' functionality which effectively emitted code that branched off of
a global to resolve to either version.
This isn't a great solution because it forced every TU to have this
variable in it. The patch in
https://github.com/llvm/llvm-project/pull/131033 removed support for
COV4 from OpenMP, which was the only consumer of this functionality.
Other users like HIP and OpenCL did not use this because they linked the
ROCm Device Library directly which has its own handling (The name was
borrowed from it after all).
So, now that we don't need to worry about backward compatibility with
COV4, we can remove this special handling. Users can still emit COV4
code, this simply removes the special handling used to make the OpenMP
device runtime bitcode version agnostic.
We can use *Set::insert_range to collapse:
for (auto Elem : Range)
Set.insert(E);
down to:
Set.insert_range(Range);
In some cases, we can further fold that into the set declaration.
C2y adds the `_Countof` operator which returns the number of elements in
an array. As with `sizeof`, `_Countof` either accepts a parenthesized
type name or an expression. Its operand must be (of) an array type. When
passed a constant-size array operand, the operator is a constant
expression which is valid for use as an integer constant expression.
This is being exposed as an extension in earlier C language modes, but
not in C++. C++ already has `std::extent` and `std::size` to cover these
needs, so the operator doesn't seem to get the user enough benefit to
warrant carrying this as an extension.
Fixes#102836
When pragma of loop transformations is specified, follow-up metadata for
loops is generated after each transformation. On the LLVM side,
follow-up metadata is expected to be a list of properties, such as the
following:
```
!followup = !{!"llvm.loop.vectorize.followup_all", !mp, !isvectorized}
!mp = !{!"llvm.loop.mustprogress"}
!isvectorized = !{"llvm.loop.isvectorized"}
```
However, on the clang side, the generated metadata contains an MDNode
that has those properties, as shown below:
```
!followup = !{!"llvm.loop.vectorize.followup_all", !loop_id}
!loop_id = distinct !{!loop_id, !mp, !isvectorized}
!mp = !{!"llvm.loop.mustprogress"}
!isvectorized = !{"llvm.loop.isvectorized"}
```
According to the
[LangRef](https://llvm.org/docs/TransformMetadata.html#transformation-metadata-structure),
the LLVM side is correct. Due to this inconsistency, follow-up metadata
was not interpreted correctly, e.g., only one transformation is applied
when multiple pragmas are used.
This patch fixes clang side to emit followup metadata in correct format.
clang/lib/CodeGen/CGBuiltin.cpp is over 1MB long (>23k LoC), and can
take minutes to recompile (depending on compiler and host system) when
modified, and 5 seconds for clangd to update for every edit. Splitting
this file was discussed in this thread:
https://discourse.llvm.org/t/splitting-clang-s-cgbuiltin-cpp-over-23k-lines-long-takes-1min-to-compile/
and the idea has received a number of +1 votes, hence this change.
Lower the `SV_GroupIndex` semantic as the
`llvm.spv.flattened.thread.id.in.group` intrinsic.
Depends on #130670.
---------
Co-authored-by: Steven Perron <stevenperron@google.com>
Original PR: #130537
Originally reverted due to revert of dependent commit. Relanding with no
changes.
This changes the MemberPointerType representation to use a
NestedNameSpecifier instead of a Type to represent the base class.
Since the qualifiers are always parsed as nested names, there was an
impedance mismatch when converting these back and forth into types, and
this led to issues in preserving sugar.
The nested names are indeed a better match for these, as the differences
which a QualType can represent cannot be expressed syntatically, and
they represent the use case more exactly, being either dependent or
referring to a CXXRecord, unqualified.
This patch also makes the MemberPointerType able to represent sugar for
a {up/downcast}cast conversion of the base class, although for now the
underlying type is canonical, as preserving the sugar up to that point
requires further work.
As usual, includes a few drive-by fixes in order to make use of the
improvements.
Relands #130976 with adjustments to test requirements.
Calls to __noreturn__ functions result in region termination for
coverage mapping. But this creates incorrect coverage results when
__noreturn__ functions (or other constructs that result in region
termination) occur within [GNU statement expressions][1].
In this scenario an extra gap region is introduced within VisitStmt,
such that if the following line does not introduce a new region it
is unconditionally counted as uncovered.
This change adjusts the mapping such that terminate statements
within statement expressions do not propagate that termination
state after the statement expression is processed.
[1]: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.htmlFixes#124296
This reverts commit 0dba5381d8c8e4cadc32a067bf2fe5e3486ae53d.
YMM rounding was removed from AVX10 whitepaper. Ref:
https://cdrdv2.intel.com/v1/dl/getContent/784343
The MINMAX and SATURATING CONVERT instructions will be removed as a
follow up.
- Use `Intrinsic::` directly instead of `llvm::Intrinsic::`.
- Eliminate redundant `nullptr` for some `CreateIntrinsic` calls.
- Eliminate redundant `ArrayRef` casts.
- Use C++17 structured binding instead of `std::tie`.
Original PR: #130537
Reland after updating lldb too.
This changes the MemberPointerType representation to use a
NestedNameSpecifier instead of a Type to represent the base class.
Since the qualifiers are always parsed as nested names, there was an
impedance mismatch when converting these back and forth into types, and
this led to issues in preserving sugar.
The nested names are indeed a better match for these, as the differences
which a QualType can represent cannot be expressed syntatically, and
they represent the use case more exactly, being either dependent or
referring to a CXXRecord, unqualified.
This patch also makes the MemberPointerType able to represent sugar for
a {up/downcast}cast conversion of the base class, although for now the
underlying type is canonical, as preserving the sugar up to that point
requires further work.
As usual, includes a few drive-by fixes in order to make use of the
improvements.
This changes the MemberPointerType representation to use a
NestedNameSpecifier instead of a Type to represent the class.
Since the qualifiers are always parsed as nested names, there was an
impedance mismatch when converting these back and forth into types, and
this led to issues in preserving sugar.
The nested names are indeed a better match for these, as the differences
which a QualType can represent cannot be expressed syntactically, and it
also represents the use case more exactly, being either dependent or
referring to a CXXRecord, unqualified.
This patch also makes the MemberPointerType able to represent sugar for
a {up/downcast}cast conversion of the base class, although for now the
underlying type is canonical, as preserving the sugar up to that point
requires further work.
As usual, includes a few drive-by fixes in order to make use of the
improvements, and removing some duplications, for example
CheckBaseClassAccess is deduplicated from across SemaAccess and
SemaCast.
This pull request is the third part of an ongoing effort to extends PGO
instrumentation to GPU device code and depends on
https://github.com/llvm/llvm-project/pull/93365. This PR makes the
following changes:
- Allows PGO flags to be supplied to GPU targets
- Pulls version global from device
- Modifies `__llvm_write_custom_profile` and `lprofWriteDataImpl` to
allow the PGO version to be overridden
Calls to __noreturn__ functions result in region termination for
coverage mapping. But this creates incorrect coverage results when
__noreturn__ functions (or other constructs that result in region
termination) occur within [GNU statement expressions][1].
In this scenario an extra gap region is introduced within VisitStmt,
such that if the following line does not introduce a new region it
is unconditionally counted as uncovered.
This change adjusts the mapping such that terminate statements
within statement expressions do not propagate that termination
state after the statement expression is processed.
[1]: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.htmlFixes#124296
When `-fcomplex-arithmetic=promoted` is set complex divassign `/=` should
promote to a wider type the same way division (without assignment) does.
Prior to this change, Smith's algorithm would be used for divassign.
Fixes: https://github.com/llvm/llvm-project/issues/131129
Introduce a trait to determine the number of bindings that would be
produced by
```cpp
auto [...p] = expr;
```
This is necessary to implement P2300
(https://eel.is/c++draft/exec#snd.concepts-5), but can also be used to
implement a general get<N> function that supports aggregates
`__builtin_structured_binding_size` is a unary type trait that evaluates
to the number of bindings in a decomposition
If the argument cannot be decomposed, a sfinae-friendly error is
produced.
A type is considered a valid tuple if `std::tuple_size_v<T>` is a valid
expression, even if there is no valid `std::tuple_element`
specialization or suitable `get` function for that type.
Fixes#46049
This builtin is supported by GCC and is a way to improve diagnostic
behavior for va_start in C23 mode. C23 no longer requires a second
argument to the va_start macro in support of variadic functions with no
leading parameters. However, we still want to diagnose passing more than
two arguments, or diagnose when passing something other than the last
parameter in the variadic function.
This also updates the freestanding <stdarg.h> header to use the new
builtin, same as how GCC works.
Fixes#124031
Processes `HLSLResourceBindingAttr` attributes that represent
`register(c#)` annotations on default constant buffer declarations and
applies its value to the buffer layout. Any default buffer declarations
without an explicit `register(c#)` annotation are placed after the
elements with explicit layout.
This PR also adds a test case for a `cbuffer` that does not have
`packoffset` on all declarations. Same layout rules apply here as well.
Fixes#126791
This caused link errors when building with sancov. See comment on the PR.
> Whereas it is UB in terms of the standard to delete an array of objects
> via pointer whose static type doesn't match its dynamic type, MSVC
> supports an extension allowing to do it.
> Aside from array deletion not working correctly in the mentioned case,
> currently not having this extension implemented causes clang to generate
> code that is not compatible with the code generated by MSVC, because
> clang always puts scalar deleting destructor to the vftable. This PR
> aims to resolve these problems.
>
> Fixes https://github.com/llvm/llvm-project/issues/19772
This reverts commit d6942d54f677000cf713d2b0eba57b641452beb4.
Clang has __builtin_elementwise_exp and __builtin_elementwise_exp2
intrinsics, but no __builtin_elementwise_exp10.
There doesn't seem to be a good reason not to expose the exp10 flavour
of this intrinsic too.
This commit introduces this intrinsic following the same pattern as the
exp and exp2 versions.
Fixes: SWDEV-519541