The implementation mostly reuses C++ code paths where possible,
including narrowing check in order to provide diagnostic messages in
case initializer for constexpr variable is not exactly representable in
target type.
The following won't work due to lack of support for other features:
- Diagnosing of underspecified declarations involving constexpr
- Constexpr attached to compound literals
Also due to lack of support for char8_t some of examples with utf-8
strings don't work properly.
Fixes https://github.com/llvm/llvm-project/issues/64742
When analysing whether we should handle a binary expression as an
overloaded operator call or a builtin operator, we were calling
`checkPlaceholderForOverload()`, which takes care of any placeholders
that are not overload sets—which would usually make sense since those
need to be handled as part of overload resolution.
Unfortunately, we were also doing that for `.*`, which is not
overloadable, and then proceeding to create a builtin operator anyway,
which would crash if the RHS happened to be an unresolved overload set
(due hitting an assertion in `CreateBuiltinBinOp()`—specifically, in one
of its callees—in the `.*` case that makes sure its arguments aren’t
placeholders).
This pr instead makes it so we check for *all* placeholders early if the
operator is `.*`.
It’s worth noting that,
1. In the `.*` case, we now additionally also check for *any*
placeholders (not just non-overload-sets) in the LHS; this shouldn’t
make a difference, however—at least I couldn’t think of a way to trigger
the assertion with an overload set as the LHS of `.*`; it is worth
noting that the assertion in question would also complain if the LHS
happened to be of placeholder type, though.
2. There is another case in which we also don’t perform overload
resolution—namely `=` if the LHS is not of class or enumeration type
after handling non-overload-set placeholders—as in the `.*` case, but
similarly to 1., I first couldn’t think of a way of getting this case to
crash, and secondly, `CreateBuiltinBinOp()` doesn’t seem to care about
placeholders in the LHS or RHS in the `=` case (from what I can tell,
it, or rather one of its callees, only checks that the LHS is not a
pseudo-object type, but those will have already been handled by the call
to `checkPlaceholderForOverload()` by the time we get to this function),
so I don’t think this case suffers from the same problem.
This fixes#53815.
---------
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
There are two issues here. first `ICK_Floating_Integral` were always
defaulting to `CK_FloatingToIntegral` for vectors regardless of
direction of cast. Check was scalar only so added a vec float check to
the conditional.
Second issue was float to int casts were resolving to
ICK_Integral_Promotion when they need to be resolving to
CK_FloatingToIntegral. This was fixed by changing the ordering of
conversion checks.
This fixes#82826
HLSL supports vector truncation and element conversions as part of
standard conversion sequences. The vector truncation conversion is a C++
second conversion in the conversion sequence. If a vector truncation is
in a conversion sequence an element conversion may occur after it before
the standard C++ third conversion.
Vector element conversions can be boolean conversions, floating point or
integral conversions or promotions.
[HLSL Draft
Specification](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf)
---------
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
This patch converts `Sema::TemplateDeductionResult` into a scoped enum
in namespace scope, making it eligible for forward declaring. This is
useful in certain contexts, such as `preferred_type` annotations on
bit-fields.
Trying to compile a C-style variadic member function with an explicit
object parameter was crashing in Sema because of an out-of-bounds
access.
This fixes#80971.
This is yet another one-line patch to fix crashes on constraint
substitution.
```cpp
template <class, class> struct formatter;
template <class, class> struct basic_format_context {};
template <typename CharType>
concept has_format_function = format(basic_format_context<CharType, CharType>());
template <typename ValueType, typename CharType>
requires has_format_function<CharType>
struct formatter<ValueType, CharType> {
template <typename OutputIt>
CharType format(basic_format_context<OutputIt, CharType>);
};
```
In this case, we would build up a `RecoveryExpr` for a call within a
constraint expression due to the absence of viable functions. The
heuristic algorithm attempted to find such a function inside of a
ClassTemplatePartialSpecialization, from which we started to substitute
its requires-expression, and it succeeded with a FunctionTemplate such
that
1) It has only one parameter, which is dependent.
2) The only one parameter depends on two template parameters. They are,
in canonical form, `<template-parameter-1-0>` and
`<template-parameter-0-1>` respectively.
Before we emit an error, we still want to recover the most viable
functions. This goes downhill to deducing template parameters against
its arguments, where we would collect the argument type with the same
depth as the parameter type into a Deduced set. The size of the set is
presumed to be that of function template parameters, which is 1 in this
case. However, since we haven't yet properly set the template depth
before the dance, we'll end up putting the type for
`<template-parameter-0-1>` to the second position of Deduced set, which
is unfortunately an access violation!
The bug seems to appear since clang 12.0.
This fixes [the
case](https://github.com/llvm/llvm-project/issues/58548#issuecomment-1287935336).
This re-applies 30155fc0 with a fix for clangd.
### Description
clang don't evaluate the object argument of `static operator()` and
`static operator[]` currently, for example:
```cpp
#include <iostream>
struct Foo {
static int operator()(int x, int y) {
std::cout << "Foo::operator()" << std::endl;
return x + y;
}
static int operator[](int x, int y) {
std::cout << "Foo::operator[]" << std::endl;
return x + y;
}
};
Foo getFoo() {
std::cout << "getFoo()" << std::endl;
return {};
}
int main() {
std::cout << getFoo()(1, 2) << std::endl;
std::cout << getFoo()[1, 2] << std::endl;
}
```
`getFoo()` is expected to be called, but clang don't call it currently
(17.0.6). This PR fixes this issue.
Fixes#67976, reland #68485.
### Walkthrough
- **clang/lib/Sema/SemaOverload.cpp**
- **`Sema::CreateOverloadedArraySubscriptExpr` &
`Sema::BuildCallToObjectOfClassType`**
Previously clang generate `CallExpr` for static operators, ignoring the
object argument. In this PR `CXXOperatorCallExpr` is generated for
static operators instead, with the object argument as the first
argument.
- **`TryObjectArgumentInitialization`**
`const` / `volatile` objects are allowed for static methods, so that we
can call static operators on them.
- **clang/lib/CodeGen/CGExpr.cpp**
- **`CodeGenFunction::EmitCall`**
CodeGen changes for `CXXOperatorCallExpr` with static operators: emit
and ignore the object argument first, then emit the operator call.
- **clang/lib/AST/ExprConstant.cpp**
- **`ExprEvaluatorBase::handleCallExpr`**
Evaluation of static operators in constexpr also need some small changes
to work, so that the arguments won't be out of position.
- **clang/lib/Sema/SemaChecking.cpp**
- **`Sema::CheckFunctionCall`**
Code for argument checking also need to be modify, or it will fail the
test `clang/test/SemaCXX/overloaded-operator-decl.cpp`.
- **clang-tools-extra/clangd/InlayHints.cpp**
- **`InlayHintVisitor::VisitCallExpr`**
Now that the `CXXOperatorCallExpr` for static operators also have object
argument, we should also take care of this situation in clangd.
### Tests
- **Added:**
- **clang/test/AST/ast-dump-static-operators.cpp**
Verify the AST generated for static operators.
- **clang/test/SemaCXX/cxx2b-static-operator.cpp**
Static operators should be able to be called on const / volatile
objects.
- **Modified:**
- **clang/test/CodeGenCXX/cxx2b-static-call-operator.cpp**
- **clang/test/CodeGenCXX/cxx2b-static-subscript-operator.cpp**
Matching the new CodeGen.
### Documentation
- **clang/docs/ReleaseNotes.rst**
Update release notes.
---------
Co-authored-by: Shafik Yaghmour <shafik@users.noreply.github.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
### Description
clang don't evaluate the object argument of `static operator()` and
`static operator[]` currently, for example:
```cpp
#include <iostream>
struct Foo {
static int operator()(int x, int y) {
std::cout << "Foo::operator()" << std::endl;
return x + y;
}
static int operator[](int x, int y) {
std::cout << "Foo::operator[]" << std::endl;
return x + y;
}
};
Foo getFoo() {
std::cout << "getFoo()" << std::endl;
return {};
}
int main() {
std::cout << getFoo()(1, 2) << std::endl;
std::cout << getFoo()[1, 2] << std::endl;
}
```
`getFoo()` is expected to be called, but clang don't call it currently
(17.0.2). This PR fixes this issue.
Fixes#67976.
### Walkthrough
- **clang/lib/Sema/SemaOverload.cpp**
- **`Sema::CreateOverloadedArraySubscriptExpr` &
`Sema::BuildCallToObjectOfClassType`**
Previously clang generate `CallExpr` for static operators, ignoring the
object argument. In this PR `CXXOperatorCallExpr` is generated for
static operators instead, with the object argument as the first
argument.
- **`TryObjectArgumentInitialization`**
`const` / `volatile` objects are allowed for static methods, so that we
can call static operators on them.
- **clang/lib/CodeGen/CGExpr.cpp**
- **`CodeGenFunction::EmitCall`**
CodeGen changes for `CXXOperatorCallExpr` with static operators: emit
and ignore the object argument first, then emit the operator call.
- **clang/lib/AST/ExprConstant.cpp**
- **`ExprEvaluatorBase::handleCallExpr`**
Evaluation of static operators in constexpr also need some small changes
to work, so that the arguments won't be out of position.
- **clang/lib/Sema/SemaChecking.cpp**
- **`Sema::CheckFunctionCall`**
Code for argument checking also need to be modify, or it will fail the
test `clang/test/SemaCXX/overloaded-operator-decl.cpp`.
### Tests
- **Added:**
- **clang/test/AST/ast-dump-static-operators.cpp**
Verify the AST generated for static operators.
- **clang/test/SemaCXX/cxx2b-static-operator.cpp**
Static operators should be able to be called on const / volatile
objects.
- **Modified:**
- **clang/test/CodeGenCXX/cxx2b-static-call-operator.cpp**
- **clang/test/CodeGenCXX/cxx2b-static-subscript-operator.cpp**
Matching the new CodeGen.
### Documentation
- **clang/docs/ReleaseNotes.rst**
Update release notes.
---------
Co-authored-by: Shafik Yaghmour <shafik@users.noreply.github.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Previously committed as 9e08e51a20d0d2b1c5724bb17e969d036fced4cd, and
reverted because a dependency commit was reverted, then committed again
as 4b574008aef5a7235c1f894ab065fe300d26e786 and reverted again because
"dependency commit" 5a391d38ac6c561ba908334d427f26124ed9132e was
reverted. But it doesn't seem that 5a391d38ac6c was a real dependency
for this.
This commit incorporates 4b574008aef5a7235c1f894ab065fe300d26e786 and
18e093faf726d15f210ab4917142beec51848258 by Richard Smith (@zygoloid),
with some minor fixes, most notably:
- `UncommonValue` renamed to `StructuralValue`
- `VK_PRValue` instead of `VK_RValue` as default kind in lvalue and
member pointer handling branch in
`BuildExpressionFromNonTypeTemplateArgumentValue`;
- handling of `StructuralValue` in `IsTypeDeclaredInsideVisitor`;
- filling in `SugaredConverted` along with `CanonicalConverted`
parameter in `Sema::CheckTemplateArgument`;
- minor cleanup in
`TemplateInstantiator::transformNonTypeTemplateParmRef`;
- `TemplateArgument` constructors refactored;
- `ODRHash` calculation for `UncommonValue`;
- USR generation for `UncommonValue`;
- more correct MS compatibility mangling algorithm (tested on MSVC ver.
19.35; toolset ver. 143);
- IR emitting fixed on using a subobject as a template argument when the
corresponding template parameter is used in an lvalue context;
- `noundef` attribute and opaque pointers in `template-arguments` test;
- analysis for C++17 mode is turned off for templates in
`warn-bool-conversion` test; in C++17 and C++20 mode, array reference
used as a template argument of pointer type produces template argument
of UncommonValue type, and
`BuildExpressionFromNonTypeTemplateArgumentValue` makes
`OpaqueValueExpr` for it, and `DiagnoseAlwaysNonNullPointer` cannot see
through it; despite of "These cases should not warn" comment, I'm not
sure about correct behavior; I'd expect a suggestion to replace `if` by
`if constexpr`;
- `temp.arg.nontype/p1.cpp` and `dr18xx.cpp` tests fixed.
Closes#77638, #24186
Rebased from <https://reviews.llvm.org/D156032>, see there for more
information.
Implements wording change in [CWG2137](https://wg21.link/CWG2137) in the
first commit.
This also implements an approach to [CWG2311](https://wg21.link/CWG2311)
in the second commit, because too much code that relies on `T{ T_prvalue}`
being an elision would break. Because that issue is still open and
the CWG issue doesn't provide wording to fix the issue, there may be
different behaviours on other compilers.
Fixes a regression from 69066ab3 in which we compared the template lists
of potential overloads before checkings their declaration contexts.
This would cause a crash when doing constraint substitution as part of
that template check, because we would try to refer to not yet
instantiated entities (the underlying cause is unclear).
This patch reorders (again) when we look at template parameter so we
don't do it when checkings friends in different lexical contexts.
Fixes#77953Fixes#78101
This patch replaces the `__arm_new_za`, `__arm_shared_za` and
`__arm_preserves_za` attributes in favour of:
* `__arm_new("za")`
* `__arm_in("za")`
* `__arm_out("za")`
* `__arm_inout("za")`
* `__arm_preserves("za")`
As described in https://github.com/ARM-software/acle/pull/276.
One change is that `__arm_in/out/inout/preserves(S)` are all mutually
exclusive, whereas previously it was fine to write `__arm_shared_za
__arm_preserves_za`. This case is now represented with `__arm_in("za")`.
The current implementation uses the same LLVM attributes under the hood,
since `__arm_in/out/inout` are all variations of "shared ZA", so can use
the existing `aarch64_pstate_za_shared` attribute in LLVM.
#77941 will add support for the new "zt0" state as introduced
with SME2.
Re-applies https://github.com/llvm/llvm-project/pull/69595 with extra
[diff](79181efd0d)
### New changes
Further relax ambiguities with a warning for member operators of a
template class (primary templates of such ops do not match). Eg:
```cpp
template <class T>
struct S {
template <typename OtherT>
bool operator==(const OtherT &rhs);
};
struct A : S<int> {};
struct B : S<bool> {};
bool x = A{} == B{}; // accepted with a warning.
```
This is important for making llvm build using previous clang versions in
C++20 mode (eg: this makes the commit
e558be51bab051d1471d92e967f8a2aecc13567a keep working with a warning
instead of an error).
### Description from https://github.com/llvm/llvm-project/pull/69595https://github.com/llvm/llvm-project/pull/68999 correctly computed
conversion sequence for reversed args to a template operator. This was a
breaking change as code, previously accepted in C++17, starts to break
in C++20.
Example:
```cpp
struct P {};
template<class S> bool operator==(const P&, const S &);
struct A : public P {};
struct B : public P {};
bool check(A a, B b) { return a == b; } // This is now ambiguous in C++20.
```
In order to minimise widespread breakages, as a clang extension, we had
previously accepted such ambiguities with a warning
(`-Wambiguous-reversed-operator`) for non-template operators. Due to the
same reasons, we extend this relaxation for template operators.
Fixes https://github.com/llvm/llvm-project/issues/53954
Prior to this, attempts to bind a bit-field to an NTTP of reference type
produced an error because references to subobjects in NTTPs are
disallowed. But C++20 allows references to subobjects in NTTPs generally
(see
[P1907R1](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1907r1.html)).
Without this change, implementing P1907R1 would cause a bug allowing
bit-fields to be bound to reference template arguments.
Extracted from https://reviews.llvm.org/D140996
Functions which correspond but have different template parameter lists
are not redeclarations.
Fixes a regression introduced by af4751
(The patch just moves the template parameters check above if the
signature check)
Fixes#76358
This reverts commit a1e2c6566305061c115954b048f2957c8d55cb5b.
Revert this patch due to regression. A testcase is:
`template <typename T>
class C {
explicit C() {};
};
template <> C<int>::C() {};
`
When deciding whether a previous function declaration is an overload or
override, implicit host/device attrs should not be considered.
This fixes the failure for the following code:
`template <typename T>
class C {
explicit C() {};
};
template <> C<int>::C() {};
`
The issue was introduced by
https://github.com/llvm/llvm-project/pull/72394
sine the template specialization is treated as overload due to implicit
host/device attrs are considered for overload/override differentiation.
This is a follow-up to febf5c97bba7910e796041c9518fce01f31ae826 and
another instance of #64121.
The added test only fails if Clang is built with libc++ and a enabled
debug check for strict weak ordering.
This patch introduces a new enumerator `Invalid = 0`, shifting other enumerators by +1. Contrary to how it might sound, this actually affirms status quo of how this enum is stored in `clang::Decl`:
```
/// If 0, we have not computed the linkage of this declaration.
/// Otherwise, it is the linkage + 1.
mutable unsigned CacheValidAndLinkage : 3;
```
This patch makes debuggers to not be mistaken about enumerator stored in this bit-field. It also converts `clang::Linkage` to a scoped enum.
This patch moves `ArraySizeModifier` before `Type` declaration so that it's complete at `ArrayTypeBitfields` declaration. It's also converted to scoped enum along the way.
This breaks C++20 build of LLVM by clang 17 and earlier.
Next steps should be reduce error to a warning for
https://godbolt.org/z/s99bvq4sG
b100ca6f219fda1fed5b92aba8471aa9a6ef8906 or similar should be reapplied
after the bug fix reached clang-18.
`S.getScopeForContext` determins the **active** scope associated with
the given `declContext`.
This fails to find the matching `operator!=` if candidate `operator==`
was found via ADL since that scope is not active.
Instead, just directly lookup using the namespace decl of `operator==`
Fixes#68901
Make it a strict weak order.
Fixes#64121.
Current implementation uses the definition of ordering from the C++ Standard.
The definition provides only a partial order and cannot be used in sorting
algorithms.
The debug builds of libc++ are capable of detecting that problem
and this failure was found when building Clang with libc++ and
those extra checks enabled, see #64121.
The new ordering is a strict weak order and still
pushes most interesting functions to the start of the list.
In some cases, it leads to better results, e.g.
```
struct Foo {
operator int();
operator const char*();
};
void test() { Foo() - Foo(); }
```
Now produces a list with two most relevant builtin operators at the top,
i.e. `operator-(int, int)` and `operator-(const char*, const char*)`.
Previously `operator-(const char*, const char*)` was the first element,
but `operator-(int, int)` was only the 13th element in the output.
This is a consequence of `stable_sort` now being able to compare those
two candidates, which are indistinguishable in the semantic partial order
despite being two local minimums in their respective comparable
subsets.
However, new implementation does not take into account some aspects of
C++ semantics, e.g. which function template is more specialized. This
can also lead to worse ordering sometimes.
Reviewed By: #clang-language-wg, aaron.ballman
Differential Revision: https://reviews.llvm.org/D159351
https://github.com/llvm/llvm-project/pull/68999 correctly computed
conversion sequence for reversed args to a template operators. This was
a breaking change as code, previously accepted in C++17, starts to break
in C++20.
Example:
```cpp
struct P {};
template<class S> bool operator==(const P&, const S &);
struct A : public P {};
struct B : public P {};
bool check(A a, B b) { return a == b; } // This is now ambiguous in C++20.
```
In order to minimise widespread breakages, as a clang extension, we had
previously accepted such ambiguities with a warning
(`-Wambiguous-reversed-operator`) for non-template operators. Due to the
same reasons, we extend this relaxation for template operators.
Fixes https://github.com/llvm/llvm-project/issues/53954
We associated conversion seq for args (when reversed) to the wrong
index.
This lead to clang believing reversed `operator==` a worse overload
candidate than the `operator==` without reversed args when both these
candidate were ambiguous.
Fixes https://github.com/llvm/llvm-project/issues/53954
If there are two guides, one of them generated from a non-templated
constructor
and the other from a templated constructor, then the standard gives
priority to
the first. Clang detected ambiguity before, now the correct guide is
chosen.
The correct behavior is described in this paper:
http://wg21.link/P0620R0
Example for the bug: http://godbolt.org/z/ee3e9qG78
As an unrelated minor change, fix the issue
https://github.com/llvm/llvm-project/issues/64020,
which could've led to incorrect behavior if further development inserted
code after a call to
`isAddressSpaceSubsetOf()`, which specified the two parameters in the
wrong order.
---------
Co-authored-by: hobois <horvath.botond.istvan@gmial.com>
We are currently rejecting an _Atomic qualified integer in a switch statment.
This fixes the issue by doing an Lvalue conversion before trying to match on the type.
Fixes#65557
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D159522
https://reviews.llvm.org/D158247 caused regressions for HIP on Windows
and was reverted.
A reduced test case is:
```
typedef void (__stdcall* funcTy)();
void invoke(funcTy f);
static void __stdcall callee() noexcept {
}
void foo() {
invoke(callee);
}
```
It is due to clang missing handling host/device attributes for calling
convention at a few places
This patch fixes that.
The goal of this change is to clean up some of the code surrounding
HLSL using CXXThisExpr as a non-pointer l-value. This change cleans up
a bunch of assumptions and inconsistencies around how the type of
`this` is handled through the AST and code generation.
This change is be mostly NFC for HLSL, and completely NFC for other
language modes.
This change introduces a new member to query for the this object's type
and seeks to clarify the normal usages of the this type.
With the introudction of HLSL to clang, CXXThisExpr may now be an
l-value and behave like a reference type rather than C++'s normal
method of it being an r-value of pointer type.
With this change there are now three ways in which a caller might need
to query the type of `this`:
* The type of the `CXXThisExpr`
* The type of the object `this` referrs to
* The type of the implicit (or explicit) `this` argument
This change codifies those three ways you may need to query
respectively as:
* CXXMethodDecl::getThisType()
* CXXMethodDecl::getThisObjectType()
* CXXMethodDecl::getThisArgType()
This change then revisits all uses of `getThisType()`, and in cases
where the only use was to resolve the pointee type, it replaces the
call with `getThisObjectType()`. In other cases it evaluates whether
the desired returned type is the type of the `this` expr, or the type
of the `this` function argument. The `this` expr type is used for
creating additional expr AST nodes and for member lookup, while the
argument type is used mostly for code generation.
Additionally some cases that used `getThisType` in simple queries could
be substituted for `getThisObjectType`. Since `getThisType` is
implemented in terms of `getThisObjectType` calling the later should be
more efficient if the former isn't needed.
Reviewed By: aaron.ballman, bogner
Differential Revision: https://reviews.llvm.org/D159247
This reverts commit de0df639724b10001ea9a74539381ea494296be9.
It was reverted due to regression in HIP unit test on Windows:
In file included from C:\hip-tests\catch\unit\graph\hipGraphClone.cc:37:
In file included from C:\hip-tests\catch\.\include\hip_test_common.hh:24:
In file included from C:\hip-tests\catch\.\include/hip_test_context.hh:24:
In file included from C:/install/native/Release/x64/hip/include\hip/hip_runtime.h:54:
C:/dk/win\vc\14.31.31107\include\thread:76:70: error: cannot initialize a parameter of type '_beginthreadex_proc_type' (aka 'unsigned int (*)(void *) __attribute__((stdcall))') with an lvalue of type 'const unsigned int (*)(void *) noexcept __attribute__((stdcall))': different exception specifications
76 | reinterpret_cast<void*>(_CSTD _beginthreadex(nullptr, 0, _Invoker_proc, _Decay_copied.get(), 0, &_Thr._Id));
| ^~~~~~~~~~~~~
C:\hip-tests\catch\unit\graph\hipGraphClone.cc:290:21) &>' requested here
90 | _Start(_STD forward<_Fn>(_Fx), _STD forward<_Args>(_Ax)...);
| ^
C:\hip-tests\catch\unit\graph\hipGraphClone.cc:290:21) &, 0>' requested here
311 | std::thread t(lambdaFunc);
| ^
C:/dk/win\ms_wdk\e22621\Include\10.0.22621.0\ucrt\process.h:99:40: note: passing argument to parameter '_StartAddress' here
99 | _In_ _beginthreadex_proc_type _StartAddress,
| ^
1 error generated when compiling for gfx1030.
Currently, clang does not resolve certain overloaded functions correctly in the initializer
of global variables, e.g.
template<typename T1, typename U>
T1 mypow(T1, U);
__attribute__((device)) double mypow(double, int);
double t_extent = mypow(1.0, 2);
In the above example, mypow is supposed to resolve to the host version
but clang resolves it to the device version instead, and emits an error
(https://godbolt.org/z/17xxzaa67).
However, if the variable is assigned in a host function, there is no error.
The discrepancy in overloading resolution inside and outside of
a function is due to clang not accounting for the host/device target
when resolving functions called in the initializer of a global variable.
This patch introduces a global host/device target context for CUDA/HIP
for functions called outside of functions. For global variable initialization,
it is determined by the host/device attribute of the variable. For other
situations, a default value of host_device is sufficient.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D158247
Fixes: SWDEV-416731
nvcc allows using std::malloc and std::free in device code.
When std::malloc or std::free is passed as a template
function argument with template argument deduction,
there is no diagnostics. e.g.
__global__ void kern() {
void *p = std::malloc(1);
std::free(p);
}
int main()
{
std::shared_ptr<float> a;
a = std::shared_ptr<float>(
(float*)std::malloc(sizeof(float) * 100),
std::free
);
return 0;
}
However, the same code fails to compile with clang
(https://godbolt.org/z/1roGvo6YY). The reason is
that clang does not have logic to choose a function
argument from an overloaded set of candidates
based on host/device attributes for template argument
deduction.
Currently, clang does have a logic to choose a candidate
based on the constraints of the candidates. This patch
extends that logic to account for the CUDA host/device-based
preference.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D154300
This patch adds all the language-level function keywords defined in:
https://github.com/ARM-software/acle/pull/188 (merged)
https://github.com/ARM-software/acle/pull/261 (update after D148700 landed)
The keywords are used to control PSTATE.ZA and PSTATE.SM, which are
respectively used for enabling the use of the ZA matrix array and Streaming
mode. This information needs to be available on call sites, since the use
of ZA or streaming mode may have to be enabled or disabled around the
call-site (depending on the IR attributes set on the caller and the
callee). For calls to functions from a function pointer, there is no IR
declaration available, so the IR attributes must be added explicitly to the
call-site.
With the exception of '__arm_locally_streaming' and '__arm_new_za' the
information is part of the function's interface, not just the function
definition, and thus needs to be propagated through the
FunctionProtoType::ExtProtoInfo.
This patch adds the defintions of these keywords, as well as codegen and
semantic analysis to ensure conversions between function pointers are valid
and that no conflicting keywords are set. For example, '__arm_streaming'
and '__arm_streaming_compatible' are mutually exclusive.
Differential Revision: https://reviews.llvm.org/D127762