[HLSL][Doc] Document multi-argument resolution (#104474)
This updates the expected diffferences document to capture the difference in multi-argument overload resolution between Clang and DXC. Fixes #99530
This commit is contained in:
parent
d66765ddf1
commit
02654f7370
@ -54,6 +54,19 @@ HLSL 202x based on proposal
|
||||
and
|
||||
`0008 <https://github.com/microsoft/hlsl-specs/blob/main/proposals/0008-non-member-operator-overloading.md>`_.
|
||||
|
||||
The largest difference between Clang and DXC's overload resolution is the
|
||||
algorithm used for identifying best-match overloads. There are more details
|
||||
about the algorithmic differences in the :ref:`multi_argument_overloads` section
|
||||
below. There are three high level differences that should be highlighted:
|
||||
|
||||
* **There should be no cases** where DXC and Clang both successfully
|
||||
resolve an overload where the resolved overload is different between the two.
|
||||
* There are cases where Clang will successfully resolve an overload that DXC
|
||||
wouldn't because we've trimmed the overload set in Clang to remove ambiguity.
|
||||
* There are cases where DXC will successfully resolve an overload that Clang
|
||||
will not for two reasons: (1) DXC only generates partial overload sets for
|
||||
builtin functions and (2) DXC resolves cases that probably should be ambiguous.
|
||||
|
||||
Clang's implementation extends standard overload resolution rules to HLSL
|
||||
library functionality. This causes subtle changes in overload resolution
|
||||
behavior between Clang and DXC. Some examples include:
|
||||
@ -71,18 +84,23 @@ behavior between Clang and DXC. Some examples include:
|
||||
uint U;
|
||||
int I;
|
||||
float X, Y, Z;
|
||||
double3 A, B;
|
||||
double3 R, G;
|
||||
}
|
||||
|
||||
void twoParams(int, int);
|
||||
void twoParams(float, float);
|
||||
void takesSingleDouble(double);
|
||||
void takesSingleDouble(vector<double, 1>);
|
||||
|
||||
void scalarOrVector(double);
|
||||
void scalarOrVector(vector<double, 2>);
|
||||
|
||||
export void call() {
|
||||
halfOrInt16(U); // DXC: Fails with call ambiguous between int16_t and uint16_t overloads
|
||||
// Clang: Resolves to halfOrInt16(uint16_t).
|
||||
halfOrInt16(I); // All: Resolves to halfOrInt16(int16_t).
|
||||
half H;
|
||||
halfOrInt16(I); // All: Resolves to halfOrInt16(int16_t).
|
||||
|
||||
#ifndef IGNORE_ERRORS
|
||||
halfOrInt16(U); // All: Fails with call ambiguous between int16_t and uint16_t
|
||||
// overloads
|
||||
|
||||
// asfloat16 is a builtin with overloads for half, int16_t, and uint16_t.
|
||||
H = asfloat16(I); // DXC: Fails to resolve overload for int.
|
||||
// Clang: Resolves to asfloat16(int16_t).
|
||||
@ -94,21 +112,28 @@ behavior between Clang and DXC. Some examples include:
|
||||
|
||||
takesDoubles(X, Y, Z); // Works on all compilers
|
||||
#ifndef IGNORE_ERRORS
|
||||
fma(X, Y, Z); // DXC: Fails to resolve no known conversion from float to double.
|
||||
fma(X, Y, Z); // DXC: Fails to resolve no known conversion from float to
|
||||
// double.
|
||||
// Clang: Resolves to fma(double,double,double).
|
||||
#endif
|
||||
|
||||
double D = dot(A, B); // DXC: Resolves to dot(double3, double3), fails DXIL Validation.
|
||||
double D = dot(R, G); // DXC: Resolves to dot(double3, double3), fails DXIL Validation.
|
||||
// FXC: Expands to compute double dot product with fmul/fadd
|
||||
// Clang: Resolves to dot(float3, float3), emits conversion warnings.
|
||||
// Clang: Fails to resolve as ambiguous against
|
||||
// dot(half, half) or dot(float, float)
|
||||
#endif
|
||||
|
||||
#ifndef IGNORE_ERRORS
|
||||
tan(B); // DXC: resolves to tan(float).
|
||||
// Clang: Fails to resolve, ambiguous between integer types.
|
||||
|
||||
twoParams(I, X); // DXC: resolves twoParams(int, int).
|
||||
// Clang: Fails to resolve ambiguous conversions.
|
||||
#endif
|
||||
|
||||
double D;
|
||||
takesSingleDouble(D); // All: Fails to resolve ambiguous conversions.
|
||||
takesSingleDouble(R); // All: Fails to resolve ambiguous conversions.
|
||||
|
||||
scalarOrVector(D); // All: Resolves to scalarOrVector(double).
|
||||
scalarOrVector(R); // All: Fails to resolve ambiguous conversions.
|
||||
}
|
||||
|
||||
.. note::
|
||||
@ -119,3 +144,75 @@ behavior between Clang and DXC. Some examples include:
|
||||
diagnostic notifying the user of the conversion rather than silently altering
|
||||
precision relative to the other overloads (as FXC does) or generating code
|
||||
that will fail validation (as DXC does).
|
||||
|
||||
.. _multi_argument_overloads:
|
||||
|
||||
Multi-Argument Overloads
|
||||
------------------------
|
||||
|
||||
In addition to the differences in single-element conversions, Clang and DXC
|
||||
differ dramatically in multi-argument overload resolution. C++ multi-argument
|
||||
overload resolution behavior (or something very similar) is required to
|
||||
implement
|
||||
`non-member operator overloading <https://github.com/microsoft/hlsl-specs/blob/main/proposals/0008-non-member-operator-overloading.md>`_.
|
||||
|
||||
Clang adopts the C++ inspired language from the
|
||||
`draft HLSL specification <https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf>`_,
|
||||
where an overload ``f1`` is a better candidate than ``f2`` if for all arguments the
|
||||
conversion sequences is not worse than the corresponding conversion sequence and
|
||||
for at least one argument it is better.
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
cbuffer CB {
|
||||
int I;
|
||||
float X;
|
||||
float4 V;
|
||||
}
|
||||
|
||||
void twoParams(int, int);
|
||||
void twoParams(float, float);
|
||||
void threeParams(float, float, float);
|
||||
void threeParams(float4, float4, float4);
|
||||
|
||||
export void call() {
|
||||
twoParams(I, X); // DXC: resolves twoParams(int, int).
|
||||
// Clang: Fails to resolve ambiguous conversions.
|
||||
|
||||
threeParams(X, V, V); // DXC: resolves threeParams(float4, float4, float4).
|
||||
// Clang: Fails to resolve ambiguous conversions.
|
||||
}
|
||||
|
||||
For the examples above since ``twoParams`` called with mixed parameters produces
|
||||
implicit conversion sequences that are { ExactMatch, FloatingIntegral } and {
|
||||
FloatingIntegral, ExactMatch }. In both cases an argument has a worse conversion
|
||||
in the other sequence, so the overload is ambiguous.
|
||||
|
||||
In the ``threeParams`` example the sequences are { ExactMatch, VectorTruncation,
|
||||
VectorTruncation } or { VectorSplat, ExactMatch, ExactMatch }, again in both
|
||||
cases at least one parameter has a worse conversion in the other sequence, so
|
||||
the overload is ambiguous.
|
||||
|
||||
.. note::
|
||||
|
||||
The behavior of DXC documented below is undocumented so this is gleaned from
|
||||
observation and a bit of reading the source.
|
||||
|
||||
DXC's approach for determining the best overload produces an integer score value
|
||||
for each implicit conversion sequence for each argument expression. Scores for
|
||||
casts are based on a bitmask construction that is complicated to reverse
|
||||
engineer. It seems that:
|
||||
|
||||
* Exact match is 0
|
||||
* Dimension increase is 1
|
||||
* Promotion is 2
|
||||
* Integral -> Float conversion is 4
|
||||
* Float -> Integral conversion is 8
|
||||
* Cast is 16
|
||||
|
||||
The masks are or'd against each other to produce a score for the cast.
|
||||
|
||||
The scores of each conversion sequence are then summed to generate a score for
|
||||
the overload candidate. The overload candidate with the lowest score is the best
|
||||
candidate. If more than one overload are matched for the lowest score the call
|
||||
is ambiguous.
|
||||
|
Loading…
x
Reference in New Issue
Block a user