This reverts commit f4fe26ddfde0d5bb1c512e89a9cdd442a7518ee1.
Reason: 4a63f4d301c0e044073e1b1f8f110015ec1778a1 reverted "[HLSL] set alwaysinline on HLSL functions (#106588)" due to a buildbot failure; this test (which builds upon the reverted patch) also needs to be reverted.
The Alwaysinline change made the mangled form of entry points get
removed. The StructuredBuffer-subscript.hlsl test was introduced in the
meantime depending on that version of the entry point. This revises it
in the same way as RWBuffer-subscript
Follow up to #89282
HLSL inlines all its functions by default. This uses the alwaysinline
attribute to make the alwaysinliner pass inline any function not
explicitly marked noinline by the user or autogeneration. The
alwayslinline marking takes place in `SetLLVMFunctionAttributesForDefinitions`
where all other inlining interactions are determined.
The outermost entry function is marked noinline because there's no
reason to inline it. Any user calls to an entry function will instead call
the internal mangled version of the entry function.
Adds tests for function and constructor inlining and augments some
existing tests to verify correct inlining of implicitly created
functions as well.
Incidentally restore RUN line that I believe was mistakenly removed as
part of #88918Fixes#89282
Implements support for the `asuint` HLSL function casting behaviour.
Addressing the `splitdouble` scenario will be addressed in a future PR.
Fixes: #70097
---------
Co-authored-by: Joao Saffran <jderezende@microsoft.com>
Co-authored-by: Justin Bogner <mail@justinbogner.com>
Under HLSL 202x+ move assignment can occur and when targeting `this`
move assignment was generating some really odd errors. This corrects the
errors by properly generating the `this` object reference for HLSL and
always treating it as a reference.
This mirrors the implementation added eariler for copy assignment, and
extends the test case to cover both move and copy assignment under HLSL
202x+.
HLSL allows implicit conversions to truncate vectors to scalar
pr-values. These conversions are scored as vector truncations and should
warn appropriately.
This change allows forming a truncation cast to a pr-value, but not an
l-value. Truncating a vector to a scalar is performed by loading the
first element of the vector and disregarding the remaining elements.
Fixes#102964
This PR adds `StructuredBuffer` to `HLSLExternalSemaSource.cpp`, by
copying the logic from RWBuffer but just replacing the name with
StructuredBuffer. The change now allows StructuredBuffers to be defined
in HLSL, though they function the same as RWBuffers.
Further work to apply the appropriate attributes that distinguish
StructuredBuffers from other Buffer types will be deferred.
This improves our position on
https://github.com/llvm/llvm-project/issues/106189
HLSL 202x inherits from C++11, which generates additional loop hint
information for loops that must progress. Since HLSL 202x is going to be
the default for Clang we want to make sure all our tests pass with it.
Required for https://github.com/llvm/llvm-project/issues/108044
Adds target codegen info class for DirectX. For now it always translates
`__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)`
(`RWBuffer<int>`). More work is needed to determine the actual target
exp type and parameters based on the resource handle attributes.
Part 1/2 of #95952
partially fixes#70078
### Changes
- Implemented `sign` clang builtin
- Linked `sign` clang builtin with `hlsl_intrinsics.h`
- Added sema checks for `sign` to `CheckHLSLBuiltinFunctionCall` in
`SemaChecking.cpp`
- Add codegen for `sign` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp`
- Add codegen tests to `clang/test/CodeGenHLSL/builtins/sign.hlsl`
- Add sema tests to `clang/test/SemaHLSL/BuiltIns/sign-errors.hlsl`
### Related PRs
- https://github.com/llvm/llvm-project/pull/101987
- https://github.com/llvm/llvm-project/pull/101988
### Discussion
- Should there be a `usign` intrinsic that handles the unsigned cases?
Add nuw attribute to inbounds GEPs where the expression used to form the
GEP is an addition of unsigned indices.
Relands #105496, which was reverted because it exposed a miscompilation
arising from #98608. This is now fixed by #106512.
This commits add the WaveIsFirstLane() hlsl intrinsinc. This intrinsic
uses the convergence intrinsincs for the SPIR-V backend. On the DXIL
side, I'm not sure what the strategy is for convergence, so I
implemented that like in DXC: a normal builtin function.
Signed-off-by: Nathan Gauër <brioche@google.com>
HLSL output parameters are denoted with the `inout` and `out` keywords
in the function declaration. When an argument to an output parameter is
constructed a temporary value is constructed for the argument.
For `inout` pamameters the argument is initialized via copy-initialization
from the argument lvalue expression to the parameter type. For `out`
parameters the argument is not initialized before the call.
In both cases on return of the function the temporary value is written
back to the argument lvalue expression through an implicit assignment
binary operator with casting as required.
This change introduces a new HLSLOutArgExpr ast node which represents
the output argument behavior. The OutArgExpr has three defined children:
- An OpaqueValueExpr of the argument lvalue expression.
- An OpaqueValueExpr of the copy-initialized parameter.
- A BinaryOpExpr assigning the first with the value of the second.
Fixes#87526
---------
Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
Co-authored-by: John McCall <rjmccall@gmail.com>
Previously, functions named "main" got the NoRecurse attribute
consistent with the behavior of C++, which HLSL largely follows.
However, standard recursion is not allowed in HLSL, so all functions
should really have this attribute. This doesn't prevent recursion, but
rather signals that these functions aren't expected to recurse.
Practically, this was done so that entry point functions named "main"
would have all have the same attributes as otherwise identical entry
points with other names.
This required small changes to the this assignment tests because they no
longer generate so many attribute sets since more of them match.
related to #105244
but done to simplify testing for #89806
Thread init guards are generated for local static variables when using
the Microsoft CXX ABI. This ABI is also used for HLSL generation, but
DXIL doesn't need the corresponding _Init_thread_header/footer calls and
doesn't really have a way to handle them in its output targets.
This modifies the language ops when the target is DXIL to exclude this
so that they won't be generated and an alternate guardvar method is used
that is compatible with the usage.
Done to facilitate testing for #89806, but isn't really related
This adds the SPIRV fdot, sdot, and udot intrinsics and allows them to
be created at codegen depending on the target architecture. This
required moving some of the DXIL-specific choices to DXIL instruction
expansion out of codegen and providing it with at a more generic fdot
intrinsic as well.
Removed some stale comments that gave the obsolete impression that type
conversions should be expected to match overloads.
The SPIRV intrinsic handling involves generating multiply and add
operations for integers and the existing OpDot operation for floating
point.
New tests for generating SPIRV float and integer dot intrinsics are
added as well as expanding HLSL tests to include SPIRV generation
Used new dot product intrinsic generation to implement normalize() in SPIRV
Incidentally changed existing dot intrinsic definitions to use
DefaultAttrsIntrinsic to match the newly added inrinsics
Fixes#88056
Implement support for HLSL intrinsic saturate.
Implement DXIL codegen for the intrinsic saturate by lowering it to DXIL
Op dx.saturate.
Implement SPIRV codegen by transforming saturate(x) to clamp(x, 0.0f,
1.0f).
Add tests for DXIL and SPIRV CodeGen.
Marks exported functions with `"hlsl.export"` attribute. This
information will be later used by DXILFinalizeLinkage pass (coming soon)
to determine which functions should have internal linkage in the final
DXIL code.
Related to #llvm/llvm-project#92071
Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.
Regression tests are updated using update scripts where possible, and by
find + replace where not.
This PR adds the length intrinsic and an HLSL function that uses it.
The SPIRV implementation is left for a future PR.
This PR addresses #99134, though some SPIR-V changes still need to be
made to complete the task. Below is how this PR addresses #99134.
- "Implement `length` clang builtin" was done by defining `HLSLL ength`
in Builtins.td
- "Link `length` clang builtin with hlsl_intrinsics.h" was done by using
the alias attribute to make `length` an alias of
`__builtin_hlsl_elementwise_length` in hlsl_intrinsics.h
- "Add sema checks for `length` to `CheckHLSLBuiltinFunctionCall` in
`SemaChecking.cpp` " was done, but in this case not in SemaChecking.cpp,
rather SemaHLSL.cpp. A case was added to the builtin to check for
semantic failures, and set `TheCall` up to have the right return type.
- "Add codegen for `length` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp`"
was done. For scalars, fabs is emitted, otherwise, length is emitted.
- "Add codegen tests to `clang/test/CodeGenHLSL/builtins/length.hlsl`
was done to test that `length` in HLSL emits the right intrinsic.
- "Add sema tests to `clang/test/SemaHLSL/BuiltIns/length-errors.hlsl`"
was done to test for diagnostics emitted in SemaHLSL.cpp
- "Create the `int_dx_length` intrinsic in `IntrinsicsDirectX.td`" was
done. Specifying return types and parameter types was difficult, but
`idot` was used for reference, and `llvm\include\llvm\IR\Intrinsics.td`
contains all the ways to express return / parameter types.
- "Create an intrinsic expansion of `int_dx_length` in
`llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp`" was done, and was
mostly derived by looking at `TranslateLength` in `HLOperationLower.cpp`
in the DXC codebase.
- "Create the `length.ll` and `length_errors.ll` tests in
`llvm/test/CodeGen/DirectX/`" was done by taking the DXIL output of
`clang/test/CodeGenHLSL/builtins/length.hlsl` and running `opt -S
-dxil-intrinsic-expansion` and ` opt -S -dxil-op-lower` on it, checking
for how the length intrinsic was either expanded or lowered.
- "Create the `int_spv_length` intrinsic in `IntrinsicsSPIRV.td`" was
done by copying `IntrinsicsDirectX.td`.
---------
Co-authored-by: Justin Bogner <mail@justinbogner.com>
There are two problems with _BitInt prior to this patch:
1. For at least some values of N, we cannot use LLVM's iN for the type
of struct elements, array elements, allocas, global variables, and so
on, because the LLVM layout for that type does not match the high-level
layout of _BitInt(N).
Example: Currently for i128:128 targets correct implementation is
possible either for __int128 or for _BitInt(129+) with lowering to iN,
but not both, since we have now correct implementation of __int128 in
place after a21abc7.
When this happens, opaque [M x i8] types used, where M =
sizeof(_BitInt(N)).
2. LLVM doesn't guarantee any particular extension behavior for integer
types that aren't a multiple of 8. For this reason, all _BitInt types
are now have in-memory representation that is a whole number of bytes.
I.e. for example _BitInt(17) now will have memory layout type i32.
This patch also introduces concept of load/store type and adds an API to
CodeGenTypes that returns the IR type that should be used for load and
store operations. This is particularly useful for the case when a
_BitInt ends up having array of bytes as memory layout type. For
_BitInt(N), let M = sizeof(_BitInt(N)), and let BITS = M * 8. Loads and
stores of iM would both (1) produce far better code from the backends
and (2) be far more optimizable by IR passes than loads and stores of [M
x i8].
Fixes https://github.com/llvm/llvm-project/issues/85139
Fixes https://github.com/llvm/llvm-project/issues/83419
---------
Co-authored-by: John McCall <rjmccall@gmail.com>
This PR reworks HLSL's implicit conversion sequences. Initially I was
seeking to match DXC's behavior more closely, but that was leading to a
pile of special case rules to tie-break ambiguous cases that should
really be left as ambiguous. We've decided that we're going to break
compatibility with DXC here, and we may port this new behavior over to
DXC instead.
This change is a bit closer to C++'s overload resolution rules, but it
does have a bit of nuance around how dimension adjustment conversions
are ranked. Conversion sequence ranks for HLSL are:
* Exact match
* Scalar Widening (i.e. splat)
* Promotion
* Scalar Widening with Promotion
* Conversion
* Scalar Widening with Conversion
* Dimension Reduction (i.e. truncation)
* Dimension Reduction with Promotion
* Dimension Reduction with Conversion
In this implementation I've folded the disambiguation into the
conversion sequence ranks which does add some complexity as compared to
C++, however this avoids needing to add special casing in
`CompareStandardConversionSequences`. I believe the added conversion
rank values provide a simpler approach, but feedback is appreciated.
The HLSL language spec updates are in the PR here:
https://github.com/microsoft/hlsl-specs/pull/261
Implements `export` keyword in HLSL.
There are two ways the `export` keyword can be used:
1. On individual function declarations
```
export void f() {}
```
2. On a group of function declaration:
```
export {
void f1();
void f2() {}
}
```
Functions declared with the `export` keyword have external linkage. The
implementation does not include validation of when a function can or
cannot be exported, such as when it has resource argument or semantic
annotations. That will be covered by llvm/llvm-project#93330.
Currently all function declarations in global or named namespaces have
external linkage by default so there are no specific code changes
required right now to make sure exported function have external linkage
as well. That will change as part of llvm/llvm-project#92071. Any
additional changes to make sure exported functions still have external
linkage will be done as part of this work item.
Fixes#92812
The data-layout independent constant folding currently has some rather
gnarly code for canonicalizing GEP indices to reduce "notional
overindexing", and then infers inbounds based on that canonicalization.
Now that we canonicalize to i8 GEPs, this canonicalization is
essentially useless, as we'll discard it as soon as the GEP hits the
data-layout aware constant folder anyway. As such, I'd like to remove
this code entirely.
This shouldn't have any impact on optimization capabilities.
PR #80680 added bits in the codegen to lazily add convergence intrinsics
when required. This logic relied on the LoopStack. The issue is when
parsing the condition, the loopstack doesn't yet reflect the correct
values, as expected since we are not yet in the loop.
However, convergence tokens should sometimes already be available. The
solution which seemed the simplest is to greedily generate the tokens
when we generate SPIR-V.
Fixes#88144
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
This change set restores commit 080978dd2067d0c9ea7e229aa7696c2480d89ef1 that was reverted to address ASAN
failures and includes a fix for the ASAN failures.
Following is the description of the change:
An earlier commit provided a way to decouple DXIL version from Shader
Model version by representing the DXIL version as `SubArch` in the DXIL
Target Triple and adding corresponding valid DXIL Arch types.
This change constructs DXIL target triple with DXIL version that is
deduced from Shader Model version specified in the following scenarios:
1. When compilation target profile is specified:
For e.g., DXIL target triple `dxilv1.8-unknown-shader6.8-library` is
constructed when `-T lib_6_8` is specified.
2. When DXIL target triple without DXIL version is specified:
For e.g., DXIL target triple `dxilv1.8-pc-shadermodel6.8-library` is
constructed when `-mtriple=dxil-pc-shadermodel6.8-library` is specified.
Updated relevant HLSL tests that check for target triple.
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
If you want an overarching view of how this will all connect see:
https://github.com/llvm/llvm-project/pull/90088
Changes:
- `clang/docs/LanguageExtensions.rst` - Document the new elementwise tan
builtin.
- `clang/include/clang/Basic/Builtins.td` - Implement the tan builtin.
- `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the tan intrinsic on uses
of the builtin
- `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the tan builtin
with the equivalent hlsl apis
- `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks as well as
HLSL specifc sema checks to the tan builtin
- `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
- `llvm/docs/LangRef.rst` - Document the tan intrinsic
An earlier commit provided a way to decouple DXIL version from Shader
Model version by representing the DXIL version as `SubArch` in the DXIL
Target Triple and adding corresponding valid DXIL Arch types.
This change constructs DXIL target triple with DXIL version that is
deduced from Shader Model version specified in the following scenarios:
1. When compilation target profile is specified:
For e.g., DXIL target triple `dxilv1.8-unknown-shader6.8-library` is
constructed when `-T lib_6_8` is specified.
2. When DXIL target triple without DXIL version is specified:
For e.g., DXIL target triple `dxilv1.8-pc-shadermodel6.8-library` is
constructed when `-mtriple=dxil-pc-shadermodel6.8-library` is specified.
Updated relevant HLSL tests that check for target triple.
Validated that Clang (`check-clang`) and LLVM (`check-llvm`) regression
tests pass.
- `CGBuiltin.cpp` - Switch to using
`CGM.getHLSLRuntime().get##NAME##Intrinsic()`
- `CGHLSLRuntime.h` - Add any to backend intrinsic abstraction
- `IntrinsicsSPIRV.td` - Add any intrinsic to SPIR-V.
- `SPIRVInstructionSelector.cpp` - Add means of selecting any intrinsic.
Any and All share the same behavior up to the opCode. They are only
different in vector cases.
Completes #88045