fixes https://github.com/llvm/llvm-project/issues/179859
For matrix types we need to check the language mode so we can change the
matrix memory layout to arrays of vectors. To make this play nice with
how the rest of clang treats matrices we need to modify the
MaybeConvertMatrixAddress and the CreateMemTemp function to know how to
reconstruct a flattened vector.
Rest of changes is just test updates.
Fixes#174629
This PR is similar to that of #169144 but for matrices.
When storing to a matrix element or matrix row, `insertelement`
instructions have been replaced by GEPs followed by stores to individual
matrix elements. There is no longer storing of the entire matrix to
memory all at once, thus avoiding data races when writing to independent
matrix elements from multiple threads.
Fixes#175808
This PR adds support for boolean matrix splats by adding tests and
fixing a bug in `CodeGenFunction::EmitToMemory` when the type of a
boolean matrix already matches the type expected of a load/store.
This PR also addresses the todo comment in `clang/lib/Sema/SemaExpr.cpp`
regarding support for boolean matrix splats by removing the comment
altogether since it is not necessary.
---------
Co-authored-by: Farzon Lotfi <farzonlotfi@microsoft.com>
Fixes#172711
Fixes the type mismatch issues preventing single matrix subscript
getters and setters from working with boolean matrices.
The changes from this PR also happens to make matrix splats work for
boolean matrices, but adding tests for that and (re)introducing
boolean-matrix-specific sema will be relegated to its own PR.
fixes#170777
If we don't use vector type and instead continue to pass on the matrix
type when we enter `EmitExtVectorElementExpr` Then we don't need to
store the row and column length on the LValue.
Using the Matrix type means we can reuse the isMatrixRow() cases in
EmitLoadOfLValue and EmitStoreThroughLValue and not have to support a
new lValue that is a hybrid between the ExtVectorElt and MatrixRow
cases.
All we need to do to support this is pass the list of column indices as
a `ConstantDataVector` and check the size of this Vector to know how
many column iterations we need to do. Further just index into the vector
to fetch the right encoded element index value.
fixes#172805
For the constant case if we know the row index then we can compute the
offsets via `E->getEncodedElementAccess(Elts)`. We had to also add
column and row sizes to LValue so that we could compute the right index.
the emitter for `MatrixSingleSubscriptExpr` collects the sizes off the
type and passes it to `MakeMatrixRow`.
---------
Co-authored-by: Justin Bogner <mail@justinbogner.com>
fixes#167617
In DXC HLSL supports different indexing modes via codegen for its
equivalent of the MatrixSubscriptExpr when the /Zpr and /Zpc flags are
used see: https://godbolt.org/z/bz5Y5WG36.
This change modifies EmitMatrixSubscriptExpr to consider the
MatrixRowMajor/MatrixColMajor Layout flags before generating an index.
Similarly it introduces `createRowMajorIndex` and
`createColumnMajorIndex` in `MatrixBuilder.h` for use in
`VisitMatrixSubscriptExpr`.
This change creates an XFAIL that will be addressed by #172805
This part of the change was not intended to make it in. It was a way to
optimize for constant index setters\getters but had the unintended
consequence of doing swizzles without considering the encoded element
offsets.
We need to be calling `MakeExtVectorElt` in `EmitExtVectorElementExpr`
so that we can call `E->getEncodedElementAccess(Elts);` off of the
`ExtVectorElementExpr` Node.
Calling it in `EmitMatrixSingleSubscriptExpr` was wrong because we doing
have the vector encoded elementts yet. But also to get the right index
we need a way to pass on the matrix column and row lengths so
`MakeMatrixRow` needed to be expanded to store row and column sizes on
the LValue.
fixes#166206
- Add swizzle support if row index is constant
- Add test cases
- Add new AST type
- Add new LValue for Matrix Row Type
- TODO: Make the new LValue a dynamic index version of ExtVectorElt
fixes#168960
Adds `ICK_HLSL_Matrix_Splat` and hooks it up to
`PerformImplicitConversion` and `IsMatrixConversion`. Map these to
`CK_HLSLAggregateSplatCast`.
fixes#168737fixes#168755
This change fixes adds support for Matrix truncations via the
ICK_HLSL_Matrix_Truncation enum. That ends up being most of the files
changed.
It also allows Matrix as an HLSL Elementwise cast as long as the cast
does not perform a shape transformation ie 3x2 to 2x3.
Tests for the new elementwise and truncation behavior were added. As
well as sema tests to make sure we error n the shape transformation
cast.
I am punting right now on the ConstExpr Matrix support. That will need
to be addressed later. Will file a seperate issue for that if reviewers
agree it can wait.
When individual elements of a vector are updated via vector swizzle, it needs to be handled as separate store operations to the individual vector elements.
Clang treats vectors as one unit, so if a part of a vector needs to be updated, the whole vector is loaded, some elements modified, and then the whole vector is stored.
In HLSL vector elements are handled separately. We need to avoid this load/modify/store sequence to prevent overwriting other vector elements that might be getting updated in parallel.
Fixes#152815
fixes#165663
The bug was that we were using the initalizer lists index to populate
the matrix. This meant that [0..n] would coorelate to [0..n] indicies of
the flattened matrix. Hence why we were seeing the Row-major order: [ 0
1 2 3 4 5 ]. To fix this we can simply converted these indicies to the
Column-major order: [ 0 3 1 4 2 5 ].
The net effect of this is the layout of the matrix is now correct and we
don't need to change the MatrixSubscriptExpr indexing scheme.
---------
Co-authored-by: Deric C. <cheung.deric@gmail.com>
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
Add a new langopt NativeInt16Type to control support for 16 bit
integers.
Enable by default for all languages but HLSL.
HLSL defines uint16_t and int16_t as a typedef of short. If
-enable-16bit-types is not used, the typedefs don't exist so int16_t and
uint16_t can't be used. However, short was still allowed. This change
will produce an error 'unknown type name short' if -enable-16bit-types
isn't used.
Update failing tests.
Add new test.
Closes#81779
This simplifies and cleans up the vector list initialization behavior.
This simplifies the work we do in SemaInit by just relying on SemaHLSL's
initialization list flattening. This change fixes some outstanding
limitations by supporting structure to vector initialization, but
re-introduces HLSL's limitations around overload resolution in
initializers.
---------
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
Adds support for elementwise and aggregate splat casting struct types
with bitfields. Replacing existing Flattening function which used to
produce a list of GEPs representing a flattened object with one that
produces a list of LValues representing a flattened object. The LValues
can be used by EmitStoreThroughLValue and EmitLoadOfLValue, ensuring
bitfields are properly loaded and stored. This also simplifies the code
in the elementwise and aggregate splat casting functions.
Closes#125986
Fixes#148063 by preventing the ByVal attribute from being placed on out
and inout function parameters which causes them to be eliminated by the
Dead Store Elimination (DSE) pass.
Implements
https://github.com/llvm/wg-hlsl/blob/main/proposals/0026-symbol-visibility.md.
The change is to stop using the `hlsl.export` attribute. Instead,
symbols with "program linkage" in HLSL will have export linkage with
default visibility, and symbols with "external linkage" in HLSL will
have export linkage with hidden visibility.
When an HLSL Init list is producing a Scalar, handle OpaqueValueExprs in
the Init List with 'emitInitListOpaqueValues'
Copied from 'AggExprEmitter::VisitCXXParenListOrInitListExpr'
Closes#136408
---------
Co-authored-by: Chris B <beanz@abolishcrlf.org>
fixes#135122
SemaExpr.cpp - Make all doubles fail. Add sema support for float scalars
and vectors when language mode is HLSL.
CGExprScalar.cpp - Allow emit frem when language mode is HLSL.
Make the memory representation of boolean vectors in HLSL, vectors of
i32.
Allow boolean swizzling for boolean vectors in HLSL.
Add tests for boolean vectors and boolean vector swizzling.
Closes#91639
This PR implements HLSL's initialization list behvaior as specified in
the draft language specifcation under
[*Decl.Init.Agg*](https://microsoft.github.io/hlsl-specs/specs/hlsl.html#Decl.Init.Agg).
This behavior is a bit unusual for C/C++ because intermediate braces in
initializer lists are ignored and a whole array of additional
conversions occur unintuitively to how initializaiton works in C.
The implementaiton in this PR generates a valid C/C++ initialization
list AST for the HLSL initializer so that there are no changes required
to Clang's CodeGen to support this. This design will also allow us to
use Clang's rewrite to convert HLSL initializers to valid C/C++
initializers that are equivalent. It does have the downside that it will
generate often redundant accesses during codegen. The IR optimizer is
extremely good at eliminating those so this will have no impact on the
final executable performance.
There is some opportunity for optimizing the initializer list generation
that we could consider in subsequent commits. One notable opportunity
would be to identify aggregate objects that occur in the same place in
both initializers and do not require converison, those aggregates could
be initialized as aggregates rather than fully scalarized.
Closes#56067
---------
Co-authored-by: Finn Plummer <50529406+inbelic@users.noreply.github.com>
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
Co-authored-by: Justin Bogner <mail@justinbogner.com>
Implement HLSL Aggregate Splat casting that handles splatting for arrays
and structs, and vectors if splatting from a vec1.
Closes#100609 and Closes#100619
Depends on #118842
Implement HLSLElementwiseCast excluding support for splat cases
Do not support casting types that contain bitfields.
Partly closes#100609 and partly closes#100619
Get inout/out parameters working for HLSL Arrays.
Utilizes the fix from #109323, and corrects the assignment behavior
slightly to allow for Non-LValues on the RHS.
Closes#106917
---------
Co-authored-by: Chris B <beanz@abolishcrlf.org>
To consolidate behavior of function mangling and limit the number of
places that ABI changes will need to be made, this switches the DirectX
target used for HLSL to use the Itanium ABI from the Microsoft ABI. The
Itanium ABI has greater flexibility in decisions regarding mangling of
new types of which we have more than a few yet to add.
One effect of this will be that linking library shaders compiled with
DXC will not be possible with shaders compiled with clang. That isn't
considered a terribly interesting use case and one that would likely
have been onerous to maintain anyway.
This involved adding a function to call all global destructors as the
Microsoft ABI had done.
This requires a few changes to tests. Most notably the mangling style
has changed which accounts for most of the changes. In making those
changes, I took the opportunity to harmonize some very similar tests for
greater consistency. I also shaved off some unneeded run flags that had
probably been copied over from one test to another.
Other changes effected by using the new ABI include using different
types when manipulating smaller bitfields, eliminating an unnecessary
alloca in one instance in this-assignment.hlsl, changing the way static
local initialization is guarded, and changing the order of inout
parameters getting copied in and out. That last is a subtle change in
functionality, but one where there was sufficient inconsistency in the
past that standardizing is important, but the particular direction of
the standardization is less important for the sake of existing shaders.
fixes#110736
We already infer this in IPSCCP (which runs very early, so cannot
benefit from inlining and simplifications) and SCCP (which runs without
PredicateInfo, so does not use assumes). Do it in CVP as well, so it can
handle cases that IPSCCP/SCCP can't.
Fixes https://github.com/llvm/llvm-project/issues/98946 (everything
apart from f2, where the assume is dropped by the frontend).
HLSL allows implicit conversions to truncate vectors to scalar
pr-values. These conversions are scored as vector truncations and should
warn appropriately.
This change allows forming a truncation cast to a pr-value, but not an
l-value. Truncating a vector to a scalar is performed by loading the
first element of the vector and disregarding the remaining elements.
Fixes#102964
HLSL output parameters are denoted with the `inout` and `out` keywords
in the function declaration. When an argument to an output parameter is
constructed a temporary value is constructed for the argument.
For `inout` pamameters the argument is initialized via copy-initialization
from the argument lvalue expression to the parameter type. For `out`
parameters the argument is not initialized before the call.
In both cases on return of the function the temporary value is written
back to the argument lvalue expression through an implicit assignment
binary operator with casting as required.
This change introduces a new HLSLOutArgExpr ast node which represents
the output argument behavior. The OutArgExpr has three defined children:
- An OpaqueValueExpr of the argument lvalue expression.
- An OpaqueValueExpr of the copy-initialized parameter.
- A BinaryOpExpr assigning the first with the value of the second.
Fixes#87526
---------
Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
Co-authored-by: John McCall <rjmccall@gmail.com>
This PR reworks HLSL's implicit conversion sequences. Initially I was
seeking to match DXC's behavior more closely, but that was leading to a
pile of special case rules to tie-break ambiguous cases that should
really be left as ambiguous. We've decided that we're going to break
compatibility with DXC here, and we may port this new behavior over to
DXC instead.
This change is a bit closer to C++'s overload resolution rules, but it
does have a bit of nuance around how dimension adjustment conversions
are ranked. Conversion sequence ranks for HLSL are:
* Exact match
* Scalar Widening (i.e. splat)
* Promotion
* Scalar Widening with Promotion
* Conversion
* Scalar Widening with Conversion
* Dimension Reduction (i.e. truncation)
* Dimension Reduction with Promotion
* Dimension Reduction with Conversion
In this implementation I've folded the disambiguation into the
conversion sequence ranks which does add some complexity as compared to
C++, however this avoids needing to add special casing in
`CompareStandardConversionSequences`. I believe the added conversion
rank values provide a simpler approach, but feedback is appreciated.
The HLSL language spec updates are in the PR here:
https://github.com/microsoft/hlsl-specs/pull/261
HLSL supports vector truncation and element conversions as part of
standard conversion sequences. The vector truncation conversion is a C++
second conversion in the conversion sequence. If a vector truncation is
in a conversion sequence an element conversion may occur after it before
the standard C++ third conversion.
Vector element conversions can be boolean conversions, floating point or
integral conversions or promotions.
[HLSL Draft
Specification](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf)
---------
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>