This addresses the following issue I opened:
https://github.com/llvm/llvm-project/issues/118851.
This change generalizes the Type Legalization mechanism that currently
handles `v8[i/f/bf]16` upsizing to include loads _and_ stores of `v8i8`
+ `v16i8`, allowing all of the mentioned vectors to be lowered to ptx as
vectors of `b32`. This extension also allows us to remove the DagCombine
that only handled exactly `load v16i8`, thus centralizing all the
upsizing logic into one place.
Test changes include adding v8i8, v16i8, and v8i16 cases to
load-store.ll, and updating the CHECKs for other tests to match the
improved codegen.
Fixes generation of invalid loads leading to misaligned access errors.
The bug got exposed by SLP vectorizer change ec360d6 which allowed SLP
to produce `v16i8` vectors.
Also updated the tests to use automatic check generator.
This is done by lowering v16i8 loads into LoadV4 operations with i32
results instead of letting ReplaceLoadVector split it into smaller
loads during legalization. This is done at dag-combine1 time, so that
vector operations with i8 elements can be optimised away instead of
being needlessly split during legalization, which involves storing to
the stack and loading it back.
ptxas is a proprietary compiler from Nvidia that can compile PTX to
machine code (SASS). It has a lot of diagnostics to catch errors
in PTX, which can be used to verify PTX output from llc.
Set -DPXTAS_EXECUTABLE=/path/to/ptxas CMake option to enable it.
If this option is not set, then ptxas is substituted to true which
effectively disables all ptxas RUN lines.
LLVM_PTXAS_EXECUTABLE environment variable takes precedence over
the CMake option, and allows to override ptxas executable that is used for LIT
without complete re-configuration.
Differential Revision: https://reviews.llvm.org/D121727
This patch enables support for .f16x2 operations.
Added new register type Float16x2.
Added support for .f16x2 instructions.
Added handling of vectorized loads/stores of v2f16 values.
Differential Revision: https://reviews.llvm.org/D30057
Differential Revision: https://reviews.llvm.org/D30310
llvm-svn: 296032