Code cleanups for TableGen files, changes includes function names,
variable names and unused imports.
---------
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Summary:
This patch handles the types(MVT) in `selectionDAG` for RISCV vector
tuples.
As described in previous patch handling llvm types, the MVTs also have
32 variants:
```
riscv_nxv1i8x2, riscv_nxv1i8x3, riscv_nxv1i8x4, riscv_nxv1i8x5, riscv_nxv1i8x6, riscv_nxv1i8x7, riscv_nxv1i8x8,
riscv_nxv2i8x2, riscv_nxv2i8x3, riscv_nxv2i8x4, riscv_nxv2i8x5, riscv_nxv2i8x6, riscv_nxv2i8x7, riscv_nxv2i8x8,
riscv_nxv4i8x2, riscv_nxv4i8x3, riscv_nxv4i8x4, riscv_nxv4i8x5, riscv_nxv4i8x6, riscv_nxv4i8x7, riscv_nxv4i8x8,
riscv_nxv8i8x2, riscv_nxv8i8x3, riscv_nxv8i8x4, riscv_nxv8i8x5, riscv_nxv8i8x6, riscv_nxv8i8x7, riscv_nxv8i8x8,
riscv_nxv16i8x2, riscv_nxv16i8x3, riscv_nxv16i8x4,
riscv_nxv32i8x2.
```
Detail:
An intuitive way to model vector tuple type is using nested scalable
vector, e.g. `nElts=NF, EltTy=nxv2i32`. However it's not compatible to
what we've done to handle scalable vector in TargetLowering, so it would
need more effort to change the code to handle this concept.
Another approach is encoding the `MinNumElts` info in `sz` of `MVT`,
e.g.
`nElts=NF, sz=(NF*MinNumElts*8)`, this makes it much easier to handle
and
changes less code.
This patch adopts the latter approach.
Stacked on https://github.com/llvm/llvm-project/pull/97992
MachineValueTypeSet in tablegen allocates an array with a bit per MVT.
This used to be 256 bits, with the introduction of 16-bit MVT it
ballooned to 65536 bits. I suspect this is increasing the memory usage
of many of the data structures used by CodeGenDAGPatterns.
Since we don't need the full 16-bit range yet, this patch proposes
lowering the maximum MVT to 511 and using only 512 bits for
MachineValueTypeSet's storage.
RFC:
https://discourse.llvm.org/t/rfc-extend-machine-value-type-from-uint8-t-to-uint16-t/80274
compile-time-tracker:
https://llvm-compile-time-tracker.com/compare.php?from=4b9fab591916eec9fd1942f37afe3b137b564089&to=177d28247efe5a4d59a8d8150b4daf01e4f57d74&stat=wall-time
Currently 208 out of 256 MVTs are used, it will be run out soon, so
ultimately we need to extend the original `MVT::SimpleValueType` from
`uint8_t` to `uint16_t` to accomodate more types.
The `MatcherTable` uses `unsigned char` for encoding the matcher code,
so the extended MVTs are no longer fit into the table, thus we need to
use VBR to encode them as we do on others that are wider than 8 bits.
The statistics below shows the difference of "Total Array size" of the
matcher table that appears in every files:
```
Table Before After Change(%)
WebAssemblyGenDAGISel.inc 23576 23775 0.844
NVPTXGenDAGISel.inc 173498 173498 0
RISCVGenDAGISel.inc 2179121 2369929 8.756
AVRGenDAGISel.inc 2754 2754 0
PPCGenDAGISel.inc 163315 163617 0.185
MipsGenDAGISel.inc 47280 47447 0.353
SystemZGenDAGISel.inc 56243 56461 0.388
AArch64GenDAGISel.inc 467893 487830 4.261
MSP430GenDAGISel.inc 8069 8069 0
LoongArchGenDAGISel.inc 78928 79131 0.257
XCoreGenDAGISel.inc 3432 3432 0
BPFGenDAGISel.inc 3733 3733 0
VEGenDAGISel.inc 65174 66456 1.967
LanaiGenDAGISel.inc 2067 2067 0
X86GenDAGISel.inc 628787 636987 1.304
ARMGenDAGISel.inc 170968 171036 0.040
HexagonGenDAGISel.inc 155764 155764 0
SparcGenDAGISel.inc 5762 5798 0.625
AMDGPUGenDAGISel.inc 504356 504463 0.021
R600GenDAGISel.inc 29785 29785 0
```
The statistics below shows the runtime peak memory usage by compiling a
simple C program:
`/bin/time -v clang -target $TARGET -O3 -c test.c`
```
int test(int a) {
return a * 3;
}
```
```
Target Before(kbytes) After(kbytes) Change(%)
wasm64 110172 110088 -0.076
nvptx64 109784 109980 0.179
riscv64 114020 113656 -0.319
avr 110352 110068 -0.257
ppc64 112612 112476 -0.120
mips64 113588 113668 0.070
systemz 110860 110760 -0.090
aarch64 113704 113432 -0.239
msp430 110284 110200 -0.076
loongarch64 111052 110756 -0.267
xcore 108340 108020 -0.295
bpf 110620 110708 0.080
ve 110960 110920 -0.036
lanai 110180 109960 -0.200
x86_64 113640 113304 -0.296
arm64 113540 113172 -0.324
hexagon 114620 114684 0.056
sparc 110412 110136 -0.250
amdgcn 118164 117144 -0.863
r600 111200 110508 -0.622
```
Most of MVT has simple mapping to the LLVM type, so it would be nice to
auto generate that from ValueTypes.td, that could reduce the effort when
we adding new MVT, especially new vector MVT with different size.
Implement MVT::getVectorElementType and MVT::getVectorMinNumElements
with table lookup instead of switch.
This speeds up "check-llvm-codegen-amdgpu" by about 7% in my Release
build.
Add a new bit to ValueTypes.td to indicate whether a type should be
part of the [FIRST_VALUETYPE,LAST_VALUETYPE] range or not.
This was reviewed as part of #93654.
- Implement `VTEmitter` as `llvm-tblgen -gen-vt`.
- Create a copy of `llvm/Support/MachineValueType.h` into `unittests/Support`.
It includes `GenVT.inc` generated by `VTEmitter`.
- Implement `MVTTest` in `SupportTests`. It checks equivalence between
`llvm/Support/MachineValueType.h` and the generated header.
Differential Revision: https://reviews.llvm.org/D146906