The pseudos for strided loads and stores use the SEW coming from the
name. For example, vlse8 has SEW=8 and vlse16 has SEW=16.
When llvm-mca tries to lookup (VLSE8_V, SEW=S, LMUL=L) in the inverse
pseudo table, a result will only be found when S=8, where S was set from
the previous vsetvli instruction. Instead, for a match to be found, we
must lookup (VLSE8_V, SEW=8, LMUL=L') where L' is the EMUL which was
calculated by scaling the LMUL and SEW from the previous vsetvli and the
SEW=8.
For instructions like vmv.s.x and friends where we don't care about LMUL
or the
SEW/LMUL ratio, we can change the LMUL in its state so that it has the
same
SEW/LMUL ratio as the previous state. This allows us to avoid more VL
toggles
later down the line (i.e. use vsetvli zero, zero, which requires that
the
SEW/LMUL ratio must be the same)
This is an alternative approach to the idea in #69259, but note that
they
don't catch exactly the same test cases.
…L based on instruction EEW
Vector Unit Stride Loads and stores EEW and EMUL depend on the EEW given
in the instruction name and the SEW from vtype. llvm-mca needs some help to correctly report
this information.
This means llvm-mc should now build without depending on the target
CodeGen libraries.
Fix up a few includes in RISCV, AMDGPU, and X86 MCA to avoid transitive
deps on CodeGen.
Fixes#64166
Now that RISCV pseudo instructions now account for SEW in some cases,
it useful that RISCV SEW instruments exist so that llvm-mca can use
the SEW specific scheduler classes.
Differential Revision: https://reviews.llvm.org/D154142
Now that scheduler resources are split by SEW for some instructions,
add the ability to map (BaseInstr, LMUL, SEW) -> Pseudo. For
BaseInstrs that are not split by SEW, 0 is the default key.
This does not change the size of the table since there was an 8
bit hole.
Differential Revision: https://reviews.llvm.org/D154136
Since the LMUL data that is needed to create an instrument is
avaliable statically from vsetivli and vsetvli instructions,
LMUL instruments can be automatically generated so that clients
of the tool do no need to manually insert instrument comments.
Instrument comments may be placed after a vset{i}vli instruction,
which will override instrument that was automatically inserted.
As a result, clients of llvm-mca instruments do not need to update
their existing instrument comments. However, if the instrument
has the same LMUL as the vset{i}vli, then it is reccomended to
remove the instrument comment as it becomes redundant.
Differential Revision: https://reviews.llvm.org/D154526
There was a memory leak that presented itself once the llvm-mca
tests were committed. This leak was not checked for by the pre-commit
tests. This change changes the shared_ptr to a unique_ptr to avoid
this problem.
We will know that this fix works once committed since I don't know
whether it is possible to force a lit test to use LSan. I spent the
day trying to build llvm with LSan enabled without much luck. If
anyone knows how to build llvm with LSan for the lit-tests, I am
happy to give it another try locally.
Differential Revision: https://reviews.llvm.org/D150816
Fixes createInstrument to return instrument when LMUL data is valid, and
return nullptr when LMUL data is not valid for RISCV target.
Differential Revision: https://reviews.llvm.org/D149068
On x86 and AArch, SIMD instructions encode all of the scheduling information in the instruction
itself. For example, VADD.I16 q0, q1, q2 is a neon instruction that operates on 16-bit integer
elements stored in 128-bit Q registers, which leads to eight 16-bit lanes in parallel. This kind
of information impacts how the instruction takes to execute and what dependencies this may cause.
On RISCV however, the data that impacts scheduling is encoded in CSR registers such as vtype or
vl, in addition with the instruction itself. But MCA does not track or use the data in these
registers. This patch fixes this problem by introducing Instruments into MCA.
* Replace `CodeRegions` with `AnalysisRegions`
* Add `Instrument` and `InstrumentManager`
* Add `InstrumentRegions`
* Add RISCV Instrument and `InstrumentManager`
* Parse `Instruments` in driver
* Use instruments to override schedule class
* RISCV use lmul instrument to override schedule class
* Fix unit tests to pass empty instruments
* Add -ignore-im clopt to disable this change
A prior version of this patch was commited in 5e82ee537321. 2323a4ee610f reverted
that change because the unit test files caused build errors. The change with fixes
were committed in b88b8307bf9e but reverted once again e8e92c8313a0 due to more
build errors.
This commit adds the prior changes and fixes the build error.
Differential Revision: https://reviews.llvm.org/D137440
On x86 and AArch, SIMD instructions encode all of the scheduling information in the instruction
itself. For example, VADD.I16 q0, q1, q2 is a neon instruction that operates on 16-bit integer
elements stored in 128-bit Q registers, which leads to eight 16-bit lanes in parallel. This kind
of information impacts how the instruction takes to execute and what dependencies this may cause.
On RISCV however, the data that impacts scheduling is encoded in CSR registers such as vtype or
vl, in addition with the instruction itself. But MCA does not track or use the data in these
registers. This patch fixes this problem by introducing Instruments into MCA.
* Replace `CodeRegions` with `AnalysisRegions`
* Add `Instrument` and `InstrumentManager`
* Add `InstrumentRegions`
* Add RISCV Instrument and `InstrumentManager`
* Parse `Instruments` in driver
* Use instruments to override schedule class
* RISCV use lmul instrument to override schedule class
* Fix unit tests to pass empty instruments
* Add -ignore-im clopt to disable this change
A prior version of this patch was commited in. It was reverted in
5e82ee5373211db8522181054800ccd49461d9d8. 2323a4ee610f5e1db74d362af4c6fb8c704be8f6 reverted
that change because the unit test files caused build errors. This commit adds the original changes
and the fixed test files.
Differential Revision: https://reviews.llvm.org/D137440
On x86 and AArch, SIMD instructions encode all of the scheduling information in the instruction
itself. For example, VADD.I16 q0, q1, q2 is a neon instruction that operates on 16-bit integer
elements stored in 128-bit Q registers, which leads to eight 16-bit lanes in parallel. This kind
of information impacts how the instruction takes to execute and what dependencies this may cause.
On RISCV however, the data that impacts scheduling is encoded in CSR registers such as vtype or
vl, in addition with the instruction itself. But MCA does not track or use the data in these
registers. This patch fixes this problem by introducing Instruments into MCA.
* Replace `CodeRegions` with `AnalysisRegions`
* Add `Instrument` and `InstrumentManager`
* Add `InstrumentRegions`
* Add RISCV Instrument and `InstrumentManager`
* Parse `Instruments` in driver
* Use instruments to override schedule class
* RISCV use lmul instrument to override schedule class
* Fix unit tests to pass empty instruments
* Add -ignore-im clopt to disable this change
Differential Revision: https://reviews.llvm.org/D137440