This PR handles`v_pk_fmac_f16` inline constant encoding/decoding
differences between pre-GFX11 and GFX11+ hardware.
- Pre-GFX11: fp16 inline constants produce `(f16, 0)` - value in low 16
bits, zero in high.
- GFX11+: fp16 inline constants are duplicated to both halves `(f16,
f16)`.
Fixes#94116.
On targets that require even aligned 64-bit VGPRs, GWS operands
require even alignment of a 32-bit operand. Previously we had a hacky
post-processing which added an implicit operand to try to manage
the constraint. This would require special casing in other passes
to avoid breaking the operand constraint. This moves the handling
into the instruction definition, so other passes no longer need
to consider this edge case. MC still does need to special case this,
to print/parse as a 32-bit register. This also still ends up net
less work than introducing even aligned 32-bit register classes.
This also should be applied to the image special case.
The ds_gws_* instructions require gds as an operand. However, when nogds
is given, it is treated the same as gds. This patch fixes this to
disallow nogds.
`MCInstrDesc` counts the SWZ operand for `NumOperands` -- thus, since we
do not parse this into the `MCInst` operands, there will be a mismatch
between `MCInst.getNumOperands` and `MCInstrDesc.getNumOperands` .
`llvm-mca` assumes that each operand counted by
`MCInstrDesc.getNumOperands` will be present in `MCInst.operands`
263377a175/llvm/lib/MCA/InstrBuilder.cpp (L324)
This parses a dummy operand for the buffer_loads as a placeholder for
the implicit SWZ operand. This is similar to the parsing of `tbuffer_`
variants which automatically parse the dummy operand
263377a175/llvm/utils/TableGen/AsmMatcherEmitter.cpp (L1853)
There should normally be no need to generate implicit lit64()
modifiers on the assembler side. It's the encoder's responsibility
to recognise literals that are implicitly 64 bits wide.
The exceptions are where we rewrite floating-point operand values
as integer ones, which would not be assembled back to the original
values unless wrapped into lit64().
Respect explicit lit() modifiers for non-inline values as
necessary to avoid regressions in MC tests. This change still
doesn't prevent use of inline constants where lit()/lit64 is
specified; subject to a separate patch.
On disassembling, only create lit64() operands where necessary for
correct round-tripping.
Add round-tripping tests where useful and feasible.
This removes special case processing in TargetInstrInfo::getRegClass to
fixup register operands which depending on the subtarget support AGPRs,
or require even aligned registers.
This regresses assembler diagnostics, which currently work by hackily
accepting invalid cases and then post-rejecting a validly parsed
instruction.
On the plus side this now emits a comment when disassembling unaligned
registers for targets with the alignment requirement.
Make sure we cannot be in a mode with both wavesizes. This
prevents assertions in a future change. This should probably
just be an error, but we do not have a good way to report
errors from the MCSubtargetInfo constructor.
And rework the lit64() support to use it.
The rules for when to add lit64() can be simplified and
improved. In this change, however, we just follow the existing
conventions on the assembler and disassembler sides.
In codegen we do not (and normally should not need to) add explicit
lit() and lit64() modifiers, so the codegen tests lose them. The change
is an NFCI otherwise.
Simplifies printing operands.
We have proper encoding facilities to encode operands and instructions;
there's no need to pollute the MC representation with encoding details.
Supposed to be an NFCI, but happens to fix some re-encoded instruction
codes in disassembler tests.
The 64-bit operands are to be addressed in following patches introducing
MC-level representation for lit() and lit64() modifiers, to then be
respected by both the assembler and disassembler.
Remember indexes of MCOperands in MCParsedAsmOperands as we add them to
instructions. Then use the indexes to find locations by known MCOperands
indexes.
Happens to fix some reported locations in tests. NFCI otherwise.
getImmLoc() is to be eliminated as well; there's enough work for another
patch.
```
warning: implicit capture of 'this' with a capture default of '=' is deprecated [-Wdeprecated-this-capture]
```
Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Sec. 4.6.7.1 of the gfx1250 SPG states that if an SGPR is used
as an operand, only one SGPR will be read for both the low and high
operations. As a result, the corresponding bits in `op_sel` and
`op_sel_hi` must be the same when the operand is an SGPR.
Co-authored-by: Tian, Shilei <Shilei.Tian@amd.com>
Co-authored-by: Tian, Shilei <Shilei.Tian@amd.com>