8 Commits

Author SHA1 Message Date
Vang Thao
28322c2514 [AMDGPU] Add scheduler pass to rematerialize trivial defs
Add a new pass in the pre-ra AMDGPU scheduler to check if sinking trivially rematerializable defs that only has one use outside of the defining block will increase occupancy. If we can determine that occupancy can be increased, then rematerialize only the minimum amount of defs required to increase occupancy. Also re-schedule all regions that had occupancy matching the previous min occupancy using the new occupancy.

This is based off of the discussion in https://reviews.llvm.org/D117562.

The logic to determine the defs we should collect and determining if sinking would be beneficial is mostly the same. Main differences is that we are no longer limiting it to immediate defs and the def and use does not have to be part of a loop.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D119475
2022-03-09 09:34:33 -08:00
Vang Thao
3a17a81773 [AMDGPU] Regenerate tests to include -NEXT. NFC. 2022-02-21 14:14:53 -08:00
Stanislav Mekhanoshin
3bffb1cd0e [AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amount of code. These operands are mostly 0 anyway.

Additional advantage that parser will accept these flags in any order unlike
now.

Differential Revision: https://reviews.llvm.org/D96469
2021-03-15 13:00:59 -07:00
Stanislav Mekhanoshin
a8d9d50762 [AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
2021-02-17 16:01:32 -08:00
Joe Nash
314e29ed2b [AMDGPU] Add _e64 suffix to VOP3 Insts
Previously, instructions which could be
expressed as VOP3 in addition to another
encoding had a _e64 suffix on the tablegen
record name, while those
only available as VOP3 did not. With this
patch, all VOP3s will have the _e64 suffix.
The assembly does not change, only  the mir.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D94341

Change-Id: Ia8ec8890d47f8f94bbbdac43745b4e9dd2b03423
2021-01-12 18:33:18 -05:00
Matt Arsenault
4b4496312e AMDGPU: Start adding MODE register uses to instructions
This is the groundwork required to implement strictfp. For now, this
should be NFC for regular instructoins (many instructions just gain an
extra use of a reserved register). Regalloc won't rematerialize
instructions with reads of physical registers, but we were suffering
from that anyway with the exec reads.

Should add it for all the related FP uses (possibly with some
extras). I did not add it to either the gpr index mode instructions
(or every single VALU instruction) since it's a ridiculous feature
already modeled as an arbitrary side effect.

Also work towards marking instructions with FP exceptions. This
doesn't actually set the bit yet since this would start to change
codegen. It seems nofpexcept is currently not implied from the regular
IR FP operations. Add it to some MIR tests where I think it might
matter.
2020-05-27 14:47:00 -04:00
Stanislav Mekhanoshin
a6322941ff [AMDGPU] gfx1010 VMEM and SMEM implementation
Differential Revision: https://reviews.llvm.org/D61330

llvm-svn: 359621
2019-04-30 22:08:23 +00:00
Matt Arsenault
2e9ddcc30e RegPressure: Fix crash on blocks with only dbg_value
If there were only dbg_values in the block, recede would hit the
beginning of the block and try to use thet dbg_value as a real
instruction.

llvm-svn: 357105
2019-03-27 18:14:02 +00:00