7 Commits

Author SHA1 Message Date
Carl Ritson
86627149f6
[AMDGPU] Mitigate GFX12 VALU read SGPR hazard (#100067)
Any SGPR read by a VALU can potentially obscure SALU writes to the same
register.
Insert s_wait_alu instructions to mitigate the hazard on affected paths.

Compute a global cache of SGPRs with any VALU reads and use this to
avoid inserting mitigation for SGPRs never accessed by VALUs.

To avoid excessive search when compile time is priority implement
secondary mode where all SALU writes are mitigated.

Co-authored-by: Shilei Tian <shilei.tian@amd.com>
2024-09-04 12:15:20 +09:00
Matt Arsenault
212b78aad4
DAG: Improve fminimum/fmaximum vector expansion logic (#93579)
First, expandFMINIMUM_FMAXIMUM should be a never-fail API. The client
wanted it expanded, and it can always be expanded. This logic was tied
up with what the VectorLegalizer wanted.
    
Prefer using the min/max opcodes, and unrolling if we don't have a
vselect.
This seems to produce better code in all the changed tests.
2024-06-06 19:01:39 +02:00
Matt Arsenault
fe5d791517 AMDGPU: Add some multi-use negative tests for minimum3/maximum3 2024-05-28 14:47:56 +02:00
Matt Arsenault
2401b6126d
AMDGPU: Fix creating minimum3/maximum3 nodes pre-gfx12 (#93027)
These would fail to select.
2024-05-23 15:59:43 +02:00
Matt Arsenault
f06c1ce860
AMDGPU: Clean up maximum3/minimum3 tests (#93025)
These were using patterns copied from older tests, before non-kernel
functions were supported and manually written checks. Also stop using
-flat-for-global, which only exists to try to share tests between SI/CI
and VI+.

This was also missing test coverage, we're incorrectly forming
maximum3/minimum3 pre-gfx12. This is a pre-commit before fixing that.
2024-05-23 15:30:12 +02:00
Fangrui Song
9e9907f1cf
[AMDGPU,test] Change llc -march= to -mtriple= (#75982)
Similar to 806761a7629df268c8aed49657aeccffa6bca449.

For IR files without a target triple, -mtriple= specifies the full
target triple while -march= merely sets the architecture part of the
default target triple, leaving a target triple which may not make sense,
e.g. amdgpu-apple-darwin.

Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
$unknown-apple-darwin as ELF instead of rejecting it outrightly.

This patch changes AMDGPU tests to not rely on the default
OS/environment components. Tests that need fixes are not changed:

```
  LLVM :: CodeGen/AMDGPU/fabs.f64.ll
  LLVM :: CodeGen/AMDGPU/fabs.ll
  LLVM :: CodeGen/AMDGPU/floor.ll
  LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll
  LLVM :: CodeGen/AMDGPU/fneg-fabs.ll
  LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll
  LLVM :: CodeGen/AMDGPU/schedule-if-2.ll
```
2024-01-16 21:54:58 -08:00
Piotr Sobczak
6eec80133b
[AMDGPU] Min/max changes for GFX12 (#75214)
Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
2023-12-13 14:18:10 +01:00