13 Commits

Author SHA1 Message Date
Fangrui Song
9e9907f1cf
[AMDGPU,test] Change llc -march= to -mtriple= (#75982)
Similar to 806761a7629df268c8aed49657aeccffa6bca449.

For IR files without a target triple, -mtriple= specifies the full
target triple while -march= merely sets the architecture part of the
default target triple, leaving a target triple which may not make sense,
e.g. amdgpu-apple-darwin.

Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
$unknown-apple-darwin as ELF instead of rejecting it outrightly.

This patch changes AMDGPU tests to not rely on the default
OS/environment components. Tests that need fixes are not changed:

```
  LLVM :: CodeGen/AMDGPU/fabs.f64.ll
  LLVM :: CodeGen/AMDGPU/fabs.ll
  LLVM :: CodeGen/AMDGPU/floor.ll
  LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll
  LLVM :: CodeGen/AMDGPU/fneg-fabs.ll
  LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll
  LLVM :: CodeGen/AMDGPU/schedule-if-2.ll
```
2024-01-16 21:54:58 -08:00
Matt Arsenault
460ffcddd9
AMDGPU: Make bf16/v2bf16 legal types (#76215)
There are some intrinsics are using i16 vectors in place of bfloat
vectors.
Move towards making bf16 vectors legal so these can migrate. Leave the
larger vectors for a later change.

Depends #76213 #76214
2024-01-04 22:31:18 +07:00
Jay Foad
a4196666ac
[AMDGPU] Revert "Preliminary patch for divergence driven instruction selection. Operands Folding 1." (#71710)
This reverts commit 201f892b3b597f24287ab6a712a286e25a45a7d9.
2023-11-13 13:53:10 +00:00
Amara Emerson
6b69584660
[GlobalISel] Fall back for bf16 conversions. (#71470)
We don't support these correctly since we don't yet have FP types.
AMDGPU tests were silently miscompiling bf16 as if they were fp16.
2023-11-06 21:18:57 -08:00
Matt Arsenault
1faa4797ca AMDGPU: Handle unsafe exp.f32 with denormal handling
I somehow missed this path when adding the new expansions. Saves a lot
of instructions for afn + IEEE.

https://reviews.llvm.org/D157867
2023-08-14 18:36:01 -04:00
Matt Arsenault
fb54afd1b7 AMDGPU: Fold fsub [+-0] into fneg when folding source modifiers
This isn't always folded to fneg for a freestanding fsub depending on
the denormal mode. When matching source modifiers, we're implicitly
canonicalizing the input so we can fold it here.

Doesn't bother handling the VOP3P case since it's only relevant with
DAZ, which nobody really uses with f16.

For f64, tests show an existing bug where DAGCombiner tries to respect
the denormal mode for fsub -0, x, but not after it's lowered to fadd
-0, (fneg x). Either the fold is wrong or we shouldn't restrict the
fsub case based on the denormal mode.

https://reviews.llvm.org/D155652
2023-07-20 19:29:40 -04:00
Matt Arsenault
0295513238 AMDGPU: Filter out contract flags when lowering exp
It is unsafe to contract the fsub into the fmul. It also increases
code size by duplicating a constant.
2023-07-20 18:14:24 -04:00
Matt Arsenault
076bc374fc AMDGPU: Add some new baseline tests for exp lowering 2023-07-20 18:14:24 -04:00
Matt Arsenault
fbe4ff8149 AMDGPU: Partially fix not respecting dynamic denormal mode
The most notable issue was producing v_mad_f32 in functions with the
dynamic mode, since it just ignores the mode. fdiv lowering is still
somewhat broken because it involves a mode switch and we need to query
the original mode.
2023-07-11 15:14:52 -04:00
Matt Arsenault
20964c901a DAG: Fix dropping flags when widening unary vector ops 2023-07-05 17:25:24 -04:00
Matt Arsenault
5491666248 AMDGPU: Correctly lower llvm.exp.f32
The library expansion has too many paths for all the permutations of
DAZ, unsafe and the 3 exp functions. It's easier to expand it in the
backend when we know all of these things. The library currently misses
the no-infinity check on the overflow, which this handles optimizing
out.

Some of the <3 x half> fast tests regress due to vector widening
dropping flags which will be fixed separately.

Apparently there is no exp10 intrinsic, but there should be. Adds some
deadish code in preparation for adding one while I'm following along
with the current library expansion.
2023-07-05 17:23:49 -04:00
Matt Arsenault
ed556a1ad5 AMDGPU: Correctly lower llvm.exp2.f32
Previously this did a fast math expansion only.
2023-07-05 17:23:48 -04:00
Matt Arsenault
c56e4a8c42 AMDGPU: Modernize exp codegen tests
Find and replace on the new log tests (plus <3 x half> which was
missing). Apparently exp10 never worked.
2023-06-23 09:11:06 -04:00