1170 Commits

Author SHA1 Message Date
Heejin Ahn
08bba6503b
[WebAssembly] Support binary generation for new EH (#109027)
This adds support for binary generation for the new EH proposal.

So far the only case that we emitted variable immediate operands in
binary has been `br_table`'s destinations. (Other `variable_ops` uses in
TableGen files are register operands, such as the operands of `call`, so
they don't get emitted in binary as a part of the same instruction.)

With this PR, variable immediate operands can include `try_table`'s
operands:
- The number of of catch clauses
- catch clauses sub-opcodes
  - `catch`: 0x00
  - `catch_ref`: 0x01
  - `catch_all`: 0x02
  - `catch_all_ref`: 0x03
- catch clauses' destinations

With `try_table`, we now have variable expr operands for `try_table`'s
catch clauses' tags. We treat their fixups in the same way we do for
tags in other instructions such as in `throw`.

Diff without whitespace will be easier to view.
2024-09-17 14:58:19 -07:00
Brendan Dahl
c076638c70
[WebAssembly] Support BUILD_VECTOR with F16x8. (#108117)
Convert BUILD_VECTORS with FP16x8 to I16x8 since there's no FP16 scalar
value to intialize v128.const.
2024-09-11 10:00:10 -07:00
Brendan Dahl
415288a2a7
[WebAssembly] Add load and store patterns for V8F16. (#108119) 2024-09-11 09:53:53 -07:00
Heejin Ahn
6bbf7f06d8
[WebAssembly] Add assembly support for final EH proposal (#107917)
This adds the basic assembly generation support for the final EH
proposal, which was newly adopted in Sep 2023 and advanced into Phase 4
in Jul 2024:

https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md

This adds support for the generation of new `try_table` and `throw_ref`
instruction in .s asesmbly format. This does NOT yet include
- Block annotation comment generation for .s format
- .o object file generation
- .s assembly parsing
- Type checking (AsmTypeCheck)
- Disassembler
- Fixing unwind mismatches in CFGStackify

These will be added as follow-up PRs.

---

The format for `TRY_TABLE`, both for `MachineInstr` and `MCInst`, is as
follows:
```
TRY_TABLE type number_of_catches catch_clauses*
```
where `catch_clause` is
```
catch_opcode tag+ destination
```
`catch_opcode` should be one of 0/1/2/3, which denotes
`CATCH`/`CATCH_REF`/`CATCH_ALL`/`CATCH_ALL_REF` respectively. (See
`BinaryFormat/Wasm.h`) `tag` exists when the catch is one of `CATCH` or
`CATCH_REF`.
The MIR format is printed as just the list of raw operands. The
(stack-based) assembly instruction supports pretty-printing, including
printing `catch` clauses by name, in InstPrinter.

In addition to the new instructions `TRY_TABLE` and `THROW_REF`, this
adds four pseudo instructions: `CATCH`, `CATCH_REF`, `CATCH_ALL`, and
`CATCH_ALL_REF`. These are pseudo instructions to simulate block return
values of `catch`, `catch_ref`, `catch_all`, `catch_all_ref` clauses in
`try_table` respectively, given that we don't support block return
values except for one case (`fixEndsAtEndOfFunction` in CFGStackify).
These will be omitted when we lower the instructions to `MCInst` at the
end.

LateEHPrepare now will have one more stage to covert
`CATCH`/`CATCH_ALL`s to `CATCH_REF`/`CATCH_ALL_REF`s when there is a
`RETHROW` to rethrow its exception. The pass also converts `RETHROW`s
into `THROW_REF`. Note that we still use `RETHROW` as an interim pseudo
instruction until we convert them to `THROW_REF` in LateEHPrepare.

CFGStackify has a new `placeTryTableMarker` function, which places
`try_table`/`end_try_table` markers with a necessary `catch` clause and
also `block`/`end_block` markers for the destination of the `catch`
clause.

In MCInstLower, now we need to support one more case for the multivalue
block signature (`catch_ref`'s destination's `(i32, exnref)` return
type).

InstPrinter has a new routine to print the `catch_list` type, which is
used to print `try_table` instructions.

The new test, `exception.ll`'s source is the same as
`exception-legacy.ll`, with the FileCheck expectations changed. One
difference is the commands in this file have `-wasm-enable-exnref` to
test the new format, and don't have `-wasm-disable-explicit-locals
-wasm-keep-registers`, because the new custom InstPrinter routine to
print `catch_list` only works for the stack-based instructions (`_S`),
and we can't use `-wasm-keep-registers` for them.

As in `exception-legacy.ll`, the FileCheck lines for the new tests do
not contain the whole program; they mostly contain only the control flow
instructions for readability.
2024-09-10 21:32:24 -07:00
anjenner
4af249fe6e
Add usub_cond and usub_sat operations to atomicrmw (#105568)
These both perform conditional subtraction, returning the minuend and
zero respectively, if the difference is negative.
2024-09-06 16:19:20 +01:00
Heejin Ahn
aecbc92410
[WebAssembly] Rename CATCH/CATCH_ALL to *_LEGACY (#107187)
This renames MIR instruction `CATCH` and `CATCH_ALL` to `CATCH_LEGACY`
and `CATCH_ALL_LEGACY` respectively.

Follow-up PRs for the new EH (exnref) implementation will use `CATCH`,
`CATCH_REF`, `CATCH_ALL`, and `CATCH_ALL_REF` as pseudo-instructions
that return extracted values or `exnref` or both, because we don't
currently support block return values in LLVM. So to give the old (real)
`CATCH`es and the new (pseudo) `CATCH`es different names, this attaches
`_LEGACY` prefix to the old names.

This also rearranges `WebAssemblyInstrControl.td` so that the old legacy
instructions are listed all together at the end.
2024-09-04 16:14:13 -07:00
Heejin Ahn
b2223b4d7e
[WebAssembly] Rename legacy EH mir tests (#107189)
We added `-legacy` suffix to the legacy EH `ll` tests in #107166 but
forgot to do the same for `mir` tests.
2024-09-04 09:52:42 -07:00
Heejin Ahn
8b28e2ebb3
[WebAssembly] Rename legacy EH tests (#107166)
Give each test in `cfg-stackify-eh-legacy.ll` a name rather than
something like `test5`, because I plan to copy many of these test into a
new file that tests for the new EH (exnref) and some of the tests here
are not applicable to the new EH so the numbering will be different,
which can make things confusing.

Also this removes `test_` prefixes in the test function names in
`exception-legacy.ll`, because, well, we all know they are tests.
2024-09-03 21:14:36 -07:00
Brendan Dahl
5703d8572f
[WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (#106465)
Getting this to work required a few additional changes:
- Add builtins for any instructions that can't be done with plain C
currently.
- Add support for the saturating version of fp_to_<s,i>_I16x8. Other
vector sizes supported this already.
- Support bitcast of f16x8 to v128. Needed to return a __f16x8 as
v128_t.
2024-08-30 08:42:37 -07:00
Brendan Dahl
7d373cef49
[WebAssembly] Change half-precision feature name to fp16. (#105434)
This better aligns with how the feature is being referred to and what
runtimes (V8) are calling it.
2024-08-22 09:44:33 -07:00
Froster
234cb4c6e3
[SelectionDAG] Scalarize binary ops of splats before legal types (#100749)
Fixes #65072. This allows binary ops of splats to be scalarized if the
operation isn't legal on the element type isn't legal, but is legal on
the type it will be legalized to. I assume if an Op is legal both in
scalar and vector, choose scalar version should always be better no
matter what the type is.

There are some cases that my approach can't scalarize, for example:
``` llvm
; test/CodeGen/RISCV/rvv/select-int.ll
define <vscale x 4 x i64> @select_nxv4i64(i1 zeroext %c, <vscale x 4 x i64> %a, <vscale x 4 x i64> %b) {
  %v = select i1 %c, <vscale x 4 x i64> %a, <vscale x 4 x i64> %b
  ret <vscale x 4 x i64> %v
}
```
https://godbolt.org/z/xzqrKrxvK
`xor (splat i1, splat i1)` is generated in late step after LegalizeType,
from select. I didn't figure out how to make `xor i1, i1` legal at this
time.

---------

Co-authored-by: Luke Lau <luke@igalia.com>
2024-08-15 00:07:00 +08:00
Hari Limaye
94473f4db6
[IRBuilder] Generate nuw GEPs for struct member accesses (#99538)
Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.

Regression tests are updated using update scripts where possible, and by
find + replace where not.
2024-08-09 13:25:04 +01:00
Nikita Popov
0564d0665b [SDAG] Transfer gep nusw/nuw to SDAG
The resulting add is nuw if either the gep was nuw or it was
nusw+nneg. Previously only inbounds+nneg was handled.

Test via wasm load offsets, which seems to most directly expose
these SDAG flags.
2024-08-07 09:26:10 +02:00
Sam Parker
76c4529515
[WebAssembly] Fix assertion in LowerBUILD_VECTOR (#101961)
The assertion was failing in the case where we were trying to lower to
loadxx_zero, but lane zero was undef.
2024-08-05 14:38:12 -07:00
Sam Parker
08decd20a9
[WebAssembly] load_zero to initialise build_vector (#100610)
Instead of splatting a single lane, to initialise a build_vector, lower
to scalar_to_vector which can be selected to load_zero.

Also add load_zero and load_lane patterns for f32x4 and f64x2.
2024-08-02 10:11:21 +01:00
Heejin Ahn
0af7542135 Reapply "[WebAssembly] Fix phi handling for Wasm SjLj (#99730)"
This reapplies #99730. #99730 contained a nondeterministic iteration
which failed the reverse-iteration bot
(https://lab.llvm.org/buildbot/#/builders/110/builds/474) and reverted
in
f3f0d9928f.

The fix is make the order of iteration of new predecessors
determintistic by using `SmallSetVector`.
```diff
--- a/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
@@ -1689,7 +1689,7 @@ void WebAssemblyLowerEmscriptenEHSjLj::handleLongjmpableCallsForWasmSjLj(
     }
   }

-  SmallDenseMap<BasicBlock *, SmallPtrSet<BasicBlock *, 4>, 4>
+  SmallDenseMap<BasicBlock *, SmallSetVector<BasicBlock *, 4>, 4>
       UnwindDestToNewPreds;
   for (auto *CI : LongjmpableCalls) {
     // Even if the callee function has attribute 'nounwind', which is true for
```
2024-07-25 00:00:59 +00:00
Brendan Dahl
0dbd72d6ab
[WebAssembly] Implement f16x8.replace_lane instruction. (#99388)
Use a builtin and intrinsic until half types are better supported for
instruction selection.
2024-07-24 11:55:36 -07:00
Sam Parker
a3de21cac1
[WebAssembly] Ofast pmin/pmax pattern matchers (#100107)
With fast-math, the ordered setcc nodes are converted to setcc nodes
which do not care about NaNs, so add patterns that use setlt, setle,
setgt and setge.
2024-07-24 09:23:49 +01:00
Heejin Ahn
f3f0d9928f Revert "[WebAssembly] Fix phi handling for Wasm SjLj (#99730)"
This reverts commit 2bf71b8bc851b49745b795f228037db159005570.
This broke the builbot at
https://lab.llvm.org/buildbot/#/builders/110/builds/474.
2024-07-24 00:14:58 +00:00
Heejin Ahn
2bf71b8bc8
[WebAssembly] Fix phi handling for Wasm SjLj (#99730)
In Wasm SjLj, longjmpable `call`s that in functions that call `setjmp`
are converted into `invoke`s. Those `invoke`s are meant to unwind to
`catch.dispatch.longjmp` to figure out which `setjmp` those `longjmp`
buffers belong to:

fada922732/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L250-L260)

But in case a longjmpable call is within another `catchpad` or
`cleanuppad` scope, to maintain the nested scope structure, we should
make them unwind to the scope's next unwind destination and not directly
to `catch.dispatch.longjmp`:

fada922732/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L1698-L1727)
In this case the longjmps will eventually unwind to
`catch.dispatch.longjmp` and be handled there.

In this case, it is possible that the unwind destination (which is an
existing `catchpad` or `cleanuppad`) may already have `phi`s. And
because the unwind destinations get new predecessors because of the
newly created `invoke`s, those `phi`s need to have new entries for those
new predecessors.

This adds new preds as new incoming blocks to those `phi`s, and we use a
separate `SSAUpdater` to calculate the correct incoming values to those
blocks.

I have assumed `SSAUpdaterBulk` used in `rebuildSSA` would take care of
these things, but apparently it doesn't. It takes available defs and
adds `phi`s in the defs' dominance frontiers, i.e., where each def's
dominance ends, and rewrites other uses based on the newly added `phi`s.
But it doesn't add entries to existing `phi`s, and the case in this bug
may not even involve dominance frontiers; this bug is simply about
existing `phis`s that have gained new preds need new entries for them.
It is kind of surprising that this bug was only reported recently, given
that this pass has not been changed much in years.

Fixes #97496 and fixes
https://github.com/emscripten-core/emscripten/issues/22170.
2024-07-23 16:06:00 -07:00
Heejin Ahn
735852f5ab
[WebAssembly] Enable simd128 when relaxed-simd is set in AsmPrinter (#99803)
Even though in `Subtarget` we defined `SIMDLevel` as a number so
`hasRelaxedSIMD` automatically means `hasSIMD128`,
0caf0c93e7/llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h (L36-L40)
0caf0c93e7/llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h (L107)

specifying only `relaxed-simd` feature on a program that needs `simd128`
instructions to compile fails, because of this query in `AsmPrinter`:
d0d05aec3b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp (L644-L645)

This `verifyInstructionPredicates` function (and other functions called
by this function) is generated by

https://github.com/llvm/llvm-project/blob/main/llvm/utils/TableGen/InstrInfoEmitter.cpp,
and looks like this (you can check it in the
`lib/Target/WebAssembly/WebAssemblyGenInstrInfo.inc` in your build
directory):
```cpp
void verifyInstructionPredicates(
    unsigned Opcode, const FeatureBitset &Features) {
  FeatureBitset AvailableFeatures = computeAvailableFeatures(Features);
  FeatureBitset RequiredFeatures = computeRequiredFeatures(Opcode);
  FeatureBitset MissingFeatures =
      (AvailableFeatures & RequiredFeatures) ^
      RequiredFeatures;
  ...
}
```

And `computeAvailableFeatures` is just a set query, like this:
```cpp
inline FeatureBitset computeAvailableFeatures(const FeatureBitset &FB) {
  FeatureBitset Features;
  if (FB[WebAssembly::FeatureAtomics])
    Features.set(Feature_HasAtomicsBit);
  if (FB[WebAssembly::FeatureBulkMemory])
    Features.set(Feature_HasBulkMemoryBit);
  if (FB[WebAssembly::FeatureExceptionHandling])
    Features.set(Feature_HasExceptionHandlingBit);
  ...
```

So this is how currently `HasSIMD128` is defined:

0caf0c93e7/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td (L79-L81)

The things being checked in this `computeAvailableFeatures`, and in turn
in `AsmPrinter`, are `AssemblerPredicate`s. These only check which bits
are set in the features set and are different from `Predicate`s, which
can call `Subtarget` functions like `Subtarget->hasSIMD128()`.

But apparently we can use `all_of` and `any_of` directives in
`AssemblerPredicate`, and we can make `simd128`'s `AssemblerPredicate`
set in `relaxed-simd` is set by the condition as an 'or' of the two.

Fixes #98502.
2024-07-23 11:50:56 -07:00
Farzon Lotfi
def3944df8
[WebAssembly] Add Support for Arc and Hyperbolic trig llvm intrinsics (#98755)
## Change:
- WebAssemblyRuntimeLibcallSignatures.cpp: Expose the RTLIB's for use by
WASM
-  Add trig specific test cases

## History
This change is part of an implementation of
https://github.com/llvm/llvm-project/issues/87367's investigation on
supporting IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This change adds wasm lowering cases for `acos`, `asin`, `atan`, `cosh`,
`sinh`, and `tanh`.

https://github.com/llvm/llvm-project/issues/70079
https://github.com/llvm/llvm-project/issues/70080
https://github.com/llvm/llvm-project/issues/70081
https://github.com/llvm/llvm-project/issues/70083
https://github.com/llvm/llvm-project/issues/70084
https://github.com/llvm/llvm-project/issues/95966

## Why Web Assembly?
From past changes to try and support constraint intrinsics the changes
to the trig builtins to emit intrinsics\constraint intrinsics broke the
WASM build. This is an attempt to preempt any such build break.

- https://github.com/llvm/llvm-project/pull/95082
-
https://github.com/llvm/llvm-project/pull/94559#issuecomment-2159923215
2024-07-19 10:18:58 -04:00
Sam Parker
d28ed29d6b
[TTI][WebAssembly] Pairwise reduction expansion (#93948)
WebAssembly doesn't support horizontal operations nor does it have a way
of expressing fast-math or reassoc flags, so runtimes are currently
unable to use pairwise operations when generating code from the existing
shuffle patterns.

This patch allows the backend to select which, arbitary, shuffle pattern
to be used per reduction intrinsic. The default behaviour is the same as
the existing, which is by splitting the vector into a top and bottom
half. The other pattern introduced is for a pairwise shuffle.

WebAssembly enables pairwise reductions for int/fp add/sub.
2024-07-17 09:21:52 +01:00
Volodymyr Vasylkun
e094abde42
[SelectionDAG] Expand [US]CMP using arithmetic on boolean values instead of selects (#98774)
The previous expansion of [US]CMP was done using two selects and two
compares. It produced decent code, but on many platforms it is better to
implement [US]CMP nodes by performing the following operation:

  ```
[us]cmp(x, y) = (x [us]> y) - (x [us]< y)
```

This patch adds this new expansion, as well as a hook in TargetLowering to allow some targets to still use the select-based approach. AArch64 and SystemZ are currently the only targets to prefer the former approach, but other targets may also start to use it if it provides for better codegen.
2024-07-16 20:56:18 +01:00
Heejin Ahn
fb6e024f49
[WebAssembly] Update generic and bleeding-edge CPUs (#96584)
This updates the list of features in 'generic' and 'bleeding-edge' CPUs
in the backend to match

4e0a0eae58/clang/lib/Basic/Targets/WebAssembly.cpp (L150-L178)

This updates existing CodeGen tests in a way that, if a test has
separate RUN lines for a reference-types test and a non-reference-types
test, I added -mattr=-reference-types to the no-reftype test's RUN
command line. I didn't delete existing -mattr=+reference-types lines in
reftype tests because having it helps readability.

Also, when tests is not really about reference-types but they have to
updated because they happen to contain call_indirect lines because now
call_indirect will take __indirect_function_table as an argument, I just
added the table argument to the expected output.

`target-features-cpus.ll` has been updated reflecting the newly added
features.
2024-07-01 19:12:01 -07:00
Heejin Ahn
a54704de0d
[WebAssembly] Split and tidy up target features test (#96735)
This splits `target-features.ll` into two tests:
`target-features-attrs.ll` and `target-features-cpus.ll`.

Now `target-features-attrs.ll` contains tests with bitcode function
attributes and `-mattr=` options. The current `target-features.ll`
file's FileCheck lines are confusing, mainly because it is unclear how
`CHECK` and `ATTRS` lines are meant to be different. Turns out, before
67ec8744d7,
`-mattr=` options used to override any existing bitcode function
attributes, but after the commit that's not the case anymore. So the
original test had a line that tested `i32.atomic.rmw.cmpxchg` was not
generated when `-mattr=+simd128` was given (because the existing
`+atomics` in the function attributes is overriden). That commit deleted
that line and changed some `ATTRS` lines into `CHECK`, which was
confusing. This PR simplifies that part and does not test the absence of
any instructions, and the effect of `-mattr=` option is only tested with
the target features section.

And `target-features-cpus.ll` only tests the sets of features enabled by
`-mcpu=` lines. It is better to have this as a separate file because
once you have bitcode function attributes they end up in the target
features section too, making the testing of only the `-mcpu=` options
difficult.
2024-06-26 13:28:55 -07:00
Heejin Ahn
1822e3183d
[WebAssembly] Rename target-features.ll (#96716)
I'm planning on a PR that splits `target-features.ll` into two different
files and fix some other stuff on them:
- `target-features-attrs.ll` that tests target features by bitcode
function attributes and `-mattr=` options
- `target-features-cpus.ll` that tests target features by `-mcpu=`
options

But `target-features-attrs.ll` will share a bulk of the lines with the
current `target-features.ll`. And if I remove `target-features.ll` and
create the two new files in a single PR, git doesn't recognize either of
them as a copy (I hoped at least `target-features-attrs.ll` would be
recognized as a copy because it shares many lines with the current file)

So to make the diff smaller and easier to review, I'm renaming the file
first. I'll follow up with the PR that does the actual splitting.
2024-06-25 23:14:04 -07:00
Brendan Dahl
928b780840
[WebAssembly] Implement trunc_sat and convert instructions for f16x8. (#95180)
These instructions can be generated using regular LL intrinsics.

Specified at:

29a9b9462c/proposals/half-precision/Overview.md
2024-06-25 10:39:05 -07:00
Heejin Ahn
3c8f3b91d8
[WebAssembly] Treat 'rethrow' as terminator in custom isel (#95967)
`rethrow` instruction is a terminator, but when when its DAG is built in
`SelectionDAGBuilder` in a custom routine, it was NOT treated as such.

```ll
rethrow:                                          ; preds = %catch.start
  invoke void @llvm.wasm.rethrow() #1 [ "funclet"(token %1) ]
          to label %unreachable unwind label %ehcleanup

ehcleanup:                                        ; preds = %rethrow, %catch.dispatch
  %tmp = phi i32 [ 10, %catch.dispatch ], [ 20, %rethrow ]
  ...
```

In this bitcode, because of the `phi`, a `CONST_I32` will be created in
the `rethrow` BB. Without this patch, the DAG for the `rethrow` BB looks
like this:
```
  t0: ch,glue = EntryToken
      t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20>
      t5: ch = llvm.wasm.rethrow t0, TargetConstant:i32<12161>
    t6: ch = TokenFactor t3, t5
  t8: ch = br t6, BasicBlock:ch<unreachable 0x562532e43c50>
```
Note that `CopyToReg` and `llvm.wasm.rethrow` don't have dependence so
either can come first in the selected code, which can result in the code
like
```mir
bb.3.rethrow:
  RETHROW 0, implicit-def dead $arguments
  %9:i32 = CONST_I32 20, implicit-def dead $arguments
  BR %bb.6, implicit-def dead $arguments
```

After this patch, `llvm.wasm.rethrow` is treated as a terminator, and
the DAG will look like
```
        t0: ch,glue = EntryToken
      t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20>
    t5: ch = llvm.wasm.rethrow t3, TargetConstant:i32<12161>
  t7: ch = br t5, BasicBlock:ch<unreachable 0x5555e3d32c70>
```
Note that now `rethrow` takes a token from `CopyToReg`, so `rethrow` has
to come after `CopyToReg`. And the resulting code will be
```mir
bb.3.rethrow:
  %9:i32 = CONST_I32 20, implicit-def dead $arguments
  RETHROW 0, implicit-def dead $arguments
  BR %bb.6, implicit-def dead $arguments
```

I'm not very familiar with the internals of `getRoot` vs.
`getControlRoot`, but other terminator instructions seem to use the
latter, and using it for `rethrow` too worked.
2024-06-18 21:56:41 -07:00
Farzon Lotfi
6355fb45a5
[CodeGen] Support vectors across all backends (#95518)
Add a default f16 type promotion
2024-06-14 17:18:20 -04:00
Farzon Lotfi
38ccee0034
[WASM] Fix for wasi libc build break add tan to RuntimeLibcallSignatureTable (#95082)
The wasm backend fetches the tan runtime lib call in
`llvm/include/llvm/IR/RuntimeLibcalls.def` via `StaticLibcallNameMap()`,
but ignores the runtime function because a function sinature mapping is
not specified in RuntimeLibcallSignatureTable(). The fix is to specify
the function signatures for float32-128.

This is a fix for a build break reported on PR
https://github.com/llvm/llvm-project/pull/94559#issuecomment-2159923215.
2024-06-11 10:43:51 -04:00
Matt Arsenault
84b026690d
DAG: Pass flags to FoldConstantArithmetic (#93663)
There is simply way too much going on inside getNode. The complicated
constant folding of vector handling works by looking for build_vector
operands, and then tries to getNode the scalar element and then checks
if
constants were the result. As a side effect, this produces unused scalar
operation nodes (previously, without flags). If the vector operation
were later scalarized, it would find the flagless constant folding
temporary and lose the flag. I don't think this is a reasonable way for
constant folding to operate, but for now fix this by ensuring flags
on the original operation are preserved in the temporary.
    
This yields a clear code improvement for AMDGPU when f16 isn't legal.
The Wasm cases switch from using a libcall to compare and select. We are
evidently
missing the fcmp+select to fminimum/fmaximum handling, but this would be
further
improved when that's handled. AArch64 also avoids the libcall, but looks
worse and
has a different call for some reason.
2024-06-06 16:44:07 +02:00
Jon Chesterfield
8516f54e6a
[AMDGPU] Implement variadic functions by IR lowering (#93362)
This is a mostly-target-independent variadic function optimisation and
lowering pass. It is only enabled for AMDGPU in this initial commit.

The purpose is to make C style variadic functions a zero cost
abstraction. They are lowered to equivalent IR which is then amenable to
other optimisations. This is inherently slightly target specific but
much less so than one might expect - the C varargs interface heavily
constrains the ABI design divergence.

The pass is primarily tested from webassembly. This is because wasm has
a straightforward variadic lowering strategy which coincides exactly
with what this pass transforms code into and a struct passing convention
with few cases to check. Adding further targets conventions is
straightforward and elided from this patch primarily to simplify the
review. Implemented in other branches are Linux X86, AMD64, AArch64 and
NVPTX.

Testing for targets that have existing lowering for va_arg from clang is
most efficiently done by checking that clang | opt completely elides the
variadic syntax from test cases. The lowering produces a struct for each
call site which can be inspected to check the various alignment and
indirections are correct.

AMDGPU presently has no variadic support other than some ad hoc printf
handling. Combined with the pass being inactive on all other targets
landing this represents strict increase in capability with zero risk.
Testing and refining will continue post commit.

In addition to the compiler tests included here, a self contained x64
clang/musl toolchain was constructed using the "lowering" instead of the
systemv ABI and used to build various C programs like lua and libxml2.
2024-06-06 10:44:53 +01:00
Brendan Dahl
dfd1a2f081
[WebAssembly] Implement all f16x8 unary instructions. (#94063)
All of these instructions can be generated using regular LL intrinsics.

Specified at:

29a9b9462c/proposals/half-precision/Overview.md
2024-06-04 13:06:16 -04:00
Nikita Popov
deab451e7a
[IR] Remove support for icmp and fcmp constant expressions (#93038)
Remove support for the icmp and fcmp constant expressions.

This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179

As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
2024-06-04 08:31:03 +02:00
Brendan Dahl
8aa8019975
[WebAssembly] Implement all f16x8 relation instructions. (#93751)
All of these instructions can be generated using regular LL
instructions.

Specified at:

29a9b9462c/proposals/half-precision/Overview.md
2024-05-30 09:02:17 -07:00
Brendan Dahl
60bce6eab4
[WebAssembly] Implement all f16x8 binary instructions. (#93360)
This reuses most of the code that was created for f32x4 and f64x2 binary
instructions and tries to follow how they were implemented.

add/sub/mul/div - use regular LL instructions
min/max - use the minimum/maximum intrinsic, and also have builtins
pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins

Specified at:

29a9b9462c/proposals/half-precision/Overview.md
2024-05-28 16:33:20 -07:00
Heejin Ahn
722a5fce58
[WebAssembly] Add -wasm-enable-exnref option (#93597)
This adds `-wasm-enable-exnref`, which will enable the new EH
instructions using `exnref` (adopted in Oct 2023 CG meeting):
https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md
This option should be used with `-wasm-enable-eh`.
2024-05-28 16:27:04 -07:00
Heejin Ahn
c179d50fd3
[WebAssembly] Add exnref type (#93586)
This adds (back) the exnref type restored in the new EH proposal adopted
in Oct 2023 CG meeting:

https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md:x
2024-05-28 16:10:11 -07:00
Heejin Ahn
c108c1e945
[WebAssembly] Rename old EH tests to *-legacy (#93585)
I think test files for the legacy and the new EH (exnref) are better be
separate, and I'd like to use the current test file names for the new
EH, rather than keeping the current files and naming the new ones as
`-new` or something.
2024-05-28 13:26:36 -07:00
Heejin Ahn
08de0b3cf5
[WebAssembly] Add tests for EH/SjLj option errors (#93583)
This adds tests for EH/SjLj option errors and swaps the error checking
order for unimportant cosmetic reasons (I think checking EH/SjLj
conflicts is more important than the model checking)
2024-05-28 11:36:48 -07:00
Brendan Dahl
4ebe9bba59
[WebAssembly] Implement prototype f16x8.extract_lane instruction. (#93272)
Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f16x8.extract_lane as opcode 0x124, but this
is incorrect and will be changed to 0x121 soon.
2024-05-24 08:31:07 -07:00
Brendan Dahl
09c5525610
[WebAssembly] Implement prototype f16x8.splat instruction. (#93228)
Adds a builtin and intrinsic for the f16x8.splat instruction.

Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f16x8.splat as opcode 0x123, but this is
incorrect and will be changed to 0x120 soon.
2024-05-23 20:05:22 -07:00
Sam Clegg
39d32b238d
[WebAssembly] Use 64-bit table when targeting wasm64 (#92042)
See https://github.com/WebAssembly/memory64/issues/51
2024-05-23 18:25:58 -07:00
Alex Voicu
10edb4991c
[Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (#88182)
At the moment, Clang is rather liberal in assuming that 0 (and by extension unqualified) is always a safe default. This does not work for targets that actually use a different value for the default / generic AS (for example, the SPIRV that obtains from HIPSPV or SYCL). This patch is a first, fairly safe step towards trying to clear things up by querying a modules' default AS from the target, rather than assuming it's 0, alongside fixing a few places where things break / we encode the 0 == DefaultAS assumption. A bunch of existing tests are extended to check for non-zero default AS usage.
2024-05-19 14:59:03 +01:00
Brendan Dahl
8a3277acbc
[WebAssembly] Implement prototype f32.store_f16 instruction. (#91545)
Adds a builtin and intrinsic for the f32.store_f16 instruction.

The instruction stores an f32 value as an f16 memory. Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is
incorrect and will be changed to 0xFC31 soon.
2024-05-09 15:38:13 -07:00
Brendan Dahl
1a2a1fbd7c
[WebAssembly] Implement prototype f32.load_f16 instruction. (#90906)
Adds a builtin and intrinsic for the f32.load_f16 instruction.

The instruction loads an f16 value from memory and puts it in an f32.
Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is
incorrect and will be changed to 0xFC30 soon.
2024-05-07 11:33:10 -07:00
Congcong Cai
1a46229636
Revert "Revert "[WebAssembly] remove instruction after builtin trap" (#90354)" (#90366)
`llvm.trap` will be convert as unreachable which is terminator.
Instruction after terminator will cause validation failed.
This PR introduces a pass to clean instruction after terminator.
Fixes: https://github.com/llvm/llvm-project/issues/68770
Reapply: #90207
2024-04-28 10:13:02 +08:00
Mehdi Amini
38a2051c52
Revert "[WebAssembly] remove instruction after builtin trap" (#90354)
Reverts llvm/llvm-project#90207

LLD Bots are broken.
2024-04-27 21:14:46 +02:00
Congcong Cai
ff03f23be8
[WebAssembly] remove instruction after builtin trap (#90207)
`llvm.trap` will be convert as `unreachable` which is terminator.
Instruction after terminator will cause validation failed.
This PR introduces a pass to clean instruction after terminator.
Fixes: #68770.
2024-04-27 22:11:47 +08:00