Getting this to work required a few additional changes:
- Add builtins for any instructions that can't be done with plain C
currently.
- Add support for the saturating version of fp_to_<s,i>_I16x8. Other
vector sizes supported this already.
- Support bitcast of f16x8 to v128. Needed to return a __f16x8 as
v128_t.
This reuses most of the code that was created for f32x4 and f64x2 binary
instructions and tries to follow how they were implemented.
add/sub/mul/div - use regular LL instructions
min/max - use the minimum/maximum intrinsic, and also have builtins
pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins
Specified at:
29a9b9462c/proposals/half-precision/Overview.md
Adds a builtin and intrinsic for the f16x8.splat instruction.
Specified at:
29a9b9462c/proposals/half-precision/Overview.md
Note: the current spec has f16x8.splat as opcode 0x123, but this is
incorrect and will be changed to 0x120 soon.
Adds a builtin and intrinsic for the f32.store_f16 instruction.
The instruction stores an f32 value as an f16 memory. Specified at:
29a9b9462c/proposals/half-precision/Overview.md
Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is
incorrect and will be changed to 0xFC31 soon.
Adds a builtin and intrinsic for the f32.load_f16 instruction.
The instruction loads an f16 value from memory and puts it in an f32.
Specified at:
29a9b9462c/proposals/half-precision/Overview.md
Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is
incorrect and will be changed to 0xFC30 soon.