AMDGPU has native instructions and target intrinsics for this, but
these really should be subject to legalization and generic
optimizations. This will enable legalization of f16->f32 on targets
without f16 support.
Implement a somewhat horrible inline expansion for targets without
libcall support. This could be better if we could introduce control
flow (GlobalISel version not yet implemented). Support for strictfp
legalization is less complete but works for the simple cases.
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):
- svld1_hor_za8 // also for _za16, _za32, _za64 and _za128
- svld1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128
- svld1_ver_za8 // also for _za16, _za32, _za64 and _za128
- svld1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128
- svst1_hor_za8 // also for _za16, _za32, _za64 and _za128
- svst1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128
- svst1_ver_za8 // also for _za16, _za32, _za64 and _za128
- svst1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128
SveEmitter.cpp is extended to generate arm_sme.h (currently named
arm_sme_draft_spec_subject_to_change.h) and other SME definitions from
arm_sme.td, which is modeled after arm_sve.td. Common TableGen definitions
are moved into arm_sve_sme_incl.td.
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen, kmclaughlin
Differential Revision: https://reviews.llvm.org/D127910
Add more builtins for stdio functions as in GCC, along with their
mutations under IEEE float128 ABI.
Reviewed By: tuliom
Differential Revision: https://reviews.llvm.org/D150087
For the cover letter of this patch-set, please checkout D146872.
Depends on D147731.
This is the 4th patch of the patch-set.
This patch is a proof-of-concept and will be extended to full coverage
in the future. Currently, the old non-tuple unit-stride segment store is
not removed, and only signed integer unit-strided segment store of NF=2,
EEW=32 is defined here.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D147774
Clang has mechanism to specify required target features of a built-in
function. This patch adds such definitions to Altivec, VSX, HTM,
PairedVec and MMA builtins.
This will help frontend to detect incompatible target features of
bulitin when using target attribute syntax.
Reviewed By: nemanjai, kamaub
Differential Revision: https://reviews.llvm.org/D143467
Currently clang doesn't verify if the first argument in
`_builtin_assume_aligned` is of pointer type. This leads to an
assertion build failure. This patch aims to add a check if the first
argument is of pointer type or not and diagnose it with
diag::err_typecheck_convert_incompatible if its not of pointer type.
Fixes https://github.com/llvm/llvm-project/issues/62305
Differential Revision: https://reviews.llvm.org/D149514
Since we don't always need the vendor extension to be in riscv_vector.td,
so it's better to make it be in separated header.
Depends on D148223 and D148680
Differential Revision: https://reviews.llvm.org/D148308
This commit implements the two NTLH intrinsic functions.
```
type __riscv_ntl_load (type *ptr, int domain);
void __riscv_ntl_store (type *ptr, type val, int domain);
```
```
enum {
__RISCV_NTLH_INNERMOST_PRIVATE = 2,
__RISCV_NTLH_ALL_PRIVATE,
__RISCV_NTLH_INNERMOST_SHARED,
__RISCV_NTLH_ALL
};
```
We encode the non-temporal domain into MachineMemOperand flags.
1. Create the RISC-V built-in function with custom semantic checking.
2. Assume the domain argument is a compile time constant,
and make it as LLVM IR metadata (nontemp_node).
3. Encode domain value as two bits MachineMemOperand TargetMMOflag.
4. According to MachineMemOperand TargetMMOflag, select corrsponding ntlh instruction.
Currently, it supports scalar type and fixed-length vector type.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D143364
Change the return type of `getClobbers` function from `const char*`
to `std::string_view`. Update the function usages in CodeGen module.
The reasoning of these changes is to remove unsafe `const char*`
strings and prevent unnecessary allocations for constructing the
`std::string` in usages of `getClobbers()` function.
Differential Revision: https://reviews.llvm.org/D148799
Make CheckAtomicAlignment() return the computed pointer for reuse to avoid
emitting it twice.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D148422
Addresses the FIXME and removes the only in tree use of
llvm::ConstantFP::getZeroValueForNegation for an FP type.
Reviewed By: dmgreen, SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D147497
Plumbing from the language level to the assume intrinsics with
separate_storage operand bundles.
Patch by David Goldblatt (davidtgoldblatt)
Differential Revision: https://reviews.llvm.org/D136515
This is the funcref counterpart to 890146b. We introduce a new attribute
that marks a function pointer as a funcref. It also implements builtin
__builtin_wasm_ref_null_func(), that returns a null funcref value.
Differential Revision: https://reviews.llvm.org/D128440
On SystemZ, int128 values are generally aligned to only 8 bytes per the ABI
while 128 bit atomic ISA instructions exist with a full 16 byte alignment
requirement.
__sync builtins are emitted as atomicrmw instructions which always require
the natural alignment (16 bytes in this case), and they always get it
regardless of the alignment of the value being addressed.
This patch improves this situation by giving a warning if the alignment is
not known to be sufficient. This check is done in CodeGen instead of in Sema
as this is currently the only place where the alignment can be computed. This
could/should be moved into Sema in case the alignment computation could be
made there eventually.
Reviewed By: efriedma, jyknight, uweigand
Differential Revision: https://reviews.llvm.org/D143813
Also check if native half types are supported to give more descriptive
error message, without it clang only reports incorrect intrinsic return
type.
Differential Revision: https://reviews.llvm.org/D145238
This builtin will be converted to llvm.set.rounding intrinsic
in IR level and should be work with "#pragma STDC FENV_ACCESS ON"
since it changes default FP environment. Users can change rounding
mode via this builtin without introducing libc dependency.
Reviewed by: andrew.w.kaylor, rjmccall, sepavloff, aaron.ballman
Differential Revision: https://reviews.llvm.org/D145765
Signed-off-by: jinge90 <ge.jin@intel.com>
Add codegen for llvm exp/exp2 elementwise builtin
The exp/exp2 elementwise builtins are necessary for HLSL codegen.
Tests were added to make sure that the expected errors are encountered when these functions are given inputs of incompatible types.
The new builtins are restricted to floating point types only.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D145270
This builtin will be converted to llvm.set.rounding intrinsic
in IR level and should be work with "#pragma STDC FENV_ACCESS ON"
since it changes default FP environment. Users can change rounding
mode via this builtin without introducing libc dependency.
Reviewed by: andrew.w.kaylor, rjmccall, sepavloff
Differential Revision: https://reviews.llvm.org/D144454
Signed-off-by: jinge90 <ge.jin@intel.com>
To generate FROUND instructions in https://reviews.llvm.org/D143982, we need to use llvm intrinsics in IR files. Now I add some corresponding builtin functions to make sure these roundtointegral instructions can be generated from C codes.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D144935
I didn't understand why the other builtins have promotion logic,
or how it would apply for a ternary operation. Implicit conversions
are evil to begin with, and even more so when the purpose is to get
an exact IR intrinsic. This checks all the arguments have the same type.
This patch introduces a new type __externref_t that denotes a WebAssembly opaque
reference type. It also implements builtin __builtin_wasm_ref_null_extern(),
that returns a null value of __externref_t. This lays the ground work
for further builtins and reference types.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D122215
Add codegen for llvm log2 / log10 elementwise builtin
The log2/log10 elementwise builtin is necessary for HLSL codegen.
Tests were added to make sure that the expected errors are encountered when these functions are given inputs of incompatible types.
The new builtins are restricted to floating point types only.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D143207
This patch changes the lowering for the following builtins to emit
calls to the new aarch64.sve.###.u intrinsics.
svabd_x
svabd_n_x
svadd_x
svadd_n_x
svasr_x
svasr_n_x
svdiv_x
svdiv_n_x
svdivr_x
svdivr_n_x
svlsl_x
svlsl_n_x
svlsr_x
svlsr_n_x
svmax_x
svmax_n_x
svmin_x
svmin_n_x
svmul_x
svmul_n_x
svmulh_x
svmulh_n_x
svsub_x
svsub_n_x
svsubr_x
svsubr_n_x
Depends on D141938
Differential Revision: https://reviews.llvm.org/D141939
Removes the forwarding header `llvm/Support/AArch64TargetParser.h`.
I am proposing to do this for all the forwarding headers left after
rGf09cf34d00625e57dea5317a3ac0412c07292148 - for each header:
- Update all relevant in-tree includes
- Remove the forwarding Header
Differential Revision: https://reviews.llvm.org/D140999
Add codegen for llvm log elementwise builtin
The log elementwise builtin is necessary for HLSL codegen.
Tests were added to make sure that the expected errors are encountered when these functions are given inputs of incompatible types.
The new builtin is restricted to floating point types only.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D140489
This patch introduces a new type __externref_t that denotes a WebAssembly opaque
reference type. It also implements builtin __builtin_wasm_ref_null_extern(),
that returns a null value of __externref_t. This lays the ground work
for further builtins and reference types.
Differential Revision: https://reviews.llvm.org/D122215
This diff extends D123345 by adding support for std::forward_like.
Test plan: ninja check-clang check-clang-tools check-llvm
Differential revision: https://reviews.llvm.org/D142430
The current way creates a fallacy that checking for
`PolicyAttrs == TAIL_AGNOSTIC` is implicitly equivalant to
`TAIL_AGNOSTIC_MASK_UNDISTURBED`. This works under the assumption that
an unmasked intrinsic has a policy of TAMU. The expression here is
mis-leading and will not be correct when the default policy is not
TAMU.
As this patch-set targets to change the default policy from TAMU to
TAMA, this commit is necessary before changing the default.
This is the 12th commit of a patch-set that aims to change the default policy
for RVV intrinsics from TAMU to TAMA.
Please refer to the cover letter in the 1st commit (D141573) for an
overview.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D141789
The SVE builtins tests rely on optimisations to remove clutter from
the resulting IR that is not relevant to the tests. However, with
the increasing number of target intrinsic combines the clang tests
are moving further away from verifying what is relevant to clang.
During early SVE (or rather scalable vector) bringup, we chose to
mitigate bugs by minimising our usage of LLVM IR instructions then
later implemented the combines to lower the calls to generic IR once
scalable vector support had matured. With the mitigations no longer
required and the combines mostly trivial I have moved the logic into
CGBuiltins, which allows the existing tests to remain unchanged once
they stop using instcombine.
The optimisations include:
* Using shifts in place of multiplies by power-of-two values.
* Don't emit getelementptrs when offset is zero.
* Use IR based vector splats rather than calls to dup_x.
* Use IR based vector selects rather than calls to sel.
* Use i64 based indices for insertelement.
The test changes are the result of "sed -i -e 's/instcombine,//'",
with the exception of acle_sve_dupq.c which required regeneration
due to its previous reliance on a zext->tunc->zext combine.
The following tests still rely on instcombine because they require
changes beyond CGBuiltin.cpp:
CodeGen/aarch64-sve-intrinsics/acle_sve_clasta.c
CodeGen/aarch64-sve-intrinsics/acle_sve_clastb.c
CodeGen/aarch64-sve-intrinsics/acle_sve_cntb.c
CodeGen/aarch64-sve-intrinsics/acle_sve_cntd.c
CodeGen/aarch64-sve-intrinsics/acle_sve_cnth.c
CodeGen/aarch64-sve-intrinsics/acle_sve_cntw.c
CodeGen/aarch64-sve-intrinsics/acle_sve_dup-bfloat.c
CodeGen/aarch64-sve-intrinsics/acle_sve_dup.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1-bfloat.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1sb.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1sh.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1sw.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ub.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1uh.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1uw.c
CodeGen/aarch64-sve-intrinsics/acle_sve_len-bfloat.c
CodeGen/aarch64-sve-intrinsics/acle_sve_len.c
CodeGen/aarch64-sve-intrinsics/acle_sve_rdffr.c
CodeGen/aarch64-sve-intrinsics/acle_sve_sel-bfloat.c
CodeGen/aarch64-sve-intrinsics/acle_sve_sel.c
CodeGen/aarch64-sve-intrinsics/acle_sve_st1-bfloat.c
CodeGen/aarch64-sve-intrinsics/acle_sve_st1.c
CodeGen/aarch64-sve-intrinsics/acle_sve_st1b.c
CodeGen/aarch64-sve-intrinsics/acle_sve_st1h.c
CodeGen/aarch64-sve-intrinsics/acle_sve_st1w.c
Tests within aarch64-sve2-intrinsics don't use opt but instead use
-O1 to cleanup their output. These tests remain unchanged and will
be visited by a later patch.
Depends on D140983
Differential Revision: https://reviews.llvm.org/D141056