This patch introduces a new type __externref_t that denotes a WebAssembly opaque
reference type. It also implements builtin __builtin_wasm_ref_null_extern(),
that returns a null value of __externref_t. This lays the ground work
for further builtins and reference types.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D122215
Add codegen for llvm log2 / log10 elementwise builtin
The log2/log10 elementwise builtin is necessary for HLSL codegen.
Tests were added to make sure that the expected errors are encountered when these functions are given inputs of incompatible types.
The new builtins are restricted to floating point types only.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D143207
This patch changes the lowering for the following builtins to emit
calls to the new aarch64.sve.###.u intrinsics.
svabd_x
svabd_n_x
svadd_x
svadd_n_x
svasr_x
svasr_n_x
svdiv_x
svdiv_n_x
svdivr_x
svdivr_n_x
svlsl_x
svlsl_n_x
svlsr_x
svlsr_n_x
svmax_x
svmax_n_x
svmin_x
svmin_n_x
svmul_x
svmul_n_x
svmulh_x
svmulh_n_x
svsub_x
svsub_n_x
svsubr_x
svsubr_n_x
Depends on D141938
Differential Revision: https://reviews.llvm.org/D141939
Removes the forwarding header `llvm/Support/AArch64TargetParser.h`.
I am proposing to do this for all the forwarding headers left after
rGf09cf34d00625e57dea5317a3ac0412c07292148 - for each header:
- Update all relevant in-tree includes
- Remove the forwarding Header
Differential Revision: https://reviews.llvm.org/D140999
Add codegen for llvm log elementwise builtin
The log elementwise builtin is necessary for HLSL codegen.
Tests were added to make sure that the expected errors are encountered when these functions are given inputs of incompatible types.
The new builtin is restricted to floating point types only.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D140489
This patch introduces a new type __externref_t that denotes a WebAssembly opaque
reference type. It also implements builtin __builtin_wasm_ref_null_extern(),
that returns a null value of __externref_t. This lays the ground work
for further builtins and reference types.
Differential Revision: https://reviews.llvm.org/D122215
This diff extends D123345 by adding support for std::forward_like.
Test plan: ninja check-clang check-clang-tools check-llvm
Differential revision: https://reviews.llvm.org/D142430
The current way creates a fallacy that checking for
`PolicyAttrs == TAIL_AGNOSTIC` is implicitly equivalant to
`TAIL_AGNOSTIC_MASK_UNDISTURBED`. This works under the assumption that
an unmasked intrinsic has a policy of TAMU. The expression here is
mis-leading and will not be correct when the default policy is not
TAMU.
As this patch-set targets to change the default policy from TAMU to
TAMA, this commit is necessary before changing the default.
This is the 12th commit of a patch-set that aims to change the default policy
for RVV intrinsics from TAMU to TAMA.
Please refer to the cover letter in the 1st commit (D141573) for an
overview.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D141789
The SVE builtins tests rely on optimisations to remove clutter from
the resulting IR that is not relevant to the tests. However, with
the increasing number of target intrinsic combines the clang tests
are moving further away from verifying what is relevant to clang.
During early SVE (or rather scalable vector) bringup, we chose to
mitigate bugs by minimising our usage of LLVM IR instructions then
later implemented the combines to lower the calls to generic IR once
scalable vector support had matured. With the mitigations no longer
required and the combines mostly trivial I have moved the logic into
CGBuiltins, which allows the existing tests to remain unchanged once
they stop using instcombine.
The optimisations include:
* Using shifts in place of multiplies by power-of-two values.
* Don't emit getelementptrs when offset is zero.
* Use IR based vector splats rather than calls to dup_x.
* Use IR based vector selects rather than calls to sel.
* Use i64 based indices for insertelement.
The test changes are the result of "sed -i -e 's/instcombine,//'",
with the exception of acle_sve_dupq.c which required regeneration
due to its previous reliance on a zext->tunc->zext combine.
The following tests still rely on instcombine because they require
changes beyond CGBuiltin.cpp:
CodeGen/aarch64-sve-intrinsics/acle_sve_clasta.c
CodeGen/aarch64-sve-intrinsics/acle_sve_clastb.c
CodeGen/aarch64-sve-intrinsics/acle_sve_cntb.c
CodeGen/aarch64-sve-intrinsics/acle_sve_cntd.c
CodeGen/aarch64-sve-intrinsics/acle_sve_cnth.c
CodeGen/aarch64-sve-intrinsics/acle_sve_cntw.c
CodeGen/aarch64-sve-intrinsics/acle_sve_dup-bfloat.c
CodeGen/aarch64-sve-intrinsics/acle_sve_dup.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1-bfloat.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1sb.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1sh.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1sw.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ub.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1uh.c
CodeGen/aarch64-sve-intrinsics/acle_sve_ld1uw.c
CodeGen/aarch64-sve-intrinsics/acle_sve_len-bfloat.c
CodeGen/aarch64-sve-intrinsics/acle_sve_len.c
CodeGen/aarch64-sve-intrinsics/acle_sve_rdffr.c
CodeGen/aarch64-sve-intrinsics/acle_sve_sel-bfloat.c
CodeGen/aarch64-sve-intrinsics/acle_sve_sel.c
CodeGen/aarch64-sve-intrinsics/acle_sve_st1-bfloat.c
CodeGen/aarch64-sve-intrinsics/acle_sve_st1.c
CodeGen/aarch64-sve-intrinsics/acle_sve_st1b.c
CodeGen/aarch64-sve-intrinsics/acle_sve_st1h.c
CodeGen/aarch64-sve-intrinsics/acle_sve_st1w.c
Tests within aarch64-sve2-intrinsics don't use opt but instead use
-O1 to cleanup their output. These tests remain unchanged and will
be visited by a later patch.
Depends on D140983
Differential Revision: https://reviews.llvm.org/D141056
To preserve the previous semantics after D141386, adjust places
that currently emit !range metadata to also emit !noundef metadata.
This retains range violation as immediate undefined behavior,
rather than just poison.
Differential Revision: https://reviews.llvm.org/D141494
The CACOP instruction is mainly used for cache initialization
and cache-consistency maintenance.
Depends on D140872
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D140527
Instruction formats:
`movgr2fcsr fcsr, rj`
`movfcsr2gr rd, fcsr`
MOVGR2FCSR modifies the value of the software writable field
corresponding to the FCSR (floating-point control and status
register) `fcsr` according to the value of the lower 32 bits of
the GR (general purpose register) `rj`.
MOVFCSR2GR sign extends the 32-bit value of the FCSR `fcsr`
and writes it into the GR `rd`.
Add "i32 @llvm.loongarch.movfcsr2gr(i32)" intrinsic for MOVFCSR2GR
instruction. The argument is FCSR register number. The return value
is the value in the FCSR.
Add "void @llvm.loongarch.movgr2fcsr(i32, i32)" intrinsic for MOVGR2FCSR
instruction. The first argument is the FCSR number, the second argument
is the value in GR.
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D140685
Use explicit _w32/_w64 suffixes for the wave size to be consistent
with the existing other wave dependent intrinsics. Also start
diagnosing trying to use both wave32 and wave64.
I would have preferred to avoid the +wavefrontsize64 spam on targets
where that's the only option, but avoiding this seems to be more work
than I expected.
This avoids recomputing string length that is already known at compile time.
It has a slight impact on preprocessing / compile time, see
https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u
This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.
The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.
Differential Revision: https://reviews.llvm.org/D139881
The naming here is strange since the value may still be updated.
Reviewed By: kito-cheng, khchen
Differential Revision: https://reviews.llvm.org/D140389
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4 (part 2)"
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4"
GCC build hangs on this bot https://lab.llvm.org/buildbot/#/builders/37/builds/19104
compiling CMakeFiles/obj.clangBasic.dir/Targets/AArch64.cpp.d
The bot uses GNU 11.3.0, but I can reproduce locally with gcc (Debian 12.2.0-3) 12.2.0.
This reverts commit caa713559bd38f337d7d35de35686775e8fb5175.
This reverts commit 06b90e2e9c991e211fecc97948e533320a825470.
This reverts commit e953ae5bbc313fd0cc980ce021d487e5b5199ea4.
@arsenm raised a good question that we should use a flag guard.
But I found it is not a problem as long as user uses intrinsics only: https://godbolt.org/z/WoYsqqjh3
Anyway, it is still nice to have.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D140467
RVV intrinsic function will generate riscv_vector_builtin_cg.inc for CGBuiltin.cpp to produce the corresponding RVV intrinsic LLVM IR.
In this stage, riscv_vector.td will describe the bunch of manual codegen C++ code to tell CGBuiltin how to handle these instructions.
In this patch, we merge the masked RVV manual codegen and unmasked RVV manual codegen to reduce the number of manual codegen, and make more policy addition easier in the future.
This is a clean-up job that will not affect the RVV intrinsic functionality.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D140361
This reverts commit e43924a75145d2f9e722f74b673145c3e62bfd07.
Reason: Patch broke the MSan buildbots. More information is available on
the original phabricator review: https://reviews.llvm.org/D127812
Address the inconsistency between FLT_ROUNDS_ and SET_ROUNDING SDAG
node. Rename FLT_ROUNDS_ to GET_ROUNDING and add llvm.get.rounding
intrinsic to replace flt.rounds.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D139507
This change:
- Modifies the ACLE code to allow the new SLC value (3) for the prefetch
target.
- Introduces a new intrinsic, @llvm.aarch64.prefetch which matches the
PRFM family instructions much more closely, and can represent all
values for the PRFM immediate.
The target-independent @llvm.prefetch intrinsic does not have enough
information for us to be able to lower to it from the ACLE intrinsics
correctly.
- Lowers the acle calls to the new intrinsic on aarch64 (the ARM
lowering is unchanged).
- Implements code generation for the new intrinsic in both SelectionDAG
and GlobalISel. We specifically choose to continue to support lowering
the target-independent @llvm.prefetch intrinsic so that other
frontends can continue to use it.
Differential Revision: https://reviews.llvm.org/D139443
The global constant arguments could be in a different address space
than the first argument, so we have to add another overloaded argument.
This patch was originally made for CHERI LLVM (where globals can be in
address space 200), but it also appears to be useful for in-tree targets
as can be seen from the test diffs.
Differential Revision: https://reviews.llvm.org/D138722
We've exploited test data class instructions introduced in ISA 3.0.
This change unifies the scalar intrinsics into ppc_test_data_class
and add support for 128-bit precision float values using xststdcqp.
Vector versions of the intrinsic can't be unified because they return
vector int instead of int.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D138105
This only contains the SelectionDAG implementation. GlobalISel to
follow.
The broad approach is:
- Introduce new builtins for 128-bit wide instructions.
- Lower these to @llvm.read_register.i128/@llvm.write_register.i128
- Introduce target-specific ISD nodes which have legal operands (two
i64s rather than an i128). These are named AArch64::{MRRS, MSRR} to
match the instructions they are for. These are a little complex as
they need to match the "shape" of what they're replacing or the
legaliser complains.
- Select these using the existing tryReadRegister/tryWriteRegister to
share the MDString parsing code, and introduce additional code to
ensure these are selected into the right MRRS/MSRR instructions. What
makes this hard is ensuring that the two i64s end up in an XSeqPair
register pair, because SelectionDAG doesn't care that much about
register classes if it can avoid doing so.
The main change to existing code is the reorganisation of
tryReadRegister and tryWriteRegister to try to keep the string parsing
code separate from the instruction creating code.
This also includes the changes to clang to define and use the ACLE
feature macro named `__ARM_FEATURE_SYSREG128`.
Contributors:
Sam Elliott
Lucas Prates
Differential Revision: https://reviews.llvm.org/D139086
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716