77363 Commits

Author SHA1 Message Date
Craig Topper
999b9e6ddb [RISCV] Use vector getConstant instead of getSplatVector+getConstant. NFC 2024-04-10 19:39:41 -07:00
Jianjian Guan
fd50151180
[RISCV] Only support SPLAT_VECTOR for Zvfhmin when also enable the scalar extension of half fp (#88275) 2024-04-11 10:23:26 +08:00
Freddy Ye
f4509cf284
[X86][MC] Support enc/dec for SETZUCC and promoted SETCC. (#86473)
apx-spec: https://cdrdv2.intel.com/v1/dl/getContent/784266
apx-syntax-recommendation:
https://cdrdv2.intel.com/v1/dl/getContent/817241
2024-04-11 10:18:29 +08:00
Matthias Braun
acb7ddc5cf
[WebAssembly] Remove threadlocal.address when disabling TLS (#88209)
Remove `llvm.threadlocal.address` intrinsic usage when disabling TLS.
This fixes errors revealed by the stricter IR verification introduced in
PR #87841.
2024-04-10 16:24:02 -07:00
Stanislav Mekhanoshin
2fdfea088c
[AMDGPU] Add v2i32 to the VS_64 types. NFCI. (#88318)
I am trying to use VOP3Inst with intrinsic taking v2i32 operand and it
fails to create patterm without it.
2024-04-10 14:50:54 -07:00
Farzon Lotfi
05093e2438
[Spirv][HLSL] Add OpAll lowering and float vec support (#87952)
The main point of this change was to add support for HLSL's all
intrinsic.
In the process of doing that I found a few issues around creating an
`OpConstantComposite` via `buildZerosVal`.

First the current code didn't support floats so the process of adding
`buildZerosValF` meant I needed a
float version of `getOrCreateIntConstVector`. After doing so I renamed
both versions to `getOrCreateConstVector`. That meant I needed to create
a float type version of `getOrCreateIntCompositeOrNull`. Luckily the
type information was low for this function so was able to split it out
into a helpwe and rename `getOrCreateIntCompositeOrNull` to
`getOrCreateCompositeOrNull` With the exception of type handling
differences of the code and Null vs 0 Constant Op codes these functions
should be identical.

To handle scalar floats I could not use `buildConstantFP` like this PR
did:
https://github.com/llvm/llvm-project/commit/0a2aaab5aba46#diff-733a189c5a8c3211f3a04fd6e719952a3fa231eadd8a7f11e6ecf1e584d57411R1603
because that would create too many superfluous registers (that causes
problems in the validator), I had to create a float version of
`getOrCreateConstInt` which I called `getOrCreateConstFP`.
similar problems with doing it like this:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/SPIRV/SPIRVBuiltins.cpp#L1540.

`buildZerosValF` also has a use of a function `getZeroFP`. This is
because half, float, and double scalar values of 0 would collide in
`SPIRVDuplicatesTracker<Constant> CT` if you use `APFloat(0.0f)`.

`getORCreateConstFP` needed its own version of `getOrCreateConstIntReg`
which I called `getOrCreateConstFloatReg` The one difference in this
function is `getOrCreateConstFloatReg` returns a bit width so we don't
have to call `getScalarOrVectorBitWidth` twice ie when it is used again
in `getOrCreateConstFP` for `OpConstantF` `addNumImm`.

`getOrCreateConstFloatReg` needed an `assignFloatTypeToVReg` helper
which called a `getOrCreateSPIRVFloatType` helper. There was no
equivalent IntegerType::get for floats so I handled this with a switch
statement on bit widths to get the right LLVM float type.

Finally, there is the use of `bool ZeroAsNull = STI.isOpenCLEnv();` This
is partly a cosmetic change. When Zeros are treated as nulls, we don't
create `OpConstantComposite` vectors which is something we do in the
DXCs SPIRV backend. The DXC SPIRV backend also does not use
`OpConstantNull`. Finally, I needed a means to test the behavior of the
OpConstantNull and `OpConstantComposite` changes and this was one way I
could do that via the same tests.
2024-04-10 16:27:44 -04:00
shamithoke
e3ef4612c1
Perform bitreverse using AVX512 GFNI for i32 and i64. (#81764)
Currently, the lowering operation for bitreverse using Intel AVX512 GFNI only supports byte vectors

Extend the operation to i32 and i64.

---------

Co-authored-by: shami <shami_thoke@yahoo.com>
2024-04-10 20:22:44 +01:00
Jun Wang
86842e1f72
[AMDGPU] New clang option for emitting a waitcnt instruction after each memory instruction (#79236)
This patch introduces a new command-line option for clang, namely,
amdgpu-precise-mem-op (or precise-memory in the backend). When this option is specified, a waitcnt
instruction is generated after each memory load/store instruction. The
counter values are always 0, but which counters are involved depends on
the memory instruction.

---------

Co-authored-by: Jun Wang <jun.wang7@amd.com>
2024-04-10 10:47:04 -07:00
Craig Topper
f27f369710
[RISCV] Remove interrupt handler special case from RISCVFrameLowering::determineCalleeSaves. (#88069)
This code was trying to save temporary argument registers in interrupt
handler functions that contain calls. With the exception that all FP
registers are saved including the normally callee saved registers.

If all of the callees use an FP ABI and the interrupt handler doesn't
touch the normally callee saved FP registers, we don't need to save
them.

It doesn't appear that we need to special case functions with calls. The
normal callee saved register handling will already check each of the calls
and consider a register clobbered if the call doesn't explicitly say it is preserved.

All of the test changes are from the removal of the FP callee saved
registers. There are tests for interrupt handlers with F and D extension
that use ilp32 or lp64 ABIs that are not affected by this change. They
still save the FP callee saved registers as they should.

gcc appears to have a bug where the D extension being enabled with the
ilp32f or lp64f ABI does not save the FP callee saved regs. The callee
would only save/restore the lower 32 bits and clobber the upper bits.
LLVM saves the FP callee saved regs in this case and there is an
unchanged test for it.

The unnecessary save/restore was raised in this thread
https://discourse.llvm.org/t/has-bugs-when-optimizing-save-restore-csrs-by-changing-csr-xlen-f32-interrupt/78200/1
2024-04-10 10:28:54 -07:00
Vyacheslav Levytskyy
335d5d5f47
[SPIRV] Tweak parsing of base type name in builtins (#88255)
This PR is a small improvement of parsing of base type name in builtins,
allowing to understand `unsigned ...` types. The test case that fails
without the fix is attached.
2024-04-10 19:04:31 +02:00
Craig Topper
323d3ab257
[RISCV] Optimize undef Even vector in getWideningInterleave. (#88221)
We recently optimized the code when the Odd vector was undef to fix a
poison bug.

There are additional optimizations we can do if the even vector is
undef. With Zvbb, we can use a single vwsll. Without Zvbb, we can use a
vzext.vf2 and a vsll.
2024-04-10 09:08:50 -07:00
Craig Topper
7f1b9adfc8
[RISCV] Add MachineCombiner to fold (sh3add Z, (add X, (slli Y, 6))) -> (sh3add (sh3add Y, Z), X). (#87884)
This improves a pattern that occurs in 531.deepsjeng_r. Reducing the
dynamic instruction count by 0.5%.

This may be possible to improve in SelectionDAG, but given the special
cases around shXadd formation, it's not obvious it can be done in a
robust way without adding multiple special cases.

I've used a GEP with 2 indices because that mostly closely resembles the
motivating case. Most of the test cases are the simplest GEP case. One
test has a logical right shift on an index which is closer to the
deepsjeng code. This requires special handling in isel to reverse a
DAGCombiner canonicalization that turns a pair of shifts into (srl (and
X, C1), C2).
2024-04-10 08:39:56 -07:00
Philip Reames
5ae9ffbd18 [RISCV] Address review comment from 88062
As pointed out by Fraser, KillSrcReg is always false at this point in
code, and having the inconcistency on whether we check the flag between
the if and else blocks is confusing.
2024-04-10 07:21:41 -07:00
Dinar Temirbulatov
990c4bc95f
[AArch64][SVE2] Generate SVE2 BSL instruction in LLVM for bit-twiddling. (#83514)
Allow to fold or/and-and to BSL instuction for scalable vectors.
2024-04-10 11:07:59 +01:00
hev
0d17e1f0e5
[LoongArch] Revert sp adjustment in prologue (#88110)
After commit 18c5f3c3 ("[RegisterScavenger][RISCV] Don't search for
FrameSetup instrs if we were searching from Non-FrameSetup instrs"), we
can revert the `sp` adjustment 4e2364a2 ("[LoongArch] Add emergency
spill slot for GPR for large frames") to generate better code, as the
issue with `RegScavenger` has been resolved.

Fixes #88109
2024-04-10 17:13:25 +08:00
Chia
469caa31e7
[RISCV] Use vwadd.vx for splat vector with extension (#87249)
This patch allows `combineBinOp_VLToVWBinOp_VL` to handle patterns like
`(splat_vector (sext op))` or `(splat_vector (zext op))`. Then we can
use `vwadd.vx` and `vwadd.w` for such a case.

### Source code
```
define <vscale x 8 x i64> @vwadd_vx_splat_sext(<vscale x 8 x i32> %va, i32 %b) {
     %sb = sext i32 %b to i64
     %head = insertelement <vscale x 8 x i64> poison, i64 %sb, i32 0
     %splat = shufflevector <vscale x 8 x i64> %head, <vscale x 8 x i64> poison, <vscale x 8 x i32> zeroinitializer
     %vc = sext <vscale x 8 x i32> %va to <vscale x 8 x i64>
     %ve = add <vscale x 8 x i64> %vc, %splat
     ret <vscale x 8 x i64> %ve
}
```

### Before this patch
[Compiler Explorer](https://godbolt.org/z/sq191PsT4)
```
vwadd_vx_splat_sext:
  sext.w a0, a0
  vsetvli a1, zero, e64, m8, ta, ma
  vmv.v.x v16, a0
  vsetvli zero, zero, e32, m4, ta, ma
  vwadd.wv v16, v16, v8
  vmv8r.v v8, v16
  ret
```
### After this patch
```
vwadd_vx_splat_sext
  vsetvli a1, zero, e32, m4, ta, ma
  vwadd.vx v16, v8, a0
  vmv8r.v v8, v16
  ret
```
2024-04-10 15:26:17 +09:00
Shih-Po Hung
3d985a6f1b
[RISCV][TTI] Scale the cost of Select with LMUL (#88098)
Use the Val type to estimate the instruction cost for SelectInst.
2024-04-10 14:18:15 +08:00
Noah Goldstein
6c40d463c2 [X86] Use nneg flag when trying to convert uitofp -> sitofp
Closes #86694
2024-04-09 23:06:55 -05:00
Pengcheng Wang
8dc006ea40
[RISCV] Make EmitToStreamer return whether Inst is compressed
This is helpful to reduce calls of `RISCVRVC::compress` in #77337.

Reviewers: asb, lukel97, topperc

Reviewed By: topperc

Pull Request: https://github.com/llvm/llvm-project/pull/88120
2024-04-10 11:02:55 +08:00
Shih-Po Hung
ee52add6cb
[RISCV][TTI] Implement cost of intrinsic active_lane_mask (#87931)
This patch uses the argument type to infer the LMUL cost for the index
generation, add, and comparison.
2024-04-10 10:08:33 +08:00
Dinar Temirbulatov
528943f153
[AArch64][SME] Allow memory operations lowering to custom SME functions. (#79263)
This change allows to lower memcpy, memset, memmove to custom SME
version provided by LibRT.
2024-04-09 17:27:46 +01:00
David Green
f0e79d9152 [AArch64] Add a cost for identity shuffles.
These are mostly handled at a higher level when costing shuffles, but some
masks can end up being identity or concat masks which we can treat as free.
2024-04-09 17:16:14 +01:00
Peter Lafreniere
614a578034
[M68k] Add support for bitwise NOT instruction (#88049)
Currently the bitwise NOT instruction is not recognized. Add support for
using NOT on data registers. This is a partial implementation that puts
NOT at the same level of support as NEG currently enjoys.

Using not rather than eori cuts the length of the encoded instruction
in half or in thirds, leading to a reduction of 4-10 cycles per
instruction, on the original 68000.

This change includes tests for both bitwise and arithmetic negation.
2024-04-09 09:07:26 -07:00
David Green
4ac2721e51
[AArch64] Add costs for ST3 and ST4 instructions, modelled as store(shuffle). (#87934)
This tries to add some costs for the shuffle in a ST3/ST4 instruction,
which are represented in LLVM IR as store(interleaving shuffle). In
order to detect the store, it needs to add a CxtI context instruction to
check the users of the shuffle. LD3 and LD4 are added, LD2 should be a
zip1 shuffle, which will be added in another patch.

It should help fix some of the regressions from #87510.
2024-04-09 16:36:08 +01:00
Sam Tebbs
fb8dbd1fb6
[AArch64] Remove copy in SVE/SME predicate spill and fill (#81716)
7dc20ab introduced an extra COPY when spilling and filling a PNR
register, which can't be elided as the input (PNR predicate) and output
(PPR predicate) register classes differ. The patch adds a new register
class that covers both PPR and PNR so that STR_PXI and LDR_PXI can
take either of them, removing the need for the copy.
2024-04-09 16:17:27 +01:00
Philip Reames
e47fd09f8e
[RISCV] Use shNadd for scalable stack offsets (#88062)
If we need to multiply VLENB by 2, 4, or 8 and add it to the stack
pointer, we can do so with a shNadd instead of separate shift and add
instructions.
2024-04-09 07:29:10 -07:00
Vyacheslav Levytskyy
23b058cb7f
[SPIR-V] Re-implement switch and improve validation of forward calls (#87823)
This PR fixes issue https://github.com/llvm/llvm-project/issues/87763
and preserves valid CFG in cases when previous scheme failed to generate
valid code for a switch statement. The PR hardens one existing test case
and adds one more test case as a validation of a new switch generation.
Tests are passing spirv-val now.

This PR also improves validation of forward calls.
2024-04-09 16:15:44 +02:00
Natalie Chouinard
1e44d9ac5e
[SPIR-V] Map llvm.{min,max}num to GL::N{Min,Max} (#88009)
SPIR-V intsruction selection was mapping the LLVM float min/max
intrinsics to FMin and FMax respectively for GL/Vulkan environments,
which does not match the intrinsics' documented treatment of NaN
operands. This patch switches the mapping to the correctly matched NMin
and NMax operations.

Fixes #87072
2024-04-09 09:41:47 -04:00
Simon Pilgrim
4023329bbf [X86] collectConcatOps - add ability to recurse through insert_subvector chains
Allows us to match insert_subvector(insert_subvector(undef, insert_subvector(insert_subvector(undef, x, 0), y, 1), 0), 0),
                                    insert_subvector(insert_subvector(undef, z, 0), w, 1), 2)
2024-04-09 13:23:44 +01:00
Paul Walker
8795822f6a
[NFC][LLVM][CodeGen] Refactor SVE unpredicated binop isel patterns. (#84045)
Create PatFrags for fadd, fmul, fsub, mul, smulh and umulh so that a
single set of patterns can be used. Patch then removes unused classes
and some redundant whitespace.
2024-04-09 12:19:53 +01:00
Simon Pilgrim
0bbe953aa3 [X86] Fold extract_subvector(cvtps2dq(x),c) -> cvtps2dq(extract_subvector(x,c))
Help unblock #83402
2024-04-09 11:06:18 +01:00
Jay Foad
9c58f3a234
[AMDGPU] Fix implicit $vcc operands after parsing MIR (#87781)
MIParser checks that implicit operands match the instruction definition,
so they have to be $vcc even in wave32 mode. Use the mirFileLoaded hook
to fix them after MIParser's checks, converting them to $vcc_lo which is
what that rest of CodeGen expects.

This is all just extending the fixImplicitOperands hack which was
introduced with GFX10, but at least it makes it possible to write a MIR
test which creates the same instructions that normal CodeGen would
generate.
2024-04-09 09:10:45 +01:00
Luke Lau
9c660362c4
[RISCV] Support vwsll in combineBinOp_VLToVWBinOp_VL (#87620)
If the subtarget has +zvbb then we can attempt folding shl and shl_vl to
vwsll nodes.

There are few test cases where we still don't pick up the vwsll:
- For fixed vector vwsll.vi on RV32, see the FIXME for VMV_V_X_VL in
fillUpExtensionSupport for support implicit sign extension
- For scalable vector vwsll.vi we need to support ISD::SPLAT_VECTOR, see
#87249
2024-04-09 16:10:35 +08:00
Luke Lau
0f20b9b92f
[RISCV] Don't require mask or VL to be the same in combineBinOp_VLToVWBinOp_VL (#87997)
In NodeExtensionHelper we keep track of the VL and mask of the operand
being extended and check that they are the same as the root node's.
However for the nodes that we support, none of them have a passthru
operand with the exception of RISCV::VMV_V_X_VL, but we check that it's
passthru is undef anyway.

So it's safe to just discard the extend node's VL and mask and just use
the root's instead. (This is the same type of reasoning we use to treat
any vmset_vl as an all ones mask)

This allows us to match some more cases where we mix VP/non-VP/VL nodes,
but these don't seem to appear in practice. The main benefit from this
would be to simplify the code.
2024-04-09 16:04:10 +08:00
Alexandre Ganea
ec1af63dde
[Codegen][X86] Fix /HOTPATCH with clang-cl and inline asm (#87639)
This fixes an edge case where functions starting with inline assembly
would assert while trying to lower that inline asm instruction.

After this PR, for now we always add a no-op (xchgw in this case) without
considering the size of the next inline asm instruction. We might want
to revisit this in the future.

This fixes Unreal Engine 5.3.2 compilation with clang-cl and /HOTPATCH.

Should close https://github.com/llvm/llvm-project/issues/56234
2024-04-08 20:02:19 -04:00
Arthur Eubanks
922700df44 Revert "[X86] Change how we treat functions with explicit sections as small/large (#87838)"
This reverts commit e27c3736f975ca463476223c465e4777186f603f.

Breaks ExecutionEngine/MCJIT/test-global-ctors.ll on windows, e.g. https://lab.llvm.org/buildbot/#/builders/117/builds/18749.
2024-04-08 23:00:01 +00:00
Arthur Eubanks
e27c3736f9
[X86] Change how we treat functions with explicit sections as small/large (#87838)
Following #78348, we should treat functions with an explicit section as
small, unless the section name is (or has the prefix) ".ltext".

Clang emits global initializers into a ".text.startup" section on Linux.
If we mix small/medium code model object files with large code model
object files, we'll end up mixing sections with and without the large
section flag.
2024-04-08 15:40:19 -07:00
Eli Friedman
7ad481e76c
Revert "[AArch64] Add support for -ffixed-x30" (#88019)
This reverts commit e770153865c53c4fd72a68f23acff33c24e42a08.

This wasn't reviewed, and the functionality in question was
intentionally rejected the last time it was discussed in
https://reviews.llvm.org/D56305 .
2024-04-08 15:16:00 -07:00
Philip Reames
eb26edbbf8
[RISCV] Exploit sh3add/sh2add for stack offsets by shifted 12-bit constants (#87950)
If we're falling back to generic constant formation in a register +
add/sub, we can check if we have a constant which is 12-bits but left
shifted by 2 or 3. If so, we can use a sh2add or sh3add to perform the
shift and add in a single instruction.

This is profitable when the unshifted constant would require two
instructions (LUI/ADDI) to form, but is never harmful since we're going
to need at least two instructions regardless of the constant value.

Since stacks are aligned to 16 bytes by default, sh3add allows addresing
(aligned) data out to 2^14 (i.e. 16kb) in at most two instructions
w/zba.
2024-04-08 14:53:21 -07:00
Philip Reames
39f6d015dd
[RISCV] Eliminate getVLENFactoredAmount and expose muladd [nfc] (#87881)
This restructures the code to make the fact that most of
getVLENFactoredAmount is just a generic multiply w/immediate more
obvious and prepare for a couple of upcoming enhancements to this code.

Note that I plan to switch mulImm to early return, but decided I'd do
that as a separate commit to keep this diff readable.

---------

Co-authored-by: Luke Lau <luke_lau@icloud.com>
2024-04-08 10:24:27 -07:00
Min-Yih Hsu
f6315a9572
[AArch64][LoopIdiom] Disable LoopIdiomTransform when NoImplicitFloat is present (#87677)
This behavior is aligned with both LoopVectorizer and SLPVectorizer.
2024-04-08 09:10:23 -07:00
Luke Lau
8b3b4a92ad
[RISCV] Fix canFoldToVWWithSameExtension allowing different FP extensions (#87978) 2024-04-08 19:20:36 +08:00
Simon Pilgrim
170c525d79 [X86] combineExtractVectorElt - fold extract(trunc(x),c) -> trunc(extract(x,c)) 2024-04-08 11:01:19 +01:00
Pengcheng Wang
364028a1a5
[RISCV] Zimop/Zcmop are ratified
Remove them from experimental.

See also:
https://github.com/riscv/riscv-isa-manual/blob/main/src/zimop.adoc

Reviewers: kito-cheng

Reviewed By: kito-cheng

Pull Request: https://github.com/llvm/llvm-project/pull/87966
2024-04-08 16:40:02 +08:00
David Green
ac321cbb03
[AArch64][GlobalISel] Legalize Insert vector element (#81453)
This attempts to standardize and extend some of the insert vector
element lowering. Most notably:
- More types are handled by splitting illegal vectors.
- The index type for G_INSERT_VECTOR_ELT is canonicalized to
  TLI.getVectorIdxTy(), similar to extact_vector_element.
- Some of the existing patterns now have the index type specified to
  make sure they can apply to GISel too.
- The C++ selection code has been removed, relying on tablegen patterns.
- G_INSERT_VECTOR_ELT with small GPR input elements are pre-selected to
  use a i32 type, allowing the existing patterns to apply.
- Variable index inserts are lowered in post-legalizer lowering,
  expanding into a stack store and reload.
2024-04-08 08:44:13 +01:00
Pengcheng Wang
73ddb2a747
[RISCV] Store VLMul/NF into RegisterClass's TSFlags
This TSFlags was introduced by https://reviews.llvm.org/D108767.

A base class of all RISCV RegisterClass is added and we store
IsVRegClass/VLMul/NF into TSFlags and add helpers to get them.

This can reduce some lines and I think there will be more usages.

Reviewers: preames, topperc

Reviewed By: topperc

Pull Request: https://github.com/llvm/llvm-project/pull/84894
2024-04-08 13:35:37 +08:00
Pengcheng Wang
f3b5597364
[RISCV] Use larger copies when register tuples are aligned
When the encoding of register tuples are aligned, we can use a copy
with larger LMUL to reduce copies.

Reviewers: preames, topperc, lukel97

Reviewed By: topperc, lukel97

Pull Request: https://github.com/llvm/llvm-project/pull/84455
2024-04-08 13:24:57 +08:00
Jianjian Guan
bc8726b16b
[RISCV] Support codegen of vfmv.v.f for bfloat vector with both Zvfbfmin and Zfbfmin (#87318)
vfmv, vfmerge should support bfloat vector when we have both Zvfbfmin
and Zfbfmin, this patch tries to support vfmv first.
2024-04-07 10:41:47 +08:00
AtariDreams
8389b3bf60
[X86] Fix typo: QWORD alignment is greater than or equal to 8, not greater than 8 (#87819)
Align(8) is QWORD aligned, but this was checking to see if alignment was
greater than that, when it should have been checking for being greater
than OR EQUAL to Align(8).

This bug was introduced in
https://github.com/llvm/llvm-project/commit/6a6af30d433d7 during the
transition to the Align type.
2024-04-07 08:43:13 +08:00
Sizov Nikita
d38bff460a
[AArch64] SimplifyDemandedBitsForTargetNode - add AArch64ISD::BICi handling (#76644)
Fold BICi if all destination bits are already known to be zeroes

```llvm
define <8 x i16> @haddu_known(<8 x i8> %a0, <8 x i8> %a1) {
  %x0 = zext <8 x i8> %a0 to <8 x i16>
  %x1 = zext <8 x i8> %a1 to <8 x i16>
  %hadd = call <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16> %x0, <8 x i16> %x1)
  %res = and <8 x i16> %hadd, <i16 511, i16 511, i16 511, i16 511,i16 511, i16 511, i16 511, i16 511>
  ret <8 x i16> %res
}
declare <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16>, <8 x i16>)
```

```
haddu_known:                            // @haddu_known
        ushll   v0.8h, v0.8b, #0
        ushll   v1.8h, v1.8b, #0
        uhadd   v0.8h, v0.8h, v1.8h
        bic     v0.8h, #254, lsl #8 <-- this one will be removed as we know high bits are zero extended
        ret
```

Fixes #53881
Fixes #53622
2024-04-06 21:41:24 +01:00