2326 Commits

Author SHA1 Message Date
Craig Topper
efa859cd3e [ARM] Use SelectonDAG::getSignedConstant. 2024-08-17 18:02:41 -07:00
Simon Pilgrim
11ba72e651
[KnownBits] Add KnownBits::add and KnownBits::sub helper wrappers. (#99468) 2024-08-12 10:21:28 +01:00
Kazu Hirata
f4fb735840
[llvm] Construct SmallVector<SDValue> with ArrayRef (NFC) (#102578) 2024-08-09 09:15:42 -07:00
Oliver Stannard
50a2b31800
[ARM] Be more precise about conditions for indirect tail-calls (#102451)
This code was trying to predict the conditions in which an indirect
tail call will have a free register to hold the target address, and
falling back to a non-tail call if all non-callee-saved registers are
used for arguments or return address authentication.

However, it was only taking the number of arguments into account, not
which registers they are allocated to, so floating-point arguments could
cause this to give the wrong result, causing either a later error due to
the lack of a free register, or a missed optimisation of not doing the
tail call.

The assignments of arguments to registers is available at this point in
the code, so we can calculate exactly which registers will be available
for the tail-call.
2024-08-09 08:50:21 +01:00
Sergei Barannikov
411d31ad69
Partially revert 92e18ffd803365c64910760ba20278f875d93681 (#101673)
It is likely to cause stage2 build failures:

https://lab.llvm.org/buildbot/#/builders/122/builds/389
https://lab.llvm.org/buildbot/#/builders/79/builds/552

I don't have an ARM machine to investigate, so I'm just reverting ARM
changes to see if it helps make the bots green again.
2024-08-02 16:38:31 +03:00
Sergei Barannikov
92e18ffd80
[SDag][ARM][RISCV] Allow lowering CTPOP into a libcall (#99752)
The main change is adding CTPOP to `RuntimeLibcalls.def` to allow
targets to use LibCall action for CTPOP. DAG legalizers are changed
accordingly.
2024-08-02 12:29:39 +03:00
Joseph Huber
615b7eeaa9 Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.

I moved the `ISD` dependencies into the CodeGen portion of the handling,
it's a little awkward but it's the easiest solution I can think of for
now.
2024-07-20 09:29:31 -05:00
NAKAMURA Takumi
740161a9b9 Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69.
(llvmorg-19-init-17714-gc05126bdfc3b)
See #99610
2024-07-20 12:36:57 +09:00
Joseph Huber
c05126bdfc
[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)
Summary:
The LTO pass and LLD linker have logic in them that forces extraction
and prevent internalization of needed runtime calls. However, these
currently take all RTLibcalls into account, even if the target does not
support them. The target opts-out of a libcall if it sets its name to
nullptr. This patch pulls this logic out into a class in the header so
that LTO / lld can use it to determine if a symbol actually needs to be
kept.

This is important for targets like AMDGPU that want to be able to use
`lld` to perform the final link step, but does not want the overhead of
uncalled functions. (This adds like a second to the link time trivially)
2024-07-16 06:22:09 -05:00
Kazu Hirata
5e22a53698
[Target] Use range-based for loops (NFC) (#98705) 2024-07-13 17:40:51 -07:00
Joseph Huber
3f1a767572
[LLVM] Factor disabled Libcalls into the initializer (#98421)
Summary:
These Libcalls represent which functions are available to the backend.
If a runtime call is not available, the target sets the the name to
`nullptr`. Currently, this logic is spread around the various targets.
This patch pulls all of the locations that disable libcalls into the
intializer. This patch is effectively NFC.

The motivation behind this patch is that currently the LTO handling uses
the list of all runtime calls to determine which functions cannot be
internalized and must be extracted from static libraries. We do not want
this to happen for libcalls that are not emitted by the backend. A
follow-up patch will move out this logic so the LTO pass can know which
rtlib calls are actually used by the backend.
2024-07-11 12:59:25 -05:00
hstk30-hw
ef465bf8b1
[ARM] Fix arm32be softfp mode miscompilation for neon sdiv (#97883)
Related issue: https://github.com/llvm/llvm-project/issues/97782
2024-07-08 14:18:38 +08:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Eli Friedman
39a0aa5876
[SelectionDAG] Lower llvm.ldexp.f32 to ldexp() on Windows. (#95301)
This reduces codesize. As discussed in #92707.
2024-06-25 10:25:48 -07:00
Lucas Duarte Prates
78ff617d3f
[ARM] CMSE security mitigation on function arguments and returned values (#89944)
The ABI mandates two things related to function calls:
 - Function arguments must be sign- or zero-extended to the register
   size by the caller.
 - Return values must be sign- or zero-extended to the register size by
   the callee.

As consequence, callees can assume that function arguments have been
extended and so can callers with regards to return values.

Here lies the problem: Nonsecure code might deliberately ignore this
mandate with the intent of attempting an exploit. It might try to pass
values that lie outside the expected type's value range in order to
trigger undefined behaviour, e.g. out of bounds access.

With the mitigation implemented, Secure code always performs extension
of values passed by Nonsecure code.

This addresses the vulnerability described in CVE-2024-0151.

Patches by Victor Campos.

---------

Co-authored-by: Victor Campos <victor.campos@arm.com>
2024-06-20 10:22:01 +01:00
Sergei Barannikov
23c1b488fe
[ARM] Remove duplicate custom SDag node (NFCI) (#93419)
ARMISD::SUBS is a duplicate of ARMISD::SUBC.
The node was introduced in 5745b6ac. This patch replaces SUBS with SUBC
and reverts changes in *.td files.
2024-06-15 12:08:32 +03:00
Farzon Lotfi
7ad12a7c04
[ARM] Add tan intrinsic lowering (#95439)
- `ARMISelLowering.cpp` - Add f16 type and neon and mve vector support
for tan
2024-06-14 10:35:50 -04:00
Jay Foad
d4a0154902
[llvm-project] Fix typo "seperate" (#95373) 2024-06-13 20:20:27 +01:00
Simon Pilgrim
af3ffff34f
[DAG] Always allow folding XOR patterns to ABS pre-legalization (#94601)
Removes residual ARM handling for vXi64 ABS nodes to prevent infinite loops.
2024-06-07 11:02:50 +01:00
Simon Pilgrim
c0b468523c
[ARM] Add NEON support for ISD::ABDS/ABDU nodes. (#94504)
As noted on #94466, NEON has ABDS/ABDU instructions but only handles them via intrinsics, plus some VABDL custom patterns.

This patch flags basic ABDS/ABDU for neon types as legal and updates all tablegen patterns to use abds/abdu instead.

Fixes #94466
2024-06-07 10:18:45 +01:00
David Green
264b1b2486 [ARM] Convert vector fdiv+fcvt fixed-point combine to fmul.
Instcombine will convert fdiv by a power-2 to fmul, this converts the
PerformVDIVCombine that converts fdiv+fcvt to fixed-point fcvt to fmul+fcvt.
The fdiv tests will look worse, but won't appear in practice (and should be
improved again by #93882).
2024-06-03 09:31:36 +01:00
AtariDreams
1d3329c2e8
[Thumb] Resolve FIXME: Transform "(and (shl x, c2), c1)" into "(shl (and x, c1>>c2), c2)" (#82120)
Transform "(and (shl x, c2), c1)" into "(shl (and x, c1>>c2), c2)" if
"c1 >> c2" is a cheaper immediate than "c1" using
HasLowerConstantMaterializationCost
2024-05-26 14:58:32 -04:00
Reid Kleckner
385faf9cde
[ARM/X86] Standardize the isEligibleForTailCallOptimization prototypes (#90688)
Pass in CallLoweringInfo (CLI) instead of passing in the various fields
directly. Also pass in CCState (CCInfo), which is computed in both the
caller and the callee for a minor efficiency saving. There may also be a
small correctness improvement for sibcalls with vectorcall, which has an
odd way of recomputing argument locations.

This is a step towards improving the handling of musttail on armv7,
which we have numerous issues filed about in our tracker.

I took inspiration for this from the RISCV tail call eligibility check,
which uses a similar prototype.
2024-05-03 13:56:55 -07:00
Qiu Chaofan
4a8f2f2e1a
[Legalizer] Expand fmaximum and fminimum (#67301)
According to langref, llvm.maximum/minimum has -0.0 < +0.0 semantics and
propagates NaN.

Expand the nodes on targets not supporting the operation, by adding
extra check for NaN and using is_fpclass to check zero signs.
2024-04-29 15:09:54 +08:00
Xu Zhang
f6d431f208
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and  `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI  parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`,  `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-24 14:24:14 +01:00
Prabhuk
212b1a84a6
[CallSiteInfo][NFC] CallSiteInfo -> CallSiteInfo.ArgRegPairs (#86842)
CallSiteInfo is originally used only for argument - register pairs. Make
it struct, in which we can store additional data for call sites.

Also, the variables/methods used for CallSiteInfo are named for its
original use case, e.g., CallFwdRegsInfo. Refactor these for the
upcoming
use, e.g. addCallArgsForwardingRegs() -> addCallSiteInfo().

An upcoming patch will add type ids for indirect calls to propogate them
from
middle-end to the back-end. The type ids will be then used to emit the
call
graph section.

Original RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Differential Revision: https://reviews.llvm.org/D107109?id=362888

Co-authored-by: Necip Fazil Yildiran <necip@google.com>
2024-04-02 13:05:16 -07:00
Arthur Eubanks
94c988bcfd [NFC] Remove unused parameter from shouldAssumeDSOLocal() 2024-03-11 19:48:17 +00:00
Noah Goldstein
61c06775c9 [KnownBits] Add API for nuw flag in computeForAddSub; NFC 2024-03-05 12:59:58 -06:00
Fangrui Song
21c83feca5 [ARM] Simplify shouldAssumeDSOLocal for ELF. NFC 2024-03-01 16:14:48 -08:00
Simon Pilgrim
b45de48be2
[MVE] Expand64BitShift - handle all constant shift amounts less than 32 (#81261)
Expand64BitShift was always dropping to generic shift legalization if the shift amount type was larger than i64, even if the constant shift amount was actually very small. I've adjusted the constant bounds checks to work with APInt types so we can always perform the comparison.

This results in the MVE long shift instructions being used more often, and it looks like this is preventing some additional combines from happening. This could be addressed in the future.

This came about while I was trying to extend the DAGTypeLegalizer::ExpandShift* helpers and need to move to consistently using the legal shift amount types instead of reusing the shift amount type from the original wider shift.
2024-02-11 15:02:27 +00:00
Harald van Dijk
52864d9c7b
[ARM] Switch to soft promoting half types. (#80440)
The traditional promotion is known to generate wrong code.

Fixes #73805.
2024-02-02 21:40:40 +00:00
Kazu Hirata
ae46855f53 [Target] Use getConstantOperand (NFC) 2024-01-28 18:03:38 -08:00
Nico Weber
184ca39529
[llvm] Move CodeGenTypes library to its own directory (#79444)
Finally addresses https://reviews.llvm.org/D148769#4311232 :)

No behavior change.
2024-01-25 12:01:31 -05:00
David Green
2c49586e1b
[ARM] Fix MVEFloatOps check on creating VCVTN (#79291)
In the past PerformSplittingToNarrowingStores handled both int and float
ops, but since the introduction of MVETRUNC now only operates on float
operations, creating VCVTN nodes. It should be guarded by hasMVEFloatOps
to prevent a failure to select.
2024-01-25 08:12:51 +00:00
Kazu Hirata
7528cf5ef2 [Target] Use getConstantOperandVal (NFC) 2024-01-14 00:53:29 -08:00
Kazu Hirata
be76f1646f [Target] Use getConstantOperandAPInt (NFC) 2024-01-10 21:06:01 -08:00
Alex Bradbury
197214e39b
[RFC][SelectionDAG] Add and use SDNode::getAsZExtVal() helper (#76710)
This follows on from #76708, allowing
`cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just
`N->getAsZextVal();`
    
Introduced via `git grep -l "cast<ConstantSDNode>\(.*\).*getZExtValue" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and
then using `git clang-format` on the result.
2024-01-09 12:25:17 +00:00
Alex Bradbury
a181b42565 [llvm][NFC] Use SDValue::getConstantOperandAPInt(i) where possible
The helper function allows examples like
`cast<ConstantSDNode>(Op.getOperand(0))->getAPIntValue();` to be changed
to `Op.getConstantOperandAPInt(0);`.

See #76708 for further context. Although there are far fewer
opportunities for replacement, I used a similar git grep and sed combo
as before, given I already had it to hand:

`git grep -l "cast<ConstantSDNode>\(.*->getOperand\(.*\)\)->getAPIntValue\(\)" | xargs sed -E -i 's/cast<ConstantSDNode>\((.*)->getOperand\((.*)\)\)->getAPIntValue\(\)/\1->getConstantOperandAPInt(\2)/'`
and
`git grep -l
"cast<ConstantSDNode>\(.*\.getOperand\(.*\)\)->getAPIntValue\(\)" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\.getOperand\((.*)\)\)->getAPIntValue\(\)/\1.getConstantOperandAPInt(\2)/'`
2024-01-02 14:43:55 +00:00
Alex Bradbury
80aeb62211
[llvm][NFC] Use SDValue::getConstantOperandVal(i) where possible (#76708)
This helper function shortens examples like
`cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();` to
`Node->getConstantOperandVal(1);`.

Implemented with:
`git grep -l
"cast<ConstantSDNode>\(.*->getOperand\(.*\)\)->getZExtValue\(\)" | xargs
sed -E -i

's/cast<ConstantSDNode>\((.*)->getOperand\((.*)\)\)->getZExtValue\(\)/\1->getConstantOperandVal(\2)/`
and `git grep -l
"cast<ConstantSDNode>\(.*\.getOperand\(.*\)\)->getZExtValue\(\)" | xargs
sed -E -i

's/cast<ConstantSDNode>\((.*)\.getOperand\((.*)\)\)->getZExtValue\(\)/\1.getConstantOperandVal(\2)/'`.
With a couple of simple manual fixes needed. Result then processed by
`git clang-format`.
2024-01-02 13:14:28 +00:00
Serge Pavlov
2f81788067
[ARM][FPEnv] Lowering of fpmode intrinsics (#74054)
LLVM intrinsics `get_fpmode`, `set_fpmode` and `reset_fpmode` operate
control modes, the bits of FP environment that affect FP operations. On
ARM these bits are in FPSCR together with the status bits. The
implementation of these intrinsics produces code close to that of
functions `fegetmode` and `fesetmode` from GLIBC.

Pull request: https://github.com/llvm/llvm-project/pull/74054
2023-12-18 18:57:36 +07:00
Craig Topper
e888e83fb6
[ARM][AArch64] Use SelectionDAG::SplitScalar to simplify some code. (#74411)
We know we're splitting a type in half to two legal values. Instead of
using shift and truncate that need to be legalized, we can use two
ISD::EXTRACT_ELEMENTs.

Spotted while reviewing #67918 for RISC-V which copied this code.
2023-12-05 07:51:54 -08:00
Simon Pilgrim
5c672d87ea Fix MSVC signed/unsigned mismatch warning. NFC. 2023-12-04 10:48:33 +00:00
Nikita Popov
c1e3a94105 [TargetLowering] Don't include ComplexDeinterleavingPass.h (NFC)
TargetLowering.h shouldn't include any passes and thus pull in
the entire pass infrastructure. Replace the include with forward
declarations.
2023-11-24 12:13:38 +01:00
Sander de Smalen
81b7f115fb
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:

  TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)

without failing its assert that explicitly tests for this case:

  assert(LHS.Scalable == RHS.Scalable && ...);

The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.

This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.

The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
2023-11-22 08:52:53 +00:00
Serge Pavlov
a2e1de1934 [ARM][FPEnv] Lowering of fpenv intrinsics
The change implements lowering of `get_fpenv`, `set_fpenv` and
`reset_fpenv`.

Differential Revision: https://reviews.llvm.org/D81843
2023-11-20 15:08:25 +07:00
Serge Pavlov
5b0f703918 Revert "[ARM][FPEnv] Lowering of fpenv intrinsics"
This reverts commit d62f040418bd167d1ddd2b79c640a90c0c2ea353.
Some cuda buildbots start failing.
2023-11-10 16:24:51 +07:00
Serge Pavlov
d62f040418 [ARM][FPEnv] Lowering of fpenv intrinsics
The change implements lowering of `get_fpenv`, `set_fpenv` and
`reset_fpenv`.

Differential Revision: https://reviews.llvm.org/D81843
2023-11-10 16:06:33 +07:00
Paulo Matos
7b9d73c2f9
[NFC] Remove Type::getInt8PtrTy (#71029)
Replace this with PointerType::getUnqual().
Followup to the opaque pointer transition. Fixes an in-code TODO item.
2023-11-07 17:26:26 +01:00
David Green
8a701024f3 [ARM] Lower i1 concat via MVETRUNC
The MVETRUNC operation can perform the same truncate of two vectors, without
requiring lane inserts/extracts from every vector lane. This moves the concat
i1 lowering to use it for v8i1 and v16i1 result types, trading a bit of extra
stack space for less instructions.
2023-10-18 19:40:11 +01:00