2250 Commits

Author SHA1 Message Date
Prabhuk
212b1a84a6
[CallSiteInfo][NFC] CallSiteInfo -> CallSiteInfo.ArgRegPairs (#86842)
CallSiteInfo is originally used only for argument - register pairs. Make
it struct, in which we can store additional data for call sites.

Also, the variables/methods used for CallSiteInfo are named for its
original use case, e.g., CallFwdRegsInfo. Refactor these for the
upcoming
use, e.g. addCallArgsForwardingRegs() -> addCallSiteInfo().

An upcoming patch will add type ids for indirect calls to propogate them
from
middle-end to the back-end. The type ids will be then used to emit the
call
graph section.

Original RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Differential Revision: https://reviews.llvm.org/D107109?id=362888

Co-authored-by: Necip Fazil Yildiran <necip@google.com>
2024-04-02 13:05:16 -07:00
Arthur Eubanks
94c988bcfd [NFC] Remove unused parameter from shouldAssumeDSOLocal() 2024-03-11 19:48:17 +00:00
Noah Goldstein
61c06775c9 [KnownBits] Add API for nuw flag in computeForAddSub; NFC 2024-03-05 12:59:58 -06:00
Fangrui Song
21c83feca5 [ARM] Simplify shouldAssumeDSOLocal for ELF. NFC 2024-03-01 16:14:48 -08:00
Simon Pilgrim
b45de48be2
[MVE] Expand64BitShift - handle all constant shift amounts less than 32 (#81261)
Expand64BitShift was always dropping to generic shift legalization if the shift amount type was larger than i64, even if the constant shift amount was actually very small. I've adjusted the constant bounds checks to work with APInt types so we can always perform the comparison.

This results in the MVE long shift instructions being used more often, and it looks like this is preventing some additional combines from happening. This could be addressed in the future.

This came about while I was trying to extend the DAGTypeLegalizer::ExpandShift* helpers and need to move to consistently using the legal shift amount types instead of reusing the shift amount type from the original wider shift.
2024-02-11 15:02:27 +00:00
Harald van Dijk
52864d9c7b
[ARM] Switch to soft promoting half types. (#80440)
The traditional promotion is known to generate wrong code.

Fixes #73805.
2024-02-02 21:40:40 +00:00
Kazu Hirata
ae46855f53 [Target] Use getConstantOperand (NFC) 2024-01-28 18:03:38 -08:00
Nico Weber
184ca39529
[llvm] Move CodeGenTypes library to its own directory (#79444)
Finally addresses https://reviews.llvm.org/D148769#4311232 :)

No behavior change.
2024-01-25 12:01:31 -05:00
David Green
2c49586e1b
[ARM] Fix MVEFloatOps check on creating VCVTN (#79291)
In the past PerformSplittingToNarrowingStores handled both int and float
ops, but since the introduction of MVETRUNC now only operates on float
operations, creating VCVTN nodes. It should be guarded by hasMVEFloatOps
to prevent a failure to select.
2024-01-25 08:12:51 +00:00
Kazu Hirata
7528cf5ef2 [Target] Use getConstantOperandVal (NFC) 2024-01-14 00:53:29 -08:00
Kazu Hirata
be76f1646f [Target] Use getConstantOperandAPInt (NFC) 2024-01-10 21:06:01 -08:00
Alex Bradbury
197214e39b
[RFC][SelectionDAG] Add and use SDNode::getAsZExtVal() helper (#76710)
This follows on from #76708, allowing
`cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just
`N->getAsZextVal();`
    
Introduced via `git grep -l "cast<ConstantSDNode>\(.*\).*getZExtValue" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and
then using `git clang-format` on the result.
2024-01-09 12:25:17 +00:00
Alex Bradbury
a181b42565 [llvm][NFC] Use SDValue::getConstantOperandAPInt(i) where possible
The helper function allows examples like
`cast<ConstantSDNode>(Op.getOperand(0))->getAPIntValue();` to be changed
to `Op.getConstantOperandAPInt(0);`.

See #76708 for further context. Although there are far fewer
opportunities for replacement, I used a similar git grep and sed combo
as before, given I already had it to hand:

`git grep -l "cast<ConstantSDNode>\(.*->getOperand\(.*\)\)->getAPIntValue\(\)" | xargs sed -E -i 's/cast<ConstantSDNode>\((.*)->getOperand\((.*)\)\)->getAPIntValue\(\)/\1->getConstantOperandAPInt(\2)/'`
and
`git grep -l
"cast<ConstantSDNode>\(.*\.getOperand\(.*\)\)->getAPIntValue\(\)" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\.getOperand\((.*)\)\)->getAPIntValue\(\)/\1.getConstantOperandAPInt(\2)/'`
2024-01-02 14:43:55 +00:00
Alex Bradbury
80aeb62211
[llvm][NFC] Use SDValue::getConstantOperandVal(i) where possible (#76708)
This helper function shortens examples like
`cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();` to
`Node->getConstantOperandVal(1);`.

Implemented with:
`git grep -l
"cast<ConstantSDNode>\(.*->getOperand\(.*\)\)->getZExtValue\(\)" | xargs
sed -E -i

's/cast<ConstantSDNode>\((.*)->getOperand\((.*)\)\)->getZExtValue\(\)/\1->getConstantOperandVal(\2)/`
and `git grep -l
"cast<ConstantSDNode>\(.*\.getOperand\(.*\)\)->getZExtValue\(\)" | xargs
sed -E -i

's/cast<ConstantSDNode>\((.*)\.getOperand\((.*)\)\)->getZExtValue\(\)/\1.getConstantOperandVal(\2)/'`.
With a couple of simple manual fixes needed. Result then processed by
`git clang-format`.
2024-01-02 13:14:28 +00:00
Serge Pavlov
2f81788067
[ARM][FPEnv] Lowering of fpmode intrinsics (#74054)
LLVM intrinsics `get_fpmode`, `set_fpmode` and `reset_fpmode` operate
control modes, the bits of FP environment that affect FP operations. On
ARM these bits are in FPSCR together with the status bits. The
implementation of these intrinsics produces code close to that of
functions `fegetmode` and `fesetmode` from GLIBC.

Pull request: https://github.com/llvm/llvm-project/pull/74054
2023-12-18 18:57:36 +07:00
Craig Topper
e888e83fb6
[ARM][AArch64] Use SelectionDAG::SplitScalar to simplify some code. (#74411)
We know we're splitting a type in half to two legal values. Instead of
using shift and truncate that need to be legalized, we can use two
ISD::EXTRACT_ELEMENTs.

Spotted while reviewing #67918 for RISC-V which copied this code.
2023-12-05 07:51:54 -08:00
Simon Pilgrim
5c672d87ea Fix MSVC signed/unsigned mismatch warning. NFC. 2023-12-04 10:48:33 +00:00
Nikita Popov
c1e3a94105 [TargetLowering] Don't include ComplexDeinterleavingPass.h (NFC)
TargetLowering.h shouldn't include any passes and thus pull in
the entire pass infrastructure. Replace the include with forward
declarations.
2023-11-24 12:13:38 +01:00
Sander de Smalen
81b7f115fb
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:

  TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)

without failing its assert that explicitly tests for this case:

  assert(LHS.Scalable == RHS.Scalable && ...);

The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.

This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.

The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
2023-11-22 08:52:53 +00:00
Serge Pavlov
a2e1de1934 [ARM][FPEnv] Lowering of fpenv intrinsics
The change implements lowering of `get_fpenv`, `set_fpenv` and
`reset_fpenv`.

Differential Revision: https://reviews.llvm.org/D81843
2023-11-20 15:08:25 +07:00
Serge Pavlov
5b0f703918 Revert "[ARM][FPEnv] Lowering of fpenv intrinsics"
This reverts commit d62f040418bd167d1ddd2b79c640a90c0c2ea353.
Some cuda buildbots start failing.
2023-11-10 16:24:51 +07:00
Serge Pavlov
d62f040418 [ARM][FPEnv] Lowering of fpenv intrinsics
The change implements lowering of `get_fpenv`, `set_fpenv` and
`reset_fpenv`.

Differential Revision: https://reviews.llvm.org/D81843
2023-11-10 16:06:33 +07:00
Paulo Matos
7b9d73c2f9
[NFC] Remove Type::getInt8PtrTy (#71029)
Replace this with PointerType::getUnqual().
Followup to the opaque pointer transition. Fixes an in-code TODO item.
2023-11-07 17:26:26 +01:00
David Green
8a701024f3 [ARM] Lower i1 concat via MVETRUNC
The MVETRUNC operation can perform the same truncate of two vectors, without
requiring lane inserts/extracts from every vector lane. This moves the concat
i1 lowering to use it for v8i1 and v16i1 result types, trading a bit of extra
stack space for less instructions.
2023-10-18 19:40:11 +01:00
David Green
c060757bcc [ARM] Correct v2i1 concat extract types.
For two v2i1 concat into a v4i1, we cannot extract each i64 element as an i32.
This casts to a v4i32 instead and extracts the correct vector lanes.
2023-10-18 13:40:38 +01:00
Alexey Bataev
e22818d5c9 [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-10-05 06:17:07 -07:00
Arthur Eubanks
07389535a7 Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst."
This reverts commit b186f1f68be11630355afb0c08b80374a6d31782.

Causes crashes, see https://reviews.llvm.org/D158449.
2023-10-04 14:37:16 -07:00
Alexey Bataev
b186f1f68b [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-10-04 07:53:30 -07:00
Alexey Bataev
1129dec778 Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst."
This reverts commit 6f43d28f3452b3ef598bc12b761cfc2dbd0f34c9 to fix
a crash reported in https://reviews.llvm.org/D158449.
2023-10-03 13:02:16 -07:00
Alexey Bataev
6f43d28f34 [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-10-03 10:26:11 -07:00
Youngsuk Kim
4346aaf05b [llvm] Remove uses of Type::getPointerTo() (NFC)
* Remove if its sole use is to support an unnecessary ptr-to-ptr bitcast
  (remove the bitcast as well)
* Replace with use of other APIs.

NFC opaque pointer cleanup effort.
2023-09-30 06:55:41 -04:00
Alexey Bataev
ebcb5d59fc Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst."
This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix
buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.
2023-09-29 15:03:46 -07:00
Alexey Bataev
9f5960e004 [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-09-29 13:16:03 -07:00
Alexey Bataev
3204f88a8b Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst."
This reverts commit c88c281cf1ac1a01c55231b93826d7c8ae83985b to fix the
crash revealed by https://lab.llvm.org/buildbot/#/builders/230/builds/19353.
2023-09-28 11:57:32 -07:00
Alexey Bataev
c88c281cf1 [IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.

Differential Revision: https://reviews.llvm.org/D158449
2023-09-28 11:03:21 -07:00
Nick Desaulniers
330fa7d2a4
[TargetLowering] Deduplicate choosing InlineAsm constraint between ISels (#67057)
Given a list of constraints for InlineAsm (ex. "imr") I'm looking to
modify the order in which they are chosen. Before doing so, I noticed a
fair
amount of logic is duplicated between SelectionDAGISel and GlobalISel
for this.

That is because SelectionDAGISel is also trying to lower immediates
during selection. If we detangle these concerns into:
1. choose the preferred constraint
2. attempt to lower that constraint

Then we can slide down the list of constraints until we find one that
can be lowered. That allows the implementation to be shared between
instruction selection frameworks.

This makes it so that later I might only need to adjust the priority of
constraints in one place, and have both selectors behave the same.
2023-09-25 08:53:03 -07:00
Jon Roelofs
83e6d2edfc
Revert "[ARM] Always lower direct calls as direct when the outliner is enabled (#66434)"
This reverts commit 003bcad9a8b21e15e3786a52b1dafa844075ab84.

ARM folks say it regresses some of their benchmarks:
https://github.com/llvm/llvm-project/pull/66434#issuecomment-1722424162
2023-09-18 09:45:46 -07:00
Yingwei Zheng
b423e1f05d
[SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a constant rhs
This patch avoids creating (sub x0, rhs) when lowering atomic_load_sub with a constant rhs.
Comparison with GCC: https://godbolt.org/z/c5zPdP7j4

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D158673
2023-09-16 17:09:41 +08:00
Jon Roelofs
003bcad9a8
[ARM] Always lower direct calls as direct when the outliner is enabled (#66434)
The indirect lowering hinders the outliner's ability to see that
sequences are in fact common, since the sequence similarity is rendered
opaque by the register callee. The size savings from making them
indirect seems to be dwarfed by the outliner's savings from
de-duplication.

rdar://115178034
rdar://115459865
2023-09-15 10:04:56 -07:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
Matt Arsenault
b14e83d1a4 IR: Add llvm.exp10 intrinsic
We currently have log, log2, log10, exp and exp2 intrinsics. Add exp10
to fix this asymmetry. AMDGPU already has most of the code for f32
exp10 expansion implemented alongside exp, so the current
implementation is duplicating nearly identical effort between the
compiler and library which is inconvenient.

https://reviews.llvm.org/D157871
2023-09-01 19:45:03 -04:00
Nicholas Guy
d65feccb12 [ARM] Set preferred function alignment
Aligning functions yields small performance gains on
embedded cores, moreso with numerous small function calls.
Similar to aligning loops, if the function can fit within
a single cache line then the performance overhead of
fetching more instructions can be limited.

Differential Revision: https://reviews.llvm.org/D157514
2023-08-16 17:31:21 +01:00
Jay Foad
fdbc944385 Fix typos in comments 2023-08-15 13:57:21 +01:00
Bjorn Pettersson
e53b28c833 [llvm] Drop some bitcasts and references related to typed pointers
Differential Revision: https://reviews.llvm.org/D157551
2023-08-10 15:07:07 +02:00
Jay Foad
2dcf051259 [CodeGen] Store call frame size in MachineBasicBlock
Record the call frame size on entry to each basic block. This is usually
zero except when a basic block has been split in the middle of a call
sequence.

This simplifies PEI::replaceFrameIndices which previously had to visit
basic blocks in a specific order and had special handling for
unreachable blocks. More importantly it paves the way for an equally
simple implementation of a backwards version of replaceFrameIndices,
which is required to fully convert PrologEpilogInserter to backwards
register scavenging, which is preferred because it does not rely on
accurate kill flags.

Differential Revision: https://reviews.llvm.org/D156113
2023-07-27 10:32:00 +01:00
John Brawn
cee7e7b245 [ARM] Correctly handle execute-only in EmitStructByval
Currently when compiling for an execute-only target without movt then
EmitStructByval will generate a constant pool load which isn't
compatible with execute-only. Handle this by emitting tMOVi32imm,
and also simplify the existing movt handling by emitting t2MOVi32imm
or MOVi32imm.

Differential Revision: https://reviews.llvm.org/D154944
2023-07-19 13:56:36 +01:00
Oliver Stannard
aea8db8eb9 Revert "[CodeGen] Store SP adjustment in MachineBasicBlock. NFCI."
This reverts commit 58d1eaa3b6ce4f7285c51f83faff7a3ac374c746.
2023-07-13 14:25:39 +01:00
Caslyn Tonelli
6d9065a716 Revert "[ARM] Correctly handle execute-only in EmitStructByval"
This reverts commit 210f61cbddeddac47b347db072d674ee142520f6.

Differential Revision: https://reviews.llvm.org/D155138
2023-07-12 23:29:54 +00:00
Jay Foad
58d1eaa3b6 [CodeGen] Store SP adjustment in MachineBasicBlock. NFCI.
Record the SP adjustment on entry to each basic block. This is almost
always zero except on targets like ARM which can split a basic block in
the middle of a call sequence.

This simplifies PEI::replaceFrameIndices which previously had to visit
basic blocks in a specific order and had special handling for
unreachable blocks. More importantly it paves the way for an equally
simple implementation of a backwards version of replaceFrameIndices,
which is required to fully convert PrologEpilogInserter to backwards
register scavenging, which is preferred because it does not rely on
accurate kill flags.

Differential Revision: https://reviews.llvm.org/D154281
2023-07-12 14:29:26 +01:00
John Brawn
210f61cbdd [ARM] Correctly handle execute-only in EmitStructByval
Currently when compiling for an execute-only target without movt then
EmitStructByval will generate a constant pool load which isn't
compatible with execute-only. Handle this by emitting tMOVi32imm,
and also simplify the existing movt handling by emitting t2MOVi32imm
or MOVi32imm.

Differential Revision: https://reviews.llvm.org/D154944
2023-07-12 11:48:01 +01:00