6814 Commits

Author SHA1 Message Date
David Tenty
8042699a30 [LLVM] Add exported visibility style for XCOFF
For the AIX linker, under default options, global or weak symbols which
have no visibility bits set to zero (i.e. no visibility, similar to ELF
default) are only exported if specified on an export list provided to
the linker. So AIX has an additional visibility style called
"exported" which indicates to the linker that the symbol should
be explicitly globally exported.

This change maps "dllexport" in the LLVM IR to correspond to XCOFF
exported as we feel this best models the intended semantic (discussion
on the discourse RFC thread: https://discourse.llvm.org/t/rfc-adding-exported-visibility-style-to-the-ir-to-model-xcoff-exported-visibility/61853)
and allows us to enable writing this visibility for the AIX target
in the assembly path.

Reviewed By: DiggerLin

Differential Revision: https://reviews.llvm.org/D123951
2022-04-28 14:56:00 -04:00
Vasileios Porpodas
fa8a9fea47 Recommit "[SLP][TTI] Refactoring of getShuffleCost Args to work like getArithmeticInstrCost"
This reverts commit 6a9bbd9f20dcd700e28738788bb63a160c6c088c.

Code review: https://reviews.llvm.org/D124202
2022-04-26 14:02:40 -07:00
David Green
9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Fangrui Song
fb193db2c7 [PowerPC] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds 2022-04-19 22:35:05 -07:00
Stefan Pintilie
ef34442232 [NFC][PowerPC] Move the Regsiter Operands for PowerPC into PPCRegisterInfo.td
Currently the regsiter operand definitions are found in three separate files.
This patch moves all of the definitions into PPCRegisterInfo.td.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D123543
2022-04-18 14:50:24 -05:00
Stefan Pintilie
bc9916fff2 [NFC][PowerPC] Style and ordering changes for PPCInstrP10.td
Renamed the two classes 8LS_DForm_R_SI34_RTA5 and 8LS_DForm_R_SI34_XT6_RA5 to
8LS_DForm_R_SI34_RTA5_MEM and 8LS_DForm_R_SI34_XT6_RA5_MEM because the
instructions that use the classes use memory reads/writes.

Moved the instruction defs up closer to the classes.
Removed unnecessary whitespace.
2022-04-18 13:28:48 -05:00
Qiu Chaofan
1e23175df6 [PowerPC] Mark side effects of Power9 darn instruction
This fixes CVE-2019-15847, preventing random number generation from
being merged.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D122783
2022-04-18 13:21:40 +08:00
Kai Luo
18679ac0d7 [PowerPC] Adjust MaxAtomicSizeInBitsSupported on PPC64
AtomicExpandPass uses this variable to determine emitting libcalls or not. The default value is 1024 and if we don't specify it for PPC64 explicitly, AtomicExpandPass won't emit `__atomic_*` libcalls for those target unable to inline atomic ops and finally the backend emits `__sync_*` libcalls. Thanks @efriedma for pointing it out.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122868
2022-04-09 00:03:09 +00:00
Kai Luo
549e118e93 [PowerPC] Support 16-byte lock free atomics on pwr8 and up
Make 16-byte atomic type aligned to 16-byte on PPC64, thus consistent with GCC. Also enable inlining 16-byte atomics on non-AIX targets on PPC64.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D122377
2022-04-08 23:25:56 +00:00
Ting Wang
b389354b28 [Clang][PowerPC] Add max/min intrinsics to Clang and PPC backend
Add support for builtin_[max|min] which has below prototype:
A builtin_max (A1, A2, A3, ...)
All arguments must have the same type; they must all be float, double, or long double.
Internally use SelectCC to get the result.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D122478
2022-04-05 22:43:48 -04:00
Stefan Pintilie
585c85abe5 [PowerPC] Fix lowering of byval parameters for sizes greater than 8 bytes.
To store a byval parameter the existing code would store as many 8 byte elements
as was required to store the full size of the byval parameter.
For example, a paramter of size 16 would store two element of 8 bytes.
A paramter of size 12 would also store two elements of 8 bytes.
This would sometimes store too many bytes as the size of the paramter is not
always a factor of 8.

This patch fixes that issue and now byval paramters are stored with the correct
number of bytes.

Reviewed By: nemanjai, #powerpc, quinnp, amyk

Differential Revision: https://reviews.llvm.org/D121430
2022-03-31 15:12:46 -05:00
Stefan Pintilie
2e55bc9f3c [PowerPC] Set the special DSCR with a compiler option.
Add a compiler option and the instructions required to set the
special Data Stream Control Register (DSCR). The special register will
not be set by default.

Original patch by: Muhammad Usman

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D117013
2022-03-31 14:06:30 -05:00
Shao-Ce SUN
662b9fa02c [NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122557
2022-03-29 09:53:24 +08:00
Kazu Hirata
6212871968 [Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC) 2022-03-27 22:22:37 -07:00
Maksim Panchenko
4ae9745af1 [Disassember][NFCI] Use strong type for instruction decoder
All LLVM backends use MCDisassembler as a base class for their
instruction decoders. Use "const MCDisassembler *" for the decoder
instead of "const void *". Remove unnecessary static casts.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D122245
2022-03-25 18:53:59 -07:00
Vasileios Porpodas
39aa202aff Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b774718113007ecb1a4449d7af0cbcfeb.
2022-03-23 18:32:17 -07:00
Stefan Pintilie
2c25c65cdc [PowerPC] The BL8_NOTOC_RM instruction needs to produce a notoc relocation.
The BL8_NOTOC_RM instruction was incorrectly producing a relocation that reqired
a TOC restore after the call. This patch fixes that issue and the notoc
relocation is now used.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D122012
2022-03-23 19:01:05 -05:00
Arthur Eubanks
e6ead19b77 Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f94928201f87f6b659fc2228efd539e8245.

Causes crashes, see comments in D121973
2022-03-23 10:57:45 -07:00
Vasileios Porpodas
27bd8f9492 Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354

This reverts commit f7d7d2a08d16356c57f6d2d36bc2fc0589a55df9.
2022-03-22 16:41:55 -07:00
Arthur Eubanks
f7d7d2a08d Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d305013de743cdbd6690e4d77c8af27e.

Causes crashes, see comments in https://reviews.llvm.org/D121973.
2022-03-22 13:33:49 -07:00
Vasileios Porpodas
79613185d3 Recommit "[SLP] Fix lookahead operand reordering for splat loads."
Original review: https://reviews.llvm.org/D121354

The original commit 9136145eb019e1d18c966d4d06a3df349b88cc14 broke the build on several targets.

Differential Revision: https://reviews.llvm.org/D121973
2022-03-21 15:57:32 -07:00
Chen Zheng
9ada761be3 [PowerPC][NFC] rename file for PPCCTRLoopsVerify pass.
Rename file for PPCCTRLoopsVerify pass from PPCCTRLoops.cpp
to PPCCTRLoopsVerify.cpp.

There will be a new file PPCCTRLoops.cpp for PPC CTR loops
generation later.
2022-03-21 03:42:14 -04:00
Aaron Puchert
c1a31ee65b [PPCISelLowering] Avoid emitting calls to __multi3, __muloti4
After D108936, @llvm.smul.with.overflow.i64 was lowered to __multi3
instead of __mulodi4, which also doesn't exist on PowerPC 32-bit, not
even with compiler-rt. Block it as well so that we get inline code.

Because libgcc doesn't have __muloti4, we block that as well.

Fixes #54460.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122090
2022-03-20 20:59:30 +01:00
Shengchen Kan
37b378386e [NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated comments 2022-03-16 20:25:42 +08:00
serge-sans-paille
989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Stefan Pintilie
78406ac898 [PowerPC][P10] Add Vector pair calling convention
Add the calling convention for the vector pair registers.
These registers overlap with the vector registers.

Part of an original patch by: Lei Huang

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D117225
2022-03-15 14:08:42 -05:00
Qiu Chaofan
300e1293de [PowerPC] Disable perfect shuffle by default
We are going to remove the old 'perfect shuffle' optimization since it
brings performance penalty in hot loop around vectors. For example, in
following loop sharing the same mask:

  %v.1 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>
  %v.2 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>

The generated instructions will be `vmrglw-vmrghw-vmrglw-vmrghw` instead
of `vperm-vperm`. In some large loop cases, this causes 20%+ performance
penalty.

The original attempt to resolve this is to pre-record masks of every
shufflevector operation in DAG, but that is somewhat complex and brings
unnecessary computation (to scan all nodes) in optimization. Here we
disable it by default. There're indeed some cases becoming worse after
this, which will be fixed in a more careful way in future patches.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D121082
2022-03-15 15:52:24 +08:00
Nemanja Ivanovic
766ca2c59e [PowerPC] Add missed VSX shuffles instead of Altivec ones
VSX introduced some permute instructions that are direct
replacements for Altivec ones except they can target all
the VSX registers. We have added code generation for most
of these but somehow missed the low/hi word merges (XXMRG[LH]W).
This caused some additional spills on some large
computationally intensive code.

This patch simply adds the missed patterns.
2022-03-14 10:11:54 -05:00
serge-sans-paille
ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Nico Weber
a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille
7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Masoud Ataei
30f30e1c12 [PowerPC] Fix the none tail call in scalar MASS conversion
This patch is proposing a fix for patch https://reviews.llvm.org/D101759
on none tail call math function conversion to MASS call.

Differential: https://reviews.llvm.org/D121016

reviewer: @nemanjai
2022-03-08 08:59:17 -08:00
Qiu Chaofan
b2497e5435 [PowerPC] Add generic fnmsub intrinsic
Currently in Clang, we have two types of builtins for fnmsub operation:
one for float/double vector, they'll be transformed into IR operations;
one for float/double scalar, they'll generate corresponding intrinsics.

But for the vector version of builtin, the 3 op chain may be recognized
as expensive by some passes (like early cse). We need some way to keep
the fnmsub form until code generation.

This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub
intrinsics.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D116015
2022-03-07 13:00:06 +08:00
Mircea Trofin
cb2160760e [nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.

Differential Revision: https://reviews.llvm.org/D119876
2022-03-01 21:53:25 -08:00
Stefan Pintilie
a84a8c937b [PowerPC] Remove redundant MMA patterns.
There are two MMA patterns that have been added twice. This patch just removes
one set of petterns. Should not change the way MMA behaves.

Reviewed By: lei, #powerpc

Differential Revision: https://reviews.llvm.org/D120680
2022-03-01 09:13:21 -06:00
Jameson Nash
c4b1a63a1b mark getTargetTransformInfo and getTargetIRAnalysis as const
Seems like this can be const, since Passes shouldn't modify it.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D120518
2022-02-25 14:30:44 -05:00
Stefan Pintilie
0625aed2fc [PowerPC][NFC] Split out the MMA instructions from the P10 instructions.
Currently all of the MMA instructions as well as the MMA related register info
is bundled with the Power 10 instructions. This patch just splits them out.

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D120515
2022-02-25 11:41:09 -06:00
Stefan Pintilie
4fbe60fd13 [PowerPC][NFC] Add file info and license that was missing from this file.
Added the license info as well as description about how classes should be named
based on existing documentation.

Reviewed By: lei, #powerpc

Differential Revision: https://reviews.llvm.org/D120530
2022-02-25 10:28:02 -06:00
Stefan Pintilie
eb1c5a9862 [PowerPC] Add the Power10 LXVKQ instrution.
Add the Power 10 instruction LXVKQ.

This patch was taken from an original patch by: Yi-Hong Lyu

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D117507
2022-02-23 08:48:59 -06:00
Nemanja Ivanovic
2aaba44b5c [PowerPC] Allow absolute expressions in relocations
The Linux kernel build uses absolute expressions suffixed with @lo/@ha
relocations. This currently doesn't work for DS/DQ form instructions and
there is no reason for it not to. It also works with GAS.
This patch allows this as long as the value is a multiple of 4/16
for DS/DQ form.

Differential revision: https://reviews.llvm.org/D115419
2022-02-22 09:53:08 -06:00
Qiu Chaofan
43d48ed220 [PowerPC] Add option to disable perfect shuffle
Perfect shuffle was introduced into PowerPC backend years ago, and only
available in big-endian subtargets. This optimization has good effects
in simple cases, but brings serious negative impact in large programs
with many shuffle instructions sharing the same mask.

Here introduces a temporary backend hidden option to control it until we
implemented better way to fix the gap in vectorshuffle decomposition.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D120072
2022-02-21 01:39:35 +08:00
Shao-Ce SUN
21ac474392 [NFC] Correct typo interger to integer 2022-02-17 21:17:47 +08:00
Lei Huang
5abe6c312b [PowerPC] Rename PPCInstrPrefix.td to PPCInstrP10.td 2022-02-16 10:22:41 -06:00
Shao-Ce SUN
2aed07e96c [NFC][MC] remove unused argument MCRegisterInfo in MCCodeEmitter
Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846
2022-02-16 13:10:09 +08:00
Amy Kwan
ac5a5a9cfe [PowerPC] Add default handling for single element vectors, and split/promote vNi1 vectors.
This patch updates the handling of vectors in getPreferredVectorAction():

For single-element and scalable vectors, fall back to default vector legalization
handling. For vNi1 vectors, add handling to either split or promote them in
order to prevent the production of wide v256i1/v512i1 types.

The following assertion is fixed by this patch, as we ended up producing the
wide vector types (that are used for MMA) in the backend prior to this fix.

```
Assertion failed: VT.getSizeInBits() == Operand.getValueSizeInBits() &&
"Cannot BITCAST between types of different sizes!"
```

Differential Revision: https://reviews.llvm.org/D119521
2022-02-15 08:44:08 -06:00
Stefan Pintilie
a601db30c6 [PowerPC] Remove the LDMX instruction.
The LDMX instruction was to be potentially added in P9 but it was never added
in either ISA 3.0 or ISA 3.1. This patch removes that instruction as it is
currently still an invalid instruction.

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D118074
2022-02-14 17:03:48 -06:00
Ting Wang
097a95f2df [PowerPC] Add custom lowering for SELECT_CC fp128 using xsmaxcqp
Power ISA 3.1 adds xsmaxcqp/xsmincqp for quad-precision type-c max/min selection,
and this opens the opportunity to improve instruction selection on: llvm.maxnum.f128,
llvm.minnum.f128, and select_cc ordered gt/lt and (don't care) gt/lt.

Reviewed By: nemanjai, shchenz, amyk

Differential Revision: https://reviews.llvm.org/D117006
2022-02-09 21:48:28 -05:00
serge-sans-paille
ef736a1c39 Cleanup LLVMMC headers
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:

llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h

Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after:  1049293745

Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
2022-02-09 11:09:17 +01:00
Nikita Popov
149195f576 [PPCISelLowering] Avoid use of getPointerElementType()
Use the value type instead.
2022-02-07 14:30:15 +01:00
Wael Yehia
addd073325 [AIX][PowerPC][PGO] Generate .ref for some PGO sections
For PGO on AIX, when we switch to the linux-style PGO variable access
(via _start and _stop labels), we need the compiler to generate a .ref
assembly for each of the three csects:

 -   __llvm_prf_data[RW]
 -   __llvm_prf_names[RO]
 -   __llvm_prf_vnds[RW]

We insert the .ref inside the __llvm_prf_cnts[RW] csect so that if it's
live then the 3 csects are live.

For example, for a testcase with at least one function definition, when
compiled with -fprofile-generate we should generate:

        .csect __llvm_prf_cnts[RW],3
        .ref __llvm_prf_data[RW]   <<============ needs to be inserted
        .ref __llvm_prf_names[RO]  <<===========

the __llvm_prf_vnds is not always present, so we reference it only when
it's present.

Reviewed By: sfertile, daltenty

Differential Revision: https://reviews.llvm.org/D116607
2022-02-05 06:34:20 -05:00