3400 Commits

Author SHA1 Message Date
Fangrui Song
2417618d5c [Verifier] Reject dllexport with non-default visibility
Add a visibility check for dllimport and dllexport. Note: dllimport with a
non-default visibility (implicit dso_local) is already rejected, but with a less
clear dso_local error.

The MC level visibility `MCSA_Exported` (D123951) is mapped from IR level
default visibility when dllexport is specified. The D123951 error is now very
difficult to trigger (needs to disable the IR verifier).

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D133267
2022-09-05 10:53:41 -07:00
Kai Luo
ad2f7fd286 [AtomicExpand] Make floating point conversion happens before fence insertion
IIUC, the conversion part is not part of atomic operations and fences should be put around converted atomic operations.
This also fixes atomic load of floating point values which requires fence on PowerPC.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D127609
2022-08-31 09:54:58 +08:00
Ting Wang
710923cdc8 [PowerPC] CTRLoop pseudo instructions should not be duplicated
Add isNotDuplicable to CTRLoop pseudo instructions, to avoid other pass
such as early-tailduplication break the loop structure by duplicating
pseudo instructions.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D132738
2022-08-30 04:32:29 -04:00
Ting Wang
f908cbc36f [NFC][PowerPC] Add test case to show ctrloop mi shall not be duplicated
Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D132899
2022-08-30 01:57:22 -04:00
Maryam Moghadas
a09140f34f Revert "[PowerPC] Remove extra swap for extract+vperm on LE"
This reverts commit f7294ac8093a2fbd8c00254580eaac6c4e1f7b24.
2022-08-25 12:34:43 -05:00
Matthias Braun
5364f49407 Fix CSR update check
D132080 introduced a bug leading to `RegisterClassInfo` caches not
getting invalidated when there was exactly one more CSR register added.

Differential Revision: https://reviews.llvm.org/D132606
2022-08-24 18:09:49 -07:00
esmeyi
dfe55cc1cd [AIX] use the original name as the input to create the new symbol for TLS symbol.
Summary: Currently, an error was reported when a thread local symbol has an invalid name. D100956 create a new symbol to prefix the TLS symbol name with a dot. When the symbol name is renamed, the error occurs. This patch uses the original symbol name (name in the symbol table) as the input for the symbol for TOC entry.

Reviewed By: shchenz, lkail

Differential Revision: https://reviews.llvm.org/D132348
2022-08-24 01:36:40 -04:00
Amaury Séchet
7e24354c8b [PowerPC] Autogenerate crbits.ll . NFC 2022-08-23 13:50:25 +00:00
Matthias Braun
b2542c40b9 RegisterClassInfo: Fix CSR cache invalidation
`RegisterClassInfo` caches information like allocation orders and reuses
it for multiple machine functions where possible. However the `MCPhysReg
*CalleeSavedRegs` field used to test whether the set of callee saved
registers changed did not work: After D28566
`MachineRegisterInfo::getCalleeSavedRegs()` can return dynamically
computed CSR sets that are only valid while the `MachineRegisterInfo`
object of the current function exists.

This changes the code to make a copy of the CSR list instead of keeping
a possibly invalid pointer around.

Differential Revision: https://reviews.llvm.org/D132080
2022-08-22 09:28:26 -07:00
Stefan Pintilie
1492c88f49 [PowerPC] Fix bugs in sign-/zero-extension elimination
This patch fixes the following two bugs in `PPCInstrInfo::isSignOrZeroExtended` helper, which is used from sign-/zero-extension elimination in PPCMIPeephole pass.
- Registers defined by load with update (e.g. LBZU) were identified as already sign or zero-extended. But it is true only for the first def (loaded value) and not for the second def (i.e. updated pointer).
- Registers defined by ORIS/XORIS were identified as already sign-extended. But, it is not true for sign extension depending on the immediate (while it is ok for zero extension).

To handle the first case, the parameter for the helpers is changed from `MachineInstr` to a register number to distinguish first and second defs. Also, this patch moves the initialization of PPCMIPeepholePass to allow mir test case.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D40554
2022-08-19 07:05:40 -05:00
Nick Desaulniers
6b0e2fa6f0 [SelectionDAG] make INLINEASM_BR use MachineBasicBlocks instead of BlockAddresses
As part of re-architecting callbr to no longer use blockaddresses
(https://reviews.llvm.org/D129288), we don't really need them in MIR.
They make comparing MachineBasicBlocks of indirect targets during
MachineVerifier a PITA.

Suggested by @efriedma from the discussion:
https://reviews.llvm.org/D130290#3669531

Reviewed By: efriedma, void

Differential Revision: https://reviews.llvm.org/D130316
2022-08-17 09:34:31 -07:00
Amy Kwan
a5bef98c75 [PowerPC][NFC] Add additional vector_shuffle tests involving scalar_to_vector.
This patch adds additional test cases involving vector_shuffles where either its
left, right or both inputs are scalar_to_vector nodes. These test cases involve
v16i8, v2i64, v4i32 and v8i16 vector shuffles, and were generated in preparation
for D130487.

Differential Revision: https://reviews.llvm.org/D130485
2022-08-15 12:30:58 -05:00
Filipp Zhinkin
1626ee6a95 [DAGCombine] Hoist shifts out of a logic operations tree.
Hoist and combine shift operations from logic operations tree:
logic (logic (SH x0, s), y), (logic (SH x1, s), z)  --> logic (SH (logic x0, x1), s), (logic y, z)

The transformation improves code generated for some cases related to the issue https://github.com/llvm/llvm-project/issues/49541.

Correctness:
https://alive2.llvm.org/ce/z/pVqVgY
https://alive2.llvm.org/ce/z/YVvT-q
https://alive2.llvm.org/ce/z/W5zTBq
https://alive2.llvm.org/ce/z/YfJsvJ
https://alive2.llvm.org/ce/z/3YSyDM
https://alive2.llvm.org/ce/z/Bs2kzk
https://alive2.llvm.org/ce/z/EoQpzU
https://alive2.llvm.org/ce/z/Jnc_5H
https://alive2.llvm.org/ce/z/_LP6k_
https://alive2.llvm.org/ce/z/KvZNC9

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D131189
2022-08-12 12:42:16 +03:00
Ting Wang
13c1e7a8aa [PowerPC] Fix test case changed by "Add XXEVAL TD pattern" [NFC] 2022-08-12 02:56:54 -04:00
Ting Wang
12e1936f64 [PowerPC] Add XXEVAL TD pattern
Add xxeval TD pattern for P10 on: eqv, nor, or, xor.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D131654
2022-08-12 01:27:24 -04:00
Chen Zheng
8d19cfb72e [PowerPC] omit location attribute for TLS variable on AIX
TLS debug on AIX is not ready for now.
The location generated in no-integrated-as mode is wrong and
in integrated-as mode causes AIX linker error.

Reviewed By: Esme

Differential Revision: https://reviews.llvm.org/D130245
2022-08-12 00:54:48 -04:00
esmeyi
7a70e6e224 [XCOFF] ignore the cold attribute.
Summary: AIX XCOFF doesn't support the cold feature.
    While it shouldn't be a function error when XCOFF catching the cold attribute.
    As with the behavior of other formats, we just ignore the attribute for now.

Reviewed By: DiggerLin

Differential Revision: https://reviews.llvm.org/D131473
2022-08-11 01:13:05 -04:00
Umesh Kalappa
9757f4f2dd [PowerPC] Don't use the S30 and S31 regs for the pic code
These changes to address issue
https://github.com/llvm/llvm-project/issues/55857.

Since R30/S30 is used as pointer (32 bits) for GOT Table in the ppc32 ABI,
remove it from the SPE callee save register when PIC is enabled.

This prevents emitting the SPE load and store for S30 and S31 regs.

Differential revision: https://reviews.llvm.org/D127495
2022-08-10 10:31:27 -05:00
Justin Hibbits
f43b228581 PowerPC: Don't hoist float multiply + add to fused operation on SPE
SPE doesn't have a fmadd instruction, so don't bother hoisting a
multiply and add sequence to this, as it'd become just a library call.
Hoisting happens too late for the CTR usability test to veto using the
CTR in a loop, and results in an assert "Invalid PPC CTR loop!".
2022-08-10 11:04:27 -04:00
Edd Barrett
fa250250b2
Migrate llvm.experimental.patchpoint() to ptr.
This intrinsic used a typed pointer for a call target operand. This
change updates the operand to be an opaque pointer and updates all
pointers in all test files that use the intrinsic.

Differential revision: https://reviews.llvm.org/D131261
2022-08-10 13:18:02 +01:00
Yuta Mukai
5357dd2f43 [MachinePipeliner] Fix Phi generation failure for large stages
The previous code overwrites VRMap for prologue stages during Phi
generation if a register spans many stages.
As a result, the wrong register is used as the one coming from
the prologue in Phis at later stages. (A process exists to correct
this, but it does not work in all cases.)
In addition, VRMap for prologue must be preserved until addBranches().

This patch fixes them by separating the map for Phis into a different
variable (VRMapPhi).

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D127840
2022-08-09 13:14:26 +09:00
Chen Zheng
d9004dfbab [PowerPC] mapping hardward loop intrinsics to powerpc pseudo
Map hardware loop intrinsics loop_decrement and set_loop_iteration
to the new PowerPC pseudo instructions, so that the hardware loop
intrinsics will be expanded to normal cmp+branch form or ctrloop
form based on the CTR register usage on MIR level.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D123366
2022-08-08 21:34:20 -04:00
Roland Froese
d6bd3d373e [DAGCombiner] Add some BE store forwarding tests; NFC
Add tests before D130115. NFC.
2022-08-08 16:33:01 -04:00
Chen Zheng
13016f1f1b [NFC] add test cases for D123366 2022-08-06 08:56:42 -04:00
Chen Zheng
ef60e44fe8 [PowerPC] fix stack size allocated for float point argument
This is for https://github.com/llvm/llvm-project/issues/56469

Allocate 4 bytes for float point arguments on PPC32.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D129558
2022-08-06 08:38:52 -04:00
Chen Zheng
e99ffe6ae8 [NFC] add test case for D129558 2022-08-05 23:15:48 -04:00
Amaury Séchet
1e15e24a76 [NFC] Autogenerate CodeGen/PowerPC/pzero-fp-xored.ll 2022-07-28 16:18:43 +00:00
Simon Pilgrim
69d5a038b9 [DAG] Enable ISD::SRL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the ISD::SRL source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.

This is another step towards removing SelectionDAG::GetDemandedBits and just using TargetLowering::SimplifyMultipleUseDemandedBits.

There a few cases where we end up with extra register moves which I think we can accept in exchange for the increased ILP.

Differential Revision: https://reviews.llvm.org/D77804
2022-07-28 14:10:44 +01:00
Umesh Kalappa
f38ea84a9f [PowerPC] Change long to int64_t (which is always 64 bit or 8 bytes )
We can't guarantee the long always 64 bits like WINDOWS or LLP64 data
model (rare but we should consider).

So use int64_t from inttypes.h and safe in this case.

Fixes https://github.com/llvm/llvm-project/issues/55911 .
2022-07-27 09:34:45 -07:00
Fangrui Song
7225213c0a [LegacyPM] Remove {,PostInline}EntryExitInstrumenterPass
Following recent changes removing non-core features of the legacy
PM/optimization pipeline.
2022-07-23 15:30:15 -07:00
Stefan Pintilie
475a39fbc3 [PowerPC][NFC] Convert the MMA test cases to use opaque pointers.
This patch modifies only test cases.
Converted the MMA test cases to use opaque pointers.

Reviewed By: lei, amyk

Differential Revision: https://reviews.llvm.org/D130090
2022-07-22 11:43:40 -05:00
Chen Zheng
bc5c637376 enable P10 vector builtins test on AIX 64 bit; NFC
Verify that P10 vector builtins with type `vector signed __int128`
and `vector unsigned __int128` work well on AIX 64 bit.
2022-07-21 04:23:02 -04:00
esmeyi
339392ecf2 [AIX] follow-up of D124654.
Emitting the remaining aliases instead of reporting
an error to avoid SPEC2017 PEAK failures.
And mark this as a TODO.
2022-07-21 01:10:09 -04:00
esmeyi
b1847ff068 [XCOFF] write the aux header when the visibility is specified in XCOFF32.
The n_type field in the symbol table entry has two interpretations in XCOFF32, and a single interpretation in XCOFF64.
The new interpretation is used in XCOFF32 if the value of the o_vstamp field in the auxiliary header is 2.
In XCOFF64 and the new XCOFF32 interpretation, the n_type field is used for the symbol type and visibility.
The patch writes the aux header with an o_vstamp field value of 2 when the visibility is specified in XCOFF32 to make the new XCOFF32 interpretation used.

Reviewed By: DiggerLin, jhenderson

Differential Revision: https://reviews.llvm.org/D128148
2022-07-20 07:09:34 -04:00
Simon Pilgrim
9fc347aa4e [DAG] PromoteIntRes_BUILD_VECTOR - extend constant boolean vectors according to target BooleanContents
PromoteIntRes_BUILD_VECTOR currently always ANY_EXTENDs build vector operands, but if this is a constant boolean vector we're losing the useful ability to keep the vector matching the BooleanContents mode used by the target.

This patch extends constant boolean vectors according to target BooleanContents, allowing a number of additional all-bits folds (notable XOR -> NOT conversions) to occur.

Differential Revision: https://reviews.llvm.org/D129641
2022-07-20 10:49:31 +01:00
esmeyi
28b1ba1c07 [PowerPC] Add an ISEL pattern for i32 MULLI.
We add the following ISEL pattern for i64 imm in D87384, this patch is for i32.
`mul with (2^N * int16_imm) -> MULLI + RLWINM`

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D129708
2022-07-18 04:40:51 -04:00
Nikita Popov
2a721374ae [IR] Don't use blockaddresses as callbr arguments
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:

    ; Before:
    %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
    to label %asm.fallthrough [label %foo]
    ; After:
    %res = callbr i8* asm "", "=r,r,!i"(i8* %x)
    to label %asm.fallthrough [label %foo]

The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:

* Allow unrolling/peeling/rotation of callbr, or any other
  clone-based optimizations
  (https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
  (https://github.com/llvm/llvm-project/issues/45248)

This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.

Differential Revision: https://reviews.llvm.org/D129288
2022-07-15 10:18:17 +02:00
Simon Pilgrim
64ffcba1f8 [PowerPC] Regenerate pr35402.ll test checks 2022-07-13 11:01:44 +01:00
esmeyi
100319cdb4 [AIX] follow-up of D124654.
Report an error when alias symbols are not emitted all.
2022-07-13 03:39:08 -04:00
Kai Nacke
42f7364fcb [GISel] Check useLoadStackGuardNode() before generating LOAD_STACK_GUARD
When lowering llvm::stackprotect intrinsic, the SDAG implementation
checks useLoadStackGuardNode() to either create a LOAD_STACK_GUARD or use
the first argument of the intrinsic. This check is not present in the
IRTranslator, which results in always generating a LOAD_STACK_GUARD even
if the target does not support it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D129505
2022-07-12 11:44:42 -04:00
Nikita Popov
4bb7b6fae3 [IR] Remove support for float binop constant expressions
As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179,
this removes support for the floating-point binop constant expressions
fadd, fsub, fmul, fdiv and frem.

As part of this change, the C APIs LLVMConstFAdd, LLVMConstFSub,
LLVMConstFMul, LLVMConstFDiv and LLVMConstFRem are removed.
The LLVMBuild APIs should be used instead.

Differential Revision: https://reviews.llvm.org/D129478
2022-07-12 09:40:49 +02:00
Sanjay Patel
8b75671314 [SDAG] try to replace subtract-from-constant with xor
This is almost the same as the abandoned D48529, but it
allows splat vector constants too.

This replaces the x86-specific code that was added with
the alternate patch D48557 with the original generic
combine.

This transform is a less restricted form of an existing
InstCombine and the proposed SDAG equivalent for that
in D128080:
https://alive2.llvm.org/ce/z/OUm6N_

Differential Revision: https://reviews.llvm.org/D128123
2022-07-08 08:14:24 -04:00
Nikita Popov
9936d732cd [PowerPC] Simplify test for PR33636 (NFC)
There was a lot of unnecessary code here. Add the -O0 flag to
avoid using constant expressions, otherwise it may get folded
away during EarlyCSE.

Verified that this test fails prior to the fixing commit.
2022-07-06 09:47:42 +02:00
Jay Foad
3ff319c690 [PowerPC] PPCTLSDynamicCall does not preserve LiveIntervals
According to D127731, PPCTLSDynamicCall does not preserve
LiveIntervals, so stop claiming that it does and remove the code
that tried to repair them. NFCI.

Differential Revision: https://reviews.llvm.org/D128421
2022-07-05 20:09:42 +01:00
esmeyi
d2a35e4d39 [AIX] Handling the label alignment of a global
variable with its multiple aliases.

This patch handles the case where a variable has
multiple aliases.
AIX's assembly directive .set is not usable for the
aliasing purpose, and using different labels allows
AIX to emulate symbol aliases. If a value is emitted
between any two labels, meaning they are not aligned,
XCOFF will automatically calculate the offset for them.

This patch implements:
1) Emits the label of the alias just before emitting
the value of the sub-element that the alias referred to.
2) A set of aliases that refers to the same offset
should be aligned.
3) We didn't emit aliasing labels for common and
zero-initialized local symbols in
PPCAIXAsmPrinter::emitGlobalVariableHelper, but
emitted linkage for them in
AsmPrinter::emitGlobalAlias, which caused a FAILURE.
This patch fixes the bug by blocking emitting linkage
for the alias without a label.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D124654
2022-07-03 23:16:16 -04:00
Chen Zheng
370127b7d5 [XCOFF] change default program code csect alignment to 32
This is the same with commercial XLC on AIX.

Reviewed By: Esme

Differential Revision: https://reviews.llvm.org/D114419
2022-06-29 04:16:01 +00:00
Ting Wang
88b6d22791 [PowerPC] Improve getNormalLoadInput to reach more splat load
opportunities

There are straight forward splat load opportunities blocked by
getNormalLoadInput(), since those cases involve consecutive bitcasts.
Improve by looking through bitcasts.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D128703
2022-06-28 08:02:49 -04:00
Ting Wang
22b8f3511a [PowerPC] Add base test case for load splat opportunity
Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D128718
2022-06-28 06:55:23 -04:00
Nikita Popov
217e85761c [ArgPromotion] Remove legacy PM support
Support for the legacy pass manager in ArgPromotion causes
complications in D125485. As the legacy pass manager for middle-end
optimizations is unsupported, drop ArgPromotion from the legacy
pipeline, rather than introducing additional complexity to deal
with it.

Differential Revision: https://reviews.llvm.org/D128536
2022-06-27 09:42:17 +02:00
Kai Luo
106657df4c [PowerPC][AIX] Fix assertion message on AIX. NFC.
Fixes build https://lab.llvm.org/buildbot/#/builders/214/builds/1980.
2022-06-24 12:03:57 +08:00