3618 Commits

Author SHA1 Message Date
Matt Arsenault
eece6ba283 IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics
AMDGPU has native instructions and target intrinsics for this, but
these really should be subject to legalization and generic
optimizations. This will enable legalization of f16->f32 on targets
without f16 support.

Implement a somewhat horrible inline expansion for targets without
libcall support. This could be better if we could introduce control
flow (GlobalISel version not yet implemented). Support for strictfp
legalization is less complete but works for the simple cases.
2023-06-06 17:07:18 -04:00
JP Lehr
c9998ec145 Revert "[DAGCombine] Make sure combined nodes are added back to the worklist in topological order."
This reverts commit e69fa03ddd85812be3143d79a0359c3e8d43bd45.

This patch lead to build time outs on the AMDGPU OpenMP runtime
buildbot.
2023-06-05 10:55:58 -04:00
Amaury Séchet
e69fa03ddd [DAGCombine] Make sure combined nodes are added back to the worklist in topological order.
Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D127115
2023-06-05 11:09:18 +00:00
Qiu Chaofan
9e17e08324 [PowerPC] Combine fptoint-store under strict cases
Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D141249
2023-06-05 16:24:02 +08:00
esmeyi
6f57d8df2d Revert "[XCOFF][DWARF] XCOFF64 should be able to select the dwarf format in intergrated-as mode."
This reverts commit 4054c68644dfebbb584bca698a25d18d1d312bae.

Due to AIX system linker requires DWARF64 for XCOFF64.
2023-06-05 02:50:47 -04:00
Qiu Chaofan
69bc8ff766 Reland "[PowerPC] Simplify fp-to-int store optimization"
The build failure should be fixed by de681d53. Follow-up refactor will
be done in future patches.

This reverts commit e7c5ced0b9f0551ea17e1d2b48be86f03a772c59.
2023-06-05 13:53:08 +08:00
sgokhale
c4a60c9d34 [CodeGen][ShrinkWrap] Enable PostShrinkWrap by default
This is an attempt to reland D42600 and enabling this optimisation by default.

This also resolves the issue pointed out in the context of PGO build.

Differential Revision: https://reviews.llvm.org/D42600
2023-05-25 13:56:29 +05:30
Vitaly Buka
e7c5ced0b9 Revert "[PowerPC] Simplify fp-to-int store optimization"
Breaks https://lab.llvm.org/buildbot/#/builders/18/builds/9118

This reverts commit 8064caf83fb166b709bfe0e7641c5181341cb064.
2023-05-24 10:05:28 -07:00
Nemanja Ivanovic
de681d53ba [PowerPC] Do not attempt to combine fptoui without FPCVT
Commit 8064caf83fb166b709bfe0e7641c5181341cb064 added a call
to a function that performs this combine without checking whether
the target supports FPCVT. This caused asserts to trip on BE bots
as the default target does not have this feature.
2023-05-24 11:14:26 -05:00
Fangrui Song
e018cbf720 [IR] Make stack protector symbol dso_local according to -f[no-]direct-access-external-data
There are two motivations.

`-fno-pic -fstack-protector -mstack-protector-guard=global` created
`__stack_chk_guard` is referenced directly on all ELF OSes except FreeBSD.
This patch allows referencing the symbol indirectly with
-fno-direct-access-external-data.

Some Linux kernel folks want
`-fno-pic -fstack-protector -mstack-protector-guard-reg=gs -mstack-protector-guard-symbol=__stack_chk_guard`
created `__stack_chk_guard` to be referenced directly, avoiding
R_X86_64_REX_GOTPCRELX (even if the relocation may be optimized out by the linker).
https://github.com/llvm/llvm-project/issues/60116
Why they need this isn't so clear to me.

---

Add module flag "direct-access-external-data" and set the dso_local property of
the stack protector symbol. The module flag can benefit other LLVMCodeGen
synthesized symbols that are not represented in LLVM IR.

Nowadays, with `-fno-pic` being uncommon, ideally we should set
"direct-access-external-data" when it is true. However, doing so would require
~90 clang/test tests to be updated, which are too much.

As a compromise, we set "direct-access-external-data" only when it's different
from the implied default value.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D150841
2023-05-23 09:49:57 -07:00
Qiu Chaofan
8064caf83f [PowerPC] Simplify fp-to-int store optimization
On PowerPC VSX targets, fp-to-int will be transformed into xscv with
mfvsr. When the result is to be stored, mfvsr can be replaced by a
direct store.

This change simplifies the optimization by using existing fp-to-int
code, which helps CSE and handling strictfp cases.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D141473
2023-05-23 16:40:54 +08:00
Kai Luo
330319557f [PowerPC] Precommit test for D151055. NFC. 2023-05-22 12:14:22 +08:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
esmeyi
4054c68644 [XCOFF][DWARF] XCOFF64 should be able to select the dwarf format in intergrated-as mode.
Summary: DWARF32 is not supported for XCOFF64 under non-integrated-as mode on AIX, because system assembler will fill the debug section lengths according to DWARF64 format. While in intergrated-as mode, XCOFF64 should be able to select the DWARF format.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D150181
2023-05-16 03:02:00 -04:00
Florian Hahn
5d57a9fd2b
[PowerPC] Adjust tests after e351b9b66da088.
Those tests were missed when landing e351b9b66da088.
2023-05-12 20:20:13 +01:00
Felipe de Azevedo Piovezan
33b69b9756 [YamlMF] Serialize EntryValueObjects
This commit implements the serialization and deserialization of the Machine
Function's EntryValueObjects.

Depends on D149879, D149778

Differential Revision: https://reviews.llvm.org/D149880
2023-05-11 10:20:05 -04:00
Alan Zhao
f4999d3535 Revert "[CodeGen][ShrinkWrap] Split restore point"
This reverts commit 1ddfd1c8186735c62b642df05c505dc4907ffac4.

The original commit causes a Chrome build assertion failure with
ThinLTO: https://crbug.com/1443635
2023-05-08 16:27:59 -07:00
Stefan Pintilie
be95b4dec2 [PowerPC] Look through OR, AND, XOR instructions when checking a clear.
This patch adds the additional step of looking through AND, OR, XOR
instructions when we check the number of leading zeros.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D149223
2023-05-08 14:25:20 -04:00
sgokhale
1ddfd1c818 [CodeGen][ShrinkWrap] Split restore point
Try to reland D42600

Differential Revision: https://reviews.llvm.org/D42600
2023-05-08 13:21:07 +05:30
sgokhale
7cba800104 [CodeGen] Autogen tests as prerequisite for D42600
Autogenerating tests as suggested in D42600
2023-05-08 12:25:51 +05:30
Florian Hahn
4e2b4f97a0
[ShrinkWrap] Use underlying object to rule out stack access.
Allow shrink-wrapping past memory accesses that only access globals or
function arguments. This patch uses getUnderlyingObject to try to
identify the accessed object by a given memory operand. If it is a
global or an argument, it does not access the stack of the current
function and should not block shrink wrapping.

Note that the caller's stack may get accessed when passing an argument
via the stack, but not the stack of the current function.

This addresses part of the TODO from D63152.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D149668
2023-05-03 09:28:07 +01:00
Nikita Popov
0659000ff7 [LICM] Don't duplicate instructions just because they're free
D37076 makes LICM duplicate instructions into exit blocks if the
instruction is free. For GEPs, the motivation appears to be that
this allows the GEP to be folded into addressing modes, while
non-foldable users outside the loop might prevent this. TBH I don't
think LICM is the place to do this (why doesn't CGP apply this
heuristic itself?) but at least I understand the motivation.

However, the transform is also applied to all other "free"
instructions, which are just that (removed during lowering and not
"folded" in some way). For such instructions, this transform seems
somewhere between useless, counter-productive (undoing CSE/GVN) and
actively incorrect. For example, this transform can duplicate freeze
instructions, which is illegal.

This patch limits the transform to just foldable GEPs, though we
might want to drop it from LICM entirely as a followup.

This is a small compile-time improvement, because querying TTI cost
model for every single instruction is expensive.

Differential Revision: https://reviews.llvm.org/D149136
2023-04-28 14:31:23 +02:00
ManuelJBrito
8b56da5e9f [IR] Change shufflevector undef mask to poison
With this patch an undefined mask in a shufflevector will be printed as poison.
This change is done to support the new shufflevector semantics
for undefined mask elements.

Differential Revision: https://reviews.llvm.org/D149210
2023-04-27 14:41:10 +01:00
Fangrui Song
398d68f624 [PPCMIPeephole] Fix incorrect compare elimination
D38236 moves a redundant compare instruction from the loop body to the
preheader.

It has a bug: when `MBB1 == &MBB2`, there may be only one compare instruction in the
loop. The code will lift the compare instruction to the preheader, failing to
account for the change of the compare result in a tail call, leading to a miscompile.

Suppress the compare elimination to fix https://github.com/llvm/llvm-project/issues/62294

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D149030
2023-04-24 10:02:06 -07:00
Maryam Moghadas
e5de760c31 [PowerPC] Add a new test for vperm with a swapped vector operand and a constant pool
This patch adds a new test that includes a vperm instruction with xxswapd as its
vector operand on little-endian Power8. The test demonstrates the constant pool
for the mask operand, which is intended to indicate the optimization of vperm
and the modification of the constant pool in subsequent patches.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D148942
2023-04-24 15:41:23 +00:00
Stefan Pintilie
1162a38685 [NFC][PowerPC] Added a test case to show extra clear instructions.
Added a number of functions that have a clear instruction that is not
actually required. This test is added first and then a patch will be
added later in order to remove the unnecessary instructions.
2023-04-24 09:47:44 -04:00
David Tenty
8d2e9fc855 [PowerPC] Add function pointer alignment to DataLayout
The alignment of function pointers was added to the Datalayout by
D57335 but currently is unset for the Power target. This will cause us
to compute a conservative minimum alignment of one if places like
Value::getPointerAlignment.

This patch implements the function pointer alignment in the Datalayout
for the Power backend and Power targets in clang, so we can query the
value for a particular Power target.

We come up with the correct value one of two ways:

- If the target uses function descriptor objects (i.e. ELFv1 & AIX ABIs),
  then a function pointer points to the descriptor, so use the alignment
  we would emit the descriptor with.
- If the target doesn't use function descriptor objects (i.e. ELFv2), a
  function pointer points to the global entry point, so use the minimum
  alignment for code on Power (i.e. 4-bytes).

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D147016
2023-04-18 13:00:27 -04:00
Kai Luo
eee024bf1b [PowerPC] Update incr after resetting the register in MI
After performing signed extension, we update the register in MI. We should also update `incr` register which is tracking the register in `MI`.

Fixes https://github.com/llvm/llvm-project/issues/61882.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D147594
2023-04-14 17:36:30 +08:00
sgokhale
bb5befefc6 Revert "[CodeGen][ShrinkWrap] Split restore point"
This reverts commit 5f0bccc3d1a74111458c71f009817c9995f4bf83.

An issue has been reported here: https://github.com/ClangBuiltLinux/linux/issues/1833
2023-04-13 10:52:28 +05:30
Nikita Popov
b8917ac62a [LICM] Reassociate GEPs to allow hoisting
Reassociate gep (gep ptr, idx1), idx2 to gep (gep ptr, idx2), idx1
if this would make the inner GEP loop invariant and thus hoistable.

This is intended to replace an InstCombine fold that does this (in
04f61fb73d/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (L2006)).
The problem with the InstCombine fold is that LoopInfo is an optional
dependency, so it is not performed reliably.

Differential Revision: https://reviews.llvm.org/D146813
2023-04-11 10:34:04 +02:00
sgokhale
5f0bccc3d1 [CodeGen][ShrinkWrap] Split restore point
This patch splits a restore point to allow it to only post-dominate blocks reachable by use
or def of CSRs(Callee Saved Registers)/FI(Frame Index).

Benchmarking this on SPEC2017, this gives around 4% improvement on povray and no significant change
for others.

Co-authored-by: junbuml

Differential Revision: https://reviews.llvm.org/D42600
2023-04-11 11:58:50 +05:30
Maryam Moghadas
6dbb2a717a [PowerPC] Update pr61315.ll to address D146632 failure
This patch is to update pr61315.ll what was needed as part of
D146632 and caused build failures.

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D147675
2023-04-05 21:24:59 -05:00
Maryam Moghadas
cf0395f816 [PowerPC] Fix the xxperm swap requirements
This patch is to fix the xxperm vector operand swap condition so that the
single-use operand is in V2 to prevent copying, it also fixes the subtarget
condition to exploit the xpperm.

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D146632
2023-04-05 20:13:40 -05:00
Kai Luo
4639653492 [PowerPC] Precommit test case for issue 61882. NFC. 2023-04-05 16:25:12 +08:00
Nikita Popov
ab232c9ddf [PowerPC] Convert tests to opaque pointers (NFC) 2023-04-04 12:11:26 +02:00
Nikita Popov
39eb7ae9c9 [PowerPC] Name instructions in tests (NFC) 2023-04-04 12:08:03 +02:00
Zequan Wu
321d02cc6b [NFC] Update CodeGen/*/nomerge.ll tests with utils/update_llc_test_checks.py.
Precommit this patch for better diff view on D146749.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D147454
2023-04-03 19:52:39 -04:00
Stefan Pintilie
be15db8cf2 [PowerPC][NFC] Forgot to add requires asserts to ppc-TOC-stats.ll
When I sumbitted the original patch I forgot to add that:
GodeGen/PowerPC/ppc-TOC-stats.ll
requires asserts.
Added that now.
2023-04-03 19:45:27 -04:00
Stefan Pintilie
398effac36 [PowerPC] Add statistics to show the number of entries in the TOC.
On Power PC some data is stored in the TOC. This pass adds statistics
to show how many entries are emitted to the TOC and what types of
entries those are.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D146325
2023-04-03 14:20:51 -04:00
Qiu Chaofan
5b8ea2d0e1 [PowerPC] Lower IS_FPCLASS by test data class instruction
Power ISA 3.0 introduced new 'test data class' instructions, which
accept flags for: NaN/Infinity/Zero/Denormal. This instruction can be
used to implement custom lowering for llvm.is.fpclass, but some extra
bits provided by the intrinsic are missing (normal and QNaN/SNaN).

For those categories not natively supported, this patch uses a two-way
or three-way combination to implement correct behavior.

Reviewed By: sepavloff, shchenz

Differential Revision: https://reviews.llvm.org/D140381
2023-04-03 11:37:17 +08:00
Qiongsi Wu
f624372ccb [AIX][CodeGen] Renaming mroptr to xcoff-mroptr
This patch renames the `mroptr` option to `mxcoff-roptr` to indicate in the option itself that it is xcoff specific.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D147161
2023-03-31 10:09:48 -04:00
Amy Kwan
3508f12335 [PowerPC][GISel] Add initial GlobalISel support for vector functions.
This patch adds the initial support for vector functions and register banks
within GlobalISel. With this patch, we are able to support simple functions that
return vectors, and also functions that perform simple operations.

This patch also:
- Legalizes vector types for G_AND, G_OR, G_XOR, G_ADD, G_SUB, G_BITCAST, G_FADD, G_FSUB
- Introduce initial support for bitcasting (that will need to be extended upon)
- Add various different test cases to for test vector support within GlobalISel

Differential Revision: https://reviews.llvm.org/D137785
2023-03-27 08:23:05 -05:00
Amy Kwan
6126356d82 [PowerPC] Implement 64-bit ELFv2 Calling Convention in TableGen (for integers/floats/vectors in registers)
This patch partially implements the parameter passing rules outlined in the
ELFv2 ABI within TableGen. Specifically, it implements the parameter assignment
of integers, floats, and vectors within registers - where the GPR numbering will
be "skipped" depending on the ordering of floats and vectors that appear within
a parameter list.

As we begin to adopt GlobalISel to the PowerPC backend, there is a need for a
TableGen definition that encapsulates the ELFv2 parameter passing rules. Thus,
this patch also changes the default calling convention that is returned within
the ccAssignFnForCall() function used in our GlobalISel implementation, and also
adds some additional testing of the calling convention that is implemented.

Future patches that build on top of this initial TableGen definition will aim to
add more of the ABI complexities, including support for additional types and
also in-memory arguments.

Differential Revision: https://reviews.llvm.org/D137504
2023-03-27 08:23:04 -05:00
Nemanja Ivanovic
e7c35d7100 [SelectionDAG] Correctly reduce BV to shuffle with zero on big endian
This DAG combine is correct on little endian targets but
is incorrect on big endian targets.
Add big endian code to correct it.

Differential revision: https://reviews.llvm.org/D146460
2023-03-24 10:57:17 -04:00
Qiongsi Wu
4f9929add5 [AIX][CodeGen] Storage Locations for Constant Pointers
This patch adds an `llc` option `-mroptr` to specify storage locations for constant pointers on AIX.

When the `-mroptr` option is specified, constant pointers, virtual function tables, and virtual type tables are placed in read-only storage. Otherwise, by default, pointers, virtual function tables, and virtual type tables are placed are placed in read/write storage.

https://reviews.llvm.org/D144190 enables the `-mroptr` option for `clang`.

Reviewed By: hubert.reinterpretcast, stephenpeckham, myhsu, MaskRay, serge-sans-paille

Differential Revision: https://reviews.llvm.org/D144189
2023-03-23 09:44:47 -04:00
esmeyi
49dcd08c3d [XCOFF] support the ref directive for object generation.
Summary: A R_REF relocation as a non-relocating reference is required to prevent garbage collection (by the binder) of the ref symbol in object generation.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D144356
2023-03-23 05:09:47 -04:00
Ting Wang
f64dc9bc6e [PowerPC][NFC] add const-nonsplat-array-init.ll
When doing store constant vector/scalar, some duplicated values can be reused.
Add test case and will show combiner can improve these.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D146500
2023-03-22 00:32:18 -04:00
Nemanja Ivanovic
6ee4ea8e2f [PowerPC][NFC] Test needs to include constant pool values 2023-03-20 16:43:59 -05:00
Nemanja Ivanovic
da40f7e8b1 [PowerPC][NFC] Pre-commit a test case for upcoming patch 2023-03-20 15:42:07 -05:00
Nikita Popov
687b5b9a0c [SCEVExpander] Always use scevgep as name
With opaque pointers the scevgep / uglygep distinction no longer
makes sense -- GEPs are always emitted in offset-based representation.
2023-03-17 14:27:03 +01:00