This reverts commit 1ddfd1c8186735c62b642df05c505dc4907ffac4.
The original commit causes a Chrome build assertion failure with
ThinLTO: https://crbug.com/1443635
This patch adds the additional step of looking through AND, OR, XOR
instructions when we check the number of leading zeros.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D149223
Allow shrink-wrapping past memory accesses that only access globals or
function arguments. This patch uses getUnderlyingObject to try to
identify the accessed object by a given memory operand. If it is a
global or an argument, it does not access the stack of the current
function and should not block shrink wrapping.
Note that the caller's stack may get accessed when passing an argument
via the stack, but not the stack of the current function.
This addresses part of the TODO from D63152.
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D149668
D37076 makes LICM duplicate instructions into exit blocks if the
instruction is free. For GEPs, the motivation appears to be that
this allows the GEP to be folded into addressing modes, while
non-foldable users outside the loop might prevent this. TBH I don't
think LICM is the place to do this (why doesn't CGP apply this
heuristic itself?) but at least I understand the motivation.
However, the transform is also applied to all other "free"
instructions, which are just that (removed during lowering and not
"folded" in some way). For such instructions, this transform seems
somewhere between useless, counter-productive (undoing CSE/GVN) and
actively incorrect. For example, this transform can duplicate freeze
instructions, which is illegal.
This patch limits the transform to just foldable GEPs, though we
might want to drop it from LICM entirely as a followup.
This is a small compile-time improvement, because querying TTI cost
model for every single instruction is expensive.
Differential Revision: https://reviews.llvm.org/D149136
With this patch an undefined mask in a shufflevector will be printed as poison.
This change is done to support the new shufflevector semantics
for undefined mask elements.
Differential Revision: https://reviews.llvm.org/D149210
D38236 moves a redundant compare instruction from the loop body to the
preheader.
It has a bug: when `MBB1 == &MBB2`, there may be only one compare instruction in the
loop. The code will lift the compare instruction to the preheader, failing to
account for the change of the compare result in a tail call, leading to a miscompile.
Suppress the compare elimination to fix https://github.com/llvm/llvm-project/issues/62294
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D149030
This patch adds a new test that includes a vperm instruction with xxswapd as its
vector operand on little-endian Power8. The test demonstrates the constant pool
for the mask operand, which is intended to indicate the optimization of vperm
and the modification of the constant pool in subsequent patches.
Reviewed By: amyk
Differential Revision: https://reviews.llvm.org/D148942
Added a number of functions that have a clear instruction that is not
actually required. This test is added first and then a patch will be
added later in order to remove the unnecessary instructions.
The alignment of function pointers was added to the Datalayout by
D57335 but currently is unset for the Power target. This will cause us
to compute a conservative minimum alignment of one if places like
Value::getPointerAlignment.
This patch implements the function pointer alignment in the Datalayout
for the Power backend and Power targets in clang, so we can query the
value for a particular Power target.
We come up with the correct value one of two ways:
- If the target uses function descriptor objects (i.e. ELFv1 & AIX ABIs),
then a function pointer points to the descriptor, so use the alignment
we would emit the descriptor with.
- If the target doesn't use function descriptor objects (i.e. ELFv2), a
function pointer points to the global entry point, so use the minimum
alignment for code on Power (i.e. 4-bytes).
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D147016
Reassociate gep (gep ptr, idx1), idx2 to gep (gep ptr, idx2), idx1
if this would make the inner GEP loop invariant and thus hoistable.
This is intended to replace an InstCombine fold that does this (in
04f61fb73d/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (L2006)).
The problem with the InstCombine fold is that LoopInfo is an optional
dependency, so it is not performed reliably.
Differential Revision: https://reviews.llvm.org/D146813
This patch splits a restore point to allow it to only post-dominate blocks reachable by use
or def of CSRs(Callee Saved Registers)/FI(Frame Index).
Benchmarking this on SPEC2017, this gives around 4% improvement on povray and no significant change
for others.
Co-authored-by: junbuml
Differential Revision: https://reviews.llvm.org/D42600
This patch is to update pr61315.ll what was needed as part of
D146632 and caused build failures.
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D147675
This patch is to fix the xxperm vector operand swap condition so that the
single-use operand is in V2 to prevent copying, it also fixes the subtarget
condition to exploit the xpperm.
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D146632
On Power PC some data is stored in the TOC. This pass adds statistics
to show how many entries are emitted to the TOC and what types of
entries those are.
Reviewed By: amyk
Differential Revision: https://reviews.llvm.org/D146325
Power ISA 3.0 introduced new 'test data class' instructions, which
accept flags for: NaN/Infinity/Zero/Denormal. This instruction can be
used to implement custom lowering for llvm.is.fpclass, but some extra
bits provided by the intrinsic are missing (normal and QNaN/SNaN).
For those categories not natively supported, this patch uses a two-way
or three-way combination to implement correct behavior.
Reviewed By: sepavloff, shchenz
Differential Revision: https://reviews.llvm.org/D140381
This patch renames the `mroptr` option to `mxcoff-roptr` to indicate in the option itself that it is xcoff specific.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D147161
This patch adds the initial support for vector functions and register banks
within GlobalISel. With this patch, we are able to support simple functions that
return vectors, and also functions that perform simple operations.
This patch also:
- Legalizes vector types for G_AND, G_OR, G_XOR, G_ADD, G_SUB, G_BITCAST, G_FADD, G_FSUB
- Introduce initial support for bitcasting (that will need to be extended upon)
- Add various different test cases to for test vector support within GlobalISel
Differential Revision: https://reviews.llvm.org/D137785
This patch partially implements the parameter passing rules outlined in the
ELFv2 ABI within TableGen. Specifically, it implements the parameter assignment
of integers, floats, and vectors within registers - where the GPR numbering will
be "skipped" depending on the ordering of floats and vectors that appear within
a parameter list.
As we begin to adopt GlobalISel to the PowerPC backend, there is a need for a
TableGen definition that encapsulates the ELFv2 parameter passing rules. Thus,
this patch also changes the default calling convention that is returned within
the ccAssignFnForCall() function used in our GlobalISel implementation, and also
adds some additional testing of the calling convention that is implemented.
Future patches that build on top of this initial TableGen definition will aim to
add more of the ABI complexities, including support for additional types and
also in-memory arguments.
Differential Revision: https://reviews.llvm.org/D137504
This DAG combine is correct on little endian targets but
is incorrect on big endian targets.
Add big endian code to correct it.
Differential revision: https://reviews.llvm.org/D146460
This patch adds an `llc` option `-mroptr` to specify storage locations for constant pointers on AIX.
When the `-mroptr` option is specified, constant pointers, virtual function tables, and virtual type tables are placed in read-only storage. Otherwise, by default, pointers, virtual function tables, and virtual type tables are placed are placed in read/write storage.
https://reviews.llvm.org/D144190 enables the `-mroptr` option for `clang`.
Reviewed By: hubert.reinterpretcast, stephenpeckham, myhsu, MaskRay, serge-sans-paille
Differential Revision: https://reviews.llvm.org/D144189
Summary: A R_REF relocation as a non-relocating reference is required to prevent garbage collection (by the binder) of the ref symbol in object generation.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D144356
When doing store constant vector/scalar, some duplicated values can be reused.
Add test case and will show combiner can improve these.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D146500
Summary:
In function PPCAIXAsmPrinter::emitTracebackTable() ,the bit "IsBackChainStored" of traceback
table always set true, it will cause aix debug tools "dbx" emit an error info
"libdebug assertion "(framep->getGpr(STKP, &addr) == DB_SUCCESS && *nextStkpp == addr)"
when debug a leaf functions with no stack frame.
If a a leaf functions with no stack frame , the bit IsBackChainStored should be unset.
Reviewers: ChenZheng
Differential Revision: https://reviews.llvm.org/D146071
[DAGCombiner] handle more store value forwarding
When lowering calls on target like PPC, some stack loads
will be generated for by value parameters. Node CALLSEQ_START
prevents such loads from being combined.
Suggested by @RolandF, this patch removes the unnecessary
loads for the byval parameter by extending ForwardStoreValueToDirectLoad
Reviewed By: nemanjai, RolandF
Differential Revision: https://reviews.llvm.org/D138899
We don't do this transform in InstCombine in general case for arbitrary values, because cost of
AND and 2 ICMP's isn't higher than of MIN and ICMP. However, LICM also has a notion
about the loop structure. This transform becomes profitable if `A` and `B` are loop-invariant and
`X` is not: by doing this, we can compute min outside the loop.
Differential Revision: https://reviews.llvm.org/D143726
Reviewed By: nikic
Summary: Fixes#60990. There is a crash reported during Running pass 'Prepare loop for ppc preferred instruction forms'. The crash occurs in 32bit PowerPC.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D145350
Default MaxDivRemBitWidthSupported to 128, so that divisions larger
than 128 bits are always expanded, without requiring additional
configuration from the target.
Note that this may still emit calls to __udivti3 on 32-bit targets,
which likely don't have an implementation of that builtin. However,
I believe this is sufficient to fix
https://github.com/llvm/llvm-project/issues/60531, because Zig must
already be defining those builtins.
Differential Revision: https://reviews.llvm.org/D144871
Summary: Currently we lower MEMCPY/MEMMOVE/MEMSET/BZERO to the corresponding libc functions. And the libc functions call the millicode functions on AIX. We can lower these intrinsics directly to save one call layer.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D143997
Add a member function isPPC64ELFv2ABI() to determine what ABI is used on the
64-bit PowerPC big endian operating environment.
Reviewed By: nemanjai, dim, pkubaj
Differential Revision: https://reviews.llvm.org/D144321
I introduced new tests in
commit 5cc1016a57b3 ("[llvm][SelectionDAGBuilder] codegen callbr.landingpad intrinsic")
https://reviews.llvm.org/D140160
that fails expensive checks. Disable -verify-machineinstrs in those
tests for now. Enable it in other tests for now, since MachineVerifier
isn't on by default for assertion builds.
Link: https://github.com/llvm/llvm-project/issues/60827