792 Commits

Author SHA1 Message Date
Nikita Popov
41d5033eb1 [IR] Enable opaque pointers by default
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:

* If IR that contains *neither* explicit ptr nor %T* types is passed
  to tools, we will now use opaque pointer mode, unless
  -opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
  It is possible to opt-out by calling setOpaquePointers(false) on
  LLVMContext.

A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.

Differential Revision: https://reviews.llvm.org/D126689
2022-06-02 09:40:56 +02:00
Nuno Lopes
80b3dcc045 [Support] Make report_fatal_error respect its GenCrashDiag argument so it doesn't generate a backtrace
There are a few places where we use report_fatal_error when the input is broken.
Currently, this function always crashes LLVM with an abort signal, which
then triggers the backtrace printing code.
I think this is excessive, as wrong input shouldn't give a link to
LLVM's github issue URL and tell users to file a bug report.
We shouldn't print a stack trace either.

This patch changes report_fatal_error so it uses exit() rather than
abort() when its argument GenCrashDiag=false.

Reviewed by: nikic, MaskRay, RKSimon

Differential Revision: https://reviews.llvm.org/D126550
2022-05-30 19:19:23 +01:00
Edd Barrett
d245974e1a Test stackmap support for floating point types.
It appears that float support is complete, or at least, the stackmap records
emitted are not inconceivable (I must admit that I don't know about many of the
architectures under test here).

One curiosity, the SystemZ tests highlight an undocumented (or maybe incorrect)
quirk of the stackmap format: in the case of a Register record, the Offset or
SmallConstant field can encode a sub-register index! I've only ever seen this
field zero for Register entries up until now.
2022-05-30 10:49:32 +01:00
Edd Barrett
c5e5cf1258 Test stackmap support for i128
This diff adds tests that check the currently-working stackmap cases for i128.
This will help ensure no regressions are later introduced by D125680 (when
ready).

Note that i128 stackmap support is currently incomplete, so we cant test all
i128 functionality:

    i128 constants >= 2^{63} crash LLVM
    non-constant i128s crash LLVM

So this change tests only constant i128 operands of value < 2^{63}.

A couple of incorrect comments are also fixed.
2022-05-23 11:56:24 +01:00
Jonas Paulsson
4273e616e5 [SystemZ] Bugfix in SystemZTargetLowering::combineINT_TO_FP()
Make sure to also handle extended value types to avoid crashing.

Resulting integers greater than 64 bits are not optimized (i128 is not a
legal type), and vectorizing seems to result in libcalls instead of just
scalarization.

Other extended vector types like <10 x float> are however now handled and
should result in vectorized conversions.

Reviewed By: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D125881
2022-05-18 16:32:37 +02:00
Simon Pilgrim
d40b7f0d5a [DAG] Fold (shl (srl x, c), c) -> and(x, m) even if srl has other uses
If we're using shift pairs to mask, then relax the one use limit if the shift amounts are equal - we'll only be generating a single AND node.

AArch64 has a couple of regressions due to this, so I've enforced the existing one use limit inside a AArch64TargetLowering::shouldFoldConstantShiftPairToMask callback.

Part of the work to fix the regressions in D77804

Differential Revision: https://reviews.llvm.org/D125607
2022-05-17 13:40:11 +01:00
Simon Pilgrim
1ecc3d86ae [DAG] Enable ISD::SHL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
Pulled out of D77804 as its going to be easier to address the regressions individually.

This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.

The lost RISCV gorc2 fold shouldn't be a problem - instcombine would have already destroyed that pattern - see https://github.com/llvm/llvm-project/issues/50553

Differential Revision: https://reviews.llvm.org/D124839
2022-05-14 09:50:01 +01:00
Jonas Paulsson
eaa78035c6 [SystemZ] Patchset for expanding memcpy/memset using at most two stores.
* Set MaxStoresPerMemcpy and MaxStoresPerMemset to 2.

* Optimize stores of replicated values in SystemZ::combineSTORE(). This
  handles the now expanded memory operations and as well some other
  pre-existing cases.

* Reject a big displacement in isLegalAddressingMode() for a vector type.

* Return true from shouldConsiderGEPOffsetSplit().

Reviewed By: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122105
2022-05-13 15:31:09 +02:00
Craig Topper
084f967370 [SelectionDAG] Constant fold (sext_inreg undef, VT) to 0 instead of undef.
The result of sign_extend_inreg needs to have as many sign bits
as requested by the VT argument. The easiest way to guarantee this
is to fold it to 0.

SystemZ test was modified to avoid using undef.

Fixes https://github.com/llvm/llvm-project/issues/55178

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D124696
2022-05-05 09:45:35 -07:00
Jonas Paulsson
fbaec11683 [SystemZ] Avoid crashing in tryRISBGZero().
Bail out from cases where the result is a ConstantSDNode as it cannot be
selected and should typically not end up here.

Fixes: #55204

Reviewed By: Ulrich Weigand
2022-05-04 11:38:50 +02:00
Serge Pavlov
c96cc500f0 [SystemZ] Custom lowering of llvm.is_fpclass
Differential Revision: https://reviews.llvm.org/D114695
2022-04-29 13:27:36 +07:00
Ulrich Weigand
1283ccb610 Support z16 processor name
The recently announced IBM z16 processor implements the architecture
already supported as "arch14" in LLVM.  This patch adds support for
"z16" as an alternate architecture name for arch14.
2022-04-21 19:58:22 +02:00
Jonas Paulsson
4aa5dc15f0 [SystemZ] Handle SystemZ specific inline assembly address operands.
Handle ZQ, ZR, ZS and ZT inline assembly operand constraints.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D110267
2022-04-19 16:55:45 +02:00
Jonas Paulsson
27e8c50a4c [SystemZ] Implement adjustInliningThreshold().
This patch boosts the inlining threshold for a particular type of functions
that are using an incoming argument only as a memcpy source.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D121341
2022-04-13 14:48:10 +02:00
Jonas Paulsson
46f83caebc [InlineAsm] Add support for address operands ("p").
This patch adds support for inline assembly address operands using the "p"
constraint on X86 and SystemZ.

This was in fact broken on X86 (see example at
https://reviews.llvm.org/D110267, Nov 23).

These operands should probably be treated the same as memory operands by
CodeGenPrepare, which have been commented with "TODO" there.

Review: Xiang Zhang and Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122220
2022-04-13 12:50:21 +02:00
Daniil Kovalev
62a983ebc5 Revert "[CodeGen] Place SDNode debug ID declaration under appropriate #if"
This reverts commit 83a798d4b0e17ac41d5430f1290d3661343eee1e.

As discussed in D120714 with @thakis, the patch added unneeded complexity
without noticeable benefits.
2022-04-06 20:32:53 +03:00
Daniil Kovalev
83a798d4b0 [CodeGen] Place SDNode debug ID declaration under appropriate #if
Place PersistentId declaration under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to
reduce memory usage when it is not needed.

Differential Revision: https://reviews.llvm.org/D120714
2022-04-06 14:09:32 +03:00
Sanjay Patel
5b4bbaa8d8 [SystemZ] generate full checks for tests; NFC
These may change if we transform the fcmp (setcc) to avoid a constant operand.
2022-03-30 08:37:15 -04:00
Neumann Hon
eb3e09c9bf [SystemZ] [z/OS] Add support for generating huge (1 MiB) stack frames in XPLINK64
This patch extends support for generating huge stack frames on 64-bit XPLINK by implementing the ABI-mandated call to the stack extension routine.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D120450
2022-02-25 02:37:08 -05:00
Kai Nacke
30053c1445 [SystemZ/z/OS] Add va intrinsics for XPLINK
Add support for va intrinsics for the XPLINK ABI.
Only the extended vararg variant, which uses a pointer to next
argument, is supported. The standard variant will build on this.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D120148
2022-02-22 14:35:05 -05:00
Jonas Paulsson
cf426100d6 [SystemZ] Improve emission of alignment hints.
Handle multiple memoperands in lowerAlignmentHint().

Review: Ulrich Weigand
2022-02-17 12:30:43 -06:00
Mubariz Afzal
1a5b881d4c Revert [SystemZ][z/OS] Fix f32 variadic argument assertion
This reverts ea0676f97d734196b15da7553cd407e6a36cef2d
2022-02-15 23:28:40 -05:00
Mubariz Afzal
ea0676f97d [SystemZ][z/OS] Fix f32 variadic argument assertion
The tablegen lines that specify the XPLINK64 calling convention for promoting an f32 vararg to an f64 are effectively overwritten by the following tablegen line which bitcast an f64 vararg to an i64 (so that it can be used in the GPRs). It becomes a bitcast from f32 to i64.

Since we don't handle a bitcast for f32s this caused an assertion.
2022-02-15 18:11:57 -05:00
Kai Nacke
713496d9c9 [SystemZ/z/OS] Add XPLINK dynamic stack allocation
With XPLINK, dynamic stack allocations requires calling
a runtime function, which allocates the stack memory,
moves the register save area, and returns the new
stack pointer.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D119732
2022-02-14 13:35:28 -05:00
Kai Nacke
62ba528a68 [Systemz/z/OS] Centralize emitting the call type information
With XPLINK, a no-op with information about the call type is emitted
after each call instruction. Centralizing it has the advantage that it is
easy to document all cases, and it makes it easier to extend it later
(e.g. dynamic stack allocation, 32 bit mode).
Also add a test checking the call types emitted so far.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D119557
2022-02-14 12:00:50 -05:00
Jonas Paulsson
9ca9fee6e8 [SystemZ] Don't shrink 64-bit FP constants.
Return false from ShouldShrinkFPConstant(), so that these constants are stored
in their full size on the constant pool, even if they could have been shrunk
and used with an extending load.

This is better since LD is faster than LDE, and it also enables reg/mem opcodes.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D117927
2022-01-27 16:14:53 -06:00
Jonas Paulsson
f541a5048a [SystemZ] Implement orderFrameObjects().
By reordering the objects on the stack frame after looking at the users, a
better utilization of displacement operands will result. This means less
needed Load Address instructions for the accessing of these objects.

This is important for very large functions where otherwise small changes
could cause a lot more/less accesses go out of range.

Note: this is not yet enabled for SystemZXPLINKFrameLowering, but should be.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D115690
2022-01-27 16:09:19 -06:00
Nick Desaulniers
79ebc3b0dd [llvm][test] rewrite callbr to use i rather than X constraint NFC
In D115311, we're looking to modify clang to emit i constraints rather
than X constraints for callbr's indirect destinations. Prior to doing
so, update all of the existing tests in llvm/ to match.

Reviewed By: void, jyknight

Differential Revision: https://reviews.llvm.org/D115410
2022-01-11 11:31:08 -08:00
Nikita Popov
291660e62f [SystemZ] Add missing elementtype in python test (NFC)
Missed this originally because it does not run locally.
2022-01-07 09:08:38 +01:00
Nikita Popov
f430c1eb64 [Tests] Add elementtype attribute to indirect inline asm operands (NFC)
This updates LLVM tests for D116531 by adding elementtype attributes
to operands that correspond to indirect asm constraints.
2022-01-06 14:23:51 +01:00
Neumann Hon
9a35844990 [z/OS] Implement prologue and epilogue generation for z/OS target.
This patch adds support for prologue and epilogue generation for the z/OS target under the XPLINK64 ABI for functions with a stack size of less than 1048576 bytes (huge stack frames).

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D114457
2021-12-16 09:04:05 -05:00
Muiez Ahmed
ebf5497b26 Revert "[z/OS] Implement prologue and epilogue generation for z/OS target."
This reverts commit ffad4d777b227f91be04020e2cd86ab38e969e39 because it introduced buildbot failures.
2021-12-14 14:22:11 -05:00
Neumann Hon
ffad4d777b [z/OS] Implement prologue and epilogue generation for z/OS target.
This patch adds support for prologue and epilogue generation for
the z/OS target under the XPLINK64 ABI for functions with a stack
size of less than 1048576 bytes (huge stack frames).

Reviewed by: uweigand, Kai

Differential Revision: https://reviews.llvm.org/D114457
2021-12-13 17:03:23 -05:00
Jonas Paulsson
cbf682cb1c [SystemZ] Improve codegen for memset.
Memset with a constant length was implemented with a single store followed by
a series of MVC:s. This patch changes this so that one store of the byte is
emitted for each MVC, which avoids data dependencies between the MVCs. An
MVI/STC + MVC(len-1) is done for each block.

In addition, memset with a variable length is now also handled without a
libcall. Since the byte is first stored and then MVC is used from that
address, a length of two must now be subtracted instead of one for the loop
and EXRL. This requires an extra check for the one-byte case, which is
handled in a special block with just a single MVI/STC (like GCC).

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D112004
2021-12-06 12:10:58 -06:00
Guozhi Wei
f1d8345a2a [TwoAddressInstructionPass] Create register mapping for registers with multiple uses in the current MBB
Currently we create register mappings for registers used only once in current
MBB. For registers with multiple uses, when all the uses are in the current MBB,
we can also create mappings for them similarly according to the last use.
For example

    %reg101 = ...
            = ... reg101
    %reg103 = ADD %reg101, %reg102

We can create mapping between %reg101 and %reg103.

Differential Revision: https://reviews.llvm.org/D113193
2021-11-29 19:01:59 -08:00
Jonas Paulsson
bb506938be [SystemZ] Improvement of emitMemMemWrapper()
It was discovered that an extra register COPY remained when expanding a
(variable length) memory operation with a loop and there was another use of
the involved address register(s) afterwards.

A simple fix for this is to COPY the address registers before the loop and
use that new vreg instead.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D112065
2021-10-26 17:03:01 +02:00
Jonas Paulsson
9f8872779a [SystemZ] Provide size values for PATCHPOINT, STACKMAP and FENTRY_CALL.
All instructions must have a correct size value close to emission when
SystemZLongBranch runs, or a necessary branch relaxation may be missed.

This patch also adds an assert for instruction sizes in SystemZLongBranch.

Review: Ulrich Weigand
2021-10-26 12:07:22 +02:00
Anirudh Prasad
aa3519f178 [SystemZ][z/OS] Initial implementation for lowerCall on z/OS
- This patch provides the initial implementation for lowering a call on z/OS according to the XPLINK64 calling convention
- A series of changes have been made to SystemZCallingConv.td to account for these additional XPLINK64 changes including adding a new helper function to shadow the stack along with allocation of a register wherever appropriate
- For the cases of copying a f64 to a gr64 and a f128 / 128-bit vector type to a gr64, a `CCBitConvertToType` has been added and has been bitcasted appropriately in the lowering phase
- Support for the ADA register (R5) will be provided in a later patch.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D111662
2021-10-21 09:48:59 -04:00
Jonas Paulsson
ccbfcfda1e [SystemZ] Handle huge immediates in SystemZInstrInfo::loadImmediate().
This is needed during isel pseudo expansion in order not to crash on huge
immediates.

Review: Ulrich Weigand
2021-10-15 19:08:45 +02:00
Jonas Paulsson
a33e4c8ae9 [SystemZ] Reapply memcmp and memcpy patches.
This reverts 3562076 and includes some refactoring as well.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D111733
2021-10-14 10:37:33 +02:00
Jonas Paulsson
00baad35b2 [SystemZ] Bugfix and refactorization of mem-mem operations
This patch fixes the bug that consisted of treating variable / immediate
length mem operations (such as memcpy, memset, ...) differently. The variable
length case needs to have the length minus 1 passed due to the use of EXRL
target instructions. However, the DAGCombiner can convert a register length
argument into a constant one, and whenever that happened one byte too little
would end up being performed.

This is also a refactorization by reducing the number of opcodes and variants
involved. For any opcode (variable or constant length), only the length minus
one is passed on to the ISD node. The rest of the logic is now instead
handled during isel pseudo expansion.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D111729
2021-10-14 10:37:33 +02:00
Kai Nacke
0a950a2e94 [SystemZ/z/OS] Implement save of non-volatile registers on z/OS XPLINK
This PR implements the save of the XPLINK callee-saved registers
on z/OS.

Reviewed By: uweigand, Kai

Differential Revision: https://reviews.llvm.org/D111653
2021-10-13 12:57:57 -04:00
Guozhi Wei
6599961c17 [TwoAddressInstructionPass] Improve the SrcRegMap and DstRegMap computation
This patch contains following enhancements to SrcRegMap and DstRegMap:

  1 In findOnlyInterestingUse not only check if the Reg is two address usage,
    but also check after commutation can it be two address usage.

  2 If a physical register is clobbered, remove SrcRegMap entries that are
    mapped to it.

  3 In processTiedPairs, when create a new COPY instruction, add a SrcRegMap
    entry only when the COPY instruction is coalescable. (The COPY src is
    killed)

With these enhancements isProfitableToCommute can do better commute decision,
and finally more register copies are removed.

Differential Revision: https://reviews.llvm.org/D108731
2021-10-11 15:28:31 -07:00
Jay Foad
df2d4bc4cb [TwoAddressInstruction] Fix ReplacedAllUntiedUses in processTiedPairs
Fix the calculation of ReplacedAllUntiedUses when any of the tied defs
are early-clobber. The effect of this is to fix the placement of kill
flags on an instruction like this (from @f2 in
 test/CodeGen/SystemZ/asm-18.ll):

  INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], killed %4:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit

After TwoAddressInstruction without this patch:

  %3:grh32bit = COPY killed %4:grh32bit
  INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], %3:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit

Note that the COPY kills %4, even though there is a later use of %4 in
the INLINEASM. This fails machine verification if you force it to run
after TwoAddressInstruction (currently it is disabled for other
reasons).

After TwoAddressInstruction with this patch:

  %3:grh32bit = COPY %4:grh32bit
  INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], %3:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit

Differential Revision: https://reviews.llvm.org/D110848
2021-10-07 10:10:11 +01:00
Jay Foad
85abedd750 [TwoAddressInstruction] Pre-commit a test case for D110848 2021-10-07 10:10:11 +01:00
Jonas Paulsson
3562076dfc [SystemZ] Temporarily revert memcmp and memcpy patches
Seem to cause test failures in compiler-rt.

Revert "[SystemZ] Implement memcmp of variable length with CLC."
This reverts commit 7a4e9a0c73667cb80e4572d41535a9e48f1ed9ef.

Revert "[SystemZ] Implement memcpy of variable length with MVC."
This reverts commit c6c13c58eebda605a9a05f1f13cac1e46407afc7.
2021-10-06 11:05:18 +02:00
Jonas Paulsson
7a4e9a0c73 [SystemZ] Implement memcmp of variable length with CLC.
Following the same pattern of memset/memcpy, this patch implements a variable
length memcmp with a CLC loop followed by an EXRL instruction.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D107380
2021-10-05 18:20:36 +02:00
Jonas Paulsson
c6c13c58ee [SystemZ] Implement memcpy of variable length with MVC.
Instead of making a memcpy libcall, emit an MVC loop and an EXRL instruction
the same way as is already done for memset 0.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D106874
2021-10-05 17:14:41 +02:00
Jay Foad
20c0280733 [LiveIntervals] Repair subreg ranges in processTiedPairs
In TwoAddressInstructionPass::processTiedPairs, update subranges of the
live interval for RegB as well as the main range.

This is a small step towards switching TwoAddressInstructionPass over
from LiveVariables to LiveIntervals. Currently this path is only tested
if you explicitly enable -early-live-intervals.

Differential Revision: https://reviews.llvm.org/D110526
2021-09-28 08:10:16 +01:00
Jonas Paulsson
ea92283449 [SystemZ] Implement ISD::BITCAST for fp128 -> i128.
The type legalizer has by default no method of doing this bitcast other than
storing and reloading the value from stack.

This patch implements a custom lowering of this operation using extractions
of subregs (z13 and earlier using FP128 register pairs), or of vector
elements (with 'vector enhancements 1' using VR128 FP registers).

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D110346
2021-09-24 10:26:45 +02:00