Now that the GNU and HLASM `InstPrinter` paths are separated in
https://github.com/llvm/llvm-project/pull/112975, differentiate between
them in `SystemZInstrFormats.td`.
The main difference are:
- Tabs converted to space
- Remove space after comma for instruction operands
---------
Co-authored-by: Tony Tao <tonytao@ca.ibm.com>
In preparation for future work on separating the output of the GNU/HLASM
ASM dialects, we first separate the SystemZInstPrinter classes to two
versions, one for each ASM dialect.
The common code remains in a SystemZInstPrinterCommon class instead.
---------
Co-authored-by: Tony Tao <tonytao@ca.ibm.com>
The previous behavior could be harmful in some edge cases, such as
emitting a call to `fma()` in the `fma()` implementation itself.
Do this by just being more accurate in `isFMAFasterThanFMulAndFAdd()`.
This was already done for PowerPC; this commit just extends that to Arm,
z/Arch, and x86. MIPS and SPARC already got it right, but I added tests
for them too, for good measure.
Note: I don't have commit access.
The ATT assembler dialect on SystemZ seems to have been taken from the
existing ATT/Intel code. However, on SystemZ, ATT does not hold any
meaning. In reality, we are splitting the difference between GNU Asm
syntax and HLASM Asm syntax, so it makes sense to rename ATT to GNU
instead.
Co-authored-by: Tony Tao <tonytao@ca.ibm.com>
Some targets (e.g. PPC and Hexagon) already did this. I think it's best
to do this consistently so that frontend authors don't run into
inconsistent results when they emit `naked` functions. For example, in
Zig, we had to change our emit code to also set `frame-pointer=none` to
get reliable results across targets.
Note: I don't have commit access.
Add support for using a thread-local variable with a specified offset
for holding the stack guard canary value. This supports both 32- and 64-
bit PowerPC targets.
This mirrors changes from #108942 but targeting PowerPC instead of
RISCV. Because both of these PRs modify the same driver functions, this
series is stack on top of the RISC-V one.
---------
Signed-off-by: Keith Packard <keithp@keithp.com>
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
Due to recently reported problems with having the inlining threshold multiplier
set fairly high (x3), this patch removes the multiplier while addressing
the regressions seen by doing so in adjustInliningThreshold().
The specific cases that benefit from inlining that were now found to be in need
of handling contain a considerable number of memory accesses to the same
memory in both caller and callee.
This patch fixes:
llvm/lib/Target/SystemZ/SystemZISelLowering.cpp:9858:18: error:
using the result of an assignment as a condition without parentheses
[-Werror,-Wparentheses]
This follows in the spirit of 7d82c99403f615f6236334e698720bf979959704,
and extends the costing API for compares and selects to provide
information about the operands passed in an analogous manner. This
allows us to model the cost of materializing the vector constant, as
some select-of-constants are significantly more expensive than others
when you account for the cost of materializing the constants involved.
This is a stepping stone towards fixing
https://github.com/llvm/llvm-project/issues/109466. A separate SLP patch
will be required to utilize the new API.
A (csmith) test case appeared where combineExtract() crashed when the
input vector was a bitcast into a vector of i1:s. Fix this by adding a check
with canTreatAsByteVector() before the call.
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
For the purpose of verifying proper arguments extensions per the target's ABI,
introduce the NoExt attribute that may be used by a target when neither sign-
or zeroextension is required (e.g. with a struct in register). The purpose of
doing so is to be able to verify that there is always one of these attributes
present and by this detecting cases where sign/zero extension is actually
missing.
As a first step, this patch has the verification step done for the SystemZ
backend only, but left off by default until all known issues have been
addressed.
Other targets/front-ends can now also add NoExt attribute where needed and do
this check in the backend.
Unlike scalar, where AArch64 prefers expanding scmp/ucmp with select,
under Neon we can use the arithmetic expansion to generate fewer
instructions. Notably it also prevents the scalarization of vselect
during vector-legalization.
The renamable flag is useful during MachineCopyPropagation but renamable
flag will be dropped after lowerCopy in some case.
This patch introduces extra arguments to pass the renamable flag to
copyPhysReg.
Enabling __ptr32 keyword to support in Clang for z/OS. It is represented
by addrspace(1) in LLVM IR. Unlike existing implementation, __ptr32 is
not mangled into symbol names for z/OS.
The current MCInstBuilder for generating an ALGFI when loading something
from the ADA is incorrect and will crash the compiler.
r0 must also be excluded from the registers returned as the result,
since it is treated as the value "0" on z/OS.
Also add some tests to properly test the paths where LLILF and ALGFI are
generated.
---------
Co-authored-by: Tony Tao <tonytao@ca.ibm.com>
A test case showed up where the new vector type is v24i16, which is not a simple
MVT. In order to get an extended value type for cases like this, EVT::getVectorVT()
needs to be called instead of MVT::getVectorVT(), otherwise the following call
to getVectorElementType() in combineExtract() will fail.
The parameter is confusing as it duplicates MCStreamer::isVeboseAsm
(initialized from MCTargetOptions::AsmVerbose). After
233cca169237b91d16092c82bd55ee6a283afe98, no in-tree target uses the
parameter.
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
just to query the MCContext, which MachineFunction already directly
stores. Other contexts are using it to query the LLVMContext, which
can already be accessed through the IR function reference.
The HFP instructions `MAY` and `MAYR`, unlike any other floating point
instructions, allow the specification of a 128bit register pair by
either the lower-numbered or the higher-numbered component register. In
order to support this, but change as little about codegen as possible,
the existing `MAY(R)?` definition is made `CodeGenOnly`, while a copy is
provided for the assembler and disassembler, which simply accepts a
64bit floating point register in place of the 128bit one. This copy is
stripped of its pattern to prevent codegen from using it.
The corresponding assembly tests that checked the register specification
rule that this commit removes from `MAY(R)?` have also been removed.
Well, not quite that simple. We can tc memset since it returns the first
argument but bzero doesn't do that and therefore we can end up
miscompiling.
This patch also refactors the logic out of isInTailCallPosition() into the callers.
As a result memcpy and memmove are also modified to do the same thing
for consistency.
rdar://131419786
The previous expansion of [US]CMP was done using two selects and two
compares. It produced decent code, but on many platforms it is better to
implement [US]CMP nodes by performing the following operation:
```
[us]cmp(x, y) = (x [us]> y) - (x [us]< y)
```
This patch adds this new expansion, as well as a hook in TargetLowering to allow some targets to still use the select-based approach. AArch64 and SystemZ are currently the only targets to prefer the former approach, but other targets may also start to use it if it provides for better codegen.
The combineSIGN_EXTEND_INREG routine was using
DAG.getConstant(-1, DL, VT), which does not result in
the expected value when VT has more than 64 bits.
Fix this by using DAG.getAllOnesConstant(DL, VT) instead.
Also add test cases for v1i128 comparisons (which triggers
the bug).
This PR adds a number of thus-far missing extended mnemonics to the
assembler and disassembler for SystemZ.
The following mnemonics have been added and are supported for the
assembler and disassembler:
- `NOP(R)?`
- `LFI`
- `RISBG(N)?Z`
The following mnemonics have been added and are supported for the
assembler only:
- `JC(TH)?`
- `LLG(F|H)I`
- `NOT(G)?R`
Summary:
These Libcalls represent which functions are available to the backend.
If a runtime call is not available, the target sets the the name to
`nullptr`. Currently, this logic is spread around the various targets.
This patch pulls all of the locations that disable libcalls into the
intializer. This patch is effectively NFC.
The motivation behind this patch is that currently the LTO handling uses
the list of all runtime calls to determine which functions cannot be
internalized and must be extracted from static libraries. We do not want
this to happen for libcalls that are not emitted by the backend. A
follow-up patch will move out this logic so the LTO pass can know which
rtlib calls are actually used by the backend.
Avoid needless copying of instructions and fixups and directly emit into
the fragment small vectors.
This (optionally, second commit) also removes the single use of the
MCCompactEncodedInstFragment to simplify code.