Most addresses in SystemZ instructions take two registers,
an index register and a base register. However, either of
those can be omitted. If there is just a single register,
this usually is taken as the base register - however, there
are certain rare cases where you specifically want to use
an index register but no base register. This is currently
not handled consistently by the assembler / disassembler.
Fix this by
- always emitting a dummy 0 as base register for index-
only addresses
- correctly handle dummy 0 as indicating no base register
when parsing an address
This is compatible with current GNU binutils behavior.
Some fixes for style issues pointed out by clang-tidy:
- Upper case/lower case fixes
- No else after return
- Removed unused #include's
- Added NOLINTNEXTLINE() for the LLVM* functions
All changes are NFC.
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
Set ups the infrastructure to create an empty GOFF file.
Also adds a GOFF writer which writes only HDR/END records.
Reviewed By: jhenderson, kpn
Differential Revision: https://reviews.llvm.org/D111437
These are small include-only changes in the X86, Mips, and SystemZ
backend that seem sufficiently small to commit separately without
review.
See issue #64166 for more information about layering.
Currently mentioning any symbols in immediate asm operands is not
supported, for example:
error: invalid operand for instruction
lghi %r4,foo_end-foo
The immediate problem is that is*Imm() and print*Operand() functions do
not accept MCExprs, but simply relaxing these checks is not enough:
after symbol addresses are computed, range checks need to run against
resolved values.
Add a number of SystemZ::FixupKind members for each kind of immediate
value and process them in SystemZMCAsmBackend::applyFixup(). Only
perform the range checks, do not change anything.
Adjust the tests: move previously failing cases like the one shown
above out of insn-bad.s.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D154899
In order to generate relocations or to apply fixups after the layout
has been computed, the targets need to know the offsets of the
respective operands. There are indirect ways to figure them out in some
cases, for example, on SystemZ, the first memory operand is always at
offset 2, and the second one is always at offset 4. But there are no
such tricks for the immediate operands on SystemZ, so one has to refer
to individual instruction encodings.
This information, however, is available to TableGen. Generate
the getOperandBitOffset() method to access it, and use it to simplify
getting memory operand offsets on SystemZ. This also paves the way for
implementing symbolic immediates on this platform.
For the multi-lit operands, getOperandBitOffset() returns the offset of
the first lit.
An alternative way to obtain offsets would be to pass them to the
encoder methods, but this would require reworking all targets. Also,
VarLenCodeEmitter already does this, but adopting it requires
reworking the respective targets without other significant benefits.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155329
Prepare for removing the MemOpsEmitted workaround for symbolic
displacements by letting TableGen know about the offsets of the
displacement sub-operands within the instruction.
There are alternative ways to do this that were tried and rejected:
- Creating encoders and decoders for each possible displacement offset.
This is too repetitive.
- Use VarLenCodeEmitter [1]. The resulting diff is quite large.
Instead, use the named sub-operand support introduced by commit
a538d1f13a13 ("[TableGen][CodeEmitterGen] Allow local names for
sub-operands in a operand list.").
Describe instruction encodings in terms of sub-operands instead of
operands (e.g. B, D, L vs BDL) - this also better matches the pictures
from the Principles of Operation. Decompose operands into sub-operands
using the new (bdaddr12only $B1, $D1):$BD1 syntax. Replace the
encoders and the decoders of the operands with these of the
sub-operands.
Since DecodeADDR64BitRegisterClass() is now used for bases and indices,
change it to return NoRegister when decoding 0. This also changes the
disassembly of some instructions, e.g., br %r0 becomes br 0. Since this
better captures the instruction semantics, namely, that the value of
%r0 is not used, keep this change and update the tests.
[1] https://m680x0.github.io/blog/2022/02/varlen-encoder.html
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D155194
This patch adds support for the ADA (associated data area), doing the following:
-Creates the ADA table to handle displacements
-Emits the ADA section in the SystemZAsmPrinter
-Lowers the ADA_ENTRY node into the appropriate load instruction
Differential Revision: https://reviews.llvm.org/D153788
In the SystemZMCObjectWriter, we currently just abort in case
some unsupported relocation in requested. However, as this
situation can be triggered by invalid (inline) assembler input,
we should really get a regular error message instead.
- Creates the ADA table to handle displacements
- Emits the ADA section in the SystemZAsmPrinter
- Lowers the ADA_ENTRY node into the appropriate load instruction
Differential Revision: https://reviews.llvm.org/D153788
This reduces dependencies on `llvm-tblgen` so much.
`CodeGenTypes` depends on `Support` at the moment.
Be careful to append deps on this, since Targets' tablegens
depend on this.
Depends on D149024
Differential Revision: https://reviews.llvm.org/D148769
This is rework of;
- D30046 (LLT)
Since I have introduced `llvm-min-tblgen` as D146352, `llvm-tblgen`
may depend on `CodeGen`.
`LowLevlType.h` originally belonged to `CodeGen`. Almost all userse are
still under `CodeGen` or `Target`. I think `CodeGen` is the right place
to put `LowLevelType.h`.
`MachineValueType.h` may be moved as well. (later, D149024)
I have made many modules depend on `CodeGen`. It is consistent but
inefficient. It will be split out later, D148769
Besides, I had to isolate MVT and LLT in modmap, since
`llvm::PredicateInfo` clashes between `TableGen/CodeGenSchedule.h`
and `Transforms/Utils/PredicateInfo.h`.
(I think better to introduce namespace llvm::TableGen)
Depends on D145937, D146352, and D148768.
Differential Revision: https://reviews.llvm.org/D148767
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
D25618 added a method to verify the instruction predicates for an
emitted instruction, through verifyInstructionPredicates added into
<Target>MCCodeEmitter::encodeInstruction. This is a very useful idea,
but the implementation inside MCCodeEmitter made it only fire for object
files, not assembly which most of the llvm test suite uses.
This patch moves the code into the <Target>_MC::verifyInstructionPredicates
method, inside the InstrInfo. The allows it to be called from other
places, such as in this patch where it is called from the
<Target>AsmPrinter::emitInstruction methods which should trigger for
both assembly and object files. It can also be called from other places
such as verifyInstruction, but that is not done here (it tends to catch
errors earlier, but in reality just shows all the mir tests that have
incorrect feature predicates). The interface was also simplified
slightly, moving computeAvailableFeatures into the function so that it
does not need to be called externally.
The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently
show errors in the test-suite, so have been disabled with FIXME
comments.
Recommitted with some fixes for the leftover MCII variables in release
builds.
Differential Revision: https://reviews.llvm.org/D129506
This reverts commit e2fb8c0f4b940e0285ee36c112469fa75d4b60ff as it does
not build for Release builds, and some buildbots are giving more warning
than I saw locally. Reverting to fix those issues.
D25618 added a method to verify the instruction predicates for an
emitted instruction, through verifyInstructionPredicates added into
<Target>MCCodeEmitter::encodeInstruction. This is a very useful idea,
but the implementation inside MCCodeEmitter made it only fire for object
files, not assembly which most of the llvm test suite uses.
This patch moves the code into the <Target>_MC::verifyInstructionPredicates
method, inside the InstrInfo. The allows it to be called from other
places, such as in this patch where it is called from the
<Target>AsmPrinter::emitInstruction methods which should trigger for
both assembly and object files. It can also be called from other places
such as verifyInstruction, but that is not done here (it tends to catch
errors earlier, but in reality just shows all the mir tests that have
incorrect feature predicates). The interface was also simplified
slightly, moving computeAvailableFeatures into the function so that it
does not need to be called externally.
The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently
show errors in the test-suite, so have been disabled with FIXME
comments.
Differential Revision: https://reviews.llvm.org/D129506
Properly handle the case where only the second operand of e.g. an MVC
instruction uses a fixup for the displacement.
Reviewed By: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D125982
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:
llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h
Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after: 1049293745
Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
The AsmParser checks the range of a PC-relative operand, but only if it is
immediate.
This patch adds range checks for operands in applyFixup(), at which point the
offset to a label is known.
The diagnostic message for an operand that is out of range is explicit (with
given value and min/max limits). This is now also done for displacement
fixups.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D114194
This patch adds support for symbolic displacements, e.g. like 'lg %r0,
sym(%r1)', which is done using relocations. This is needed to compile the
kernel without disabling the integrated assembler.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D113341
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
- This patch adds in the GOFFMCAsmInfo interfaces for the z/OS target.
- This patch decouples the previously existing SystemZMCAsmInfo interface for the ELF target and the z/OS target.
- This patch also removes a small test in the SystemZAsmLexerTest.cpp. The reason for this is because, the test is set up for the s390x-ibm-linux (SystemZ ELF triple), and the test checks a function which is overridden only for the z/OS target. The reason we can't change the test to use a z/OS triple outright is because there is still missing support which prevents the successful running of a test (assert in AsmParser.cpp due to missing GOFFAsmParser support)
Reviewed By: uweigand, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D110077
SystemZ adds the EXRL target instructions in the end of each file. This must
be done before debug info emission since that may end the text section, and
therefore this is now done in emitConstantPools() (instead of in
emitEndOfAsmFile).
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D109513
The .machine directive can be used in assembly files to specify the ISA for
the instructions following it.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D109660
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.
On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.
For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.
This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.
I've attempted to take into account the in tree experimental backends.
Differential Revision: https://reviews.llvm.org/D45962
Add support for the .reloc directive along the lines of
other back-ends.
This fixes a regression after https://reviews.llvm.org/D104080
was merged, since that patch presupposed support for .reloc.
- Currently, the emitting of labels in the parsePrimaryExpr function is case independent. It just takes the identifier and emits it.
- However, for HLASM the emitting of labels is case independent. We are emitting them in the upper case only, to enforce case independency. So we need to ensure that at the time of parsing the label we are emitting the upper case (in `parseAsHLASMLabel`), but also, when we are processing a PC-relative relocatable expression, we need to ensure we emit it in upper case (in `parsePrimaryExpr`)
- To achieve this a new MCAsmInfo attribute has been introduced which corresponding targets can override if needed.
Reviewed By: abhina.sreeskantharajan, uweigand
Differential Revision: https://reviews.llvm.org/D104715
- Currently, LLVM supports symbols of the name "token1@token2".
- "token2" is used to identify whether an appropriate symbol reference can be used for the symbol.
- Now, if the symbol reference couldn't be found, the AsmParser usually emits an error, unless the backend is configured to accept the "@" in a symbol name
- Thus, this patch aims to do that. It sets the `AllowAtInName` attribute in the SystemZ backend for the HLASM dialect.
- Setting this attribute ensures that, if a particular symbol reference is found, it uses that. If it doesn't, and there exists an "@" in the symbol name, it will use that instead of explicitly erroring out.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D103111
- Currently, before printing a label in MCSymbol.cpp (MCSymbol::print), the current code "validates" the label that is to be printed.
- If it fails the validation step, then it prints the label within double quotes.
- However, the validation is provided as a virtual function in MCAsmInfo.h (i.e. isAcceptableChar() function). So we can override this for the AD_HLASM dialect in SystemZMCAsmInfo.cpp.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D103091
- This patch (is one in a series of patches) which introduces HLASM Parser support (for the first parameter of inline asm statements) to LLVM ([[ https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html | main RFC here ]])
- This patch in particular introduces HLASM Parser support for Z machine instructions.
- The approach taken here was to subclass `AsmParser`, and make various functions and variables as "protected" wherever appropriate.
- The `HLASMAsmParser` class overrides the `parseStatement` function. Two new private functions `parseAsHLASMLabel` and `parseAsMachineInstruction` are introduced as well.
The general syntax is laid out as follows (more information available in [[ https://www.ibm.com/support/knowledgecenter/SSENW6_1.6.0/com.ibm.hlasm.v1r6.asm/asmr1023.pdf | HLASM V1R6 Language Reference Manual ]] - Chapter 2 - Instruction Statement Format):
```
<TokA><spaces.*><TokB><spaces.*><TokC><spaces.*><TokD>
```
1. TokA is referred to as the Name Entry. This token is optional
2. TokB is referred to as the Operation Entry. This token is mandatory.
3. TokC is referred to as the Operand Entry. This token is mandatory
4. TokD is referred to as the Remarks Entry. This token is optional
- If TokA is provided, then we either parse TokA as a possible comment or as a label (Name Entry), Tok B as the Operation Entry and so on.
- If TokA is not provided (i.e. we have one or more spaces and then the first token), then we will parse the first token (i.e TokB) as a possible Z machine instruction, TokC as the operands to the Z machine instruction and TokD as a possible Remark field
- TokC (Operand Entry), no spaces are allowed between OperandEntries. If a space occurs it is classified as an error.
- TokD if provided is taken as is, and emitted as a comment.
The following additional approach was examined, but not taken:
- Adding custom private only functions to base AsmParser class, and only invoking them for z/OS. While this would eliminate the need for another child class, these private functions would be of non-use to every other target. Similarly, adding any pure virtual functions to the base MCAsmParser class and overriding them in AsmParser would also have the same disadvantage.
Testing:
- This patch doesn't have tests added with it, for the sole reason that MCStreamer Support and Object File support hasn't been added for the z/OS target (yet). Hence, it's not possible generate code outright for the z/OS target. They are in the process of being committed / process of being worked on.
- Any comments / feedback on how to combat this "lack of testing" due to other missing required features is appreciated.
Reviewed By: Kai, uweigand
Differential Revision: https://reviews.llvm.org/D98276
- This patch attempts to implement the location counter syntax (*) for the HLASM variant for PC-relative instructions.
- In the HLASM variant, for purely constant relocatable values, we expect a * token preceding it, with special support for " *" which is parsed as "<pc-rel-insn 0>"
- For combinations of absolute values and relocatable values, we don't expect the "*" preceding the token.
When you have a " * " what’s accepted is:
```
*<space>.*{.*} -> <pc-rel-insn> 0
*[+|-][constant-value] -> <pc-rel-insn> [+|-]constant-value
```
When you don’t have a " * " what’s accepted is:
```
brasl 1,func is allowed (MCSymbolRef type)
brasl 1,func+4 is allowed (MCBinary type)
brasl 1,4+func is allowed (MCBinary type)
brasl 1,-4+func is allowed (MCBinary type)
brasl 1,func-4 is allowed (MCBinary type)
brasl 1,*func is not allowed (* cannot be used for non-MCConstantExprs)
brasl 1,*+func is not allowed (* cannot be used for non-MCConstantExprs)
brasl 1,*+func+4 is not allowed (* cannot be used for non-MCConstantExprs)
brasl 1,*+4+func is not allowed (* cannot be used for non-MCConstantExprs)
brasl 1,*-4+8+func is not allowed (* cannot be used for non-MCConstantExprs)
```
Reviewed By: Kai
Differential Revision: https://reviews.llvm.org/D100987
- Currently, the "." (Dot) character, when not identifying an Identifier or a Constant, refers to the current PC (Program Counter)
- However, in z/OS, for the HLASM dialect, it strictly accepts only the "*" as the current PC (Support for this will be put up in a follow-up patch)
- The changes in this patch allow individual platforms to choose whether they would like to use the "." (Dot) character as a marker for the current PC or not.
- It is achieved by introducing a new field in MCAsmInfo.h called `DotIsPC` (similar to `DollarIsPC`)
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D100975
- Previously, https://reviews.llvm.org/D99889 changed the framework in the AsmLexer to treat special tokens, if they occur at the start of the string, as Identifiers.
- These are used by the MASM Parser implementation in LLVM, and we can extend some of the changes made in the previous patch to SystemZ.
- In SystemZ, the special "tokens" referred to here are "_", "$", "@", "#". [_|$|@|#] are already supported as "part" of an Identifier.
- The changes in this patch ensure that these special tokens, when they occur at the start of the Identifier, are treated as Identifiers.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D100959
- This patch is the first part in enforcing prefix-less registers for the HLASM dialect in z/OS
- This patch removes the "%[r|f|v]" prefix while printing registers
- To achieve this, the `AssemblerDialect` field of MAI was used
- There is also a bit of refactoring done to ensure code repetition is reduced.
- Currently the LLVM assembler for SystemZ/z/OS accepts both prefixed registers and prefix-less registers. A subsequent follow-up patch will restrict the SystemZAsmParser to only accept prefix-less registers.
Crediting @kianm as an author as well.
Reviewed By: uweigand, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D101308
- Currently, MCAsmInfo provides a CommentString attribute, that various targets can set, so that the AsmLexer can appropriately lex a string as a comment based on the set value of the attribute.
- However, AsmLexer also supports a few additional comment syntaxes, in addition to what's specified as a CommentString attribute. This includes regular C-style block comments (/* ... */), regular C-style line comments (// .... ) and #. While I'm not sure as to why this behaviour exists, I am assuming it does to maintain backward compatibility with GNU AS (see https://sourceware.org/binutils/docs/as/Comments.html#Comments for reference)
For example:
Consider a target which sets the CommentString attribute to '*'.
The following strings are all lexed as comments.
```
"# abc" -> comment
"// abc" -> comment
"/* abc */ -> comment
"* abc" -> comment
```
- In HLASM however, only "*" is accepted as a comment string, and nothing else.
- To achieve this, an additional attribute (`AllowAdditionalComments`) has been added to MCAsmInfo. If this attribute is set to false, then only the string specified by the CommentString attribute is used as a possible comment string to be lexed by the AsmLexer. The regular C-style block comments, line comments and "#" are disabled. As a final note, "#" will still be treated as a comment, if the CommentString attribute is set to "#".
Depends on https://reviews.llvm.org/D99277
Reviewed By: abhina.sreeskantharajan, myiwanch
Differential Revision: https://reviews.llvm.org/D99286
- https://reviews.llvm.org/rGb605cfb336989705f391d255b7628062d3dfe9c3 was reverted due to sanitizer bugs in the introduced unit-test (specifically in the Address sanitizer https://lab.llvm.org/buildbot/#/builders/5/builds/5697)
- This patch attempts to rectify that, as well as re-factor parts of the test
- The issue was previously, within the `setupCallToAsmParser` function in the unit-test, `SrcMgr` was declared as a local variable. `SrcMgr` owns a unique pointer. Since the variable goes out of scope at the end of the function, the unique pointer is released.
- This patch, moves the declaration of the `SrcMgr` variable to a class field, since the scope will remain until the class's destructor is invoked (which in this case is at the end of the unit test)
- Furthermore, this patch also moves the `MCContext Ctx` declaration from a local variable instance inside a function, to a unique pointer class field. This ensures the instantiation of the MCContext remains until the tear down of the test.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D99004