AAPCS64 reserves any of X9-X15 for a compiler to choose to use for this
purpose, and says not to use X16 or X18 like GCC (and the previous
implementation) chose to use. The X18 register may need to get used by
the kernel in some circumstances, as specified by the platform ABI, so
it is generally an unwise choice. Simply choosing a different register
fixes the problem of this being broken on any platform that actually
follows the platform ABI (which is all of them except EABI, if I am
reading this linux kernel bug correctly
https://lkml2.uits.iu.edu/hypermail/linux/kernel/2001.2/01502.html). As
a side benefit, also generate slightly better code and avoids needing
the compiler-rt to be present. I did that by following the XCore
implementation instead of PPC (although in hindsight, following the
RISCV might have been slightly more readable). That X18 is wrong to use
for this purpose has been known for many years (e.g.
https://www.mail-archive.com/gcc@gcc.gnu.org/msg76934.html) and also
known that fixing this to use one of the correct registers is not an ABI
break, since this only appears inside of a translation unit. Some of the
other temporary registers (e.g. X9) are already reserved inside llvm for
internal use as a generic temporary register in the prologue before
saving registers, while X15 was already used in rare cases as a scratch
register in the prologue as well, so I felt that seemed the most logical
choice to choose here.
* Translate the following versions to 26.
* watchOS 12 -> 26
* visionOS 3 -> 26
* macos 16 -> 26
* iOS 19 -> 26
* tvOS 19 -> 26
* Emit diagnostics, but allow conversion when clients attempt to use
invalid gaps in OS versioning in availability.
* For target-triples, only allow "valid" versions for implicit
conversions.
This adds support under LoongArch for the target("..") attributes.
The supported formats are:
- "arch=<arch>" strings, that specify the architecture features for a
function as per the -march=arch option.
- "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as
per -mtune.
- "<feature>", "no-<feature>" enabled/disables the specific feature.
This adds assembler/disassembler support for XSfmmbase 0.6 and related
SiFive matrix multiplication extensions based on the spec here
https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification
Functionality-wise, this is the same as the Zvma extension proposal that
SiFive shared with the Attached Matrix Extension Task Group. The
extension names and instruction mnemonics have been changed to use
vendor prefixes.
Note this is a non-conforming extension as the opcodes used here are in
the standard opcode space in OP-V or OP-VE.
---------
Co-authored-by: Brandon Wu <brandon.wu@sifive.com>
This patch adds initial support for the recently announced Armv9
Cortex-A320 processor.
For more information, including the Technical Reference Manual, see:
https://developer.arm.com/Processors/Cortex-A320
---------
Co-authored-by: Oliver Stannard <oliver.stannard@arm.com>
The `llvm-headers` target wasn't depending on the generated TargetParser
headers, so they'd be flakily installed or not installed depending on
which order the build steps ran in. Add an explicit dependency to fix
this, and switch to a single `target_parser_gen` target to mirror the
pattern used by `intrinsics_gen` (which also fixes a few other missing
dependencies). Switch `llvm-headers` to use `add_dependencies` instead
of `DEPENDS` for the tablegen dependencies as well, since `DEPENDS` is
only intended for creating a file-level dependency on the output of an
`add_custom_command` in the same CMakeLists.txt (see `DEPENDS` under
https://cmake.org/cmake/help/latest/command/add_custom_target.html).
MSYS2 uses i686-pc-msys and x86_64-pc-msys as target, and is a fork of
Cygwin. There's an effort underway to try to switch as much as possible
to use -pc-cygwin targets, but the -msys target will be hanging around
for the forseeable future.
`+simd` and `+nosimd` are used to enable or disable NEON Instructions
when compiling for ARM Targets. However, up until now, using these
has not been possible. To enable this, these options are mapped to the
relevant LLVM backend option (`+neon` and `-neon`) so it can be both
enabled and disabled successfully by the user.
Tests have been added to ensure this behaviour is maintained in the
future, along with updates to existing tests as behaviour has now changed
relating to the use of `+simd` and `+nosimd`.
As `simd` has been mapped within the ARMTargetParser.def, support for
this extension is also added for the `--print-support-extensions` command
when the target is AArch32. This will print the `simd` option, along with the
description that relates to the Neon feature. This previously was not
possible as `simd` did not have a related Feature or Negative Feature.
To make this functional as intended, MVE and MVE.FP now rely on their
own Enum identifier, rather than `AEK_SIMD`. While SIMD does refer to
both Neon and Helium technologies, in terms of command line options,
SIMD relates to Neon. Helium relates to MVE and MVE.FP. The Enum
now reflects this too.
The recently announced IBM z17 processor implements the architecture
already supported as "arch15" in LLVM. This patch adds support for "z17"
as an alternate architecture name for arch15.
This patch also add the scheduler description for the z17 processor,
provided by Jonas Paulsson.
This patch introduces the `vmem-to-lds-load-insts` target feature, which
can be used to enable builtins `__builtin_amdgcn_global_load_lds` and
`__builtin_amdgcn_raw_ptr_buffer_load_lds` on platforms which have this
feature.
This feature is only available on gfx9/10.
A limitation of using a common target feature for both builtins is that
we could have made `__builtin_amdgcn_raw_ptr_buffer_load_lds` available
on gfx6,7,8.
This extension adds two external input output instructions for
non-memory-mapped device.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
Like cryptography extensions like `Zk`, `B` (a combination of `Zba`,
`Zbb` and `Zbs` extensions) can be useful if we handle this extension as
a combination.
If all `Zba`, `Zbb` and `Zbs` extensions are enabled, it also enables
the `B` extension.
With a minor fix for the build failures.
Original message:
This extension adds nine instructions, eight for non-memory-mapped devices synchronization and delay instruction.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli quic_svs@quicinc.com
This extension adds nine instructions, eight for non-memory-mapped
devices synchronization and delay instruction.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
This extension adds 10 instructions that provide hints to the interface
simulation environment.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/
This patch adds assembler only support.
This extension adds twelve conditional branch instructions that use an
immediate operand for the source.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
The Xqcili extension includes a two instructions that load large
immediates than is available with the base RISC-V ISA.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
I was reviewing encodings to put the disassembling of vendor
instructions after after standard instructions and found that these
overlap with c.fldsp and c.fsdsp.
This extension adds thirty eight bit manipulation instructions.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.6
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
Xqccmp is a new spec by Qualcomm that makes a vendor-specific effort to
solve the push/pop + frame pointers issue. Broadly, it takes the Zcmp
instructions and reverse the order they push/pop registers in, which
ends up matching the frame pointer convention.
This extension adds a new instruction not present in Zcmp,
`qc.cm.pushfp`, which will set `fp` to the incoming `sp` value after it
has pushed the registers.
This change duplicates the Zcmp implementation, with minor changes to
mnemonics (for the `qc.` prefix), predicates, and the addition of
`qc.cm.pushfp`. There is also new logic to prevent combining Xqccmp and
Zcmp. Xqccmp is kept separate to Xqci for decoding/encoding etc, as the
specs are separate today.
Specification:
https://github.com/quic/riscv-unified-db/releases/tag/Xqccmp_extension-0.1.0
gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.
This PR removes all non-documentation occurrences of gfx940/gfx941 from
the llvm directory, and the remaining occurrences in clang.
Documentation changes will follow.
For SWDEV-512631
The VLMUL and policy enums originally lived in RISCVBaseInfo.h in the
backend which is where everything else in the RISCVII namespace is
defined.
RISCVTargetParser.h is used by much more of the compiler and it
doesn't really make sense to have 2 different namespaces exposed.
These enums are both associated with VTYPE so using the RISCVVType
namespace seems like a good home for them.
Previously, when selecting a Single Precision FPU, LLVM would ensure all
elements of the Candidate FPU matched the InputFPU that was given.
However, for cases such as Cortex-R52, there are FPU options where not
all fields match exactly, for example NEON Support or Restrictions on
the Registers available.
This change ensures that LLVM can select the FPU correctly, removing the
requirement for Neon Support and Restrictions for the Candidate FPU to
be the same as the InputFPU.
This extension adds eight 48 bit load store instructions.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest
This patch adds assembler only support.
---------
Co-authored-by: Harsh Chandel <hchandel@qti.qualcomm.com>
This patch adds support for the next-generation arch15
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Detection of arch15 as host processor.
- Assembler/disassembler support for new instructions.
- Exploitation of new instructions for code generation.
- New vector (signed|unsigned|bool) __int128 data types.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10305.
Note: No currently available Z system supports the arch15
architecture. Once new systems become available, the
official system name will be added as supported -march name.