That instruction is not supported on GFX12.
Added a testcase which previously crashed without this change.
Co-authored-by: pvanhout <pierre.vanhoutryve@amd.com>
Add support for specifying the logical SPIR-V target environment in the
triple as Vulkan. When compiling HLSL, this replaces the DirectX Shader
Model with a Vulkan environment instead.
Currently, the only supported combinations of SPIR-V version and Vulkan
environment are:
- Vulkan 1.2 and SPIR-V 1.5
- Vulkan 1.3 and SPIR-V 1.6
Fixes#70051
llvm-project/llvm/lib/TargetParser/AArch64TargetParser.cpp:157:1:
error: non-void function does not return a value in all control paths [-Werror,-Wreturn-type]
}
^
1 error generated.
Currently there are several bits of code in the AArch64 driver which
attempt to enforce dependencies between optional features in the -march=
and -mcpu= options. However, these are based on the list of feature
names being enabled/disabled, so they have a lot of logic to consider
the order in which features were turned on and off, which doesn't scale
well as dependency chains get longer.
This patch moves the code handling these dependencies to TargetParser,
and changes them to use a Bitset of enabled features. This makes it easy
to check which features are enabled, and is converted back to a list of
LLVM feature names once all of the command-line options are parsed.
The motivating example for this was the -mcpu=cortex-r82+nofp option.
Previously, the code handling the dependency between the fp16 and
fp16fml extensions did not consider the nofp modifier, so it added
+fullfp16 to the feature list. This should have been disabled by the
+nofp modifier, and also the backend did follow the dependency between
fullfp16 and fp, resulting in fp being turned back on in the backend.
Most of the dependencies added to AArch64TargetParser.h weren't known
about by clang before, I built that list by checking what the backend
thinks the dependencies between SubtargetFeatures are.
This patch extends the -mcpu/mtune=native support to handle the
Microsoft Azure Cobalt 100 CPU as a Neoverse N2. We expect users to use
-mcpu=neoverse-n2 when targeting this CPU and all the architecture and
codegen decisions to be identical.
The only difference is that the Microsoft Azure Cobalt 100 has a
different Implementer ID in the /proc/cpuinfo entry that needs to be
detected in getHostCPUNameForARM appropriately.
-mbranch-protection=gcs (enabled by -mbranch-protection=standard) causes
generated objects to be marked with the gcs feature. This is done via
the guarded-control-stack module flag, in a similar way to
branch-target-enforcement and sign-return-address.
Enabling GCS causes the GNU_PROPERTY_AARCH64_FEATURE_1_GCS bit to be set
on generated objects. No code generation changes are required, as GCS
just requires that functions are called using BL and returned from using
RET (or other similar variant instructions), which is already the case.
a15532d7647a8a4b7fd2889bd97f6f72f273c4bf landed a patch that added
support for detecting more AMD znver2 CPUs and cleaned up some of the
surrounding code, including the znver3 detection. Since one model group
is 00h-0fh, I adjusted the check to include checking if the value is
greater than zero. Since the value is unsigned, this is always true and
gcc warns on it. This patch removes the comparison with zero to get rid
of the compiler warning.
This patch adds proper detection support for more znver2 CPUs.
Specifically, this adds in support for CPUs codenamed Renoir, Lucienne,
and Mendocino.
This was originally proposedfor Renoir in
https://reviews.llvm.org/D96220 and
got approved, but slipped through the cracks. However, there is still a
demand for this feature.
In addition to adding support for more znver2 CPUs, this patch also includes
some additional refactoring and comments related to cpu model
information for zen CPUs.
Fixes https://github.com/llvm/llvm-project/issues/74934.
- Adds a new +pc option to -mbranch-protection that will enable
the use of PC as a diversifier in PAC branch protection code.
- When +pauth-lr is enabled (-march=armv9.5a+pauth-lr) in combination
with -mbranch-protection=pac-ret+pc, the new 9.5-a instructions
(pacibsppc, retaasppc, etc) are used.
Documentation for the relevant instructions can be found here:
https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/
Co-authored-by: Lucas Prates <lucas.prates@arm.com>
Some tools with a specified target arch, but no full triple default to
the host triple. On macos hosts, this would then force using macho on
targets that didn't expect it, resulting in assertions.
We should also probably emit explicit errors if the object format is
specified on targets which don't handle it.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
Positive options: -mapx-features=<comma-separated-features>
Negative options: -mno-apx-features=<comma-separated-features>
-m[no-]apx-features is designed to be able to control separate APX
features.
Besides, we also support the flag -m[no-]apxf, which can be used like an
alias of -m[no-]apx-features=< all APX features covered by CPUID APX_F>
Behaviour when positive and negative options are used together:
For boolean flags, the last one wins
-mapxf -mno-apxf -> -mno-apxf
-mno-apxf -mapxf -> -mapxf
For flags that take a set as arguments, it sets the mask by order of the
flags
-mapx-features=egpr,ndd -mno-apx-features=egpr -> -egpr,+ndd
-mapx-features=egpr -mno-apx-features=egpr,ndd -> -egpr,-ndd
-mno-apx-features=egpr -mapx-features=egpr,ndd -> +egpr,+ndd
-mno-apx-features=egpr,ndd -mapx-features=egpr -> -ndd,+egpr
The design is aligned with gcc
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628905.html
This patch reference ac1ffd3caca12c254e0b8c847aa8ce8e51b6cfbf to suppot
a soft coding way to identify whether a cpu has a feature
`unaligned-scalar-mem` by `RISCVProcessors.td`.
This patch does not provide test case since there is no risc-v cpu
support `unaligned-scalar-mem` in llvm upstream now.
This reverts commit 122c89b271af30b86536cad7bac64ea9c56615ed. Change does not build, with errors such as:
In file included from ../llvm-project/llvm/tools/dsymutil/DebugMap.h:24,
from ../llvm-project/llvm/tools/dsymutil/DwarfLinkerForBinary.h:13,
from ../llvm-project/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp:9:
../llvm-project/llvm/tools/dsymutil/RelocationMap.h:60:17: error: declaration of ‘llvm::dsymutil::SymbolMapping llvm::dsymutil::ValidReloc::SymbolMapping’ changes meaning of ‘SymbolMapping’ [-fpermissive]
60 | SymbolMapping SymbolMapping;
| ^~~~~~~~~~~~~
../llvm-project/llvm/tools/dsymutil/RelocationMap.h:36:8: note: ‘SymbolMapping’ declared here as ‘struct llvm::dsymutil::SymbolMapping’
36 | struct SymbolMapping {
| ^~~~~~~~~~~~~
In file included from ../llvm-project/llvm/tools/dsymutil/DwarfLinkerForBinary.h:13,
from ../llvm-project/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp:9:
../llvm-project/llvm/tools/dsymutil/DebugMap.h:198:32: error: declaration of ‘std::optional<llvm::dsymutil::RelocationMap> llvm::dsymutil::DebugMapObject::RelocationMap’ changes meaning of ‘RelocationMap’ [-fpermissive]
198 | std::optional<RelocationMap> RelocationMap;
| ^~~~~~~~~~~~~
In file included from ../llvm-project/llvm/tools/dsymutil/DebugMap.h:24,
from ../llvm-project/llvm/tools/dsymutil/DwarfLinkerForBinary.h:13,
from ../llvm-project/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp:9:
../llvm-project/llvm/tools/dsymutil/RelocationMap.h:76:7: note: ‘RelocationMap’ declared here as ‘class llvm::dsymutil::RelocationMap’
76 | class RelocationMap {
| ^~~~~~~~~~~~~
This adds support in dsymutil for mergeable libraries [1].
dsymutil reads a new stab emitted by ld, allowing it to operate on
dynamic libraries instead of object files. It also now loads the DWARF
files associated to the libraries, and build the debug map for each
binary from the list of symbols exported by the library. For each Debug
Map Object, there is a new associated Relocation Map which is serialized
from the information retrieved in the original debug_info (or
debug_addr) section of the .o file.
The final DWARF file has multiple compile units, so the offsets
information of the relocations are adjusted relatively to the compile
unit they will end up belonging to, inside the final linked DWARF file.
[1] https://developer.apple.com/documentation/xcode/configuring-your-project-to-use-mergeable-libraries
Differential revision: https://reviews.llvm.org/D158124
When the FPU was selected with "+(no)fp(.dp)" extensions in "-march" or
"-mcpu" options, the FPU used for multilib selection was still the
default one for given architecture or CPU.