As discussed in the last sync-up call, because these profiles are not
yet finalised they shouldn't be exposed to users unless they opt-in to
them (much like experimental extensions). We may later want to add a
more specific flag, but reusing `-menable-experimental-extensions`
solves the immediate problem.
This is implemented using the new support for marking profiles s
experimental added in #91993 to move the unratified profiles to
RISCVExperimentalProfile and making the necessary changes to logic in
RISCVISAInfo to handle this.
After patch https://github.com/llvm/llvm-project/pull/88805
`I` Ext will be added automatically when we running the command like
`./build/bin/llc -mtriple=riscv32 -mattr=+e -target-abi ilp32e
-verify-machineinstrs llvm/test/CodeGen/RISCV/zcmp-additional-stack.ll`
it will generate
```
.text
.attribute 4, 16
.attribute 5, "rv32i2p1_e2pe"
.file "zcmp-additional-stack.ll"
.globl func # -- Begin function func
.p2align 1
.type func,@function
```
This patch reset the I ext in FeatureBit when `+e` has been specify
hasExtension checks if the extension name is a known extension name.
That should always be true for the extensions listed here so we can
skip that check.
We can extract each extension as we process them without much
complexity.
I changed the error message for cases where there are double underscores
or a trailing underscore. I think this is an improvement over the
previous error.
hasExtension check isSupportedExtension before the map lookup. All
of the extensions we check for in updateCombination should be valid
extension names so we can bypass that to save some time.
We can use a SmallVector<StringRef>.
Adjust the code so we check for empty strings in the loop instead of
making a copy of the vector returned from StringRef::split.
This overlaps with #91532 which also removed the std::vector, but
that PR may be more controversial.
This detects the same extension name being added twice. Mostly I'm
worried about the case that the same string appears with two different
versions. We will only preserve one of the versions.
We could allow the same version to be repeated, but that doesn't seem
useful at the moment.
I've updated addExtension to use map::emplace instead of
map::operator[]. This means we only keep the first version if there are
duplicates. Previously we kept the last version, but that shouldn't matter
now that we don't allow duplicates. parseArchString already doesn't allow
duplicates.
Extensions starting with 's' or 'x' should always be followed by an
alphabetical character. I don't know of any crashes from this currently,
but it seemed better to be defensive.
If 'z' is given as the complete extension name or with a digit after it,
it will crash in the extension map compare function. Check for these
cases and give an error.
Previously we only rejected upper case characters. We should instead
reject anything except lower case, numbers, and underscore. Other
characters will likely confuse the extension sorting.
I tried also printing the -march they correspond to, but it seemed
overly verbose and caused line wraps. It might be better if we remove
the versions numbers from the string or did a more intelligent line
wrap.
As the list of profiles grow, this will be a more efficient lookup.
Because the profile name is a prefix of the Arch string, we use
upper_bound to find the first profile that definitely comes after the
Arch string. If that isn't the first supported profile, we move back 1
profile and see if that profile is a prefix of our Arch string.
Instead of hardcoding the 4 current profile prefixes, treat profile
selection as a fallback if we don't find "rv32" or "rv64".
Update the error message accordingly.
We weren't fully checking that we parsed Zve*x/f/d correctly. This could
break if new extension is added that starts with Zve.
We were assuming the Zve64d is present whenever V is so we only
inferred from Zve*. It's more correct to infer ELEN from V itself too.
This replaces some starts_with calls wth consume_front. This allows us
to remove a later assumption that prefix was 4 characters. We would
eventually need to fix this anyway if we ever support rv128.
Noticed while reviewing the RISCVISAInfo code for other reasons.
Previously you got:
clang: error: invalid arch name 'rv64v', first letter should be 'e', 'i'
or 'g'
Which to me, unfamiliar with riscv, reads as if I should have used
"[eig]rv64v". Which is not what clang means.
Include the first bit in the error message to make this clearer:
clang: error: invalid arch name 'rv64v', first letter after 'rv64'
should be 'e', 'i' or 'g'
Previously we had an individiaul global array of implied extensions for
each extension that needed it. This allowed each array to have a
different length. Then we had a sorted table that stored pointers and
size for the indivual arrays keyed by the extension name.
This patch changes the sorted table to use multiple rows if multiple
extensions are implied. We use equal_range instead of lower_bound to
find all the rows that apply to a given extension.
The CombineIntoExts array was also modified to store only the extension
name that need to be combined. This extension name is looked up in the
implied table to find all the extensions it depends on.
This generates the SupportedExtensions and ImpliedExts information from
the RISCVExtension records found in RISCVFeatures.td.
Some of the extensions listed in the individual `ImpliedExts*` arrays
may be in a different, but the order in those array doesn't matter. I
manually verified the all the extensions were still present in each
array.
I've added the new information to the existing RISCVTargetParserDef.inc
and RISCVTargetDefEmitter.cpp so we don't need to re-parse the entirety
of RISCV.td a second time for a new file.
Zabha and Zacas are both documented as depending on Zaamo. I'm
hesitant to make them imply Zaamo instead.
So remove the implication and replace with a check that either
A or Zaamo is enabled.
This introduces a new file, RISCVISAUtils.cpp and moves the rest of
RISCVISAInfo to the TargetParser library.
This will allow us to generate part of RISCVISAInfo.cpp using tablegen.
So that there is no cyclic dependency if we want to use it in
tablegen.
Reviewed By: fpetrogalli
Differential Revision: https://reviews.llvm.org/D140529
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
component into a new LLVM Component called "TargetParser". This
potentially enables using tablegen to maintain this information, as
is shown in https://reviews.llvm.org/D137517. This cannot currently
be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
information in the TargetParser:
- `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
the current Host machine for info about it, primarily to support
getting the host triple, but also for `-mcpu=native` support in e.g.
Clang. This is fairly tightly intertwined with the information in
`X86TargetParser.h`, so keeping them in the same component makes
sense.
- `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
the target triple parser and representation. This is very intertwined
with the Arm target parser, because the arm architecture version
appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.
And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM
Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.
If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.
Differential Revision: https://reviews.llvm.org/D137838