If 'z' is given as the complete extension name or with a digit after it,
it will crash in the extension map compare function. Check for these
cases and give an error.
Previously we only rejected upper case characters. We should instead
reject anything except lower case, numbers, and underscore. Other
characters will likely confuse the extension sorting.
I tried also printing the -march they correspond to, but it seemed
overly verbose and caused line wraps. It might be better if we remove
the versions numbers from the string or did a more intelligent line
wrap.
As the list of profiles grow, this will be a more efficient lookup.
Because the profile name is a prefix of the Arch string, we use
upper_bound to find the first profile that definitely comes after the
Arch string. If that isn't the first supported profile, we move back 1
profile and see if that profile is a prefix of our Arch string.
Instead of hardcoding the 4 current profile prefixes, treat profile
selection as a fallback if we don't find "rv32" or "rv64".
Update the error message accordingly.
We weren't fully checking that we parsed Zve*x/f/d correctly. This could
break if new extension is added that starts with Zve.
We were assuming the Zve64d is present whenever V is so we only
inferred from Zve*. It's more correct to infer ELEN from V itself too.
This replaces some starts_with calls wth consume_front. This allows us
to remove a later assumption that prefix was 4 characters. We would
eventually need to fix this anyway if we ever support rv128.
Noticed while reviewing the RISCVISAInfo code for other reasons.
Previously you got:
clang: error: invalid arch name 'rv64v', first letter should be 'e', 'i'
or 'g'
Which to me, unfamiliar with riscv, reads as if I should have used
"[eig]rv64v". Which is not what clang means.
Include the first bit in the error message to make this clearer:
clang: error: invalid arch name 'rv64v', first letter after 'rv64'
should be 'e', 'i' or 'g'
Previously we had an individiaul global array of implied extensions for
each extension that needed it. This allowed each array to have a
different length. Then we had a sorted table that stored pointers and
size for the indivual arrays keyed by the extension name.
This patch changes the sorted table to use multiple rows if multiple
extensions are implied. We use equal_range instead of lower_bound to
find all the rows that apply to a given extension.
The CombineIntoExts array was also modified to store only the extension
name that need to be combined. This extension name is looked up in the
implied table to find all the extensions it depends on.
This generates the SupportedExtensions and ImpliedExts information from
the RISCVExtension records found in RISCVFeatures.td.
Some of the extensions listed in the individual `ImpliedExts*` arrays
may be in a different, but the order in those array doesn't matter. I
manually verified the all the extensions were still present in each
array.
I've added the new information to the existing RISCVTargetParserDef.inc
and RISCVTargetDefEmitter.cpp so we don't need to re-parse the entirety
of RISCV.td a second time for a new file.
Zabha and Zacas are both documented as depending on Zaamo. I'm
hesitant to make them imply Zaamo instead.
So remove the implication and replace with a check that either
A or Zaamo is enabled.
This introduces a new file, RISCVISAUtils.cpp and moves the rest of
RISCVISAInfo to the TargetParser library.
This will allow us to generate part of RISCVISAInfo.cpp using tablegen.
So that there is no cyclic dependency if we want to use it in
tablegen.
Reviewed By: fpetrogalli
Differential Revision: https://reviews.llvm.org/D140529
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
component into a new LLVM Component called "TargetParser". This
potentially enables using tablegen to maintain this information, as
is shown in https://reviews.llvm.org/D137517. This cannot currently
be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
information in the TargetParser:
- `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
the current Host machine for info about it, primarily to support
getting the host triple, but also for `-mcpu=native` support in e.g.
Clang. This is fairly tightly intertwined with the information in
`X86TargetParser.h`, so keeping them in the same component makes
sense.
- `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
the target triple parser and representation. This is very intertwined
with the Arm target parser, because the arm architecture version
appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.
And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM
Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.
If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.
Differential Revision: https://reviews.llvm.org/D137838