21 Commits

Author SHA1 Message Date
Brian Cain
c6bd020714
Revert "[Hexagon] NFC: Reduce the amount of version-specific code" (#146193)
Reverts llvm/llvm-project#145812
2025-06-27 23:13:36 -05:00
Alexey Karyakin
e9c9adcefe
[Hexagon] NFC: Reduce the amount of version-specific code (#145812)
There is a lot of redundant code that needs to be modified when new
Hexagon versions are added. Reduce the amount of this redundancy.

- compute ELF flags and attributes based on version feature names;
- simplify EnableHVX option handling by using arch features instead of
arch version enums;
- simplify completeHVXFeatures() by using features;
- delete several unused or redundant functions and constants:
isCPUValid, getCpu, getHexagonCPUSuffix;
- do not set HexagonArchVersion in initializeSubtargetDependencies, it
is set in ParseSubtargetFeatures;

Signed-off-by: Alexey Karyakin <akaryaki@quicinc.com>
2025-06-27 15:05:28 -05:00
Chandler Carruth
cd269fee05 [StrTable] Switch Clang builtins to use string tables
This both reapplies #118734, the initial attempt at this, and updates it
significantly.

First, it uses the newly added `StringTable` abstraction for string
tables, and simplifies the construction to build the string table and
info arrays separately. This should reduce any `constexpr` compile time
memory or CPU cost of the original PR while significantly improving the
APIs throughout.

It also restructures the builtins to support sharding across several
independent tables. This accomplishes two improvements from the
original PR:

1) It improves the APIs used significantly.

2) When builtins are defined from different sources (like SVE vs MVE in
   AArch64), this allows each of them to build their own string table
   independently rather than having to merge the string tables and info
   structures.

3) It allows each shard to factor out a common prefix, often cutting the
   size of the strings needed for the builtins by a factor two.

The second point is important both to allow different mechanisms of
construction (for example a `.def` file and a tablegen'ed `.inc` file,
or different tablegen'ed `.inc files), it also simply reduces the sizes
of these tables which is valuable given how large they are in some
cases. The third builds on that size reduction.

Initially, we use this new sharding rather than merging tables in
AArch64, LoongArch, RISCV, and X86. Mostly this helps ensure the system
works, as without further changes these still push scaling limits.
Subsequent commits will more deeply leverage the new structure,
including using the prefix capabilities which cannot be easily factored
out here and requires deep changes to the targets.
2025-02-04 18:04:57 +00:00
Chandler Carruth
ca79ff07d8
Revert "Switch builtin strings to use string tables" (#119638)
Reverts llvm/llvm-project#118734

There are currently some specific versions of MSVC that are miscompiling
this code (we think). We don't know why as all the other build bots and
at least some folks' local Windows builds work fine.

This is a candidate revert to help the relevant folks catch their
builders up and have time to debug the issue. However, the expectation
is to roll forward at some point with a workaround if at all possible.
2024-12-13 23:58:48 -08:00
Chandler Carruth
be2df95e92
Switch builtin strings to use string tables (#118734)
The Clang binary (and any binary linking Clang as a library), when built
using PIE, ends up with a pretty shocking number of dynamic relocations
to apply to the executable image: roughly 400k.

Each of these takes up binary space in the executable, and perhaps most
interestingly takes start-up time to apply the relocations.

The largest pattern I identified were the strings used to describe
target builtins. The addresses of these string literals were stored into
huge arrays, each one requiring a dynamic relocation. The way to avoid
this is to design the target builtins to use a single large table of
strings and offsets within the table for the individual strings. This
switches the builtin management to such a scheme.

This saves over 100k dynamic relocations by my measurement, an over 25%
reduction. Just looking at byte size improvements, using the `bloaty`
tool to compare a newly built `clang` binary to an old one:

```
    FILE SIZE        VM SIZE
 --------------  --------------
  +1.4%  +653Ki  +1.4%  +653Ki    .rodata
  +0.0%    +960  +0.0%    +960    .text
  +0.0%    +197  +0.0%    +197    .dynstr
  +0.0%    +184  +0.0%    +184    .eh_frame
  +0.0%     +96  +0.0%     +96    .dynsym
  +0.0%     +40  +0.0%     +40    .eh_frame_hdr
  +114%     +32  [ = ]       0    [Unmapped]
  +0.0%     +20  +0.0%     +20    .gnu.hash
  +0.0%      +8  +0.0%      +8    .gnu.version
  +0.9%      +7  +0.9%      +7    [LOAD #2 [R]]
  [ = ]       0 -75.4% -3.00Ki    .relro_padding
 -16.1%  -802Ki -16.1%  -802Ki    .data.rel.ro
 -27.3% -2.52Mi -27.3% -2.52Mi    .rela.dyn
  -1.6% -2.66Mi  -1.6% -2.66Mi    TOTAL
```

We get a 16% reduction in the `.data.rel.ro` section, and nearly 30%
reduction in `.rela.dyn` where those reloctaions are stored.

This is also visible in my benchmarking of binary start-up overhead at
least:

```
Benchmark 1: ./old_clang --version
  Time (mean ± σ):      17.6 ms ±   1.5 ms    [User: 4.1 ms, System: 13.3 ms]
  Range (min … max):    14.2 ms …  22.8 ms    162 runs

Benchmark 2: ./new_clang --version
  Time (mean ± σ):      15.5 ms ±   1.4 ms    [User: 3.6 ms, System: 11.8 ms]
  Range (min … max):    12.4 ms …  20.3 ms    216 runs

Summary
  './new_clang --version' ran
    1.13 ± 0.14 times faster than './old_clang --version'
```

We get about 2ms faster `--version` runs. While there is a lot of noise
in binary execution time, this delta is pretty consistent, and
represents over 10% improvement. This is particularly interesting to me
because for very short source files, repeatedly starting the `clang`
binary is actually the dominant cost. For example, `configure` scripts
running against the `clang` compiler are slow in large part because of
binary start up time, not the time to process the actual inputs to the
compiler.

----

This PR implements the string tables using `constexpr` code and the
existing macro system. I understand that the builtins are moving towards
a TableGen model, and if complete that would provide more options for
modeling this. Unfortunately, that migration isn't complete, and even
the parts that are migrated still rely on the ability to break out of
the TableGen model and directly expand an X-macro style `BUILTIN(...)`
textually. I looked at trying to complete the move to TableGen, but it
would both require the difficult migration of the remaining targets, and
solving some tricky problems with how to move away from any macro-based
expansion.

I was also able to find a reasonably clean and effective way of doing
this with the existing macros and some `constexpr` code that I think is
clean enough to be a pretty good intermediate state, and maybe give a
good target for the eventual TableGen solution. I was also able to
factor the macros into set of consistent patterns that avoids a
significant regression in overall boilerplate.
2024-12-08 19:00:14 -08:00
Aaron Ballman
0b53e7b1cb Silence a signed/unsigned comparison mismatch in MSVC; NFC 2024-07-09 16:31:02 -04:00
Brian Cain
c91b50ed92
[hexagon] Add {con,de}structive interference size defn (#94877)
This support was originally added in 72c373bfdc98 ([C++17] Support
__GCC_[CON|DE]STRUCTIVE_SIZE (#89446), 2024-04-26). We're overriding the
values for Hexagon here.

Signed-off-by: Brian Cain <bcain@quicinc.com>
2024-07-09 12:18:48 -05:00
Stoorx
42d758bfa6 [clang] Return std::string_view from TargetInfo::getClobbers()
Change the return type of `getClobbers` function from `const char*`
to `std::string_view`. Update the function usages in CodeGen module.

The reasoning of these changes is to remove unsafe `const char*`
strings and prevent unnecessary allocations for constructing the
`std::string` in usages of `getClobbers()` function.

Differential Revision: https://reviews.llvm.org/D148799
2023-04-24 12:16:54 +03:00
Archibald Elliott
62c7f035b4 [NFC][TargetParser] Remove llvm/ADT/Triple.h
I also ran `git clang-format` to get the headers in the right order for
the new location, which has changed the order of other headers in two
files.
2023-02-07 12:39:46 +00:00
serge-sans-paille
d9ab3e82f3
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.

The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.

Differential Revision: https://reviews.llvm.org/D139881
2022-12-27 09:55:19 +01:00
Aaron Ballman
6c75ab5f66 Introduce _BitInt, deprecate _ExtInt
WG14 adopted the _ExtInt feature from Clang for C23, but renamed the
type to be _BitInt. This patch does the vast majority of the work to
rename _ExtInt to _BitInt, which accounts for most of its size. The new
type is exposed in older C modes and all C++ modes as a conforming
extension. However, there are functional changes worth calling out:

* Deprecates _ExtInt with a fix-it to help users migrate to _BitInt.
* Updates the mangling for the type.
* Updates the documentation and adds a release note to warn users what
is going on.
* Adds new diagnostics for use of _BitInt to call out when it's used as
a Clang extension or as a pre-C23 compatibility concern.
* Adds new tests for the new diagnostic behaviors.

I want to call out the ABI break specifically. We do not believe that
this break will cause a significant imposition for early adopters of
the feature, and so this is being done as a full break. If it turns out
there are critical uses where recompilation is not an option for some
reason, we can consider using ABI tags to ease the transition.
2021-12-06 12:52:01 -05:00
Erich Keane
8a1c999c9b Implement _ExtInt ABI for all ABIs in Clang, enable type for ABIs
This is the result of an audit of all of the ABIs in clang to implement
and enable the type for those targets.

Additionally, this finds an issue with integer-promotion passing for a
few platforms when using _ExtInt of < int, so this also corrects that
resulting in signext/zeroext being on a params of those types in some
platforms.

Differential Revisions: https://reviews.llvm.org/D79118
2020-05-06 06:52:18 -07:00
Sid Manning
81194bfeea [Hexagon] MaxAtomicPromoteWidth and MaxAtomicInlineWidth are not getting set.
Noticed when building llvm's c++ library.

Differential Revision: https://reviews.llvm.org/D76546
2020-03-30 12:33:51 -05:00
Sid Manning
b0da094983 [Hexagon] Add support for Linux/Musl ABI (part 2)
A continuation of https://reviews.llvm.org/D72701.  This
adds support needed in clang.

Differential Revision: https://reviews.llvm.org/D75638
2020-03-26 17:19:46 -05:00
Krzysztof Parzyszek
b1d47467e2 [Hexagon] Change HVX vector predicate types from v512/1024i1 to v64/128i1
This commit removes the artificial types <512 x i1> and <1024 x i1>
from HVX intrinsics, and makes v512i1 and v1024i1 no longer legal on
Hexagon.

It may cause existing bitcode files to become invalid.

* Converting between vector predicates and vector registers must be
  done explicitly via vandvrt/vandqrt instructions (their intrinsics),
  i.e. (for 64-byte mode):
    %Q = call <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32> %V, i32 -1)
    %V = call <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1> %Q, i32 -1)

  The conversion intrinsics are:
    declare  <64 x i1> @llvm.hexagon.V6.vandvrt(<16 x i32>, i32)
    declare <128 x i1> @llvm.hexagon.V6.vandvrt.128B(<32 x i32>, i32)
    declare <16 x i32> @llvm.hexagon.V6.vandqrt(<64 x i1>, i32)
    declare <32 x i32> @llvm.hexagon.V6.vandqrt.128B(<128 x i1>, i32)
  They are all pure.

* Vector predicate values cannot be loaded/stored directly. This directly
  reflects the architecture restriction. Loading and storing or vector
  predicates must be done indirectly via vector registers and explicit
  conversions via vandvrt/vandqrt instructions.
2020-02-19 14:14:56 -06:00
Krzysztof Parzyszek
305bf5b21d [Hexagon] Add support for Hexagon v67t microarchitecture (tiny core) 2020-01-21 11:35:10 -06:00
Krzysztof Parzyszek
c12a5917d2 [Hexagon] Add support for Hexagon/HVX v67 ISA 2020-01-20 16:16:49 -06:00
Chandler Carruth
2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Erich Keane
e44bdb3f70 Add Rest of Targets Support to ValidCPUList (enabling march notes)
A followup to: https://reviews.llvm.org/D42978

Most of the rest of the Targets were pretty rote, so this
patch knocks them all out at once. 

Differential Revision: https://reviews.llvm.org/D43057

llvm-svn: 324676
2018-02-08 23:16:55 +00:00
Sumanth Gundapaneni
57098f5ac3 [Hexagon] Handling of new HVX flags and target-features
This patch has the following changes
A new flag "-mhvx-length={64B|128B}" is introduced to specify the length of the vector.
Previously we have used "-mhvx-double" for 128 Bytes. This adds the target-feature "+hvx-length{64|128}b"

The "-mhvx" flag must be provided on command line to enable HVX for Hexagon. If no -mhvx-length flag
is specified, a default length is picked from the arch mentioned in this priority order from either -mhvx=vxx
or -mcpu. For v60 and v62 the default length is 64 Byte. For unknown versions, the length is 128 Byte. The 
-mhvx flag adds the target-feature "+hvxv{hvx_version}"

The 64 Byte mode is soon going to be deprecated. A warning is emitted if 64 Byte is enabled. A warning is
still emitted for the default 64 Byte as well. This warning can be suppressed with a -Wno flag.

The "-mhvx-double" and "-mno-hvx-double" flags are deprecated. A warning is emitted if the driver sees
them on commandline. "-mhvx-double" is an alias to "-mhvx-length=128B"

The compilation will error out if -mhvx-length is specified with out an -mhvx/-mhvx= flag

The macro HVX_LENGTH is defined and is set to the length of the vector. 
Eg: #define HVX_LENGTH 64

The macro HVX_ARCH is defined and is set to the version of the HVX. 
Eg: #define HVX_ARCH 62

Differential Revision: https://reviews.llvm.org/D38852

llvm-svn: 316102
2017-10-18 18:10:13 +00:00
Erich Keane
ebba592682 Break up Targets.cpp into a header/impl pair per target type[NFCI]
Targets.cpp is getting unwieldy, and even minor changes cause the entire thing 
to cause recompilation for everyone. This patch bites the bullet and breaks 
it up into a number of files.

I tended to keep function definitions in the class declaration unless it 
caused additional includes to be necessary. In those cases, I pulled it 
over into the .cpp file. Content is copy/paste for the most part, 
besides includes/format/etc.


Differential Revision: https://reviews.llvm.org/D35701

llvm-svn: 308791
2017-07-21 22:37:03 +00:00