Move the implementation of the `toString` function from
`llvm/Support/Error.h` to the source file, which allows us to move
`#include "llvm/ADT/StringExtras.h"` to the source file as well.
As `Error.h` is present in a large number of translation units this
means we are unnecessarily bringing in the contents of
`StringExtras.h` - itself a large file with lots of includes - and
slowing down compilation.
Also move the `#include "llvm/ADT/SmallVector.h"` directive to the
source file as it's no longer needed, but this does not give as much of
a benefit.
This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,920,413,050 to 1,903,629,230 - a
reduction of ~0.87%. This should result in a small improvement in
compilation time.
Differential Revision: https://reviews.llvm.org/D154763
Move the implementation of the `toString` function from
`llvm/Support/Error.h` to the source file, which allows us to move
`#include "llvm/ADT/StringExtras.h"` to the source file as well.
As `Error.h` is present in a large number of translation units this
means we are unnecessarily bringing in the contents of
`StringExtras.h` - itself a large file with lots of includes - and
slowing down compilation.
Also move the `#include "llvm/ADT/SmallVector.h"` directive to the
source file as it's no longer needed, but this does not give as much of
a benefit.
This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,920,413,050 to 1,903,629,230 - a
reduction of ~0.87%. This should result in a small improvement in
compilation time.
Differential Revision: https://reviews.llvm.org/D154543
BOLT YAML profile reading time gets marginally faster (14.1572->13.9207 s) for
a large YAML profile (121MB/31K functions). Not claiming stat significance
though.
Reviewed By: hintonda
Differential Revision: https://reviews.llvm.org/D154553
We're in favor of the llvm::writeToOutput API, and all
writeFileAtomically usages have been migrated to writeToOutput.
Differential Revision: https://reviews.llvm.org/D153740
These extensions don't contain any instructions on their own, they
are just aliases for a set of extensions. We should set the preprocessor
define anytime all the sub-extensions are supported.
Reviewed By: kito-cheng, eopXD
Differential Revision: https://reviews.llvm.org/D154171
An .ARM.attributes section is divided into subsections, each labelled
with a vendor name. There is one standardised vendor name, which must
be used for all attributes that affect compatibility. Subsections
labelled with other vendor names can be used for optimisation
purposes, but it has to be safe for an object file consumer to ignore
them if it doesn't recognise the vendor name.
LLD currently terminates parsing of the whole attributes section as
soon as it encounters a subsection with a vendor name it doesn't
recognise (which is anything other than the standard one). This can
prevent it from detecting compatibility issues, if a standard
subsection followed the vendor-specific one.
This patch modifies the attribute parser so that unrecognised vendor
subsections are silently skipped, and the subsections beyond them are
still processed.
(Relanded with no change from the original commit 8f208edd44d0832. I
reverted it in 949bb7e4de62cd0 due to widespread buildbot breakage,
failing to notice that 975f71faa72aaaa had already fixed the failing
unit test. Also, the *revert* caused at least one buildbot to fail,
because I switched the affected lld test to making %t a directory, and
then the reverted version tried to treat it as a file without cleaning
the output directory first.)
Differential Revision: https://reviews.llvm.org/D153335
This reverts commit 8f208edd44d0832ac2580e0ec4238be4ecfd5737.
I completely missed the compiled unit test for ELFAttributeParser,
which also needs updating. I'll reland this change once I make further
fixes.
This was discussed somewhat in D148315. As it stands, we require in
RISCVISAInfo::parseArchString (used for e.g. -march parsing in Clang)
that extensions are given in the order of z, then s, then x prefixed
extensions (after the standard single-letter extensions). However, we
recently (in D148315) moved to that order from z/x/s as the canonical
ordering was changed in the spec. In addition, recent GCC seems to
require z* extensions before s*.
My recollection of the history here is that we thought keeping -march as
close to the rules for ISA naming strings as possible would simplify
things, as there's an existing spec to point to. My feeling is that now
we've had incompatible changes, and an incompatibility with GCC there's
no real benefit to sticking to this restriction, and it risks making it
much more painful than it needs to be to copy a -march= string between
GCC and Clang.
This patch removes all ordering restrictions so you can freely mix x/s/z
extensions.
To be very explicit, this doesn't change our behaviour when emitting a
canonically ordered extension string (e.g. in build attributes). We of
course sort according to the canonical order (as we understand it) in
that case.
Differential Revision: https://reviews.llvm.org/D149246
An .ARM.attributes section is divided into subsections, each labelled
with a vendor name. There is one standardised vendor name, which must
be used for all attributes that affect compatibility. Subsections
labelled with other vendor names can be used for optimisation
purposes, but it has to be safe for an object file consumer to ignore
them if it doesn't recognise the vendor name.
LLD currently terminates parsing of the whole attributes section as
soon as it encounters a subsection with a vendor name it doesn't
recognise (which is anything other than the standard one). This can
prevent it from detecting compatibility issues, if a standard
subsection followed the vendor-specific one.
This patch modifies the attribute parser so that unrecognised vendor
subsections are silently skipped, and the subsections beyond them are
still processed.
Differential Revision: https://reviews.llvm.org/D153335
Move the implementation of the `toString` function from
`llvm/Support/Error.h` to the source file, which allows us to move
`#include "llvm/ADT/StringExtras.h"` to the source file as well.
As `Error.h` is present in a large number of translation units this
means we are unnecessarily bringing in the contents of
`StringExtras.h` - itself a large file with lots of includes - and
slowing down compilation.
Also move the `#include "llvm/ADT/SmallVector.h"` directive to the
source file as it's no longer needed, but this does not give as much of
a benefit.
This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,920,413,050 to 1,903,629,230 - a
reduction of ~0.87%. This should result in a small improvement in
compilation time.
Differential Revision: https://reviews.llvm.org/D153229
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
After D153170 the tables are now sorted by extension name so we can use that to
avoid a linear search.
Reviewed By: asb, MaskRay
Differential Revision: https://reviews.llvm.org/D153598
This was committed with D153598 merged into it. Reverting to recommit as separate patches.
This reverts commit 690b1c847f0b188202a86dc25a0a76fd8c4618f4.
For the same reasons as D151284, this requires custom lowering of the
truncate libcall on hard float ABIs (the normal libcall code path is
used on soft ABIs).
The extend operation is implemented by a shift just as in the standard
legalisation, but needs to be custom lowered because i32 isn't a legal
type on RV64.
This patch aims to make the minimal changes that result in correct
codegen for the bfloat.ll tests.
Differential Revision: https://reviews.llvm.org/D151663
As the extension list continues to grow it probably makes sense
to use a binary search rather than linear search. Sorting the strings
will make this possible.
This also avoids any question about where to add new strings in
the tables.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D153170
We already had an error for Zcmt though it appears to be untested
Add similar one for Zcmp along with tests for both.
Factor the code to share the strings as much as possible.
Reviewed By: VincentWu
Differential Revision: https://reviews.llvm.org/D153159
`TrigramIndex` was added back in https://reviews.llvm.org/D27188 as an optimization to make `SpecialCaseList::match()` faster. I've found that `TrigramIndex` actually makes the function slower and it has no functional use, so we can remove it.
I grabbed the list of queries passed to `SpecialCaseList::match()` on a random very large file (`AArch64ISelLowering.cpp`) and measured the runtime to call `match()` on all of them with [this line](8e1f820bb4/llvm/lib/Support/SpecialCaseList.cpp (L64)) disabled and then enabled.
```
$ hyperfine --warmup 3 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests' 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests'
Benchmark 1: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests
Time (mean ± σ): 575.9 ms ± 20.3 ms [User: 573.1 ms, System: 2.7 ms]
Range (min … max): 555.5 ms … 620.0 ms 10 runs
Benchmark 2: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests
Time (mean ± σ): 283.4 ms ± 6.7 ms [User: 280.3 ms, System: 3.0 ms]
Range (min … max): 277.0 ms … 294.9 ms 10 runs
Summary
'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests' ran
2.03 ± 0.09 times faster than 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests'
```
Using `perf` I found that most of the runtime in `TrigramIndex::isDefinitelyOut()` comes from a division operation that seems to come from `std::unordered_map`: 8e1f820bb4/llvm/include/llvm/Support/TrigramIndex.h (L62)
Removing `TrigramIndex` will make it easier to potentially switch to using `GlobPattern` instead of a full regex for `SpecialCaseList`. See discussion in https://reviews.llvm.org/D152762 for details.
Reviewed By: MaskRay, #sanitizers, vitalybuka
Differential Revision: https://reviews.llvm.org/D153171
This fixes
```
% ninja -C out/play LLVMSupport
ninja: Entering directory `out/play'
[151/158] Building ASM object lib/Support/BLAKE3/CMakeFiles/LLVMSupportBlake3.dir/blake3_avx512_x86-64_unix.S.o
clang: warning: argument unused during compilation: '-mavx512vl' [-Wunused-command-line-argument]
```
and applies `disable_blake3_x86_simd()`.
This fixes the root cause of commit 5160f6fefb0021a0b23e99c7cf621a330241c211 ("broke cross-builds of llvm from x86_64 to arm64 mac"...)
Add missing include guards to LLVM header files that did not previously
have them and update existing include guards to ensure that they enclose
all non-whitespace, non-comment text to enable these headers for the
multiple-include optimization.
Differential Revision: https://reviews.llvm.org/D150511
Update the RISC-V Zvk (vector cryptography) extension support from 0.5
to version 0.9.7 (2023-05-31), per
<https://github.com/riscv/riscv-crypto/releases/download/v20230531/riscv-crypto-spec-vector.pdf>
Differences:
- Zvbc is dropped from Zvkn and Zvks, and by extension
from Zvkng and Zvksg;
- new combo extensions Zvknc and Zvksc are introduced,
adding Zvbc to Zvkn and Zvks;
- the experimentatl extensions are tagged as "0.9",
from the earlier "0.5".
Reviewed By: 4vtomat
Differential Revision: https://reviews.llvm.org/D152117
There are some new cases if the division is `exact`:
1: If `TZ(LHS) == TZ(RHS)` then the result is always Odd
2: If `TZ(LHS) > TZ(RHS)` then the `TZ(LHS)-TZ(RHS)` bits of the
result are zero.
Proofs: https://alive2.llvm.org/ce/z/3rAZqF
As well, return zero in known poison cases to be consistent rather
than just working about the bits we are changing.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D150923
It seems consistent to always return zero for known poison rather than
varying the value. We do the same elsewhere.
Differential Revision: https://reviews.llvm.org/D150922
Chronically misspelled 'denominator' as 'denuminator' and a few other
cases.
On the logic side, no longer require `RHS` to be strictly positive in
`sdiv`. This in turn means we need to handle a possible zero `denom`
in the APInt division.
Differential Revision: https://reviews.llvm.org/D150921
In https://reviews.llvm.org/D147812 I introduced the class
`BalancedPartitioning` and it seemed to trigger a warning in flang
```
C:\Users\buildbot-worker\minipc-ryzen-win\flang-x86_64-windows\llvm-project\llvm\include\llvm/Support/BalancedPartitioning.h(89): warning C4305: 'initializing': truncation from 'double' to 'float'
```
For good measure, I converted all double literals to floats. This should
be a NFC.
In [0] we described an algorithm called //BalancedPartitioning// (bp) to consume function traces [1] and compute a function order that reduces the number of page faults during startup.
This patch adds the `order` command to the `llvm-profdata` tool which uses bp to output a function order that can be passed to the linker via `--symbol-ordering-file=`.
Special thanks to Sergey Pupyrev and Julian Mestre for designing this balanced partitioning algorithm.
[0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068
[1] https://reviews.llvm.org/D147287
Reviewed By: spupyrev
Differential Revision: https://reviews.llvm.org/D147812
Make ValueTracking directly call the KnownBits shift helpers, which
provides more precise results.
Unfortunately, ValueTracking has a special case where sometimes we
determine non-zero shift amounts using isKnownNonZero(). I have my
doubts about the usefulness of that special-case (it is only tested
in a single unit test), but I've reproduced the special-case via an
extra parameter to the KnownBits methods.
Differential Revision: https://reviews.llvm.org/D151816
Encountered ASAN crash and found it dereference without check pointer.
Reviewed By: kito-cheng, eklepilkina
Differential Revision: https://reviews.llvm.org/D151716
This reverts commit 35a0079238ce9fc36cdc8c6a2895eb5538bf7b4a.
The backend support is not present yet. The intrinsics will crash
the compiler if compiled to assembly or binary.
D111241 added support for extractBits() with zero width. Extend this
to extractBitsAsZExtValue() as well for consistency (in which case
it will always return zero).
Differential Revision: https://reviews.llvm.org/D151788
We currently don't call into KnownBits::shl() from ValueTracking
if the shift amount is unknown. If we do try to do so, we get
significant compile-time regressions, because evaluating all 64
shift amounts if quite expensive, and mostly pointless in this case.
Add a fast-path for the case where the shift amount is the full
[0, BitWidth-1] range. This primarily requires a more accurate
estimate of the max shift amount, to avoid taking the fast-path in
too many cases.
Differential Revision: https://reviews.llvm.org/D151540
Add overloads of sshl_ov, ushl_ov, sshl_sat and ushl_sat that take the
shift amount as unsigned instead of APInt. This matches what we do for
the normal shift operators and can help to avoid creating temporary
APInts in some cases.
Differential Revision: https://reviews.llvm.org/D151420
Implement precise nuw/nsw support in the KnownBits implementation,
replacing the rather crude handling in ValueTracking.
Differential Revision: https://reviews.llvm.org/D151208