This corrects a Darwin build failure due to a missing stub and an
environment-specific Linux test failure due to an overly restrictive
test regex. This also backs out of the previous fix attempt; %p is
intended to be printed and parsed with the semantics of #.
When the environment LLVM_ENABLE_SYMBOLIZER_MARKUP is set, if
llvm-symbolizer fails or is disabled, this change will print a backtrace
in llvm-symbolizer markup instead of falling back to in-process
symbolization mechanisms.
This allows llvm-symbolizer to be run on the output later to produce a
high quality backtrace, even for fully-stripped LLVM utilities.
Reviewed By: mcgrathr
Differential Revision: https://reviews.llvm.org/D139750
Currently, the `-fprofile-udpate` is ignored when `-fprofile-generate` is in effect. This patch enables `-fprofile-update` for `-fprofile-generate`. This patch continues the work from https://reviews.llvm.org/D87737, which added `-fprofile-update` in the first place.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D157280
Since zihintntl is ratified now, we could remove the experimental prefix and change its version to 1.0.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D151547
Avoid both memory leaks and expensive dynamic allocations by using
SpecificBumpPtrAllocator for HNode types. It's expected they're not deallocated
until the `Input` class is destroyed, so deallocating all at once works well in
this case.
Reduces YAML profile pre-processing time in BOLT from
> 11.2067 ( 3.2%) 1.6487 ( 7.3%) 12.8554 ( 3.5%) 12.8635 ( 5.6%) pre-process profile data
to
> 10.6613 ( 3.1%) 1.6489 ( 6.7%) 12.3102 ( 3.3%) 12.3134 ( 5.3%) pre-process profile data
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D155006
Update this diagnostic to mention the reason (unmatched '['), matching
the other diagnostic about stray '\'. The original pattern is omitted,
as some users may mention the original pattern in another position, not
repeating it.
The `Twine::str()` function currently always allocates heap memory via `std::string`. However, some instances of `Twine` don't need an intermediate buffer at all, and the rest can attempt to print into a stack buffer first.
This is intentionally not making use of `Twine::isSingleStringLiteral()` from D157010 to skip saving the string in the bump-pointer allocator, since the `StringSaver` documentation suggests that MUST happen for every given string.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D157015
The `const char *` storage backing StringLiteral has static lifetime. Making `Twine` aware of that allows us to avoid allocating heap memory in some contexts (e.g. avoid passing it to `StringSaver::save()` in a follow-up Clang patch).
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D157010
Implement XCVbi intrinsics for CV32E40P according to the specification.
This commit is part of a patch-set to upstream the 7 vendor specific extensions of CV32E40P.
Contributors: @CharKeaney, @jeremybennett, @lewis-revill, @liaolucy, Nandni Jamnadas, @paolos, @simoncook, @xmj.
bf2ad26b4ff856aab9a62ad168e6bdefeedc374f originally commited.
e4777dc4b9cb371971523cc603e1b8a5c7255e7e reverted due to test failures caused by a merge conflict marker in llvm/test/CodeGen/RISCV/attributes that was accidentally checked in.
This commit removed the conflict marker and recommitted.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154412
In a Unicode name was stored in a way that caused
a medial hyphen to be at the end of a a chunk, it would not
be properly ignored by the loose matching algorithm.
For example if `LEFT-TO-RIGHT OVERRIDE` was stored as
`LEFT-` [...], the `-` would not be ignored.
The generators now ensures nodes are not cut accross
medial hyphen boundaries.
Fixes#64161
Differential Revision: https://reviews.llvm.org/D156518
Implement XCVbi intrinsics for CV32E40P according to the specification.
This commit is part of a patch-set to upstream the 7 vendor specific extensions of CV32E40P.
Contributors: @CharKeaney, @jeremybennett, @lewis-revill, @liaolucy, Nandni Jamnadas, @PaoloS, @simoncook, @xmj.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154412
Implement XCVsimd intrinsics for CV32E40P according to the specification.
This commit is part of a patch-set to upstream the 7 vendor specific extensions of CV32E40P.
Contributors: @CharKeaney, @jeremybennett, @lewis-revill, @liaolucy, Nandni Jamnadas, @PaoloS, @simoncook, @xmj.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153721
According to the latest spec, Zvfbfwma requires Zvfbfmin and Zvfbfmin requires Zfbfmin, with FLH/FSH/FMV.H.X/HMV.X.H removed from Zvfbfwma.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D155916
Implement XCValu intrinsics for CV32E40P according to the specification.
This is a commit of the patch-set to upstream the 7 vendor specific extensions of CV32E40P.
Contributors: @CharKeaney, Nandni Jamnadas, Serkan Muhcu, @jeremybennett, @lewis-revill, @liaolucy, @simoncook, @xmj
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153748
The current implementation has two primary issues:
* Matching `a*a*a*b` against `aaaaaa` has exponential complexity.
* BitVector harms data cache and is inefficient for literal matching.
and a minor issue that `\` at the end may cause an out of bounds access
in `StringRef::operator[]`.
Switch to an O(|S|*|P|) greedy algorithm instead: factor the pattern
into segments split by '*'. The segment is matched sequentianlly by
finding the first occurrence past the end of the previous match. This
algorithm is used by lots of fnmatch implementations, including musl and
NetBSD's.
In addition, `optional<StringRef> Exact, Suffix, Prefix` wastes space.
Instead, match the non-metacharacter prefix against the haystack, then
match the pattern with the rest.
In practice `*suffix` style patterns are less common and our new
algorithm is fast enough, so don't bother storing the non-metacharacter
suffix.
Note: brace expansions (D153587) can leverage the `matchOne` function.
Differential Revision: https://reviews.llvm.org/D156046
Similar to D142862.
xxh3 is significantly faster than xxh64. Switch to xxh3, as we did for
for lld and llvm-dwarfutil to increase performance (D154813 D155675).
While I think StringMap is not a bottleneck for most applications, it
seems good to eliminate the slower xxh64.
In addition, according to Erik Desjardins, an artificial benchmark of
Rust with very large constant strings improves by ~3% locally.
I have fixed all found issues (~20) separately, but one is remaining:
* ExecutionEngine/RuntimeDyld/ARM/MachO_ARM_PIC_relocations.s has a
failure due to StringMap iteration order. It now passes
with LLVM_ENABLE_REVERSE_ITERATION=on while failing with
LLVM_ENABLE_REVERSE_ITERATION=off.
Reviewed By: erikdesjardins
Differential Revision: https://reviews.llvm.org/D155781
There is a libstdc++ stl_map.h bug that is only back ported to 7.5.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78595#c13
```
/tmp/opt/gcc-7.3.0/include/c++/7.3.0/bits/stl_tree.h:2091:28: error: no matching function for call to ‘std::_Rb_tree<std::__cxx11::basic_string<char>, std::pair<const std::__cxx11::basic_string<char>, std::unique_ptr<llvm::vfs::detail::InMemoryNode> >, std::_Select1st<std::pair<const std::__cxx11::basic_string<char>, std::unique_ptr<llvm::vfs::detail::InMemoryNode> > >, std::less<std::__cxx11::basic_string<char> >, std::allocator<std::pair<const std::__cxx11::basic_string<char>, std::unique_ptr<llvm::vfs::detail::InMemoryNode> > > >::_M_get_insert_unique_pos(std::pair<llvm::StringRef, std::unique_ptr<llvm::vfs::detail::InMemoryNode> >::first_type&)’
= _M_get_insert_unique_pos(_KeyOfValue()(__v));
~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
```
Just construct a std::string from StringRef to work around it.
ProgrammersManual.html says
> StringMap iteration order, however, is not guaranteed to be deterministic, so any uses which require that should instead use a std::map.
This patch makes -DLLVM_REVERSE_ITERATION=on (currently
-DLLVM_ENABLE_REVERSE_ITERATION=on works as well) shuffle StringMap
iteration order (actually flipping the hash so that elements not in the
same bucket are reversed) to catch violations, similar to D35043 for
DenseMap. This should help change the hash function (e.g., D142862,
D155781).
With a lot of fixes, there are still some violations. This patch
implements the "reverse_iteration" lit feature to skip such tests.
Eventually we should remove this feature.
`ninja check-{llvm,clang,clang-tools}` are clean with
`#define LLVM_ENABLE_REVERSE_ITERATION 1`.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D155789
Link time thinLTO spawns pthreads to parallelize optimization and
codegen of the input bitcode files. On AIX, the default pthread
stack size limit is ~192k for 64-bit programs; insufficient for a
normal LLVM compilation.
Reviewed By: ZarkoCA, MaskRay
Differential Revision: https://reviews.llvm.org/D155731
ld.lld SHF_MERGE|SHF_STRINGS duplicate elimination is computation heavy
and utilitizes llvm::xxHash64, a simplified version of XXH64.
Externally many sources confirm that a new variant XXH3 is much faster.
I have picked a few hash implementations and computed the
proportion of time spent on hashing in the overall link time (a debug
build of clang 16 on a machine using AMD Zen 2 architecture):
* llvm::xxHash64: 3.63%
* official XXH64 (`#define XXH_VECTOR XXH_SCALAR`): 3.53%
* official XXH3_64bits (`#define XXH_VECTOR XXH_SCALAR`): 1.21%
* official XXH3_64bits (default, essentially `XXH_SSE2`): 1.22%
* this patch llvm::xxh3_64bits: 1.19%
The remaining part of lld remains unchanged. Consequently, a lower ratio
indicates that hashing is faster. Therefore, it is evident that XXH3 from xxhash
is significantly faster than both the official version and our llvm::xxHash64.
(
string length: count
1-3: 393434
4-8: 2084056
9-16: 2846249
17-128: 5598928
129-240: 1317989
241-: 328058
)
This patch adds heavily simplified https://github.com/Cyan4973/xxHash,
taking account of many simplification ideas from Devin Hussey's xxhash-clean.
Important x86-64 optimization ideas:
* Make XXH3_len_129to240_64b and XXH3_hashLong_64b noinline
* Unroll XXH3_len_17to128_64b
* __restrict does not affect Clang code generation
Beside SHF_MERGE|SHF_STRINGS duplicate elimination, llvm/ADT/StringMap.h
StringMapImpl::LookupBucketFor and a few places in lld can potentially be
accelerated by switching to llvm::xxh3_64bits.
Link: https://github.com/llvm/llvm-project/issues/63750
Reviewed By: serge-sans-paille
Differential Revision: https://reviews.llvm.org/D154812
With the new behaviour, the /MD or similar options aren't added to
e.g. CMAKE_CXX_FLAGS_RELEASE, but are added separately by CMake.
They can be changed by the cmake variable
CMAKE_MSVC_RUNTIME_LIBRARY or with the target property
MSVC_RUNTIME_LIBRARY.
LLVM has had its own custom CMake flags, e.g. LLVM_USE_CRT_RELEASE,
which affects which CRT is used for release mode builds. Deprecate
these and direct users to use CMAKE_MSVC_RUNTIME_LIBRARY directly
instead (and do a best effort attempt at setting CMAKE_MSVC_RUNTIME_LIBRARY
based on the existing LLVM_USE_CRT_ flags). This only handles the
simple cases, it doesn't handle multi-config generators with
different LLVM_USE_CRT_* variables for different configs though,
but that's probably fine - we should move over to the new upstream
CMake mechanism anyway, and push users towards that.
Change code in compiler-rt, that previously tried to override the
CRT choice to /MT, to set CMAKE_MSVC_RUNTIME_LIBRARY instead of
meddling in the old variables.
This resolves the policy issue in
https://github.com/llvm/llvm-project/issues/63286, and should
handle the issues that were observed originally when the
minimum CMake version was bumped, in
https://github.com/llvm/llvm-project/issues/62719 and
https://github.com/llvm/llvm-project/issues/62739.
Differential Revision: https://reviews.llvm.org/D155233
This is a mostly NFC change cleaning up and clarifying components of the
in-tree CORE-V (xcv*) extensions following discussions on the remaining
extensions.
This makes the following changes to the xcbitmanip and xcvmac support:
1. Add missing extensions from RISCVISAInfo, such that they can be
supported in clang's -march option.
2. Clarify the extension version number is 1.0.0 in documentation.
3. Clarify the extensions are by OpenHW Group, and the capitilization
of the CORE-V extension family.
4. Add CORE-V to extension name in RISCVFeatures, both to be consistent
with other vendors, and also better distinguish e.g. CORE-V bit
manipulation vs RISC-V's standard Zb extensions.
Differential Revision: https://reviews.llvm.org/D155283
Move the implementation of the `toString` function from
`llvm/Support/Error.h` to the source file, which allows us to move
`#include "llvm/ADT/StringExtras.h"` to the source file as well.
As `Error.h` is present in a large number of translation units this
means we are unnecessarily bringing in the contents of
`StringExtras.h` - itself a large file with lots of includes - and
slowing down compilation.
Also move the `#include "llvm/ADT/SmallVector.h"` directive to the
source file as it's no longer needed, but this does not give as much of
a benefit.
This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,920,413,050 to 1,903,629,230 - a
reduction of ~0.87%. This should result in a small improvement in
compilation time.
Differential Revision: https://reviews.llvm.org/D155178
According to the spec, Zce is an alias for Zca, Zcb, Zcmp, and Zcmt.
If F is enabled on RV32 it also includes Zcf.
This patch adds the Zce and the implication rule which unfortunately
requires custom handling for adding Zcf.
I've also made all the Zc* extensions imply Zca.
I've also added an error for Zcf without RV32.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D153742
Move the implementation of the `toString` function from
`llvm/Support/Error.h` to the source file, which allows us to move
`#include "llvm/ADT/StringExtras.h"` to the source file as well.
As `Error.h` is present in a large number of translation units this
means we are unnecessarily bringing in the contents of
`StringExtras.h` - itself a large file with lots of includes - and
slowing down compilation.
Also move the `#include "llvm/ADT/SmallVector.h"` directive to the
source file as it's no longer needed, but this does not give as much of
a benefit.
This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,920,413,050 to 1,903,629,230 - a
reduction of ~0.87%. This should result in a small improvement in
compilation time.
Differential Revision: https://reviews.llvm.org/D155018
This restores commit b4a82b62258c5f650a1cccf5b179933e6bae4867, reverted
in 3ab7ef28eebf9019eb3d3c4efd7ebfd160106bb1 because it was thought to
cause a bot failure, which ended up being unrelated to this patch set.
Differential Revision: https://reviews.llvm.org/D154856
Previously the MemProf profile was expected to be in the same profile
file as a normal PGO profile, passed via the usual -fprofile-use=
option, and was matched in the same pass. To simplify profile
preparation, since the raw MemProf profile requires the binary for
symbolization and may be simpler to index separately from the raw PGO
profile, and also to enable providing a MemProf profile for a SamplePGO
build, separate out the MemProf feedback option and matching pass.
This patch adds the -fmemory-profile-use=${file} option, and the
provided file is passed down to LLVM and ultimately used in a new
MemProfUsePass which performs the matching of just the memory profile
contents of that file.
Note that a single profile file containing both normal PGO and MemProf
profile data is still supported, and the relevant profile data is
matched by the appropriate matching pass(es) based on which option(s)
the profile is provided with (the same profile file can be supplied to
both feedback options).
Differential Revision: https://reviews.llvm.org/D154856
A follow-up to D135841. This patch returns the virtual path for directories from `RedirectingFileSystem`. This ensures the contents of `Path` are the same as the contents of `FS->getRealPath(Path)`. This also means we can drop the workaround in Clang's module map canonicalization, where we couldn't use the real path for a directory if it resolved to a different `DirectoryEntry`. In addition to that, we can also avoid introducing new workaround for a bug triggered by the newly introduced test case.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D135849
This implements the v1.0-rc1 draft extension.
amocas.d on RV32 and amocas.q have the restriction that rd and rs2 must
be even registers. I've opted to implement this restriction in
RISCVAsmParser::validateInstruction even though for codegen we'll need a
new register class and can then remove this validation. This also
sidesteps, for now, the issue of amocas.d being different on rv32 vs
rv64.
See <https://github.com/riscv-non-isa/riscv-c-api-doc/issues/37> for the
issue of needing an agreed asm register constraint for register pairs.
Differential Revision: https://reviews.llvm.org/D149248
Move the implementation of the `toString` function from
`llvm/Support/Error.h` to the source file, which allows us to move
`#include "llvm/ADT/StringExtras.h"` to the source file as well.
As `Error.h` is present in a large number of translation units this
means we are unnecessarily bringing in the contents of
`StringExtras.h` - itself a large file with lots of includes - and
slowing down compilation.
Also move the `#include "llvm/ADT/SmallVector.h"` directive to the
source file as it's no longer needed, but this does not give as much of
a benefit.
This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,920,413,050 to 1,903,629,230 - a
reduction of ~0.87%. This should result in a small improvement in
compilation time.
Differential Revision: https://reviews.llvm.org/D154775