5 Commits

Author SHA1 Message Date
Brendan Duke
f991ebbb46
[Support] Add llvm::xxh3_128bits (#95863)
Add a 128-bit xxhash function, following the existing
`llvm::xxh3_64bits` and `llvm::xxHash` implementations. Previously,
48e93f57f1ee914ca29aa31bf2ccd916565a3610 added support for
`llvm::xxh3_64bits`, which closely follows the upstream implementation
at https://github.com/Cyan4973/xxHash, with simplifications from Devin
Hussey's xxhash-clean.
However, it is desirable to have a larger 128-bit hash key for use cases
such as filesystem checksums where chance of collision needs to be
negligible.

So to that end this also ports over the 128-bit xxh3_128bits as
`llvm::xxh3_128bits`.

Testing:
- Add a test based on xsum_sanity_check.c in upstream xxhash.
2024-06-19 15:24:54 -04:00
Fangrui Song
48e93f57f1 [Support] Add llvm::xxh3_64bits
ld.lld SHF_MERGE|SHF_STRINGS duplicate elimination is computation heavy
and utilitizes llvm::xxHash64, a simplified version of XXH64.
Externally many sources confirm that a new variant XXH3 is much faster.

I have picked a few hash implementations and computed the
proportion of time spent on hashing in the overall link time (a debug
build of clang 16 on a machine using AMD Zen 2 architecture):

* llvm::xxHash64: 3.63%
* official XXH64 (`#define XXH_VECTOR XXH_SCALAR`): 3.53%
* official XXH3_64bits (`#define XXH_VECTOR XXH_SCALAR`): 1.21%
* official XXH3_64bits (default, essentially `XXH_SSE2`): 1.22%
* this patch llvm::xxh3_64bits: 1.19%

The remaining part of lld remains unchanged. Consequently, a lower ratio
indicates that hashing is faster. Therefore, it is evident that XXH3 from xxhash
is significantly faster than both the official version and our llvm::xxHash64.

(
string length: count
1-3: 393434
4-8: 2084056
9-16: 2846249
17-128: 5598928
129-240: 1317989
241-: 328058
)

This patch adds heavily simplified https://github.com/Cyan4973/xxHash,
taking account of many simplification ideas from Devin Hussey's xxhash-clean.

Important x86-64 optimization ideas:

* Make XXH3_len_129to240_64b and XXH3_hashLong_64b noinline
* Unroll XXH3_len_17to128_64b
* __restrict does not affect Clang code generation

Beside SHF_MERGE|SHF_STRINGS duplicate elimination, llvm/ADT/StringMap.h
StringMapImpl::LookupBucketFor and a few places in lld can potentially be
accelerated by switching to llvm::xxh3_64bits.

Link: https://github.com/llvm/llvm-project/issues/63750

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D154812
2023-07-18 13:36:11 -07:00
Benjamin Kramer
72eac42f21 [xxHash] Don't trigger UB on empty StringRef
This is quite silly, but casting to uintptr_t seems like the easiest
option to quiet ubsan.

llvm/lib/Support/xxhash.cpp:107:12: runtime error: applying non-zero offset 8 to null pointer
    #0 0x7fe3660404c0 in llvm::xxHash64(llvm::StringRef) llvm/lib/Support/xxhash.cpp:107:12
2023-02-08 12:53:54 +01:00
Chandler Carruth
2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Rafael Espindola
eaeb6d91a1 Add xxhash to llvm.
It will be used for fast fingerprinting in lld at least.

llvm-svn: 282493
2016-09-27 15:45:57 +00:00