344 Commits

Author SHA1 Message Date
Vinay Deshmukh
257e483715
[libc] Add -Wno-sign-conversion & re-attempt -Wconversion (#129811)
Relates to
https://github.com/llvm/llvm-project/issues/119281#issuecomment-2699470459
2025-03-10 11:57:09 -04:00
Augie Fackler
da61b0ddc5 Revert "[libc] Enable -Wconversion for tests. (#127523)"
This reverts commit 1e6e845d49a336e9da7ca6c576ec45c0b419b5f6 because it
changed the 1st parameter of adjust() to be unsigned, but libc itself
calls adjust() with a negative argument in align_backward() in
op_generic.h.
2025-03-05 16:42:40 -05:00
Michael Jones
ed5cd8d464
[libc] Fix casts for arm32 after Wconversion (#129771)
Followup to #127523

There were some test failures on arm32 after enabling Wconversion. There
were some tests that were failing due to missing casts. Also I changed
BigInt's `safe_get_at` back to being signed since it needed the ability
to be negative.
2025-03-04 14:32:36 -08:00
Vinay Deshmukh
1e6e845d49
[libc] Enable -Wconversion for tests. (#127523)
Relates to: #119281
2025-03-04 10:24:35 -05:00
Krishna Pandey
19c3e2f7de
[libc] Fix all imports of src/string/memory_utils (#114939)
Fixed imports for all files *within* `libc/src/string/memory_utils`.
Note: This doesn't include **all** files that need to be fixed.

Fixes #86579
2025-02-05 09:24:44 -08:00
Nick Desaulniers
631a6e0004
[libc][wchar] implement wcslen (#124150)
Update string_utils' string_length to work with char* or wchar_t*, so that it
may be reusable when implementing wmemchr, wcspbrk, wcsrchr, wcsstr.

Link: #121183
Link: #124027

Co-authored-by: Nick Desaulniers <ndesaulniers@google.com>

---------

Co-authored-by: Tristan Ross <tristan.ross@midstall.com>
2025-01-23 13:33:04 -08:00
Nick Desaulniers
431ea2d076
[libc] move bcmp, bzero, bcopy, index, rindex, strcasecmp, strncasecmp to strings.h (#118899)
docgen relies on the convention that we have a file foo.cpp in
libc/src/\<header\>/. Because the above functions weren't in libc/src/strings/
but rather libc/src/string/, docgen could not find that we had implemented
these.

Rather than add special carve outs to docgen, let's fix up our sources for
these 7 functions to stick with the existing conventions the rest of the
codebase follows.

Link: #118860
Fixes: #118875
2024-12-10 08:58:45 -08:00
Schrodinger ZHU Yifan
c71418574f
[libc] suppress more clang-cl warnings (#117718)
- migrate more `-O3` to `${libc_opt_high_flag}`
- workaround a issue with `LLP64` in test. The overflow testing is
guarded by a constexpr but the literal overflow itself will still
trigger warnings.

Notice that for math smoke test, for some reasons, the
`${libc_opt_high_flag}` will be passed into `lld-link` which confuses
the linker so there are still some warnings leftover there. I can
investigate more when I have time.
2024-11-26 15:15:58 -05:00
Schrodinger ZHU Yifan
1973270fc6
[libc] suppress string warning in case intrinsics are defined as macros (#117640) 2024-11-25 18:05:22 -05:00
Jay Foad
d6fc7d3ab1 Fix typo "intead" 2024-11-21 14:48:38 +00:00
Daniel Thornburgh
95b680e4c3
[libc] Rename libc/src/__support/endian.h to endian_internal.h (#115950)
This prevents a conflict with the Linux system endian.h when built in
overlay mode for CPP files in __support.

This issue appeared in PR #106259.
2024-11-13 10:28:07 -08:00
Job Henandez Lara
33bdb53d86
[libc] Remove the #include <stdlib.h> header (#114453) 2024-11-01 21:49:57 -07:00
George Burgess IV
50c44478fe
[libc] fix behavior of strrchr(x, '\0') (#112620)
`strrchr("foo", '\0')` is defined to point to the end of `foo`, rather
than returning NULL. This wasn't caught by tests, since llvm-libc's
`ASSERT_STREQ(nullptr, "");` is not an assertion error.

While I'm here, refactor the test slightly to check for NULL more
specifically. I considered adding fancier `ASSERT`s (and changing the
semantics of `ASSERT_STREQ`), but opted for a more local fix by fair
dice roll.
2024-10-30 15:08:03 -07:00
George Burgess IV
b03c8c4fdd
libc: strlcpy/strlcat shouldn't bzero the rest of buf (#114259)
When running Bionic's testsuite over llvm-libc, tests broke because
e.g.,

```
const char *str = "abc";
char buf[7]{"111111"};
strlcpy(buf, str, 7);
ASSERT_EQ(buf, {'1', '1', '1', '\0', '\0', '\0', '\0'});
```

On my machine (Debian w/ glibc and clang-16), a `printf` loop over `buf`
gets unrolled into a series of const `printf` at compile-time:
```
printf("%d\n", '1');
printf("%d\n", '1');
printf("%d\n", '1');
printf("%d\n", 0);
printf("%d\n", '1');
printf("%d\n", '1');
printf("%d\n", 0);
```

Seems best to match existing precedent here.
2024-10-30 12:28:32 -07:00
Guillaume Chatelet
2f58ac4a22
[libc][x86] copy one cache line at a time to prevent the use of rep;movsb (#113161)
When using `-mprefer-vector-width=128` with `-march=sandybridge` copying
3 cache lines in one go (192B) gets converted into `rep;movsb` which
translate into a 60% hit in performance.

Consecutive calls to `__builtin_memcpy_inline` (implementation behind
`builtin::Memcpy::block_offset`) are not coalesced by the compiler and
so calling it three times in a row generates the desired assembly. It
only differs in the interleaving of the loads and stores and does not
affect performance.

This is needed to reland
https://github.com/llvm/llvm-project/pull/108939.
2024-10-22 10:48:43 +02:00
Job Henandez Lara
2ce10f0491
[libc] Remove the <string.h> header in libc/src and libc/test (#113076) 2024-10-20 09:05:41 -07:00
c8ef
2cc9795140
[libc] Clean up some include in libc. (#110980)
The patch primarily cleans up some incorrect includes. The `LIBC_INLINE`
macro is defined in `attributes.h`, not `config.h`. There appears to be
no need to change the CMake and Bazel build files.
2024-10-06 11:08:34 -04:00
Vitaly Goldshteyn
66a03295de
[libc] Implement branchless head-tail comparison for bcmp (#107540)
Binary size changes:

| Bytes (cache lines) | before   | after   |
|---------------------|----------|---------|
| sse4                | 419 (7)  | 288 (5) |
| avx                 | 430 (7)  | 308 (5) |
| avx512f             | 589 (10) | 390 (7) |

Benchmarks for different CPUs using
https://github.com/google/fleetbench.

 - indus-cascadelake

```
name                                                       old speed            new speed            delta
BM_LIBC_Bcmp_Fleet_L1                                      1.96GB/s ± 1%        2.19GB/s ± 0%  +11.49%  (p=0.000 n=29+24)
BM_LIBC_Bcmp_Fleet_L2                                      1.90GB/s ± 1%        2.14GB/s ± 1%  +12.68%  (p=0.000 n=29+24)
BM_LIBC_Bcmp_Fleet_LLC                                      513MB/s ± 4%         531MB/s ± 4%   +3.53%  (p=0.000 n=24+24)
BM_LIBC_Bcmp_Fleet_Cold                                     452MB/s ± 3%         456MB/s ± 4%     ~     (p=0.103 n=30+30)
BM_LIBC_Bcmp_0_L1                                [Bcmp_0]  2.98GB/s ± 1%        3.15GB/s ± 1%   +5.59%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_0_L2                                [Bcmp_0]  2.86GB/s ± 1%        3.07GB/s ± 1%   +7.21%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_0_LLC                               [Bcmp_0]   738MB/s ± 7%         751MB/s ± 3%   +1.68%  (p=0.000 n=24+25)
BM_LIBC_Bcmp_0_Cold                              [Bcmp_0]   643MB/s ± 3%         642MB/s ± 4%     ~     (p=0.522 n=29+30)
BM_LIBC_Bcmp_1_L1                                [Bcmp_1]  3.08GB/s ± 0%        3.25GB/s ± 0%   +5.35%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_1_L2                                [Bcmp_1]  2.97GB/s ± 1%        3.17GB/s ± 1%   +6.65%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_1_LLC                               [Bcmp_1]   901MB/s ±59%         871MB/s ±36%     ~     (p=0.676 n=29+27)
BM_LIBC_Bcmp_1_Cold                              [Bcmp_1]   686MB/s ± 4%         686MB/s ± 3%     ~     (p=0.934 n=29+30)
BM_LIBC_Bcmp_2_L1                                [Bcmp_2]  1.63GB/s ± 0%        1.80GB/s ± 1%  +10.19%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_2_L2                                [Bcmp_2]  1.57GB/s ± 1%        1.75GB/s ± 1%  +11.46%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_2_LLC                               [Bcmp_2]   451MB/s ±61%         427MB/s ±28%     ~     (p=0.469 n=29+25)
BM_LIBC_Bcmp_2_Cold                              [Bcmp_2]   353MB/s ± 4%         354MB/s ± 5%     ~     (p=0.467 n=30+30)
BM_LIBC_Bcmp_3_L1                                [Bcmp_3]  1.91GB/s ± 1%        2.10GB/s ± 1%   +9.90%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_3_L2                                [Bcmp_3]  1.84GB/s ± 1%        2.03GB/s ± 1%  +10.63%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_3_LLC                               [Bcmp_3]   491MB/s ±24%         538MB/s ±24%   +9.66%  (p=0.000 n=24+27)
BM_LIBC_Bcmp_3_Cold                              [Bcmp_3]   417MB/s ± 4%         421MB/s ± 3%     ~     (p=0.063 n=30+29)
BM_LIBC_Bcmp_4_L1                                [Bcmp_4]   761MB/s ± 1%         867MB/s ± 1%  +14.02%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_4_L2                                [Bcmp_4]   748MB/s ± 1%         860MB/s ± 1%  +15.04%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_4_LLC                               [Bcmp_4]   227MB/s ±29%         260MB/s ±64%  +14.70%  (p=0.000 n=26+27)
BM_LIBC_Bcmp_4_Cold                              [Bcmp_4]   187MB/s ± 3%         191MB/s ± 5%   +2.26%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_5_L1                                [Bcmp_5]  1.48GB/s ± 1%        1.71GB/s ± 1%  +15.26%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_5_L2                                [Bcmp_5]  1.42GB/s ± 1%        1.67GB/s ± 1%  +17.68%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_5_LLC                               [Bcmp_5]   412MB/s ±34%         519MB/s ±80%  +25.87%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_5_Cold                              [Bcmp_5]   336MB/s ± 4%         343MB/s ± 6%   +2.05%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_6_L1                                [Bcmp_6]  2.87GB/s ± 0%        3.24GB/s ± 1%  +12.88%  (p=0.000 n=26+30)
BM_LIBC_Bcmp_6_L2                                [Bcmp_6]  2.78GB/s ± 1%        3.20GB/s ± 1%  +15.15%  (p=0.000 n=26+30)
BM_LIBC_Bcmp_6_LLC                               [Bcmp_6]   926MB/s ±43%        1227MB/s ±76%  +32.53%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_6_Cold                              [Bcmp_6]   716MB/s ± 4%         737MB/s ± 6%   +3.02%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_7_L1                                [Bcmp_7]  1.54GB/s ± 1%        1.56GB/s ± 0%   +1.40%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_7_L2                                [Bcmp_7]  1.47GB/s ± 1%        1.52GB/s ± 1%   +2.97%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_7_LLC                               [Bcmp_7]   351MB/s ±23%         436MB/s ±83%  +24.04%  (p=0.005 n=24+29)
BM_LIBC_Bcmp_7_Cold                              [Bcmp_7]   283MB/s ± 4%         282MB/s ± 4%     ~     (p=0.644 n=30+30)
BM_LIBC_Bcmp_8_L1                                [Bcmp_8]   824MB/s ± 1%        1048MB/s ± 1%  +27.18%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_8_L2                                [Bcmp_8]   808MB/s ± 1%        1027MB/s ± 1%  +27.12%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_8_LLC                               [Bcmp_8]   317MB/s ±79%         332MB/s ±74%     ~     (p=0.338 n=30+29)
BM_LIBC_Bcmp_8_Cold                              [Bcmp_8]   207MB/s ± 5%         212MB/s ± 5%   +2.27%  (p=0.000 n=30+30)
```

 - indus-skylake

```
name                                                       old speed            new speed            delta
BM_LIBC_Bcmp_Fleet_L1                                      2.06GB/s ± 2%        2.25GB/s ± 3%   +9.66%  (p=0.000 n=27+24)
BM_LIBC_Bcmp_Fleet_L2                                      1.96GB/s ± 2%        2.17GB/s ± 2%  +10.61%  (p=0.000 n=30+24)
BM_LIBC_Bcmp_Fleet_LLC                                     1.18GB/s ± 6%        1.32GB/s ± 5%  +12.27%  (p=0.000 n=28+28)
BM_LIBC_Bcmp_Fleet_Cold                                     456MB/s ± 2%         466MB/s ± 2%   +2.22%  (p=0.000 n=28+28)
BM_LIBC_Bcmp_0_L1                                [Bcmp_0]  3.08GB/s ± 2%        3.20GB/s ± 1%   +3.72%  (p=0.000 n=28+22)
BM_LIBC_Bcmp_0_L2                                [Bcmp_0]  2.92GB/s ± 1%        3.05GB/s ± 2%   +4.49%  (p=0.000 n=23+23)
BM_LIBC_Bcmp_0_LLC                               [Bcmp_0]  1.83GB/s ± 8%        1.94GB/s ± 4%   +6.24%  (p=0.000 n=25+27)
BM_LIBC_Bcmp_0_Cold                              [Bcmp_0]   654MB/s ± 2%         659MB/s ± 2%   +0.76%  (p=0.012 n=30+29)
BM_LIBC_Bcmp_1_L1                                [Bcmp_1]  3.19GB/s ± 2%        3.34GB/s ± 2%   +4.41%  (p=0.000 n=26+23)
BM_LIBC_Bcmp_1_L2                                [Bcmp_1]  3.05GB/s ± 2%        3.21GB/s ± 2%   +5.32%  (p=0.000 n=28+25)
BM_LIBC_Bcmp_1_LLC                               [Bcmp_1]  1.95GB/s ± 4%        2.03GB/s ±10%   +3.61%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_1_Cold                              [Bcmp_1]   700MB/s ± 2%         702MB/s ± 2%     ~     (p=0.150 n=30+30)
BM_LIBC_Bcmp_2_L1                                [Bcmp_2]  1.69GB/s ± 2%        1.85GB/s ± 1%   +9.31%  (p=0.000 n=30+26)
BM_LIBC_Bcmp_2_L2                                [Bcmp_2]  1.60GB/s ± 2%        1.78GB/s ± 2%  +10.90%  (p=0.000 n=26+27)
BM_LIBC_Bcmp_2_LLC                               [Bcmp_2]  1.01GB/s ± 5%        1.12GB/s ± 5%  +11.40%  (p=0.000 n=27+28)
BM_LIBC_Bcmp_2_Cold                              [Bcmp_2]   355MB/s ± 3%         360MB/s ± 3%   +1.46%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_3_L1                                [Bcmp_3]  1.98GB/s ± 2%        2.15GB/s ± 2%   +8.89%  (p=0.000 n=29+27)
BM_LIBC_Bcmp_3_L2                                [Bcmp_3]  1.87GB/s ± 3%        2.05GB/s ± 2%  +10.06%  (p=0.000 n=30+26)
BM_LIBC_Bcmp_3_LLC                               [Bcmp_3]  1.19GB/s ± 4%        1.31GB/s ± 6%   +9.82%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_3_Cold                              [Bcmp_3]   424MB/s ± 3%         431MB/s ± 3%   +1.58%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_4_L1                                [Bcmp_4]   849MB/s ± 2%         949MB/s ± 2%  +11.84%  (p=0.000 n=27+28)
BM_LIBC_Bcmp_4_L2                                [Bcmp_4]   815MB/s ± 3%         913MB/s ± 3%  +12.06%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_4_LLC                               [Bcmp_4]   512MB/s ± 9%         571MB/s ± 7%  +11.40%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_4_Cold                              [Bcmp_4]   187MB/s ± 3%         192MB/s ± 2%   +2.56%  (p=0.000 n=30+28)
BM_LIBC_Bcmp_5_L1                                [Bcmp_5]  1.55GB/s ± 2%        1.77GB/s ± 3%  +13.93%  (p=0.000 n=30+28)
BM_LIBC_Bcmp_5_L2                                [Bcmp_5]  1.47GB/s ± 2%        1.70GB/s ± 2%  +15.96%  (p=0.000 n=27+26)
BM_LIBC_Bcmp_5_LLC                               [Bcmp_5]   939MB/s ± 5%        1084MB/s ± 4%  +15.36%  (p=0.000 n=28+27)
BM_LIBC_Bcmp_5_Cold                              [Bcmp_5]   340MB/s ± 2%         347MB/s ± 3%   +1.93%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_6_L1                                [Bcmp_6]  3.06GB/s ± 3%        3.40GB/s ± 2%  +11.13%  (p=0.000 n=30+28)
BM_LIBC_Bcmp_6_L2                                [Bcmp_6]  2.89GB/s ± 3%        3.24GB/s ± 2%  +12.20%  (p=0.000 n=29+26)
BM_LIBC_Bcmp_6_LLC                               [Bcmp_6]  1.93GB/s ± 4%        2.09GB/s ±11%   +8.16%  (p=0.000 n=26+30)
BM_LIBC_Bcmp_6_Cold                              [Bcmp_6]   746MB/s ± 2%         762MB/s ± 2%   +2.11%  (p=0.000 n=30+28)
BM_LIBC_Bcmp_7_L1                                [Bcmp_7]  1.59GB/s ± 2%        1.62GB/s ± 2%   +1.72%  (p=0.000 n=25+27)
BM_LIBC_Bcmp_7_L2                                [Bcmp_7]  1.49GB/s ± 2%        1.53GB/s ± 2%   +2.62%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_7_LLC                               [Bcmp_7]   852MB/s ±10%         909MB/s ± 6%   +6.71%  (p=0.000 n=30+29)
BM_LIBC_Bcmp_7_Cold                              [Bcmp_7]   283MB/s ± 3%         283MB/s ± 2%     ~     (p=0.617 n=30+27)
BM_LIBC_Bcmp_8_L1                                [Bcmp_8]   891MB/s ± 2%        1083MB/s ± 2%  +21.64%  (p=0.000 n=27+24)
BM_LIBC_Bcmp_8_L2                                [Bcmp_8]   855MB/s ± 2%        1045MB/s ± 1%  +22.31%  (p=0.000 n=25+23)
BM_LIBC_Bcmp_8_LLC                               [Bcmp_8]   568MB/s ± 7%         659MB/s ± 8%  +16.04%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_8_Cold                              [Bcmp_8]   207MB/s ± 2%         212MB/s ± 2%   +2.31%  (p=0.000 n=30+27)
```

 - arcadia-rome

```
name                                                       old speed            new speed            delta
BM_LIBC_Bcmp_Fleet_L1                                      2.16GB/s ± 2%        2.27GB/s ± 2%   +5.13%  (p=0.000 n=26+30)
BM_LIBC_Bcmp_Fleet_L2                                      2.15GB/s ± 2%        2.25GB/s ± 2%   +4.64%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_Fleet_LLC                                     1.73GB/s ± 3%        1.81GB/s ± 3%   +4.66%  (p=0.000 n=25+28)
BM_LIBC_Bcmp_Fleet_Cold                                     494MB/s ± 1%         496MB/s ± 2%   +0.45%  (p=0.023 n=22+24)
BM_LIBC_Bcmp_0_L1                                [Bcmp_0]  3.30GB/s ± 1%        3.24GB/s ± 2%   -1.70%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_0_L2                                [Bcmp_0]  3.23GB/s ± 2%        3.19GB/s ± 2%   -1.28%  (p=0.000 n=28+28)
BM_LIBC_Bcmp_0_LLC                               [Bcmp_0]  2.59GB/s ± 3%        2.58GB/s ± 2%   -0.65%  (p=0.010 n=26+26)
BM_LIBC_Bcmp_0_Cold                              [Bcmp_0]   720MB/s ± 1%         707MB/s ± 3%   -1.75%  (p=0.000 n=22+25)
BM_LIBC_Bcmp_1_L1                                [Bcmp_1]  3.37GB/s ± 1%        3.36GB/s ± 2%     ~     (p=0.102 n=28+29)
BM_LIBC_Bcmp_1_L2                                [Bcmp_1]  3.32GB/s ± 2%        3.30GB/s ± 2%   -0.51%  (p=0.038 n=28+29)
BM_LIBC_Bcmp_1_LLC                               [Bcmp_1]  2.67GB/s ± 4%        2.70GB/s ± 4%   +0.96%  (p=0.009 n=28+27)
BM_LIBC_Bcmp_1_Cold                              [Bcmp_1]   755MB/s ± 1%         751MB/s ± 2%   -0.57%  (p=0.000 n=22+25)
BM_LIBC_Bcmp_2_L1                                [Bcmp_2]  1.79GB/s ± 1%        1.86GB/s ± 2%   +3.92%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_2_L2                                [Bcmp_2]  1.77GB/s ± 2%        1.82GB/s ± 2%   +2.99%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_2_LLC                               [Bcmp_2]  1.41GB/s ± 4%        1.47GB/s ± 3%   +3.97%  (p=0.000 n=28+28)
BM_LIBC_Bcmp_2_Cold                              [Bcmp_2]   386MB/s ± 1%         389MB/s ± 1%   +0.60%  (p=0.000 n=21+23)
BM_LIBC_Bcmp_3_L1                                [Bcmp_3]  2.07GB/s ± 2%        2.17GB/s ± 2%   +4.87%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_3_L2                                [Bcmp_3]  2.07GB/s ± 2%        2.13GB/s ± 2%   +3.02%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_3_LLC                               [Bcmp_3]  1.66GB/s ± 2%        1.73GB/s ± 2%   +4.08%  (p=0.000 n=29+26)
BM_LIBC_Bcmp_3_Cold                              [Bcmp_3]   466MB/s ± 2%         469MB/s ± 3%   +0.66%  (p=0.001 n=22+25)
BM_LIBC_Bcmp_4_L1                                [Bcmp_4]   861MB/s ± 1%         964MB/s ± 2%  +11.98%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_4_L2                                [Bcmp_4]   853MB/s ± 2%         935MB/s ± 2%   +9.54%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_4_LLC                               [Bcmp_4]   707MB/s ± 3%         743MB/s ± 4%   +5.08%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_4_Cold                              [Bcmp_4]   199MB/s ± 3%         199MB/s ± 2%     ~     (p=0.107 n=29+25)
BM_LIBC_Bcmp_5_L1                                [Bcmp_5]  1.65GB/s ± 1%        1.75GB/s ± 2%   +6.15%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_5_L2                                [Bcmp_5]  1.64GB/s ± 3%        1.73GB/s ± 2%   +5.37%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_5_LLC                               [Bcmp_5]  1.32GB/s ± 2%        1.40GB/s ± 2%   +6.21%  (p=0.000 n=28+27)
BM_LIBC_Bcmp_5_Cold                              [Bcmp_5]   370MB/s ± 3%         371MB/s ± 2%   +0.16%  (p=0.008 n=29+25)
BM_LIBC_Bcmp_6_L1                                [Bcmp_6]  3.25GB/s ± 2%        3.47GB/s ± 2%   +6.74%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_6_L2                                [Bcmp_6]  3.26GB/s ± 1%        3.44GB/s ± 1%   +5.43%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_6_LLC                               [Bcmp_6]  2.66GB/s ± 2%        2.79GB/s ± 3%   +4.90%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_6_Cold                              [Bcmp_6]   812MB/s ± 3%         799MB/s ± 2%   -1.57%  (p=0.000 n=29+25)
BM_LIBC_Bcmp_7_L1                                [Bcmp_7]  1.71GB/s ± 2%        1.66GB/s ± 2%   -3.14%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_7_L2                                [Bcmp_7]  1.63GB/s ± 2%        1.59GB/s ± 2%   -2.50%  (p=0.000 n=29+28)
BM_LIBC_Bcmp_7_LLC                               [Bcmp_7]  1.25GB/s ± 4%        1.25GB/s ± 2%     ~     (p=0.530 n=28+26)
BM_LIBC_Bcmp_7_Cold                              [Bcmp_7]   311MB/s ± 3%         308MB/s ± 1%     ~     (p=0.127 n=29+24)
BM_LIBC_Bcmp_8_L1                                [Bcmp_8]   869MB/s ± 2%        1098MB/s ± 2%  +26.28%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_8_L2                                [Bcmp_8]   873MB/s ± 2%        1075MB/s ± 1%  +23.06%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_8_LLC                               [Bcmp_8]   743MB/s ± 4%         859MB/s ± 4%  +15.58%  (p=0.000 n=27+27)
BM_LIBC_Bcmp_8_Cold                              [Bcmp_8]   221MB/s ± 4%         221MB/s ± 3%   +0.14%  (p=0.034 n=29+25)
```

 - ixion-haswell

```
name                                                       old speed            new speed            delta
BM_LIBC_Bcmp_Fleet_L1                                      2.27GB/s ± 5%        2.41GB/s ± 6%   +6.10%  (p=0.000 n=29+28)
BM_LIBC_Bcmp_Fleet_L2                                      2.14GB/s ± 6%        2.33GB/s ± 5%   +9.21%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_Fleet_LLC                                     1.30GB/s ± 9%        1.43GB/s ± 8%   +9.85%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_Fleet_Cold                                     475MB/s ± 6%         475MB/s ± 5%     ~     (p=0.839 n=30+29)
BM_LIBC_Bcmp_0_L1                                [Bcmp_0]  3.38GB/s ± 7%        3.46GB/s ± 6%   +2.35%  (p=0.009 n=30+29)
BM_LIBC_Bcmp_0_L2                                [Bcmp_0]  3.20GB/s ± 5%        3.32GB/s ± 6%   +3.52%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_0_LLC                               [Bcmp_0]  1.88GB/s ± 9%        2.00GB/s ± 6%   +6.63%  (p=0.000 n=30+28)
BM_LIBC_Bcmp_0_Cold                              [Bcmp_0]   664MB/s ± 6%         655MB/s ± 6%   -1.32%  (p=0.025 n=30+30)
BM_LIBC_Bcmp_1_L1                                [Bcmp_1]  3.50GB/s ± 8%        3.61GB/s ±10%   +3.09%  (p=0.001 n=29+30)
BM_LIBC_Bcmp_1_L2                                [Bcmp_1]  3.32GB/s ± 7%        3.48GB/s ± 8%   +4.89%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_1_LLC                               [Bcmp_1]  2.02GB/s ± 7%        2.14GB/s ± 9%   +5.82%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_1_Cold                              [Bcmp_1]   716MB/s ± 6%         709MB/s ± 5%   -0.97%  (p=0.040 n=30+28)
BM_LIBC_Bcmp_2_L1                                [Bcmp_2]  1.83GB/s ± 7%        1.97GB/s ± 8%   +7.90%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_2_L2                                [Bcmp_2]  1.74GB/s ± 6%        1.92GB/s ± 6%  +10.29%  (p=0.000 n=30+29)
BM_LIBC_Bcmp_2_LLC                               [Bcmp_2]  1.05GB/s ± 9%        1.15GB/s ± 9%   +9.73%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_2_Cold                              [Bcmp_2]   379MB/s ± 6%         372MB/s ± 6%   -1.74%  (p=0.012 n=30+30)
BM_LIBC_Bcmp_3_L1                                [Bcmp_3]  2.17GB/s ± 5%        2.29GB/s ± 6%   +5.61%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_3_L2                                [Bcmp_3]  2.02GB/s ± 6%        2.20GB/s ± 6%   +8.75%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_3_LLC                               [Bcmp_3]  1.22GB/s ± 8%        1.34GB/s ± 9%   +9.19%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_3_Cold                              [Bcmp_3]   447MB/s ± 3%         441MB/s ± 7%   -1.40%  (p=0.033 n=30+30)
BM_LIBC_Bcmp_4_L1                                [Bcmp_4]   902MB/s ± 6%         995MB/s ±10%  +10.37%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_4_L2                                [Bcmp_4]   863MB/s ± 5%         945MB/s ±11%   +9.50%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_4_LLC                               [Bcmp_4]   528MB/s ±11%         559MB/s ±12%   +5.75%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_4_Cold                              [Bcmp_4]   183MB/s ± 4%         181MB/s ± 7%     ~     (p=0.088 n=28+30)
BM_LIBC_Bcmp_5_L1                                [Bcmp_5]  1.70GB/s ± 6%        1.87GB/s ± 8%  +10.14%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_5_L2                                [Bcmp_5]  1.60GB/s ± 5%        1.80GB/s ± 9%  +12.61%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_5_LLC                               [Bcmp_5]   994MB/s ±13%        1094MB/s ± 8%  +10.10%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_5_Cold                              [Bcmp_5]   362MB/s ± 6%         358MB/s ± 7%     ~     (p=0.123 n=30+30)
BM_LIBC_Bcmp_6_L1                                [Bcmp_6]  3.31GB/s ± 5%        3.67GB/s ± 6%  +10.90%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_6_L2                                [Bcmp_6]  3.11GB/s ± 5%        3.53GB/s ± 5%  +13.59%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_6_LLC                               [Bcmp_6]  1.98GB/s ± 9%        2.18GB/s ± 8%  +10.34%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_6_Cold                              [Bcmp_6]   754MB/s ± 5%         752MB/s ± 5%     ~     (p=0.592 n=30+30)
BM_LIBC_Bcmp_7_L1                                [Bcmp_7]  1.72GB/s ± 5%        1.72GB/s ± 6%     ~     (p=0.549 n=29+29)
BM_LIBC_Bcmp_7_L2                                [Bcmp_7]  1.61GB/s ± 7%        1.63GB/s ± 8%     ~     (p=0.191 n=30+29)
BM_LIBC_Bcmp_7_LLC                               [Bcmp_7]   913MB/s ± 8%         905MB/s ± 9%     ~     (p=0.423 n=30+30)
BM_LIBC_Bcmp_7_Cold                              [Bcmp_7]   304MB/s ± 6%         287MB/s ± 4%   -5.57%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_8_L1                                [Bcmp_8]   961MB/s ± 5%        1124MB/s ± 6%  +16.94%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_8_L2                                [Bcmp_8]   915MB/s ± 8%        1100MB/s ± 7%  +20.16%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_8_LLC                               [Bcmp_8]   593MB/s ± 8%         669MB/s ± 8%  +12.92%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_8_Cold                              [Bcmp_8]   220MB/s ± 4%         220MB/s ± 6%     ~     (p=0.572 n=30+30)
```

Co-authored-by: goldvitaly@google.com <%username%@google.com>
2024-09-06 11:19:01 +02:00
Joseph Huber
5c019bdb7a
[libc] Add support for 'string.h' locale variants (#105719)
Summary:
This adds the locale variants of the string functions. As previously,
these do not use the locale information at all and simply copy the
non-locale version which expects the "C" locale.
2024-08-29 14:20:15 -05:00
Guillaume Chatelet
73ef397fcb
[libc][x86] Use prefetch for write for memcpy (#90450)
Currently when `LIBC_COPT_MEMCPY_X86_USE_SOFTWARE_PREFETCHING` is set we
prefetch memory for read on the source buffer. This patch adds prefetch
for write on the destination buffer.
2024-08-29 14:17:23 +02:00
Petr Hosek
5ff3ff33ff
[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98597)
This is a part of #97655.
2024-07-12 09:28:41 -07:00
Mehdi Amini
ce9035f5bd
Revert "[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration" (#98593)
Reverts llvm/llvm-project#98075

bots are broken
2024-07-12 09:12:13 +02:00
Petr Hosek
3f30effe1b
[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98075)
This is a part of #97655.
2024-07-11 12:35:22 -07:00
Guillaume Chatelet
48ba7da9c8
[libc][NFC] Allow compilation of memcpy with -m32 (#93790)
Needed to support i386 (#93709).
2024-05-31 10:48:38 +02:00
Guillaume Chatelet
292b300c51
[libc][bug] Fix out of bound write in memcpy w/ software prefetching (#90591)
This patch adds tests for `memcpy` and `memset` making sure that we
don't access buffers out of bounds. It relies on POSIX `mmap` /
`mprotect` and works only when FULL_BUILD_MODE is disabled.

The bug showed up while enabling software prefetching.
`loop_and_tail_offset` is always running at least one iteration but in
some configurations loop unrolled prefetching was actually needing only
the tail operation and no loop iterations at all.
2024-05-14 13:55:24 +02:00
Marc Auberer
77118536b5
[libc] Remove obsolete LIBC_HAS_BUILTIN macro (#86554)
Fixes #86546 and removes the macro `LIBC_HAS_BUILTIN`. This was
necessary to support older compilers that did not support
`__has_builtin`. All of the compilers we support already have this
builtin.
See: https://libc.llvm.org/compiler_support.html
All uses now use `__has_builtin` directly

cc @nickdesaulniers
2024-03-27 17:22:41 +01:00
Guillaume Chatelet
a84e66a92d
[libc] Provide LIBC_TYPES_HAS_INT64 (#83441)
Umbrella bug #83182
2024-03-09 09:43:07 +01:00
Schrodinger ZHU Yifan
57a337378f
[libc][c23] add memset_explicit (#83577) 2024-03-07 14:57:35 -05:00
Nick Desaulniers
2aa22ca2ca
[libc] suppress readability-identifier-naming for std::numeric_limits interfaces (#83921)
These templates are made to match the ergonomics of std::numeric_limits.
Because our style for constexpr variables is ALL_CAPS, we must silence the
linter for these manually.

Link:
https://clang.llvm.org/extra/clang-tidy/#suppressing-undesired-diagnostics
2024-03-05 08:17:27 -08:00
Nick Desaulniers
640c85748e
[libc] fix readability-identifier-naming in memory_utils/utils.h (#83919)
Fixes:

    libc/src/string/memory_utils/utils.h:345:13: warning: invalid case style
    for member 'offset_' [readability-identifier-naming]

Having a trailing underscore for members is a google3 style, not LLVM style.
Removing the underscore is insufficient, as we would then have 2 members with
the same identifier which is not allowed (it is a compile time error). Remove
the getter, and just access the renamed member that's now made public.
2024-03-05 08:16:50 -08:00
Nick Desaulniers
88d82b747c
[libc] fix more readability-identifier-naming lints (#83914)
Found via:

    $ ninja -k2000 libc-lint 2>&1 | grep readability-identifier-naming

Auto fixed via:

    $ clang-tidy -p build/compile_commands.json \
      -checks="-*,readability-identifier-naming" \
      <filename> --fix

This doesn't fix all instances, just the obvious simple cases where it makes
sense to change the identifier names.  Subsequent PRs will fix up the
stragglers.
2024-03-05 08:15:56 -08:00
Nick Desaulniers
27352e600a
[libc] fix typo introduced in inline_bcmp_byte_per_byte (#83356)
My global find+replace was overzealous and broke post submit unit tests.

Link: #83345
2024-02-28 19:13:05 -06:00
Nick Desaulniers
6f8d826b74
[libc] fix readability-identifier-naming.ConstexprFunctionCase (#83345)
Codify that we use lower_case for
readability-identifier-naming.ConstexprFunctionCase and then fix the 11
violations (rather than codify UPPER_CASE and have to fix the 170 violations).
2024-02-28 14:52:02 -08:00
Nick Desaulniers
330793c91d
[libc] fix clang-tidy llvm-header-guard warnings (#82679)
Towards the goal of getting `ninja libc-lint` back to green, fix the numerous
instances of:

    warning: header guard does not follow preferred style [llvm-header-guard]

This is because many of our header guards start with `__LLVM` rather than
`LLVM`.

To filter just these warnings:

    $ ninja -k2000 libc-lint 2>&1 | grep llvm-header-guard

To automatically apply fixits:

    $ find libc/src libc/include libc/test -name \*.h | \
        xargs -n1 -I {} clang-tidy {} -p build/compile_commands.json \
        -checks='-*,llvm-header-guard' --fix --quiet

Some manual cleanup is still necessary as headers that were missing header
guards outright will have them inserted before the license block (we prefer
them after).
2024-02-28 12:53:56 -08:00
Joseph Huber
47b7c91abe
[libc] Rework the GPU build to be a regular target (#81921)
Summary:
This is a massive patch because it reworks the entire build and
everything that depends on it. This is not split up because various bots
would fail otherwise. I will attempt to describe the necessary changes
here.

This patch completely reworks how the GPU build is built and targeted.
Previously, we used a standard runtimes build and handled both NVPTX and
AMDGPU in a single build via multi-targeting. This added a lot of
divergence in the build system and prevented us from doing various
things like building for the CPU / GPU at the same time, or exporting
the startup libraries or running tests without a full rebuild.

The new appraoch is to handle the GPU builds as strict cross-compiling
runtimes. The first step required
https://github.com/llvm/llvm-project/pull/81557 to allow the `LIBC`
target to build for the GPU without touching the other targets. This
means that the GPU uses all the same handling as the other builds in
`libc`.

The new expected way to build the GPU libc is with
`LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`.

The second step was reworking how we generated the embedded GPU library
by moving it into the library install step. Where we previously had one
`libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This
patch includes the necessary clang / OpenMP changes to make that not
break the bots when this lands.

We unfortunately still require that the NVPTX target has an `internal`
target for tests. This is because the NVPTX target needs to do LTO for
the provided version (The offloading toolchain can handle it) but cannot
use it for the native toolchain which is used for making tests.

This approach is vastly superior in every way, allowing us to treat the
GPU as a standard cross-compiling target. We can now install the GPU
utilities to do things like use the offload tests and other fun things.

Some certain utilities need to be built with 
`--target=${LLVM_HOST_TRIPLE}` as well. I think this is a fine
workaround as we
will always assume that the GPU `libc` is a cross-build with a
functioning host.

Depends on https://github.com/llvm/llvm-project/pull/81557
2024-02-22 15:29:29 -06:00
Guillaume Chatelet
bc4f3e31a9
[libc][NFC] Selectively disable GCC warnings (#78462) 2024-01-18 10:36:21 +01:00
AtariDreams
e06b5a2435
[libc] Give more functions restrict qualifiers (NFC) (#78061)
strsep, strtok_r, strlcpy, and strlcat take restricted pointers as
parameters.
Add the restrict qualifiers to them.

Sources:
https://man7.org/linux/man-pages/man3/strsep.3.html
https://man7.org/linux/man-pages/man3/strtok_r.3.html
https://man.freebsd.org/cgi/man.cgi?strlcpy
2024-01-15 12:12:09 -06:00
Guillaume Chatelet
5794854213
[libc][NFC] Use 16-byte indices for _mmXXX_shuffle_epi8 (#77781)
This is less confusing since the implementation only cares about the 4
lower bits.
2024-01-11 16:25:55 +01:00
Guillaume Chatelet
9ca6e5bb86
[libc] Fix buggy AVX2 / AVX512 memcmp (#77081)
Fixes #77080.
2024-01-11 11:45:37 +01:00
Nick Desaulniers
1689bbea17 [libc] fix up #77384 2024-01-08 16:18:31 -08:00
Nick Desaulniers
6958986f77
[libc] fix -Wconversion (#77384)
Fixes the following from GCC:

    llvm-project/libc/src/string/memory_utils/op_x86.h:236:24: error:
conversion from ‘long unsigned int’ to ‘uint32_t’ {aka ‘unsigned int’}
may
    change value [-Werror=conversion]
      236 |   return (xored >> 32) | (xored & 0xFFFFFFFF);
          |          ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~

Link:
https://lab.llvm.org/buildbot/#/builders/250/builds/16236/steps/8/logs/stdio
Link: https://github.com/llvm/llvm-project/pull/74506
2024-01-08 16:08:22 -08:00
Nick Desaulniers
5352ce32fc
[libc] fix -Warray-bounds in block_offset (#77001)
GCC reports an instance of -Warray-bounds in block_offset.  Reimplement
block_offset in terms of memcpy_inline which was created to avoid this
diagnostic. See the linked issue for the full trace of diagnostic.

Fixes: https://github.com/llvm/llvm-project/issues/76877
2024-01-05 08:19:04 -08:00
Guillaume Chatelet
64671dbebc
[libc] Remove unnecessary call in memfunction dispatchers (#75800)
Before this patch the compiler could generate unnecessary calls to the
selected implementation.
https://clang.llvm.org/docs/AttributeReference.html#flatten
2023-12-19 13:57:44 +01:00
Guillaume Chatelet
1d89478830
[reland][libc][NFC] Remove __support/bit.h and use __support/CPP/bit.h instead (#73939) (#74446)
Same as #73939 but also fix `libc/src/string/memory_utils/op_aarch64.h`
that was still using `deferred_static_assert`.
2023-12-05 11:35:13 +01:00
Guillaume Chatelet
de7fdc5b54
Revert "[libc][NFC] Remove __support/bit.h and use __support/CPP/bit.h instead" (#74444)
Reverts llvm/llvm-project#73939

This broke libc-aarch64-ubuntu build bot 
https://lab.llvm.org/buildbot/#/builders/138/builds/56186
2023-12-05 11:25:39 +01:00
Guillaume Chatelet
b140948850
[libc][NFC] Remove __support/bit.h and use __support/CPP/bit.h instead (#73939) 2023-12-05 11:21:07 +01:00
Guillaume Chatelet
8628ca29aa
[libc] Fix UB in memory utils (#74295)
The [standard](https://eel.is/c++draft/expr.add#4.3) forbids forming
pointers to invalid objects even if the pointer is never read from or
written to. This patch makes sure that we don't do pointer arithmetic on
invalid pointers.


Co-authored-by: Vitaly Buka <vitalybuka@google.com>
2023-12-04 10:57:35 +01:00
Guillaume Chatelet
e2a37e5130
[libc][NFC] Fix missing LIBC_INLINE + style (#73659) 2023-11-29 10:37:54 +01:00
doshimili
3153aa4c95
[libc] Adding a version of memset with software prefetching (#70857)
Software prefetching helps recover performance when hardware prefetching
is disabled. The 'LIBC_COPT_MEMSET_X86_USE_SOFTWARE_PREFETCHING' compile
time option allows users to use this patch.
2023-11-10 10:56:16 +01:00
Dmitry Vyukov
d275277544
[libc] Optimize mempcy size thresholds (#70049)
Adjust boundary conditions for sizes = 16/32/64.
See the added comment for explanations.

Results on a machine with AVX2, so sizes 64/128 affected:
```
                │   baseline   │               adjusted               │
                │    sec/op    │   sec/op     vs base                 │
memcpy/Google_A   5.701n ±  0%   5.551n ± 1%   -2.63% (n=100)
memcpy/Google_B   3.817n ±  0%   3.776n ± 0%   -1.07% (p=0.000 n=100)
memcpy/Google_D   11.35n ±  1%   11.32n ± 0%        ~ (p=0.066 n=100)
memcpy/Google_U   3.874n ± 1%    3.821n ± 1%   -1.37% (p=0.001 n=100)
memcpy/64         3.843n ±  0%   3.105n ± 3%  -19.22% (n=50)
memcpy/128        4.842n ±  0%   3.818n ± 0%  -21.15% (p=0.000 n=50)
```
2023-11-07 08:37:19 +01:00