3550 Commits

Author SHA1 Message Date
Michael Jones
aeb18ebbe0
[libc] Add MSAN unpoison annotations to recv funcs (#109844)
Anywhere a struct is returned from the kernel, we need to explicitly
unpoison it for MSAN. This patch does that for the recv, recvfrom,
recvmsg, and socketpair functions.
2024-09-24 14:54:02 -07:00
Joseph Huber
fe6a3d46aa
[libc] Implement the 'rename' function on the GPU (#109814)
Summary:
Straightforward implementation like the other `stdio.h` functions.
2024-09-24 09:32:42 -07:00
Joseph Huber
3bbe0f90f3
[libc] Add 'strings.h' header on the GPU (#109661)
Summary:
These are GNU extensions but still show up, the entrypoints were enabled
but we weren't emitting the header so they couldn't be used.
2024-09-23 14:19:33 -07:00
Joseph Huber
16d11e26f3
[libc] Add GPU support for the 'system' function (#109687)
Summary:
This function can easily be implemented by forwarding it to the host
process. This shows up in a few places that we might want to test the
GPU so it should be provided. Also, I find the idea of the GPU
offloading work to the CPU via `system` very funny.
2024-09-23 14:04:28 -07:00
OverMighty
6267f121f5
[libc] Fix missing LIBC_TYPES_HAS_FLOAT16 guard around DyadicFloat::generic_as() (#109697)
See Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/93/builds/6872.
2024-09-23 20:01:53 +02:00
OverMighty
127349fcba
[libc][math] Add floating-point cast independent of compiler runtime (#105152)
Fixes build and tests with compiler-rt on x86.
2024-09-23 19:35:39 +02:00
Shourya Goel
ba5e195809
[libc][math] Implement issubnormal macro. (#109572)
#109201
2024-09-23 00:50:46 -04:00
Petr Hosek
eaedbbc30d
[libc] Use yaml.safe_load rather than yaml.load (#109557)
`yaml.load` is considered unsafe, use `yaml.safe_load`.
2024-09-21 20:08:13 -07:00
Shourya Goel
aaa637d8d0
[libc][math] Implement isnormal macro. (#109547)
#109201
2024-09-21 22:26:56 -04:00
Shourya Goel
56124feeb8
[libc][math] Implement fpclassify macro. (#109519)
#109201
2024-09-21 11:49:23 -04:00
Shourya Goel
739ede400b
[libc][[math] Implement IsZero Macro (#109336)
#109201
2024-09-20 13:00:01 -04:00
lntue
95d4c97a20
[libc][wchar] Move wchar's types to proxy headers. (#109334)
Also protect against extern inline function definitions added when
building with gcc: https://github.com/llvm/llvm-project/issues/60481.
2024-09-19 22:23:51 -04:00
Michael Jones
13dd2fd1e0
[libc] Put bind back, fix gcc build (#109341)
Fixes #106467.
Bind was accidentally removed while trying to clean up functions that
didn't end up being needed. The GCC issue was just a warning treated as
an error.
2024-09-19 15:10:56 -07:00
Michael Jones
f6b4c34d4f
[libc] Add functions to send/recv messages (#106467)
This patch adds the necessary functions to send and receive messages
over a socket. Those functions are: recv, recvfrom, recvmsg, send,
sendto, sendmsg, and socketpair for testing.
2024-09-19 14:43:00 -07:00
Michael Jones
010c0d36e1
[libc][AMDGPU] Disable %m in RPC server (#109317)
The RPC server directly includes the printf code, but doesn't support
errno, so the %m conversion needs to be disabled there as well. This
patch does that.
2024-09-19 13:33:23 -05:00
Michael Jones
f009f72df5
[libc] Add printf strerror conversion (%m) (#105891)
This patch adds the %m conversion to printf, which prints the
strerror(errno). Explanation of why is below, this patch also updates
the docs, tests, and build system to accomodate this.

The standard for syslog in posix specifies it uses the same format as
printf, but adds %m which prints the error message string for the
current value of errno. For ease of implementation, it's standard
practice for libc implementers to just add %m to printf instead of
creating a separate parser for syslog.
2024-09-19 10:48:08 -07:00
Joseph Huber
ba8c96593c
[Clang] Do not implicitly link C libraries for the GPU targets (#109052)
Summary:
I initially thought that it would be convenient to automatically link
these libraries like they are for standard C/C++ targets. However, this
created issues when trying to use C++ as a GPU target. This patch moves
the logic to now implicitly pass it as part of the offloading toolchain
instead, if found. This means that the user needs to set the target
toolchain for the link job for automatic detection, but can still be
done manually via `-Xoffload-linker -lc`.
2024-09-18 06:44:07 -07:00
Зишан Мирза
b9e13045ab
[libc] add ctime and ctime_r to date_and_time documentation (#108665)
closes #108664
2024-09-17 09:50:07 -07:00
Youngsuk Kim
c3d78a7af8 [libc][benchmarks] Tidy uses of raw_string_ostream (NFC)
As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly

( 65b13610a5226b84889b923bae884ba395ad084d for further reference )

Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
2024-09-17 10:25:18 -05:00
Зишан Мирза
000a3f0a54
[libc][c11] implement ctime (#107285)
This is an implementation of `ctime` and includes `ctime_r`.

According to documentation, `ctime` and `ctime_r` are defined as the
following:

```c
char *ctime(const time_t *timep);
char *ctime_r(const time_t *restrict timep, char buf[restrict 26]);
```

closes #86567
2024-09-16 11:27:11 -07:00
Jeff Bailey
50985d23e5
[libc][nfc] Fix typo in header generation message. (#108813)
Fix a typo in the header generation message.

Before:
Generating header from
/home/vscode/llvm-project/llvm/../libc/newhdrgen/yaml/ctype.yaml and
/home/vscode/llvm-project/libc/include/ctype.h.def

After:
Generating header ctype.h from
/home/vscode/llvm-project/llvm/../libc/newhdrgen/yaml/ctype.yaml and
/home/vscode/llvm-project/libc/include/ctype.h.def
2024-09-16 16:53:43 +01:00
Job Henandez Lara
a205a854e0
[libc][math] Improve fmul performance by using double-double arithmetic. (#107517)
```
 Performance tests with inputs in denormal range:
-- My function --
     Total time      : 2731072304 ns 
     Average runtime : 68.2767 ns/op 
     Ops per second  : 14646276 op/s 
-- Other function --
     Total time      : 3259744268 ns 
     Average runtime : 81.4935 ns/op 
     Ops per second  : 12270913 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.837818 

 Performance tests with inputs in normal range:
-- My function --
     Total time      : 93467258 ns 
     Average runtime : 2.33668 ns/op 
     Ops per second  : 427957777 op/s 
-- Other function --
     Total time      : 637295452 ns 
     Average runtime : 15.9324 ns/op 
     Ops per second  : 62765299 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.146662 

 Performance tests with inputs in normal range with exponents close to each other:
-- My function --
     Total time      : 95764894 ns 
     Average runtime : 2.39412 ns/op 
     Ops per second  : 417690014 op/s 
-- Other function --
     Total time      : 639866770 ns 
     Average runtime : 15.9967 ns/op 
     Ops per second  : 62513075 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.149664 
```

---------

Co-authored-by: Tue Ly <lntue@google.com>
2024-09-14 17:32:22 -04:00
Job Henandez Lara
c0b7f1bb58
[libc][math][c23] add darwin entrypoints for fmul (#108680) 2024-09-14 00:21:32 -04:00
lntue
b659abef48
[libc] Fix vdso VER_FLG_BASE redefinition in overlay mod. (#108628) 2024-09-13 15:06:20 -04:00
Schrodinger ZHU Yifan
82987bd9da
[libc] fix dependency path for vDSO (#108591) 2024-09-13 12:13:27 -04:00
Schrodinger ZHU Yifan
a6438360d4
[libc] fix build issue in overlay mode (#108583) 2024-09-13 11:10:10 -04:00
Schrodinger ZHU Yifan
99fe5954d2
[libc] implement clock_gettime using vDSO (#108458)
supersedes https://github.com/llvm/llvm-project/pull/91805
2024-09-13 10:58:39 -04:00
Sirui Mu
ded080152a
[libc] Add osutils for Windows and make libc and its tests build on Windows target (#104676)
This PR first adds osutils for Windows, and changes some libc code to
make libc and its tests build on the Windows target. It then temporarily
disables some libc tests that are currently problematic on Windows.

Specifically, the changes besides the addition of osutils include:

- Macro `LIBC_TYPES_HAS_FLOAT16` is disabled on Windows. `clang-cl`
generates calls to functions in `compiler-rt` to handle float16
arithmetic and these functions are currently not linked in on Windows.
- Macro `LIBC_TYPES_HAS_INT128` is disabled on Windows.
- The invocation to `::aligned_malloc` is changed to an invocation to
`::_aligned_malloc`.
- The following unit tests are temporarily disabled because they
currently fail on Windows:
  - `test.src.__support.big_int_test`
  - `test.src.__support.arg_list_test`
  - `test.src.fenv.getenv_and_setenv_test`
- Tests involving `__m128i`, `__m256i`, and `__m512i` in
`test.src.string.memory_utils.op_tests.cpp`
- `test_range_errors` in `libc/test/src/math/smoke/AddTest.h` and
`libc/test/src/math/smoke/SubTest.h`
2024-09-11 23:41:32 -04:00
Joseph Huber
666a3f4ed4
[libc] Stub TLS functions on the GPU temporarily (#108267)
Summary:
There's an extern weak symbol for this, we should just factor these into
a more common interface. Stub them temporarily to make the bots happy.
PTXAS does not handle extern weak.
2024-09-11 11:36:07 -07:00
lntue
1896ee3889
[libc] Fix undefined behavior for nan functions. (#106468)
Currently the nan* functions use nullptr dereferencing to crash with
SIGSEGV if the input is nullptr. Both `nan(nullptr)` and `nullptr`
dereferencing are undefined behaviors according to the C standard.
Employing `nullptr` dereference in the `nan` function implementation is
ok if users only linked against the pre-built library, but it might be
completely removed by the compilers' optimizations if it is built from
source together with the users' code.

See for instance:  https://godbolt.org/z/fd8KcM9bx

This PR uses volatile load to prevent the undefined behavior if libc is
built without sanitizers, and leave the current undefined behavior if
libc is built with sanitizers, so that the undefined behavior can be
caught for users' codes.
2024-09-11 14:13:31 -04:00
Schrodinger ZHU Yifan
d8e124dffa
[libc] implement vdso (#91572) 2024-09-11 12:51:11 -04:00
Schrodinger ZHU Yifan
779a444009
[libc] fix tls teardown while being used (#108229)
The call chain to `Mutex:lock` can be polluted by stack protector. For
completely safe, let's postpone the main TLS tearing down to a separate
phase.

fix #108030
2024-09-11 12:22:35 -04:00
Schrodinger ZHU Yifan
ce9f987295
[libc] fix locale dependency for stdlib (#108042)
Address the following issue:
```
❯ ninja libc.test.src.__support.OSUtil.linux.vdso_test.__unit__
[91/127] Building CXX object libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o
FAILED: libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o 
sccache /usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_20_0_0_git -D_DEBUG -I/home/schrodingerzy/Documents/llvm-project/libc -isystem /home/schrodingerzy/Documents/llvm-project/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g -std=gnu++17 -fpie -DLIBC_FULL_BUILD -ffreestanding -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -MD -MT libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -MF libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o.d -o libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -c /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp
In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp:21:
In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/UnitTest/ErrnoSetterMatcher.h:13:
In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/FPUtil/fpbits_str.h:12:
In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/CPP/string.h:20:
/home/schrodingerzy/Documents/llvm-project/build/libc/include/stdlib.h:13:10: fatal error: 'llvm-libc-types/locale_t.h' file not found
   13 | #include "llvm-libc-types/locale_t.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
[123/127] Building CXX object libc/test/UnitTest/CMakeFiles/LibcTest.unit.dir/LibcTestMain.cpp.o
ninja: build stopped: subcommand failed.
```
2024-09-10 13:04:19 -04:00
lntue
277371943f
[libc][bazel] Update bazel overlay for math functions and their tests. (#107862) 2024-09-09 14:15:46 -04:00
wldfngrs
3d7af093f3
[libc] Add proxy header for the jmp_buf type (#107712)
Added proxy header for the jmp_buf type and changed all use instances
from __jmp_buf * to the typedef alias jmp_buf , fixed the link to LLVM
in stack_t.h description
2024-09-08 20:55:00 -04:00
Rahul Joshi
98563b19c2
[libc][TableGen] Migrate libc-hdrgen backend to use const RecordKeeper (#107542)
Migrate libc-hdrgen backend to use const RecordKeeper
2024-09-07 15:14:07 -07:00
wldfngrs
056a1676cb
[libc] Add proxy header for the stack_t type (#107559)
added proxy header for the stack_t type and modified the corresponding
CMakeLists.txt files
2024-09-07 10:01:33 -04:00
lntue
fc7a893620
[libc] Remove -ffreestanding when building MPFR wrapper. (#107637)
MPFR/GMP headers do not work with -ffreestanding flags.
2024-09-06 16:54:36 -04:00
lntue
876b0e60fe
[libc] Fix signal's dependency on the proxy header sighandler_t. (#107605) 2024-09-06 16:23:53 -04:00
lntue
80cf21dad1
[libc] Fix unit test compile flags propagation. (#106128)
With this change, I was able to build and test for aarch64 & riscv64 on
x86-64 host as follow:

Pre-requisite:
- cross build toolchain for aarch64
```
$ sudo apt install binutils-aarch64-linux-gnu gcc-aarch64-linux-gnu g++-aarch64-linux-gnu
```
- cross build toolchain for riscv64
```
$ sudo apt install binutils-riscv64-linux-gnu gcc-riscv64-linux-gnu g++-riscv64-linux-gnu
```
- qemu user:
```
$ sudo apt install qemu qemu-user qemu-user-static
```

CMake invocation:
```
$ cmake ../runtimes -GNinja -DLLVM_ENABLE_RUNTIMES=libc -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLIBC_TARGET_TRIPLE=<aarch64-linux-gnu/riscv64-linux-gnu> -DCMAKE_BUILD_TYPE=Release -DLIBC_TEST_COMPILE_OPTIONS_DEFAULT="-static"
$ ninja libc
$ ninja check-libc
```
2024-09-06 11:56:07 -04:00
Vitaly Goldshteyn
66a03295de
[libc] Implement branchless head-tail comparison for bcmp (#107540)
Binary size changes:

| Bytes (cache lines) | before   | after   |
|---------------------|----------|---------|
| sse4                | 419 (7)  | 288 (5) |
| avx                 | 430 (7)  | 308 (5) |
| avx512f             | 589 (10) | 390 (7) |

Benchmarks for different CPUs using
https://github.com/google/fleetbench.

 - indus-cascadelake

```
name                                                       old speed            new speed            delta
BM_LIBC_Bcmp_Fleet_L1                                      1.96GB/s ± 1%        2.19GB/s ± 0%  +11.49%  (p=0.000 n=29+24)
BM_LIBC_Bcmp_Fleet_L2                                      1.90GB/s ± 1%        2.14GB/s ± 1%  +12.68%  (p=0.000 n=29+24)
BM_LIBC_Bcmp_Fleet_LLC                                      513MB/s ± 4%         531MB/s ± 4%   +3.53%  (p=0.000 n=24+24)
BM_LIBC_Bcmp_Fleet_Cold                                     452MB/s ± 3%         456MB/s ± 4%     ~     (p=0.103 n=30+30)
BM_LIBC_Bcmp_0_L1                                [Bcmp_0]  2.98GB/s ± 1%        3.15GB/s ± 1%   +5.59%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_0_L2                                [Bcmp_0]  2.86GB/s ± 1%        3.07GB/s ± 1%   +7.21%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_0_LLC                               [Bcmp_0]   738MB/s ± 7%         751MB/s ± 3%   +1.68%  (p=0.000 n=24+25)
BM_LIBC_Bcmp_0_Cold                              [Bcmp_0]   643MB/s ± 3%         642MB/s ± 4%     ~     (p=0.522 n=29+30)
BM_LIBC_Bcmp_1_L1                                [Bcmp_1]  3.08GB/s ± 0%        3.25GB/s ± 0%   +5.35%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_1_L2                                [Bcmp_1]  2.97GB/s ± 1%        3.17GB/s ± 1%   +6.65%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_1_LLC                               [Bcmp_1]   901MB/s ±59%         871MB/s ±36%     ~     (p=0.676 n=29+27)
BM_LIBC_Bcmp_1_Cold                              [Bcmp_1]   686MB/s ± 4%         686MB/s ± 3%     ~     (p=0.934 n=29+30)
BM_LIBC_Bcmp_2_L1                                [Bcmp_2]  1.63GB/s ± 0%        1.80GB/s ± 1%  +10.19%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_2_L2                                [Bcmp_2]  1.57GB/s ± 1%        1.75GB/s ± 1%  +11.46%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_2_LLC                               [Bcmp_2]   451MB/s ±61%         427MB/s ±28%     ~     (p=0.469 n=29+25)
BM_LIBC_Bcmp_2_Cold                              [Bcmp_2]   353MB/s ± 4%         354MB/s ± 5%     ~     (p=0.467 n=30+30)
BM_LIBC_Bcmp_3_L1                                [Bcmp_3]  1.91GB/s ± 1%        2.10GB/s ± 1%   +9.90%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_3_L2                                [Bcmp_3]  1.84GB/s ± 1%        2.03GB/s ± 1%  +10.63%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_3_LLC                               [Bcmp_3]   491MB/s ±24%         538MB/s ±24%   +9.66%  (p=0.000 n=24+27)
BM_LIBC_Bcmp_3_Cold                              [Bcmp_3]   417MB/s ± 4%         421MB/s ± 3%     ~     (p=0.063 n=30+29)
BM_LIBC_Bcmp_4_L1                                [Bcmp_4]   761MB/s ± 1%         867MB/s ± 1%  +14.02%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_4_L2                                [Bcmp_4]   748MB/s ± 1%         860MB/s ± 1%  +15.04%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_4_LLC                               [Bcmp_4]   227MB/s ±29%         260MB/s ±64%  +14.70%  (p=0.000 n=26+27)
BM_LIBC_Bcmp_4_Cold                              [Bcmp_4]   187MB/s ± 3%         191MB/s ± 5%   +2.26%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_5_L1                                [Bcmp_5]  1.48GB/s ± 1%        1.71GB/s ± 1%  +15.26%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_5_L2                                [Bcmp_5]  1.42GB/s ± 1%        1.67GB/s ± 1%  +17.68%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_5_LLC                               [Bcmp_5]   412MB/s ±34%         519MB/s ±80%  +25.87%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_5_Cold                              [Bcmp_5]   336MB/s ± 4%         343MB/s ± 6%   +2.05%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_6_L1                                [Bcmp_6]  2.87GB/s ± 0%        3.24GB/s ± 1%  +12.88%  (p=0.000 n=26+30)
BM_LIBC_Bcmp_6_L2                                [Bcmp_6]  2.78GB/s ± 1%        3.20GB/s ± 1%  +15.15%  (p=0.000 n=26+30)
BM_LIBC_Bcmp_6_LLC                               [Bcmp_6]   926MB/s ±43%        1227MB/s ±76%  +32.53%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_6_Cold                              [Bcmp_6]   716MB/s ± 4%         737MB/s ± 6%   +3.02%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_7_L1                                [Bcmp_7]  1.54GB/s ± 1%        1.56GB/s ± 0%   +1.40%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_7_L2                                [Bcmp_7]  1.47GB/s ± 1%        1.52GB/s ± 1%   +2.97%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_7_LLC                               [Bcmp_7]   351MB/s ±23%         436MB/s ±83%  +24.04%  (p=0.005 n=24+29)
BM_LIBC_Bcmp_7_Cold                              [Bcmp_7]   283MB/s ± 4%         282MB/s ± 4%     ~     (p=0.644 n=30+30)
BM_LIBC_Bcmp_8_L1                                [Bcmp_8]   824MB/s ± 1%        1048MB/s ± 1%  +27.18%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_8_L2                                [Bcmp_8]   808MB/s ± 1%        1027MB/s ± 1%  +27.12%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_8_LLC                               [Bcmp_8]   317MB/s ±79%         332MB/s ±74%     ~     (p=0.338 n=30+29)
BM_LIBC_Bcmp_8_Cold                              [Bcmp_8]   207MB/s ± 5%         212MB/s ± 5%   +2.27%  (p=0.000 n=30+30)
```

 - indus-skylake

```
name                                                       old speed            new speed            delta
BM_LIBC_Bcmp_Fleet_L1                                      2.06GB/s ± 2%        2.25GB/s ± 3%   +9.66%  (p=0.000 n=27+24)
BM_LIBC_Bcmp_Fleet_L2                                      1.96GB/s ± 2%        2.17GB/s ± 2%  +10.61%  (p=0.000 n=30+24)
BM_LIBC_Bcmp_Fleet_LLC                                     1.18GB/s ± 6%        1.32GB/s ± 5%  +12.27%  (p=0.000 n=28+28)
BM_LIBC_Bcmp_Fleet_Cold                                     456MB/s ± 2%         466MB/s ± 2%   +2.22%  (p=0.000 n=28+28)
BM_LIBC_Bcmp_0_L1                                [Bcmp_0]  3.08GB/s ± 2%        3.20GB/s ± 1%   +3.72%  (p=0.000 n=28+22)
BM_LIBC_Bcmp_0_L2                                [Bcmp_0]  2.92GB/s ± 1%        3.05GB/s ± 2%   +4.49%  (p=0.000 n=23+23)
BM_LIBC_Bcmp_0_LLC                               [Bcmp_0]  1.83GB/s ± 8%        1.94GB/s ± 4%   +6.24%  (p=0.000 n=25+27)
BM_LIBC_Bcmp_0_Cold                              [Bcmp_0]   654MB/s ± 2%         659MB/s ± 2%   +0.76%  (p=0.012 n=30+29)
BM_LIBC_Bcmp_1_L1                                [Bcmp_1]  3.19GB/s ± 2%        3.34GB/s ± 2%   +4.41%  (p=0.000 n=26+23)
BM_LIBC_Bcmp_1_L2                                [Bcmp_1]  3.05GB/s ± 2%        3.21GB/s ± 2%   +5.32%  (p=0.000 n=28+25)
BM_LIBC_Bcmp_1_LLC                               [Bcmp_1]  1.95GB/s ± 4%        2.03GB/s ±10%   +3.61%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_1_Cold                              [Bcmp_1]   700MB/s ± 2%         702MB/s ± 2%     ~     (p=0.150 n=30+30)
BM_LIBC_Bcmp_2_L1                                [Bcmp_2]  1.69GB/s ± 2%        1.85GB/s ± 1%   +9.31%  (p=0.000 n=30+26)
BM_LIBC_Bcmp_2_L2                                [Bcmp_2]  1.60GB/s ± 2%        1.78GB/s ± 2%  +10.90%  (p=0.000 n=26+27)
BM_LIBC_Bcmp_2_LLC                               [Bcmp_2]  1.01GB/s ± 5%        1.12GB/s ± 5%  +11.40%  (p=0.000 n=27+28)
BM_LIBC_Bcmp_2_Cold                              [Bcmp_2]   355MB/s ± 3%         360MB/s ± 3%   +1.46%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_3_L1                                [Bcmp_3]  1.98GB/s ± 2%        2.15GB/s ± 2%   +8.89%  (p=0.000 n=29+27)
BM_LIBC_Bcmp_3_L2                                [Bcmp_3]  1.87GB/s ± 3%        2.05GB/s ± 2%  +10.06%  (p=0.000 n=30+26)
BM_LIBC_Bcmp_3_LLC                               [Bcmp_3]  1.19GB/s ± 4%        1.31GB/s ± 6%   +9.82%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_3_Cold                              [Bcmp_3]   424MB/s ± 3%         431MB/s ± 3%   +1.58%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_4_L1                                [Bcmp_4]   849MB/s ± 2%         949MB/s ± 2%  +11.84%  (p=0.000 n=27+28)
BM_LIBC_Bcmp_4_L2                                [Bcmp_4]   815MB/s ± 3%         913MB/s ± 3%  +12.06%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_4_LLC                               [Bcmp_4]   512MB/s ± 9%         571MB/s ± 7%  +11.40%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_4_Cold                              [Bcmp_4]   187MB/s ± 3%         192MB/s ± 2%   +2.56%  (p=0.000 n=30+28)
BM_LIBC_Bcmp_5_L1                                [Bcmp_5]  1.55GB/s ± 2%        1.77GB/s ± 3%  +13.93%  (p=0.000 n=30+28)
BM_LIBC_Bcmp_5_L2                                [Bcmp_5]  1.47GB/s ± 2%        1.70GB/s ± 2%  +15.96%  (p=0.000 n=27+26)
BM_LIBC_Bcmp_5_LLC                               [Bcmp_5]   939MB/s ± 5%        1084MB/s ± 4%  +15.36%  (p=0.000 n=28+27)
BM_LIBC_Bcmp_5_Cold                              [Bcmp_5]   340MB/s ± 2%         347MB/s ± 3%   +1.93%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_6_L1                                [Bcmp_6]  3.06GB/s ± 3%        3.40GB/s ± 2%  +11.13%  (p=0.000 n=30+28)
BM_LIBC_Bcmp_6_L2                                [Bcmp_6]  2.89GB/s ± 3%        3.24GB/s ± 2%  +12.20%  (p=0.000 n=29+26)
BM_LIBC_Bcmp_6_LLC                               [Bcmp_6]  1.93GB/s ± 4%        2.09GB/s ±11%   +8.16%  (p=0.000 n=26+30)
BM_LIBC_Bcmp_6_Cold                              [Bcmp_6]   746MB/s ± 2%         762MB/s ± 2%   +2.11%  (p=0.000 n=30+28)
BM_LIBC_Bcmp_7_L1                                [Bcmp_7]  1.59GB/s ± 2%        1.62GB/s ± 2%   +1.72%  (p=0.000 n=25+27)
BM_LIBC_Bcmp_7_L2                                [Bcmp_7]  1.49GB/s ± 2%        1.53GB/s ± 2%   +2.62%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_7_LLC                               [Bcmp_7]   852MB/s ±10%         909MB/s ± 6%   +6.71%  (p=0.000 n=30+29)
BM_LIBC_Bcmp_7_Cold                              [Bcmp_7]   283MB/s ± 3%         283MB/s ± 2%     ~     (p=0.617 n=30+27)
BM_LIBC_Bcmp_8_L1                                [Bcmp_8]   891MB/s ± 2%        1083MB/s ± 2%  +21.64%  (p=0.000 n=27+24)
BM_LIBC_Bcmp_8_L2                                [Bcmp_8]   855MB/s ± 2%        1045MB/s ± 1%  +22.31%  (p=0.000 n=25+23)
BM_LIBC_Bcmp_8_LLC                               [Bcmp_8]   568MB/s ± 7%         659MB/s ± 8%  +16.04%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_8_Cold                              [Bcmp_8]   207MB/s ± 2%         212MB/s ± 2%   +2.31%  (p=0.000 n=30+27)
```

 - arcadia-rome

```
name                                                       old speed            new speed            delta
BM_LIBC_Bcmp_Fleet_L1                                      2.16GB/s ± 2%        2.27GB/s ± 2%   +5.13%  (p=0.000 n=26+30)
BM_LIBC_Bcmp_Fleet_L2                                      2.15GB/s ± 2%        2.25GB/s ± 2%   +4.64%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_Fleet_LLC                                     1.73GB/s ± 3%        1.81GB/s ± 3%   +4.66%  (p=0.000 n=25+28)
BM_LIBC_Bcmp_Fleet_Cold                                     494MB/s ± 1%         496MB/s ± 2%   +0.45%  (p=0.023 n=22+24)
BM_LIBC_Bcmp_0_L1                                [Bcmp_0]  3.30GB/s ± 1%        3.24GB/s ± 2%   -1.70%  (p=0.000 n=27+30)
BM_LIBC_Bcmp_0_L2                                [Bcmp_0]  3.23GB/s ± 2%        3.19GB/s ± 2%   -1.28%  (p=0.000 n=28+28)
BM_LIBC_Bcmp_0_LLC                               [Bcmp_0]  2.59GB/s ± 3%        2.58GB/s ± 2%   -0.65%  (p=0.010 n=26+26)
BM_LIBC_Bcmp_0_Cold                              [Bcmp_0]   720MB/s ± 1%         707MB/s ± 3%   -1.75%  (p=0.000 n=22+25)
BM_LIBC_Bcmp_1_L1                                [Bcmp_1]  3.37GB/s ± 1%        3.36GB/s ± 2%     ~     (p=0.102 n=28+29)
BM_LIBC_Bcmp_1_L2                                [Bcmp_1]  3.32GB/s ± 2%        3.30GB/s ± 2%   -0.51%  (p=0.038 n=28+29)
BM_LIBC_Bcmp_1_LLC                               [Bcmp_1]  2.67GB/s ± 4%        2.70GB/s ± 4%   +0.96%  (p=0.009 n=28+27)
BM_LIBC_Bcmp_1_Cold                              [Bcmp_1]   755MB/s ± 1%         751MB/s ± 2%   -0.57%  (p=0.000 n=22+25)
BM_LIBC_Bcmp_2_L1                                [Bcmp_2]  1.79GB/s ± 1%        1.86GB/s ± 2%   +3.92%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_2_L2                                [Bcmp_2]  1.77GB/s ± 2%        1.82GB/s ± 2%   +2.99%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_2_LLC                               [Bcmp_2]  1.41GB/s ± 4%        1.47GB/s ± 3%   +3.97%  (p=0.000 n=28+28)
BM_LIBC_Bcmp_2_Cold                              [Bcmp_2]   386MB/s ± 1%         389MB/s ± 1%   +0.60%  (p=0.000 n=21+23)
BM_LIBC_Bcmp_3_L1                                [Bcmp_3]  2.07GB/s ± 2%        2.17GB/s ± 2%   +4.87%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_3_L2                                [Bcmp_3]  2.07GB/s ± 2%        2.13GB/s ± 2%   +3.02%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_3_LLC                               [Bcmp_3]  1.66GB/s ± 2%        1.73GB/s ± 2%   +4.08%  (p=0.000 n=29+26)
BM_LIBC_Bcmp_3_Cold                              [Bcmp_3]   466MB/s ± 2%         469MB/s ± 3%   +0.66%  (p=0.001 n=22+25)
BM_LIBC_Bcmp_4_L1                                [Bcmp_4]   861MB/s ± 1%         964MB/s ± 2%  +11.98%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_4_L2                                [Bcmp_4]   853MB/s ± 2%         935MB/s ± 2%   +9.54%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_4_LLC                               [Bcmp_4]   707MB/s ± 3%         743MB/s ± 4%   +5.08%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_4_Cold                              [Bcmp_4]   199MB/s ± 3%         199MB/s ± 2%     ~     (p=0.107 n=29+25)
BM_LIBC_Bcmp_5_L1                                [Bcmp_5]  1.65GB/s ± 1%        1.75GB/s ± 2%   +6.15%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_5_L2                                [Bcmp_5]  1.64GB/s ± 3%        1.73GB/s ± 2%   +5.37%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_5_LLC                               [Bcmp_5]  1.32GB/s ± 2%        1.40GB/s ± 2%   +6.21%  (p=0.000 n=28+27)
BM_LIBC_Bcmp_5_Cold                              [Bcmp_5]   370MB/s ± 3%         371MB/s ± 2%   +0.16%  (p=0.008 n=29+25)
BM_LIBC_Bcmp_6_L1                                [Bcmp_6]  3.25GB/s ± 2%        3.47GB/s ± 2%   +6.74%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_6_L2                                [Bcmp_6]  3.26GB/s ± 1%        3.44GB/s ± 1%   +5.43%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_6_LLC                               [Bcmp_6]  2.66GB/s ± 2%        2.79GB/s ± 3%   +4.90%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_6_Cold                              [Bcmp_6]   812MB/s ± 3%         799MB/s ± 2%   -1.57%  (p=0.000 n=29+25)
BM_LIBC_Bcmp_7_L1                                [Bcmp_7]  1.71GB/s ± 2%        1.66GB/s ± 2%   -3.14%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_7_L2                                [Bcmp_7]  1.63GB/s ± 2%        1.59GB/s ± 2%   -2.50%  (p=0.000 n=29+28)
BM_LIBC_Bcmp_7_LLC                               [Bcmp_7]  1.25GB/s ± 4%        1.25GB/s ± 2%     ~     (p=0.530 n=28+26)
BM_LIBC_Bcmp_7_Cold                              [Bcmp_7]   311MB/s ± 3%         308MB/s ± 1%     ~     (p=0.127 n=29+24)
BM_LIBC_Bcmp_8_L1                                [Bcmp_8]   869MB/s ± 2%        1098MB/s ± 2%  +26.28%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_8_L2                                [Bcmp_8]   873MB/s ± 2%        1075MB/s ± 1%  +23.06%  (p=0.000 n=27+29)
BM_LIBC_Bcmp_8_LLC                               [Bcmp_8]   743MB/s ± 4%         859MB/s ± 4%  +15.58%  (p=0.000 n=27+27)
BM_LIBC_Bcmp_8_Cold                              [Bcmp_8]   221MB/s ± 4%         221MB/s ± 3%   +0.14%  (p=0.034 n=29+25)
```

 - ixion-haswell

```
name                                                       old speed            new speed            delta
BM_LIBC_Bcmp_Fleet_L1                                      2.27GB/s ± 5%        2.41GB/s ± 6%   +6.10%  (p=0.000 n=29+28)
BM_LIBC_Bcmp_Fleet_L2                                      2.14GB/s ± 6%        2.33GB/s ± 5%   +9.21%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_Fleet_LLC                                     1.30GB/s ± 9%        1.43GB/s ± 8%   +9.85%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_Fleet_Cold                                     475MB/s ± 6%         475MB/s ± 5%     ~     (p=0.839 n=30+29)
BM_LIBC_Bcmp_0_L1                                [Bcmp_0]  3.38GB/s ± 7%        3.46GB/s ± 6%   +2.35%  (p=0.009 n=30+29)
BM_LIBC_Bcmp_0_L2                                [Bcmp_0]  3.20GB/s ± 5%        3.32GB/s ± 6%   +3.52%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_0_LLC                               [Bcmp_0]  1.88GB/s ± 9%        2.00GB/s ± 6%   +6.63%  (p=0.000 n=30+28)
BM_LIBC_Bcmp_0_Cold                              [Bcmp_0]   664MB/s ± 6%         655MB/s ± 6%   -1.32%  (p=0.025 n=30+30)
BM_LIBC_Bcmp_1_L1                                [Bcmp_1]  3.50GB/s ± 8%        3.61GB/s ±10%   +3.09%  (p=0.001 n=29+30)
BM_LIBC_Bcmp_1_L2                                [Bcmp_1]  3.32GB/s ± 7%        3.48GB/s ± 8%   +4.89%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_1_LLC                               [Bcmp_1]  2.02GB/s ± 7%        2.14GB/s ± 9%   +5.82%  (p=0.000 n=28+29)
BM_LIBC_Bcmp_1_Cold                              [Bcmp_1]   716MB/s ± 6%         709MB/s ± 5%   -0.97%  (p=0.040 n=30+28)
BM_LIBC_Bcmp_2_L1                                [Bcmp_2]  1.83GB/s ± 7%        1.97GB/s ± 8%   +7.90%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_2_L2                                [Bcmp_2]  1.74GB/s ± 6%        1.92GB/s ± 6%  +10.29%  (p=0.000 n=30+29)
BM_LIBC_Bcmp_2_LLC                               [Bcmp_2]  1.05GB/s ± 9%        1.15GB/s ± 9%   +9.73%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_2_Cold                              [Bcmp_2]   379MB/s ± 6%         372MB/s ± 6%   -1.74%  (p=0.012 n=30+30)
BM_LIBC_Bcmp_3_L1                                [Bcmp_3]  2.17GB/s ± 5%        2.29GB/s ± 6%   +5.61%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_3_L2                                [Bcmp_3]  2.02GB/s ± 6%        2.20GB/s ± 6%   +8.75%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_3_LLC                               [Bcmp_3]  1.22GB/s ± 8%        1.34GB/s ± 9%   +9.19%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_3_Cold                              [Bcmp_3]   447MB/s ± 3%         441MB/s ± 7%   -1.40%  (p=0.033 n=30+30)
BM_LIBC_Bcmp_4_L1                                [Bcmp_4]   902MB/s ± 6%         995MB/s ±10%  +10.37%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_4_L2                                [Bcmp_4]   863MB/s ± 5%         945MB/s ±11%   +9.50%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_4_LLC                               [Bcmp_4]   528MB/s ±11%         559MB/s ±12%   +5.75%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_4_Cold                              [Bcmp_4]   183MB/s ± 4%         181MB/s ± 7%     ~     (p=0.088 n=28+30)
BM_LIBC_Bcmp_5_L1                                [Bcmp_5]  1.70GB/s ± 6%        1.87GB/s ± 8%  +10.14%  (p=0.000 n=29+29)
BM_LIBC_Bcmp_5_L2                                [Bcmp_5]  1.60GB/s ± 5%        1.80GB/s ± 9%  +12.61%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_5_LLC                               [Bcmp_5]   994MB/s ±13%        1094MB/s ± 8%  +10.10%  (p=0.000 n=29+30)
BM_LIBC_Bcmp_5_Cold                              [Bcmp_5]   362MB/s ± 6%         358MB/s ± 7%     ~     (p=0.123 n=30+30)
BM_LIBC_Bcmp_6_L1                                [Bcmp_6]  3.31GB/s ± 5%        3.67GB/s ± 6%  +10.90%  (p=0.000 n=28+30)
BM_LIBC_Bcmp_6_L2                                [Bcmp_6]  3.11GB/s ± 5%        3.53GB/s ± 5%  +13.59%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_6_LLC                               [Bcmp_6]  1.98GB/s ± 9%        2.18GB/s ± 8%  +10.34%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_6_Cold                              [Bcmp_6]   754MB/s ± 5%         752MB/s ± 5%     ~     (p=0.592 n=30+30)
BM_LIBC_Bcmp_7_L1                                [Bcmp_7]  1.72GB/s ± 5%        1.72GB/s ± 6%     ~     (p=0.549 n=29+29)
BM_LIBC_Bcmp_7_L2                                [Bcmp_7]  1.61GB/s ± 7%        1.63GB/s ± 8%     ~     (p=0.191 n=30+29)
BM_LIBC_Bcmp_7_LLC                               [Bcmp_7]   913MB/s ± 8%         905MB/s ± 9%     ~     (p=0.423 n=30+30)
BM_LIBC_Bcmp_7_Cold                              [Bcmp_7]   304MB/s ± 6%         287MB/s ± 4%   -5.57%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_8_L1                                [Bcmp_8]   961MB/s ± 5%        1124MB/s ± 6%  +16.94%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_8_L2                                [Bcmp_8]   915MB/s ± 8%        1100MB/s ± 7%  +20.16%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_8_LLC                               [Bcmp_8]   593MB/s ± 8%         669MB/s ± 8%  +12.92%  (p=0.000 n=30+30)
BM_LIBC_Bcmp_8_Cold                              [Bcmp_8]   220MB/s ± 4%         220MB/s ± 6%     ~     (p=0.572 n=30+30)
```

Co-authored-by: goldvitaly@google.com <%username%@google.com>
2024-09-06 11:19:01 +02:00
wldfngrs
73514f6831
[libc] Add proxy header for __sighandler_t type (#107354)
Added proxy headers for __sighandler_t type, modified the corresponding
CMakeLists.txt files and test files
2024-09-05 18:04:35 -04:00
Michael Jones
8e28f0471b
[libc] Correct the entrypoints list for ARM/darwin (#107331)
These entrypoints were added to every target without testing. They don't
work on ARM macs.
2024-09-05 09:38:05 -07:00
Petr Hosek
8b77aa990b
[libc] Use correct names for locale variants in spec.td (#106806)
This addresses issue introduced in #105718.
2024-08-30 15:13:23 -07:00
Joseph Huber
5c019bdb7a
[libc] Add support for 'string.h' locale variants (#105719)
Summary:
This adds the locale variants of the string functions. As previously,
these do not use the locale information at all and simply copy the
non-locale version which expects the "C" locale.
2024-08-29 14:20:15 -05:00
Joseph Huber
a87105121d
[libc] Implement locale variants for 'stdlib.h' functions (#105718)
Summary:
This provides the `_l` variants for the `stdlib.h` functions. These are
just copies of the same entrypoint and don't do anything with the locale
information.
2024-08-29 14:18:37 -05:00
Job Henandez Lara
1ace91f925
[libc][math] Add performance tests for fmul and fmull. (#106262) 2024-08-29 14:14:18 -04:00
Guillaume Chatelet
73ef397fcb
[libc][x86] Use prefetch for write for memcpy (#90450)
Currently when `LIBC_COPT_MEMCPY_X86_USE_SOFTWARE_PREFETCHING` is set we
prefetch memory for read on the source buffer. This patch adds prefetch
for write on the destination buffer.
2024-08-29 14:17:23 +02:00
Joseph Huber
439d7de14d [libc] Disable failing scanf test on AMDGPU temporarily
Summary:
This test currently fails in the `amdgpu-attributor` pass. I haven't
figured out anything beyond that yet as it's difficult to reduce.
2024-08-28 07:04:15 -05:00
Joseph Huber
8fd9ec5817 [libc] Fix incorrect check for NVPTX backend 2024-08-28 07:04:15 -05:00