11 Commits

Author SHA1 Message Date
Peng Liu
d0cee6939a
[libc++] Optimize std::{,ranges}::{fill,fill_n} for segmented iterators (#132665)
This patch optimizes `std::fill`, `std::fill_n`, `std::ranges::fill`,
and `std::ranges::fill_n` for segmented iterators, achieving substantial
performance improvements. Specifically, for `deque<int>` iterators, the
performance improvements are above 10x for all these algorithms. The
optimization also enables filling segmented memory of `deque<int>` to
approach the performance of filling contiguous memory of `vector<int>`.


Benchmark results comparing the before and after implementations are
provided below. For additional context, we’ve included `vector<int>`
results, which remain unchanged, as this patch specifically targets
segmented iterators and leaves non-segmented iterator behavior
untouched.



Fixes two subtasks outlined in #102817.

#### `fill_n`

```
-----------------------------------------------------------------------------
Benchmark                                Before            After      Speedup
-----------------------------------------------------------------------------
std::fill_n(deque<int>)/32              11.4 ns          2.28 ns        5.0x
std::fill_n(deque<int>)/50              19.7 ns          3.40 ns        5.8x
std::fill_n(deque<int>)/1024             391 ns          37.3 ns       10.5x
std::fill_n(deque<int>)/8192            3174 ns           301 ns       10.5x
std::fill_n(deque<int>)/65536          26504 ns          2951 ns        9.0x
std::fill_n(deque<int>)/1048576       407960 ns         80658 ns        5.1x
rng::fill_n(deque<int>)/32              14.3 ns          2.15 ns        6.6x
rng::fill_n(deque<int>)/50              20.2 ns          3.22 ns        6.3x
rng::fill_n(deque<int>)/1024             381 ns          37.8 ns       10.1x
rng::fill_n(deque<int>)/8192            3101 ns           294 ns       10.5x
rng::fill_n(deque<int>)/65536          25098 ns          2926 ns        8.6x
rng::fill_n(deque<int>)/1048576       394342 ns         78874 ns        5.0x
std::fill_n(vector<int>)/32             1.76 ns          1.72 ns        1.0x
std::fill_n(vector<int>)/50             3.00 ns          2.73 ns        1.1x
std::fill_n(vector<int>)/1024           38.4 ns          37.9 ns        1.0x
std::fill_n(vector<int>)/8192            258 ns           252 ns        1.0x
std::fill_n(vector<int>)/65536          2993 ns          2889 ns        1.0x
std::fill_n(vector<int>)/1048576       80328 ns         80468 ns        1.0x
rng::fill_n(vector<int>)/32             1.99 ns          1.35 ns        1.5x
rng::fill_n(vector<int>)/50             2.66 ns          2.12 ns        1.3x
rng::fill_n(vector<int>)/1024           37.7 ns          35.8 ns        1.1x
rng::fill_n(vector<int>)/8192            253 ns           250 ns        1.0x
rng::fill_n(vector<int>)/65536          2922 ns          2930 ns        1.0x
rng::fill_n(vector<int>)/1048576       79739 ns         79742 ns        1.0x
```

#### `fill`

```
--------------------------------------------------------------------------
Benchmark                              Before            After     Speedup
--------------------------------------------------------------------------
std::fill(deque<int>)/32              13.7 ns          2.45 ns        5.6x
std::fill(deque<int>)/50              21.7 ns          4.57 ns        4.7x
std::fill(deque<int>)/1024             367 ns          38.5 ns        9.5x
std::fill(deque<int>)/8192            2896 ns           247 ns       11.7x
std::fill(deque<int>)/65536          23723 ns          2907 ns        8.2x
std::fill(deque<int>)/1048576       379043 ns         79885 ns        4.7x
rng::fill(deque<int>)/32              13.6 ns          2.70 ns        5.0x
rng::fill(deque<int>)/50              23.4 ns          3.94 ns        5.9x
rng::fill(deque<int>)/1024             377 ns          37.9 ns        9.9x
rng::fill(deque<int>)/8192            2914 ns           286 ns       10.2x
rng::fill(deque<int>)/65536          23612 ns          2939 ns        8.0x
rng::fill(deque<int>)/1048576       379841 ns         80079 ns        4.7x
std::fill(vector<int>)/32             1.99 ns          1.79 ns        1.1x
std::fill(vector<int>)/50             3.05 ns          3.06 ns        1.0x
std::fill(vector<int>)/1024           37.6 ns          38.0 ns        1.0x
std::fill(vector<int>)/8192            255 ns           257 ns        1.0x
std::fill(vector<int>)/65536          2966 ns          2981 ns        1.0x
std::fill(vector<int>)/1048576       78300 ns         80348 ns        1.0x
rng::fill(vector<int>)/32             1.77 ns          1.75 ns        1.0x
rng::fill(vector<int>)/50             4.85 ns          2.31 ns        2.1x
rng::fill(vector<int>)/1024           39.6 ns          36.1 ns        1.1x
rng::fill(vector<int>)/8192            238 ns           251 ns        0.9x
rng::fill(vector<int>)/65536          2941 ns          2918 ns        1.0x
rng::fill(vector<int>)/1048576       80497 ns         80442 ns        1.0x
```

---------

Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
Co-authored-by: A. Jiang <de34@live.cn>
2025-10-17 07:41:24 +08:00
Peng Liu
1d583ed2fb
[libc++][test] Augment ranges::{fill, fill_n, find} with missing tests (#121209)
libc++ currently has very limited test coverage for `std::ranges{fill, fill_n, find}`
with `vector<bool>::iterator` optimizations. Specifically, the existing tests for
`std::ranges::fill` only covers cases of 1 - 2 bytes, which is merely 1/8 to 1/4
of the `__storage_type` word size. This renders the tests insufficient to validate
functionality for whole words, with or without partial words (which necessitates at
least 8 bytes of data). Moreover, no tests were provided for `ranges::{find, fill_n}`
with `vector<bool>::iterator` optimizations. This PR fills in the gap.
2025-02-26 11:38:15 -05:00
Peng Liu
ad87d5f23d
[libc++][test] Refactor tests for std::{copy, move, fill} algorithms (#120909)
This refactor includes the following changes:
- Refactor similar tests using `types::for_each` to remove redundant code;
- Explicitly include the missing header `type_algorithms.h` in some test files;
- Some tests scattered in different test functions with ad-hoc names
  (e.g., `test5()`, `test6()`) but belong to the same kind are now grouped
  into one function (`test_struct_array()`).
2025-02-19 12:44:44 -05:00
Peng Liu
cf9806eb4d
[libc++] Fix UB in bitwise logic of {std, ranges}::{fill, fill_n} algorithms (#122410)
This PR addresses an undefined behavior that arises when using the
`std::fill` and `std::fill_n` algorithms, as well as their ranges
counterparts `ranges::fill` and `ranges::fill_n`, with `vector<bool, Alloc>`
that utilizes a custom-sized allocator with small integral types.
2025-02-05 11:39:49 -05:00
Nikolas Klauser
e99c4906e4
[libc++] Granularize <cstddef> includes (#108696) 2024-10-31 02:20:10 +01:00
Nikolas Klauser
07b18c5e1b
[libc++] Optimize ranges::fill{,_n} for vector<bool>::iterator (#84642)
```
------------------------------------------------------
Benchmark                          old             new
------------------------------------------------------
bm_ranges_fill_n/1             1.64 ns         3.06 ns
bm_ranges_fill_n/2             3.45 ns         3.06 ns
bm_ranges_fill_n/3             4.88 ns         3.06 ns
bm_ranges_fill_n/4             6.46 ns         3.06 ns
bm_ranges_fill_n/5             8.03 ns         3.06 ns
bm_ranges_fill_n/6             9.65 ns         3.07 ns
bm_ranges_fill_n/7             11.5 ns         3.06 ns
bm_ranges_fill_n/8             13.0 ns         3.06 ns
bm_ranges_fill_n/16            25.9 ns         3.06 ns
bm_ranges_fill_n/64             103 ns         4.62 ns
bm_ranges_fill_n/512            711 ns         4.40 ns
bm_ranges_fill_n/4096          5642 ns         9.86 ns
bm_ranges_fill_n/32768        45135 ns         33.6 ns
bm_ranges_fill_n/262144      360818 ns          243 ns
bm_ranges_fill_n/1048576    1442828 ns          982 ns
bm_ranges_fill/1               1.63 ns         3.17 ns
bm_ranges_fill/2               3.43 ns         3.28 ns
bm_ranges_fill/3               4.97 ns         3.31 ns
bm_ranges_fill/4               6.53 ns         3.27 ns
bm_ranges_fill/5               8.12 ns         3.33 ns
bm_ranges_fill/6               9.76 ns         3.32 ns
bm_ranges_fill/7               11.6 ns         3.29 ns
bm_ranges_fill/8               13.2 ns         3.26 ns
bm_ranges_fill/16              26.3 ns         3.26 ns
bm_ranges_fill/64               104 ns         4.92 ns
bm_ranges_fill/512              716 ns         4.47 ns
bm_ranges_fill/4096            5772 ns         8.21 ns
bm_ranges_fill/32768          45778 ns         33.1 ns
bm_ranges_fill/262144        351422 ns          241 ns
bm_ranges_fill/1048576      1404710 ns          965 ns
```
2024-03-17 20:00:54 +01:00
JF Bastien
2df59c5068 Support tests in freestanding
Summary:
Freestanding is *weird*. The standard allows it to differ in a bunch of odd
manners from regular C++, and the committee would like to improve that
situation. I'd like to make libc++ behave better with what freestanding should
be, so that it can be a tool we use in improving the standard. To do that we
need to try stuff out, both with "freestanding the language mode" and
"freestanding the library subset".

Let's start with the super basic: run the libc++ tests in freestanding, using
clang as the compiler, and see what works. The easiest hack to do this:

In utils/libcxx/test/config.py add:

  self.cxx.compile_flags += ['-ffreestanding']

Run the tests and they all fail.

Why? Because in freestanding `main` isn't special. This "not special" property
has two effects: main doesn't get mangled, and main isn't allowed to omit its
`return` statement. The first means main gets mangled and the linker can't
create a valid executable for us to test. The second means we spew out warnings
(ew) and the compiler doesn't insert the `return` we omitted, and main just
falls of the end and does whatever undefined behavior (if you're luck, ud2
leading to non-zero return code).

Let's start my work with the basics. This patch changes all libc++ tests to
declare `main` as `int main(int, char**` so it mangles consistently (enabling us
to declare another `extern "C"` main for freestanding which calls the mangled
one), and adds `return 0;` to all places where it was missing. This touches 6124
files, and I apologize.

The former was done with The Magic Of Sed.

The later was done with a (not quite correct but decent) clang tool:

  https://gist.github.com/jfbastien/793819ff360baa845483dde81170feed

This works for most tests, though I did have to adjust a few places when e.g.
the test runs with `-x c`, macros are used for main (such as for the filesystem
tests), etc.

Once this is in we can create a freestanding bot which will prevent further
regressions. After that, we can start the real work of supporting C++
freestanding fairly well in libc++.

<rdar://problem/47754795>

Reviewers: ldionne, mclow.lists, EricWF

Subscribers: christof, jkorous, dexonsmith, arphaman, miyuki, libcxx-commits

Differential Revision: https://reviews.llvm.org/D57624

llvm-svn: 353086
2019-02-04 20:31:13 +00:00
Chandler Carruth
57b08b0944 Update more file headers across all of the LLVM projects in the monorepo
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351648
2019-01-19 10:56:40 +00:00
Stephan T. Lavavej
6b1ae9b854 [libcxx] [test] Strip trailing whitespace, NFC.
llvm-svn: 324959
2018-02-12 22:54:35 +00:00
Marshall Clow
4bfb9313c1 More P0202 constexpr work. This commit adds fill/fill_n/generate/generate_n/unique/unique_copy. I removed a specialization of fill_n that recognized when we were dealing with raw pointers and 1 byte trivially-assignable types and did a memset, because the compiler will do that optimization for us.
llvm-svn: 323050
2018-01-20 20:14:32 +00:00
Eric Fiselier
5a83710e37 Move test into test/std subdirectory.
llvm-svn: 224658
2014-12-20 01:40:03 +00:00