Nikolas Klauser
1cac2be874
[libc++] Optimize copy construction and assignment of __tree ( #151304 )
...
```
----------------------------------------------------------------------------------------------------------
Benchmark old new
----------------------------------------------------------------------------------------------------------
std::map<int, int>::ctor(const&)/0 15.5 ns 14.9 ns
std::map<int, int>::ctor(const&)/32 474 ns 321 ns
std::map<int, int>::ctor(const&)/1024 24591 ns 11101 ns
std::map<int, int>::ctor(const&)/8192 236153 ns 98868 ns
std::map<std::string, int>::ctor(const&)/0 15.2 ns 14.9 ns
std::map<std::string, int>::ctor(const&)/32 2673 ns 2340 ns
std::map<std::string, int>::ctor(const&)/1024 115354 ns 86088 ns
std::map<std::string, int>::ctor(const&)/8192 1298510 ns 626876 ns
std::map<int, int>::operator=(const&) (into cleared Container)/0 16.5 ns 16.1 ns
std::map<int, int>::operator=(const&) (into cleared Container)/32 548 ns 323 ns
std::map<int, int>::operator=(const&) (into cleared Container)/1024 28418 ns 11026 ns
std::map<int, int>::operator=(const&) (into cleared Container)/8192 281827 ns 97113 ns
std::map<int, int>::operator=(const&) (into populated Container)/0 2.42 ns 1.85 ns
std::map<int, int>::operator=(const&) (into populated Container)/32 369 ns 73.0 ns
std::map<int, int>::operator=(const&) (into populated Container)/1024 24078 ns 2322 ns
std::map<int, int>::operator=(const&) (into populated Container)/8192 266537 ns 22963 ns
std::map<std::string, int>::operator=(const&) (into cleared Container)/0 16.6 ns 16.2 ns
std::map<std::string, int>::operator=(const&) (into cleared Container)/32 2614 ns 1622 ns
std::map<std::string, int>::operator=(const&) (into cleared Container)/1024 116826 ns 63281 ns
std::map<std::string, int>::operator=(const&) (into cleared Container)/8192 1316655 ns 649177 ns
std::map<std::string, int>::operator=(const&) (into populated Container)/0 2.42 ns 1.89 ns
std::map<std::string, int>::operator=(const&) (into populated Container)/32 1264 ns 581 ns
std::map<std::string, int>::operator=(const&) (into populated Container)/1024 238826 ns 39943 ns
std::map<std::string, int>::operator=(const&) (into populated Container)/8192 2412327 ns 379456 ns
```
Fixes #77658
Fixes #62571
2025-08-05 09:49:40 +02:00
xbcnn
3efa461d45
[libcxx] Avoid hash key in __hash_table::find() if it is empty. ( #126837 )
...
If the hash table is empty, with or without buckets, the find() can do
fast return. Then computing hash key is useless and avoidable, since it
could be expensive for some key types, such as long strings.
This is a small optimization but useful in cases like a checklist
(unordered_set/map) which is mostly empty.
```
For std::unordered_set<*>, `--benchmark_filter=find`
1. With the opt:
---------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------------------------------------------------
std::unordered_set<int>::find(key) (existent)/0 0.118 ns 0.118 ns 5939922720
std::unordered_set<int>::find(key) (existent)/32 52.1 ns 52.1 ns 13287232
std::unordered_set<int>::find(key) (existent)/1024 51.1 ns 51.1 ns 13449472
std::unordered_set<int>::find(key) (existent)/8192 53.1 ns 53.1 ns 13420864
std::unordered_set<int>::find(key) (non-existent)/0 14.7 ns 14.7 ns 47725472
std::unordered_set<int>::find(key) (non-existent)/32 44.1 ns 44.1 ns 15478144
std::unordered_set<int>::find(key) (non-existent)/1024 41.2 ns 41.2 ns 15082464
std::unordered_set<int>::find(key) (non-existent)/8192 49.5 ns 49.5 ns 15233600
std::unordered_set<std::string>::find(key) (existent)/0 0.136 ns 0.136 ns 5157977920
std::unordered_set<std::string>::find(key) (existent)/32 739 ns 739 ns 1023744
std::unordered_set<std::string>::find(key) (existent)/1024 836 ns 836 ns 840448
std::unordered_set<std::string>::find(key) (existent)/8192 768 ns 768 ns 1085664
std::unordered_set<std::string>::find(key) (non-existent)/0 14.6 ns 14.6 ns 47844160
std::unordered_set<std::string>::find(key) (non-existent)/32 608 ns 608 ns 1106496
std::unordered_set<std::string>::find(key) (non-existent)/1024 646 ns 646 ns 986272
std::unordered_set<std::string>::find(key) (non-existent)/8192 669 ns 669 ns 1047584
2. Without the opt:
---------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------------------------------------------------
std::unordered_set<int>::find(key) (existent)/0 0.135 ns 0.135 ns 5188502304
std::unordered_set<int>::find(key) (existent)/32 54.4 ns 54.4 ns 12954144
std::unordered_set<int>::find(key) (existent)/1024 57.7 ns 57.7 ns 13107008
std::unordered_set<int>::find(key) (existent)/8192 50.7 ns 50.7 ns 12953312
std::unordered_set<int>::find(key) (non-existent)/0 16.1 ns 16.1 ns 43460192
std::unordered_set<int>::find(key) (non-existent)/32 45.8 ns 45.8 ns 17139584
std::unordered_set<int>::find(key) (non-existent)/1024 44.6 ns 44.6 ns 16538048
std::unordered_set<int>::find(key) (non-existent)/8192 41.5 ns 41.5 ns 12850816
std::unordered_set<std::string>::find(key) (existent)/0 0.133 ns 0.133 ns 5214104992
std::unordered_set<std::string>::find(key) (existent)/32 731 ns 731 ns 1000576
std::unordered_set<std::string>::find(key) (existent)/1024 716 ns 716 ns 1131584
std::unordered_set<std::string>::find(key) (existent)/8192 745 ns 745 ns 909632
std::unordered_set<std::string>::find(key) (non-existent)/0 600 ns 600 ns 1089792
std::unordered_set<std::string>::find(key) (non-existent)/32 645 ns 645 ns 979232
std::unordered_set<std::string>::find(key) (non-existent)/1024 675 ns 675 ns 962240
std::unordered_set<std::string>::find(key) (non-existent)/8192 711 ns 711 ns 1054880
```
We can see the improvements when find() for non-existent
`std::string`(random size 1~1024) keys:
```
std::unordered_set<std::string>::find(key) (non-existent)/0 14.6 ns 14.6 ns 47844160
std::unordered_set<std::string>::find(key) (non-existent)/0 600 ns 600 ns 1089792
```
---------
Co-authored-by: yangxiaobing <yangxiaobing@jwzg.com>
2025-07-03 09:39:06 +02:00
Hui
34b2e934ea
[libc++] Introduce __product_iterator_traits
and optimise flat_map::insert
( #139454 )
...
Fixes #108624
This allows `flat_map::insert(Iter, Iter)` to directly forward to
underlying containers' `insert(Iter, Iter)`, instead of inserting one
element at a time, when input models "product iterator". atm,
`flat_map::iterator` and `zip_view::iterator` are "product iterator"s.
This gives about almost 10x speed up in my benchmark with -03 (for both
before and after)
```cpp
Benchmark Time CPU Time Old Time New CPU Old CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------
flat_map::insert_product_iterator_flat_map/32 -0.5028 -0.5320 149 74 149 70
flat_map::insert_product_iterator_flat_map/1024 -0.8617 -0.8618 3113 430 3112 430
flat_map::insert_product_iterator_flat_map/8192 -0.8877 -0.8877 26682 2995 26679 2995
flat_map::insert_product_iterator_flat_map/65536 -0.8769 -0.8769 226235 27844 226221 27841
flat_map::insert_product_iterator_zip/32 -0.5844 -0.5844 162 67 162 67
flat_map::insert_product_iterator_zip/1024 -0.8754 -0.8754 3427 427 3427 427
flat_map::insert_product_iterator_zip/8192 -0.8934 -0.8934 28134 3000 28132 3000
flat_map::insert_product_iterator_zip/65536 -0.8783 -0.8783 229783 27960 229767 27958
OVERALL_GEOMEAN -0.8319 -0.8332 0 0 0 0
```
---------
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
2025-06-28 13:42:50 +01:00
Louis Dionne
1d319dfe7d
[libc++] Implement generic associative container benchmarks ( #123663 )
...
This patch implements generic associative container benchmarks for
containers with unique keys. In doing so, it replaces the existing
std::map benchmarks which were based on the cartesian product
infrastructure and were too slow to execute.
These new benchmarks aim to strike a balance between exhaustive coverage
of all operations in the most interesting case, while executing fairly
rapidly (~40s on my machine).
This bumps the requirement for the map benchmarks from C++17 to C++20
because the common header that provides associative container benchmarks
requires support for C++20 concepts.
2025-02-06 16:08:55 -05:00