This patch introduces a new trait to represent whether a type is
trivially
relocatable, and uses that trait to optimize the growth of a std::vector
of trivially relocatable objects.
```
--------------------------------------------------
Benchmark old new
--------------------------------------------------
bm_grow<int> 1354 ns 1301 ns
bm_grow<std::string> 5584 ns 3370 ns
bm_grow<std::unique_ptr<int>> 3506 ns 1994 ns
bm_grow<std::deque<int>> 27114 ns 27209 ns
```
This also changes to order of moving and destroying the objects when
growing the vector. This should not affect our conformance.
Credits: this change is based on analysis and a proof of concept by
gerbens@google.com.
Before, the compiler loses track of end as 'this' and other references
possibly escape beyond the compiler's scope. This can be see in the
generated assembly:
16.28 │200c80: mov %r15d,(%rax)
60.87 │200c83: add $0x4,%rax
│200c87: mov %rax,-0x38(%rbp)
0.03 │200c8b: → jmpq 200d4e
...
...
1.69 │200d4e: cmp %r15d,%r12d
│200d51: → je 200c40
16.34 │200d57: inc %r15d
0.05 │200d5a: mov -0x38(%rbp),%rax
3.27 │200d5e: mov -0x30(%rbp),%r13
1.47 │200d62: cmp %r13,%rax
│200d65: → jne 200c80
We fix this by always explicitly storing the loaded local and pointer
back at the end of push back. This generates some slight source 'noise',
but creates nice and compact fast path code, i.e.:
32.64 │200760: mov %r14d,(%r12)
9.97 │200764: add $0x4,%r12
6.97 │200768: mov %r12,-0x38(%rbp)
32.17 │20076c: add $0x1,%r14d
2.36 │200770: cmp %r14d,%ebx
│200773: → je 200730
8.98 │200775: mov -0x30(%rbp),%r13
6.75 │200779: cmp %r13,%r12
│20077c: → jne 200760
Now there is a single store for the push_back value (as before), and a
single store for the end without a reload (dependency).
For fully local vectors, (i.e., not referenced elsewhere), the capacity
load and store inside the loop could also be removed, but this requires
more substantial refactoring inside vector.
Differential Revision: https://reviews.llvm.org/D80588
This reverts commit 390ac823178fc1073612b4c8a38835f441138d9d.
It broke the sphinx publish bots for our documentation:
https://lab.llvm.org/buildbot/#/builders/242/builds/1130
because that machine has GCC 9.4.0 which does not know about C++23
This commit does a pass of clang-format over files in libc++ that
don't require major changes to conform to our style guide, or for
which we're not overly concerned about conflicting with in-flight
patches or hindering the git blame.
This roughly covers:
- benchmarks
- range algorithms
- concepts
- type traits
I did a manual verification of all the changes, and in particular I
applied clang-format on/off annotations in a few places where the
result was less readable after than before. This was not necessary
in a lot of places, however I did find that clang-format had pretty
bad taste when it comes to formatting concepts.
Differential Revision: https://reviews.llvm.org/D153140
The optimizer is petulant and temperamental. In this case LLVM failed to lower
the the "insert at end" loop used by`vector<unsigned char>` to a `memset` despite
`memset` being substantially faster over a range of bytes.
LLVM has the ability to lower loops to `memset` whet appropriate, but the
odd nature of libc++'s loops prevented the optimization from taking places.
This patch addresses the issue by rewriting the loops from the form
`do [ ... --__n; } while (__n > 0);` to instead use a for loop over a pointer
range (For example: `for (auto *__i = ...; __i < __e; ++__i)`).
This patch also rewrites the asan annotations to unposion all additional memory
at the start of the loop instead of once per iterations. This could potentially
permit false negatives where the constructor of element N attempts to access
element N + 1 during its construction.
The before and after results for the `BM_ConstructSize/vector_byte/5140480_mean`
benchmark (run 5 times) are:
--------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------------------------
Before
------
BM_ConstructSize/vector_byte/5140480_mean 12530140 ns 12469693 ns N/A
BM_ConstructSize/vector_byte/5140480_median 12512818 ns 12445571 ns N/A
BM_ConstructSize/vector_byte/5140480_stddev 106224 ns 107907 ns 5
-----
After
-----
BM_ConstructSize/vector_byte/5140480_mean 167285 ns 166500 ns N/A
BM_ConstructSize/vector_byte/5140480_median 166749 ns 166069 ns N/A
BM_ConstructSize/vector_byte/5140480_stddev 3242 ns 3184 ns 5
llvm-svn: 367183