Tacet 9ed20568e7
[ASan][libc++] std::basic_string annotations (#72677)
This commit introduces basic annotations for `std::basic_string`,
mirroring the approach used in `std::vector` and `std::deque`.
Initially, only long strings with the default allocator will be
annotated. Short strings (_SSO - short string optimization_) and strings
with non-default allocators will be annotated in the near future, with
separate commits dedicated to enabling them. The process will be similar
to the workflow employed for enabling annotations in `std::deque`.

**Please note**: these annotations function effectively only when libc++
and libc++abi dylibs are instrumented (with ASan). This aligns with the
prevailing behavior of Memory Sanitizer.

To avoid breaking everything, this commit also appends
`_LIBCPP_INSTRUMENTED_WITH_ASAN` to `__config_site` whenever libc++ is
compiled with ASan. If this macro is not defined, string annotations are
not enabled. However, linking a binary that does **not** annotate
strings with a dynamic library that annotates strings, is not permitted.

Originally proposed here: https://reviews.llvm.org/D132769

Related patches on Phabricator:
- Turning on annotations for short strings:
https://reviews.llvm.org/D147680
- Turning on annotations for all allocators:
https://reviews.llvm.org/D146214

This PR is a part of a series of patches extending AddressSanitizer C++
container overflow detection capabilities by adding annotations, similar
to those existing in `std::vector` and `std::deque` collections. These
enhancements empower ASan to effectively detect instances where the
instrumented program attempts to access memory within a collection's
internal allocation that remains unused. This includes cases where
access occurs before or after the stored elements in `std::deque`, or
between the `std::basic_string`'s size (including the null terminator)
and capacity bounds.

The introduction of these annotations was spurred by a real-world
software bug discovered by Trail of Bits, involving an out-of-bounds
memory access during the comparison of two strings using the
`std::equals` function. This function was taking iterators
(`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison,
using a custom comparison function. When the `iter1` object exceeded the
length of `iter2`, an out-of-bounds read could occur on the `iter2`
object. Container sanitization, upon enabling these annotations, would
effectively identify and flag this potential vulnerability.

This Pull Request introduces basic annotations for `std::basic_string`.
Long strings exhibit structural similarities to `std::vector` and will
be annotated accordingly. Short strings are already implemented, but
will be turned on separately in a forthcoming commit. Look at [a
comment](https://github.com/llvm/llvm-project/pull/72677#issuecomment-1850554465)
below to read about SSO issues at current moment.

Due to the functionality introduced in
[D132522](dd1b7b797a),
the `__sanitizer_annotate_contiguous_container` function now offers
compatibility with all allocators. However, enabling this support will
be done in a subsequent commit. For the time being, only strings with
the default allocator will be annotated.

If you have any questions, please email:
- advenam.tacet@trailofbits.com
- disconnect3d@trailofbits.com
2023-12-13 06:05:34 +01:00

143 lines
3.4 KiB
C++

//===----------------------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
// <string>
// basic_string(const basic_string& str, const Allocator& alloc); // constexpr since C++20
#include <string>
#include <cassert>
#include "test_macros.h"
#include "test_allocator.h"
#include "min_allocator.h"
#include "asan_testing.h"
#ifndef TEST_HAS_NO_EXCEPTIONS
struct alloc_imp {
bool active;
TEST_CONSTEXPR alloc_imp() : active(true) {}
template <class T>
T* allocate(std::size_t n) {
if (active)
return static_cast<T*>(std::malloc(n * sizeof(T)));
else
throw std::bad_alloc();
}
template <class T>
void deallocate(T* p, std::size_t) {
std::free(p);
}
void activate() { active = true; }
void deactivate() { active = false; }
};
template <class T>
struct poca_alloc {
typedef T value_type;
typedef std::true_type propagate_on_container_copy_assignment;
alloc_imp* imp;
TEST_CONSTEXPR poca_alloc(alloc_imp* imp_) : imp(imp_) {}
template <class U>
TEST_CONSTEXPR poca_alloc(const poca_alloc<U>& other) : imp(other.imp) {}
T* allocate(std::size_t n) { return imp->allocate<T>(n); }
void deallocate(T* p, std::size_t n) { imp->deallocate(p, n); }
};
template <typename T, typename U>
bool operator==(const poca_alloc<T>& lhs, const poca_alloc<U>& rhs) {
return lhs.imp == rhs.imp;
}
template <typename T, typename U>
bool operator!=(const poca_alloc<T>& lhs, const poca_alloc<U>& rhs) {
return lhs.imp != rhs.imp;
}
template <class S>
TEST_CONSTEXPR_CXX20 void test_assign(S& s1, const S& s2) {
try {
s1 = s2;
} catch (std::bad_alloc&) {
return;
}
assert(false);
}
#endif
template <class S>
TEST_CONSTEXPR_CXX20 void test(S s1, const typename S::allocator_type& a) {
S s2(s1, a);
LIBCPP_ASSERT(s2.__invariants());
assert(s2 == s1);
assert(s2.capacity() >= s2.size());
assert(s2.get_allocator() == a);
LIBCPP_ASSERT(is_string_asan_correct(s1));
LIBCPP_ASSERT(is_string_asan_correct(s2));
}
template <class Alloc>
TEST_CONSTEXPR_CXX20 void test_string(const Alloc& a) {
typedef std::basic_string<char, std::char_traits<char>, Alloc> S;
test(S(), Alloc(a));
test(S("1"), Alloc(a));
test(S("1234567890123456789012345678901234567890123456789012345678901234567890"), Alloc(a));
}
TEST_CONSTEXPR_CXX20 bool test() {
test_string(std::allocator<char>());
test_string(test_allocator<char>());
test_string(test_allocator<char>(3));
#if TEST_STD_VER >= 11
test_string(min_allocator<char>());
test_string(safe_allocator<char>());
#endif
#if TEST_STD_VER >= 11
# ifndef TEST_HAS_NO_EXCEPTIONS
if (!TEST_IS_CONSTANT_EVALUATED) {
typedef poca_alloc<char> A;
typedef std::basic_string<char, std::char_traits<char>, A> S;
const char* p1 = "This is my first string";
const char* p2 = "This is my second string";
alloc_imp imp1;
alloc_imp imp2;
S s1(p1, A(&imp1));
S s2(p2, A(&imp2));
assert(s1 == p1);
assert(s2 == p2);
imp2.deactivate();
test_assign(s1, s2);
assert(s1 == p1);
assert(s2 == p2);
}
# endif
#endif
return true;
}
int main(int, char**) {
test();
#if TEST_STD_VER > 17
static_assert(test());
#endif
return 0;
}