Tacet 9ed20568e7
[ASan][libc++] std::basic_string annotations (#72677)
This commit introduces basic annotations for `std::basic_string`,
mirroring the approach used in `std::vector` and `std::deque`.
Initially, only long strings with the default allocator will be
annotated. Short strings (_SSO - short string optimization_) and strings
with non-default allocators will be annotated in the near future, with
separate commits dedicated to enabling them. The process will be similar
to the workflow employed for enabling annotations in `std::deque`.

**Please note**: these annotations function effectively only when libc++
and libc++abi dylibs are instrumented (with ASan). This aligns with the
prevailing behavior of Memory Sanitizer.

To avoid breaking everything, this commit also appends
`_LIBCPP_INSTRUMENTED_WITH_ASAN` to `__config_site` whenever libc++ is
compiled with ASan. If this macro is not defined, string annotations are
not enabled. However, linking a binary that does **not** annotate
strings with a dynamic library that annotates strings, is not permitted.

Originally proposed here: https://reviews.llvm.org/D132769

Related patches on Phabricator:
- Turning on annotations for short strings:
https://reviews.llvm.org/D147680
- Turning on annotations for all allocators:
https://reviews.llvm.org/D146214

This PR is a part of a series of patches extending AddressSanitizer C++
container overflow detection capabilities by adding annotations, similar
to those existing in `std::vector` and `std::deque` collections. These
enhancements empower ASan to effectively detect instances where the
instrumented program attempts to access memory within a collection's
internal allocation that remains unused. This includes cases where
access occurs before or after the stored elements in `std::deque`, or
between the `std::basic_string`'s size (including the null terminator)
and capacity bounds.

The introduction of these annotations was spurred by a real-world
software bug discovered by Trail of Bits, involving an out-of-bounds
memory access during the comparison of two strings using the
`std::equals` function. This function was taking iterators
(`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison,
using a custom comparison function. When the `iter1` object exceeded the
length of `iter2`, an out-of-bounds read could occur on the `iter2`
object. Container sanitization, upon enabling these annotations, would
effectively identify and flag this potential vulnerability.

This Pull Request introduces basic annotations for `std::basic_string`.
Long strings exhibit structural similarities to `std::vector` and will
be annotated accordingly. Short strings are already implemented, but
will be turned on separately in a forthcoming commit. Look at [a
comment](https://github.com/llvm/llvm-project/pull/72677#issuecomment-1850554465)
below to read about SSO issues at current moment.

Due to the functionality introduced in
[D132522](dd1b7b797a),
the `__sanitizer_annotate_contiguous_container` function now offers
compatibility with all allocators. However, enabling this support will
be done in a subsequent commit. For the time being, only strings with
the default allocator will be annotated.

If you have any questions, please email:
- advenam.tacet@trailofbits.com
- disconnect3d@trailofbits.com
2023-12-13 06:05:34 +01:00

163 lines
4.8 KiB
C++

//===----------------------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
// <string>
// basic_string substr(size_type pos = 0, size_type n = npos) const; // constexpr since C++20, removed in C++23
// basic_string substr(size_type pos = 0, size_type n = npos) const&; // since in C++23
#include <string>
#include <stdexcept>
#include <algorithm>
#include <cassert>
#include "test_allocator.h"
#include "test_macros.h"
#include "min_allocator.h"
#include "asan_testing.h"
template <class S>
TEST_CONSTEXPR_CXX20 void test(const S& s, typename S::size_type pos, typename S::size_type n) {
if (pos <= s.size()) {
S str = s.substr(pos, n);
LIBCPP_ASSERT(str.__invariants());
assert(pos <= s.size());
typename S::size_type rlen = std::min(n, s.size() - pos);
assert(str.size() == rlen);
assert(S::traits_type::compare(s.data() + pos, str.data(), rlen) == 0);
LIBCPP_ASSERT(is_string_asan_correct(s));
LIBCPP_ASSERT(is_string_asan_correct(str));
}
#ifndef TEST_HAS_NO_EXCEPTIONS
else if (!TEST_IS_CONSTANT_EVALUATED) {
try {
S str = s.substr(pos, n);
assert(false);
} catch (std::out_of_range&) {
assert(pos > s.size());
}
}
#endif
}
template <class S>
TEST_CONSTEXPR_CXX20 void test_string() {
test(S(""), 0, 0);
test(S(""), 1, 0);
test(S("pniot"), 0, 0);
test(S("htaob"), 0, 1);
test(S("fodgq"), 0, 2);
test(S("hpqia"), 0, 4);
test(S("qanej"), 0, 5);
test(S("dfkap"), 1, 0);
test(S("clbao"), 1, 1);
test(S("ihqrf"), 1, 2);
test(S("mekdn"), 1, 3);
test(S("ngtjf"), 1, 4);
test(S("srdfq"), 2, 0);
test(S("qkdrs"), 2, 1);
test(S("ikcrq"), 2, 2);
test(S("cdaih"), 2, 3);
test(S("dmajb"), 4, 0);
test(S("karth"), 4, 1);
test(S("lhcdo"), 5, 0);
test(S("acbsj"), 6, 0);
test(S("pbsjikaole"), 0, 0);
test(S("pcbahntsje"), 0, 1);
test(S("mprdjbeiak"), 0, 5);
test(S("fhepcrntko"), 0, 9);
test(S("eqmpaidtls"), 0, 10);
test(S("joidhalcmq"), 1, 0);
test(S("omigsphflj"), 1, 1);
test(S("kocgbphfji"), 1, 4);
test(S("onmjekafbi"), 1, 8);
test(S("fbslrjiqkm"), 1, 9);
test(S("oqmrjahnkg"), 5, 0);
test(S("jeidpcmalh"), 5, 1);
test(S("schfalibje"), 5, 2);
test(S("crliponbqe"), 5, 4);
test(S("igdscopqtm"), 5, 5);
test(S("qngpdkimlc"), 9, 0);
test(S("thdjgafrlb"), 9, 1);
test(S("hcjitbfapl"), 10, 0);
test(S("mgojkldsqh"), 11, 0);
test(S("gfshlcmdjreqipbontak"), 0, 0);
test(S("nadkhpfemgclosibtjrq"), 0, 1);
test(S("nkodajteqplrbifhmcgs"), 0, 10);
test(S("ofdrqmkeblthacpgijsn"), 0, 19);
test(S("gbmetiprqdoasckjfhln"), 0, 20);
test(S("bdfjqgatlksriohemnpc"), 1, 0);
test(S("crnklpmegdqfiashtojb"), 1, 1);
test(S("ejqcnahdrkfsmptilgbo"), 1, 9);
test(S("jsbtafedocnirgpmkhql"), 1, 18);
test(S("prqgnlbaejsmkhdctoif"), 1, 19);
test(S("qnmodrtkebhpasifgcjl"), 10, 0);
test(S("pejafmnokrqhtisbcdgl"), 10, 1);
test(S("cpebqsfmnjdolhkratgi"), 10, 5);
test(S("odnqkgijrhabfmcestlp"), 10, 9);
test(S("lmofqdhpkibagnrcjste"), 10, 10);
test(S("lgjqketopbfahrmnsicd"), 19, 0);
test(S("ktsrmnqagdecfhijpobl"), 19, 1);
test(S("lsaijeqhtrbgcdmpfkno"), 20, 0);
test(S("dplqartnfgejichmoskb"), 21, 0);
test(S("gbmetiprqdoasckjfhlnxx"), 0, 22);
test(S("gbmetiprqdoasckjfhlnxa"), 0, 8);
test(S("gbmetiprqdoasckjfhlnxb"), 1, 0);
test(S("LONGtiprqdoasckjfhlnxxo"), 0, 23);
test(S("LONGtiprqdoasckjfhlnxap"), 0, 8);
test(S("LONGtiprqdoasckjfhlnxbl"), 1, 0);
test(S("LONGtiprqdoasckjfhlnxxyy"), 0, 24);
test(S("LONGtiprqdoasckjfhlnxxyr"), 0, 8);
test(S("LONGtiprqdoasckjfhlnxxyz"), 1, 0);
}
TEST_CONSTEXPR_CXX20 bool test() {
test_string<std::string>();
#if TEST_STD_VER >= 11
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
#endif
return true;
}
TEST_CONSTEXPR_CXX20 bool test_alloc() {
{
using alloc = test_allocator<char>;
using string = std::basic_string<char, std::char_traits<char>, alloc>;
test_allocator_statistics stats;
{
string str = string(alloc(&stats));
stats = test_allocator_statistics();
(void)str.substr();
assert(stats.moved == 0);
assert(stats.copied == 0);
}
{
string str = string(alloc(&stats));
stats = test_allocator_statistics();
(void)std::move(str).substr();
assert(stats.moved == 0);
assert(stats.copied == 0);
}
}
return true;
}
int main(int, char**) {
test();
test_alloc();
#if TEST_STD_VER > 17
static_assert(test());
static_assert(test_alloc());
#endif
return 0;
}