139 Commits

Author SHA1 Message Date
Nick Desaulniers
8bb9414f14 [Demangle] use std::string_view::data rather than &*std::string_view::begin
To fix expensive check builds that were failing when using MSVC's
std::string_view::iterator::operator*, I added a few expressions like
&*std::string_view::begin.  @nico pointed out that this is literally the
same thing and more clearly expressed as std::string_view::data.

Link: https://github.com/llvm/llvm-project/issues/63740

Reviewed By: #libc_abi, ldionne, philnik, MaskRay

Differential Revision: https://reviews.llvm.org/D154876
2023-07-13 10:20:09 -07:00
Nick Desaulniers
6641d2b7df [MicrosoftDemangle] fix warn-trailing false positive
A follow up to commit 6bad76c7ae93 ("[Demangle] fix windows tests")
based on @thakis' report.

Fixes: #63740

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D154875
2023-07-11 09:25:52 -07:00
Nick Desaulniers
db98ac0827 [Demangle] convert microsoftDemangle to take a std::string_view
This should be last of the "bottom-up conversions" of various demanglers
to accept std::string_view.  After this, D149104 may be revisited.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D152176
2023-06-05 13:00:20 -07:00
Nick Desaulniers
c2709fcb0a [Demangle] remove unused params of microsoftDemangle
No call sites use these parameters, so drop them.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148940
2023-04-21 15:37:00 -07:00
Nick Desaulniers
6bad76c7ae [Demangle] fix windows tests
My reland of https://reviews.llvm.org/D148546 has caused a few windows
demangler tests to fail when run with -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON
on windows.

I have a sneaking suspicion that MSVC's
std::string_view::iterator::operator* may be missing a nullptr check.

Link: https://lab.llvm.org/buildbot/#/builders/42/builds/9723/steps/7/logs/stdio

Reviewed By: ayzhao

Differential Revision: https://reviews.llvm.org/D148852
2023-04-20 16:41:09 -07:00
Nick Desaulniers
7c59e8001a Reland: [Demangle] replace use of llvm::StringView w/ std::string_view
This reverts commit d81cdb49d74064e88843733e7da92db865943509.

This refactoring was waiting on converting LLVM to C++17.

Leave StringView.h and cleanup around for subsequent cleanup.

Additional fixes for missing std::string_view conversions for MSVC.

Reviewed By: MaskRay, DavidSpickett, ayzhao

Differential Revision: https://reviews.llvm.org/D148546
2023-04-20 11:22:20 -07:00
Fangrui Song
d81cdb49d7 Revert D148384 "[Demangle] replace use of llvm::StringView w/ std::string_view"
This reverts commit 3e559509b426b6aae735a7f57dbdaed1041d2622 and e0c4ffa796b553fa78c638a9584c05ac21fe07d5.

This still breaks Windows builds.

In addition, `#include <llvm/ADT/StringViewExtras.h>` in
llvm/include/llvm/Demangle/ItaniumDemangle.h is a library layering violation
(LLVMDemangle is the lowest LLVM library and cannot depend on LLVMSupport).
2023-04-14 18:42:11 -07:00
Nick Desaulniers
e0c4ffa796 [Demangle] fix windows build
Fixes diagnostics reported against
https://reviews.llvm.org/D148384

https://lab.llvm.org/buildbot/#/builders/127/builds/46749/steps/4/logs/stdio

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148392
2023-04-14 16:25:14 -07:00
Nick Desaulniers
3e559509b4 [Demangle] replace use of llvm::StringView w/ std::string_view
This refactoring was waiting on converting LLVM to C++17.

Leave StringView.h and cleanup around for subsequent cleanup.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148384
2023-04-14 15:48:38 -07:00
Nick Desaulniers
5e53e1bbc3 [StringView] remove consumeFront
Towards converting our use of llvm::StringView to std::string_view,
remove a method that std::string_view doesn't have.

This could be moved to the nascent llvm/ADT/StringViewExtras.h, but the
use is highly localized to one TU. Move this to be a static function
there.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148375
2023-04-14 15:34:24 -07:00
Nick Desaulniers
ee6abfc5ea [StringView] remove popFront
Towards converting our use of llvm::StringView to std::string_view,
remove a method that std::string_view doesn't have.

llvm::StringView::popFront is similar to std::string_view::remove_prefix
but with a reference to std::string_view::front taken first.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148363
2023-04-14 13:13:46 -07:00
Nick Desaulniers
0f9f03afca [StringView] remove ctor incompatible with std::string_view
Towards replacing llvm::StringView with std::string_view, remove ctor
that std::string_view doesn't have an analog for.

Reviewed By: erichkeane, MaskRay

Differential Revision: https://reviews.llvm.org/D148353
2023-04-14 11:16:11 -07:00
Nick Desaulniers
f198e0b594 [StringView] remove dropFront
Towards converting our use of llvm::StringView to std::string_view,
remove a method that std::string_view doesn't have.
llvm::StringView::dropFront is semantically similar to
std::string_view::substr but with the input clamped to the size. No code
was relying on clamping other than the rust demangler, which I fixed in
https://reviews.llvm.org/D148272.  Removing this method makes it easier
to switch over code later.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148348
2023-04-14 10:24:10 -07:00
Fangrui Song
3ece37b3fa [Demangle] Remove uses of llvm::itanium_demangle::StringView::{dropBack,dropFront}. NFC
Make it easier to migrate StringView to std::string_view.
2023-04-13 15:09:25 -07:00
Nathan Sidwell
d3b10150b6 [demangler] Simplify OutputBuffer initialization
Every non-testcase use of OutputBuffer contains code to allocate an
initial buffer (using either 128 or 1024 as initial guesses). There's
now no need to do that, given recent changes to the buffer extension
heuristics -- it allocates a 1k(ish) buffer on first need.

Just pass in a buffer (if any) to the constructor.  Thus the
OutputBuffer's ownership of the buffer starts at its own lifetime
start. We can reduce the lifetime of this object in several cases.

That new constructor takes a 'size_t *' for the size argument, as all
uses with a non-null buffer are passing through a malloc'd buffer from
their own caller in this manner.

The buffer reset member function is never used, and is deleted.

Some adjustment to a couple of uses is needed, due to the lazy buffer
creation of this patch.

a) the Microsoft demangler can demangle empty strings to nothing,
which it then memoizes.  We need to avoid the UB of passing nullptr to
memcpy.

b) a unit test checks insertion of no characters into an empty buffer.
We need to avoid UB when converting that to std::string.

The original buffer initialization code would return a failure code if
that first malloc failed.  Existing code either ignored that, called
std::terminate with a FIXME, or returned an error code.

But that's not foolproof anyway, as a subsequent buffer extension
failure ends up calling std::terminate. I am working on addressing
that unfortunate failure mode in a manner more consistent with the C++
ABI design.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122604
2022-10-17 04:23:16 -07:00
Zequan Wu
bf1e96d620 [MicrosoftDemangle] Set error to true when returning nullptr. 2022-06-08 17:18:09 -07:00
Kirill Stoimenov
aabeb5eb7f Revert "[demangler] Simplify OutputBuffer initialization"
Reverting due to a bot failure:
https://lab.llvm.org/buildbot/#/builders/5/builds/22738

This reverts commit 5b3ca24a35e91bf9c19af856e7f92c69b17f989e.
2022-04-26 20:24:06 +00:00
Nathan Sidwell
5b3ca24a35 [demangler] Simplify OutputBuffer initialization
Every non-testcase use of OutputBuffer contains code to allocate an
initial buffer (using either 128 or 1024 as initial guesses). There's
now no need to do that, given recent changes to the buffer extension
heuristics -- it allocates a 1k(ish) buffer on first need.

Just pass in a buffer (if any) to the constructor.  Thus the
OutputBuffer's ownership of the buffer starts at its own lifetime
start. We can reduce the lifetime of this object in several cases.

That new constructor takes a 'size_t *' for the size argument, as all
uses with a non-null buffer are passing through a malloc'd buffer from
their own caller in this manner.

The buffer reset member function is never used, and is deleted.

The original buffer initialization code would return a failure code if
that first malloc failed.  Existing code either ignored that, called
std::terminate with a FIXME, or returned an error code.

But that's not foolproof anyway, as a subsequent buffer extension
failure ends up calling std::terminate. I am working on addressing
that unfortunate failure mode in a manner more consistent with the C++
ABI design.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122604
2022-04-26 04:23:12 -07:00
Nathan Sidwell
1066e397fa [demangler] Add StringView conversion operator
The OutputBuffer class tries to present a NUL-terminated string API to
consumers.  But several of them would prefer a StringView.  In
particular the Microsoft demangler, juggles between NUL-terminated and
StringView, which is confusing.

This adds a StringView conversion, and adjusts the Demanglers that can
benefit from that.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D120990
2022-03-28 11:19:55 -07:00
Nathan Sidwell
d1587c38e6 [llvm] Fix string copy confusion
The microsoft demangler makes copies of the demangled strings, but has
some confusion between StringView representation (sans NUL), and
C-strings (with NUL).  Here we also have a use of strcpy, which
happens to work because the incoming string view happens to have a
trailing NUL.  But a simple memcpy excluding the NUL is sufficient.

Reviewed By: dblaikie, erichkeane

Differential Revision: https://reviews.llvm.org/D122391
2022-03-28 09:37:36 -07:00
Luís Ferreira
2e97236aac [Demangle] Rename OutputStream to OutputString
This patch is a refactor to implement prepend afterwards. Since this changes a lot of files and to conform with guidelines, I will separate this from the implementation of prepend. Related to the discussion in https://reviews.llvm.org/D111414 , so please read it for more context.

Reviewed By: #libc_abi, dblaikie, ldionne

Differential Revision: https://reviews.llvm.org/D111947
2021-10-21 17:34:57 -07:00
Lasse Folger
134e1817f6 [lldb] change name demangling to be consistent between windows and linx
When printing names in lldb on windows these names contain the full type information while on linux only the name is contained.

This change introduces a flag in the Microsoft demangler to control if the type information should be included.
With the flag enabled demangled name contains only the qualified name, e.g:
without flag -> with flag
int (*array2d)[10] -> array2d
int (*abc::array2d)[10] -> abc::array2d
const int *x -> x

For globals there is a second inconsistency which is not yet addressed by this change. On linux globals (in global namespace) are prefixed with :: while on windows they are not.

Reviewed By: teemperor, rnk

Differential Revision: https://reviews.llvm.org/D111715
2021-10-19 12:04:37 +02:00
Saleem Abdulrasool
9c2de23821 Demangle: correct swift_async demangling for Microsoft scheme
The emission was corrected for the swift_async calling convention but
the demangling support was not.  This repairs the demangling support as
well.
2021-07-14 11:43:44 -07:00
Varun Gandhi
92dcb1d2db [Clang] Introduce Swift async calling convention.
This change is intended as initial setup. The plan is to add
more semantic checks later. I plan to update the documentation
as more semantic checks are added (instead of documenting the
details up front). Most of the code closely mirrors that for
the Swift calling convention. Three places are marked as
[FIXME: swiftasynccc]; those will be addressed once the
corresponding convention is introduced in LLVM.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D95561
2021-07-09 11:50:10 -07:00
Varun Gandhi
44f792966e [Demangle] Support demangling Swift calling convention in MS demangler.
Previously, Clang was able to mangle the Swift calling
convention but 'MicrosoftDemangle.cpp' was not able to demangle it.

Reviewed By: compnerd, rnk

Differential Revision: https://reviews.llvm.org/D95053
2021-01-27 13:24:54 -08:00
Nico Weber
bc1c3655bf Give microsoftDemangle() an outparam for how many input bytes were consumed.
Demangling Itanium symbols either consumes the whole input or fails,
but Microsoft symbols can be successfully demangled with just some
of the input.

Add an outparam that enables clients to know how much of the input was
consumed, and use this flag to give llvm-undname an opt-in warning
on partially consumed symbols.

Differential Revision: https://reviews.llvm.org/D80173
2020-05-20 16:17:31 -04:00
David Blaikie
75bbbeec74 Revert "Add some missing includes to MicrosoftDemangle.cpp (PR44217)"
This reverts commit 9b962d83ece841e43fd2823375dc6ddc94c1b178.

This didn't address the underlying issue (in MicrosoftDemangleNodes.h)
that was fixed 6 months ago anyway.
2019-12-04 11:10:07 -08:00
David Blaikie
9b962d83ec Add some missing includes to MicrosoftDemangle.cpp (PR44217) 2019-12-04 08:41:08 -08:00
Martin Storsjo
da92ed8365 [Demangle] Add a few more options to the microsoft demangler
This corresponds to commonly used options to UnDecorateSymbolName
within llvm.

Add them as hidden options in llvm-undname. MS undname.exe takes
numeric flags, corresponding to the UNDNAME_* constants, but instead
of hardcoding in mappings for those numbers, just add textual
options instead, as it the use of them here is primarily intended
for testing.

Differential Revision: https://reviews.llvm.org/D68917

llvm-svn: 374865
2019-10-15 08:29:56 +00:00
Simon Pilgrim
d2a3e89877 Fix uninitialized variable warning. NFCI.
llvm-svn: 373450
2019-10-02 11:48:45 +00:00
Nico Weber
da298aa913 llvm-undname: Add support for demangling typeinfo names
typeinfo names aren't symbols but string constant contents
stored in compiler-generated typeinfo objects, but llvm-cxxfilt
can demangle these for Itanium names.

In the MSVC ABI, these are just a '.' followed by a mangled
type -- this means they don't start with '?' like all MS-mangled
symbols do.

Differential Revision: https://reviews.llvm.org/D67851

llvm-svn: 372602
2019-09-23 13:13:37 +00:00
Nico Weber
1dce82636c llvm-undname: Correctly demangle vararg parameters
FunctionSignatureNode already had an IsVariadic field,
but it wasn't used anywhere yet. Set it and use it.

llvm-svn: 362541
2019-06-04 19:10:08 +00:00
Nico Weber
4638548468 llvm-undname: More coverage-related cleanups
- The loop in demangleFunctionParameterList() only exits
  on Error, @, and Z. All 3 cases were handled, so the
  rest of the function is DEMANGLE_UNREACHABLE.

- The loop in demangleTemplateParameterList() always returns
  on Error, so there's no need to check for that in the loop
  header and after the loop.

- Add test cases for invalid function parameter manglings.

- Add a (redundant) test case for a simple template parameter
  list mangling.

- Add a test case pointing out that varargs functions aren't
  demangled correctly.

llvm-svn: 362540
2019-06-04 18:49:05 +00:00
Nico Weber
878df1c2a9 llvm-undname: Add test coverage for demangleInitFiniStub()
llvm-svn: 362536
2019-06-04 18:06:28 +00:00
Nico Weber
d98a0a362f llvm-undname: Yet more coverage for error paths
- For error returns in demangleSpecialTableNode(),
  demangleLocalStaticGuard(), RTTITypeDescriptor,
  demangleRttiBaseClassDescriptorNode(), demangleUnsigned(),
  demangleUntypedVariable() (via RttiBaseClassArray)

- For ?_A and ?_P which are handled at early levels of the
  demangler but are not implemented in a later stage; this
  is now more obvious

- Replace a "default:" with an explicit list of cases, to
  get -Wswitch check we list all cases

llvm-svn: 362520
2019-06-04 16:25:28 +00:00
Nico Weber
c1a0e6fe6b llvm-undname: More no-op changes to increase test coverage
- Add test coverage around invalid anon namespaces and
  for error paths in demanglePrimitiveType() and in
  demangleFullyQualifiedTypeName()

- Use DEMANGLE_UNREACHABLE in two more unreachable places

llvm-svn: 362514
2019-06-04 15:38:00 +00:00
Nico Weber
880d21d3cb llvm-undname: Several behavior-preserving changes to increase coverage
- Replace `Error = true` in a few branches that are truly unreachable
  with DEMANGLE_UNREACHABLE

- Remove early return early in startsWithLocalScopePattern() because
  it's redundant with the next two early returns

- Remove unreachable `case '0'` (it's handled in the branch below)

- Remove an unused bool return

- Add test coverage for several early error returns, mostly in
  array type parsing

llvm-svn: 362506
2019-06-04 15:13:30 +00:00
Nico Weber
54362477c7 llvm-undname; Add more test coverage for demangleFunctionClass()
Also add two FC_Far that seem to be missing, by symmetry from
the public and protected cases. (But FC_Far isn't really a thing
anymore, so this doesn't really have an observable effect.)

llvm-svn: 362344
2019-06-02 23:26:57 +00:00
Nico Weber
b5cd6163f4 Remove code path that's dead after r358835
llvm-svn: 362333
2019-06-02 17:41:07 +00:00
Nico Weber
a2ca6e7803 llvm-undname: Support demangling char8_t
Ports clang's mangling support added in r354633 to llvm-undname.

llvm-svn: 361839
2019-05-28 15:30:04 +00:00
Nico Weber
88ab281b4d llvm-undname: Add support for local static thread guards
llvm-svn: 361835
2019-05-28 14:54:49 +00:00
Nico Weber
f83c39e53f llvm-undname: Remove unreachable statement
llvm-svn: 361786
2019-05-28 01:20:36 +00:00
Nico Weber
82dc06c340 llvm-undname: Extract demangleMD5Name() method; no behavior change
llvm-svn: 361783
2019-05-27 23:10:42 +00:00
Nico Weber
cfe08bc7d6 llvm-undname: Make demangling of MD5 names more robust
Demangler::parse() for MD5 names would:

1. Put all remaining text into the MD5 name sight unseen
2. Not modify MangledName

This meant that if the demangler recursively called parse() (e.g. in
demangleLocallyScopedNamePiece()), every recursive call that started on
an MD5 name would add all remaining bytes to the output buffer but
only advance the input by a byte.  For valid inputs, MD5 types are
never (well, see comments for 2 exceptions) nested, but for invalid
input this could cause memory use quadratic in the input size.

llvm-svn: 361744
2019-05-27 00:48:59 +00:00
Nico Weber
09fb2029e5 llvm-undname: Fix an assert-on-invalid, found by oss-fuzz
If a template parameter refers to a pointer to member, but the mangling
of that was a string literal instead of a real symbol, llvm-undname used
to crash instead of rejecting the input.

llvm-svn: 361402
2019-05-22 15:53:23 +00:00
Nico Weber
8d05eb8556 llvm-undname: Fix assert-on->4GiB-string-literal, found by oss-fuzz
llvm-svn: 359109
2019-04-24 16:09:38 +00:00
Nico Weber
e8f21b1a6b llvm-undname: Support demangling the spaceship operator
Also add a test for demanling the co_await operator.

llvm-svn: 359007
2019-04-23 16:20:27 +00:00
Nico Weber
f5c7f3ad33 llvm-undname: Fix an assert-on-invalid, found by oss-fuzz
llvm-svn: 358891
2019-04-22 15:05:18 +00:00
Nico Weber
ce67a41741 llvm-undname: Fix hex escapes in wchar_t, char16_t, char32_t strings
llvm-undname used to put '\x' in front of every pair of nibbles, but
u"\xD7\xFF" produces a string with 6 bytes: \xD7 \0 \xFF \0 (and \0\0). Correct
for a single character (plus terminating \0) is u\xD7FF instead.
Now, wchar_t, char16_t, and char32_t strings roundtrip from source to
clang-cl (and cl.exe) and then llvm-undname.

(...at least as long as it's not a string like L"\xD7FF" L"foo" which
gets demangled as L"\xD7FFfoo", where the compiler then considers the
"f" as part of the hex escape. That seems ok.)

Also add a comment saying that the "almost-valid" char32_t string I
added in my last commit is actually produced by compilers.

llvm-svn: 358857
2019-04-21 17:19:27 +00:00
Nico Weber
8fc9902bbb llvm-undname: Fix stack overflow on almost-valid
If a unsigned with all 4 bytes non-0 was passed to outputHex(), there
were two off-by-ones in it:

- Both MaxPos and Pos left space for the final \0, which left the buffer
  one byte to small. Set MaxPos to 16 instead of 15 to fix.

- The `assert(Pos >= 0);` was after a `Pos--`, move it up one line.

Since valid Unicode codepoints are <= 0x10ffff, this could never really
happen in practice.

Found by oss-fuzz.

llvm-svn: 358856
2019-04-21 16:58:25 +00:00