117 Commits

Author SHA1 Message Date
Congcong Cai
a80aad2812
[YAML] fix output incorrect format for block scalar string (#132897)
After outputting block scalar string, the indent will be wrong.
This patch fixes Padding after block scalar string to ensure the correct
format of yaml.

The new added ut will fail in main.
```diff
@@ -3,4 +3,4 @@
     Just a block
     scalar doc
-scalar:          a
+  scalar:          a
 ...\n
```
2025-03-27 02:16:27 +08:00
Vitaly Buka
4f26edd5e9
[NFC][YAML] Add IO::error() (#123475)
For #123280
2025-01-23 03:56:51 -08:00
NAKAMURA Takumi
39209724e6
[YAML] Fix incorrect dash output in nested sequences (#116488)
Nested sequences could be defined but the YAML output was incorrect.
`Output::newLineCheck()` was not able to emit multiple dashes `- ` and
YAML parser sometimes didn't accept its output as the result.

This fixes for emitting corresponding dashes for consecutive
`inSeqFirstElement`, but suppresses emission to the top
`inSeqFirstElement`.

This also fixes for emitting flow elements onto nested sequences.
2024-11-30 22:10:31 +09:00
Kazu Hirata
d44ea7186b
[Support] Remove unused includes (NFC) (#116752)
Identified with misc-include-cleaner.
2024-11-20 06:51:43 -08:00
Kazu Hirata
7ee6288312
[Support] Use StringRef::operator== instead of StringRef::equals (NFC) (#91042)
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.

- StringRef::operator== outnumbers StringRef::equals by a factor of 25
  under llvm/ in terms of their usage.

- The elimination of StringRef::equals brings StringRef closer to
  std::string_view, which has operator== but not equals.

- S == "foo" is more readable than S.equals("foo"), especially for
  !Long.Expression.equals("str") vs Long.Expression != "str".
2024-05-04 08:46:48 -07:00
Jannik Silvanus
5f9ae61dee
[Support][YamlTraits] Add quoting for keys in textual YAML representation (#88763)
The support library contains helpers to parse and emit YAML documents.

In the textual YAML representation, some strings need to be quoted, e.g.
when containing unprintable characters.

We already have such quoting implemented for YAML values.

This patch applies the same quoting to YAML *keys*.

One affected case is output of control registers in AMDGPU Msgpack
metadata, which are printed in a format like this:

```
   0x2cca (SPI_SHADER_PGM_RSRC1_ES): 42
```

With this patch, the key is quoted:

```
   '0x2cca (SPI_SHADER_PGM_RSRC1_ES)': 42
```

Most test changes come from this pattern.
2024-04-29 15:37:42 +02:00
akirchhoff-modular
4bf10f3da7
[YAMLTraits] Fix std::optional input on empty documents (#68947)
When the input document is non-empty, `mapOptional` works as expected,
setting `std::optional` to `std::nullopt` when the field is not present.
When the input document is empty, we hit a special case inside of
`Input::preflightKey` that results in `UseDefault = false`, which
results in the `std::optional` erroneously being set to a non-nullopt
value. `preflightKey` is changed to set `UseDefault = true` in this case
to make the behavior consistent between empty and non-empty documents.
2023-10-16 15:14:17 -05:00
Amir Ayupov
ec129c28af [YAML][NFC] Use BumpPtrAllocator instead of unique_ptrs
Avoid both memory leaks and expensive dynamic allocations by using
SpecificBumpPtrAllocator for HNode types. It's expected they're not deallocated
until the `Input` class is destroyed, so deallocating all at once works well in
this case.

Reduces YAML profile pre-processing time in BOLT from
>   11.2067 (  3.2%)   1.6487 (  7.3%)  12.8554 (  3.5%)  12.8635 (  5.6%)  pre-process profile data
to
>   10.6613 (  3.1%)   1.6489 (  6.7%)  12.3102 (  3.3%)  12.3134 (  5.3%)  pre-process profile data

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D155006
2023-08-09 16:42:25 -07:00
Amir Ayupov
c2aa0616cd [YAML][NFC] Replace if-else with switch in createHNodes
BOLT YAML profile reading time gets marginally faster (14.1572->13.9207 s) for
a large YAML profile (121MB/31K functions). Not claiming stat significance
though.

Reviewed By: hintonda

Differential Revision: https://reviews.llvm.org/D154553
2023-07-06 10:18:13 -07:00
Anton Sidorenko
388d679c1d Recommit [YAML IO] Check that mapping doesn't contain duplicating keys
The revert reason is fixed in D143727 (test changes).

According to YAML specification keys must be unique for a mapping node:
"The content of a mapping node is an unordered set of key/value node pairs, with
the restriction that each of the keys is unique".

Differential Revision: https://reviews.llvm.org/D140474
2023-02-13 17:45:07 +03:00
Anton Sidorenko
ed66b81f7a Revert "[YAML IO] Check that mapping doesn't contain duplicating keys"
LLDB failures: https://lab.llvm.org/buildbot/#/builders/17/builds/33865

This reverts commit 4c228ee6d40a7cff256f1a680561b6c0155ad704.
2023-02-09 15:09:07 +03:00
Anton Sidorenko
4c228ee6d4 [YAML IO] Check that mapping doesn't contain duplicating keys
According to YAML specification keys must be unique for a mapping node:
"The content of a mapping node is an unordered set of key/value node pairs, with
the restriction that each of the keys is unique".

Differential Revision: https://reviews.llvm.org/D140474
2023-02-09 14:45:51 +03:00
Fangrui Song
8b61376dfa YAMLParser: llvm::Optional => std::optional 2022-12-15 09:34:31 +00:00
Jan Svoboda
aa33688cad [llvm][support] Replace std::vector<bool> use in YAMLTraits
LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>` and suggests `llvm::BitVector` as a possible replacement.

This patch replaces the use of `std::vector` with `llvm::BitVector` in LLVM's YAML traits and replaces the call to `Vec.insert(Vec.begin(), N, false)` on empty `Vec` with `Vec.resize(N)`, which has the same semantics but avoids using `insert` and iterators, which `llvm::BitVector` doesn't possess.

Reviewed By: dexonsmith, dblaikie

Differential Revision: https://reviews.llvm.org/D118111
2022-01-26 11:20:18 +01:00
serge-sans-paille
66c602be25 [NFC] Additional header dependency cleanup LLVMSupport
A few more forward-declarations, a few less headers. the impact on number of
preprocessed lines for LLVMSupport is negligible (-3K lines) but it's always
good to remove dependencies.

Related discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
2022-01-26 11:16:15 +01:00
serge-sans-paille
75e164f61d [llvm] Cleanup header dependencies in ADT and Support
The cleanup was manual, but assisted by "include-what-you-use". It consists in

1. Removing unused forward declaration. No impact expected.
2. Removing unused headers in .cpp files. No impact expected.
3. Removing unused headers in .h files. This removes implicit dependencies and
   is generally considered a good thing, but this may break downstream builds.
   I've updated llvm, clang, lld, lldb and mlir deps, and included a list of the
   modification in the second part of the commit.
4. Replacing header inclusion by forward declaration. This has the same impact
   as 3.

Notable changes:

- llvm/Support/TargetParser.h no longer includes llvm/Support/AArch64TargetParser.h nor llvm/Support/ARMTargetParser.h
- llvm/Support/TypeSize.h no longer includes llvm/Support/WithColor.h
- llvm/Support/YAMLTraits.h no longer includes llvm/Support/Regex.h
- llvm/ADT/SmallVector.h no longer includes llvm/Support/MemAlloc.h nor llvm/Support/ErrorHandling.h

You may need to add some of these headers in your compilation units, if needs be.

As an hint to the impact of the cleanup, running

clang++ -E  -Iinclude -I../llvm/include ../llvm/lib/Support/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l

before: 8000919 lines
after:  7917500 lines

Reduced dependencies also helps incremental rebuilds and is more ccache
friendly, something not shown by the above metric :-)

Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831
2022-01-21 13:54:49 +01:00
Vitaly Buka
ee43259cbc Initialize output parameters
If the function returns true, it should
set all output paremeters, similar to Output::preflightElement, or we
have UB on code like:

```
void *SaveInfo;
if (io.preflightFlowElement(i, SaveInfo))
  io.postflightFlowElement(SaveInfo);
```

It's going to be detected by msan with:
-Xclang -enable-noundef-analysis -mllvm -msan-eager-checks=1

Differential Revision: https://reviews.llvm.org/D116826
2022-01-07 15:21:21 -08:00
Kevin Athey
4d06565bd8 Initialize SaveInfo in methods Output::preflightKey and Output::preflightElement.
When enabling MSAN eager mode with noundef analysis these variables were found to not be initialized in unit tests.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D116428
2022-01-05 13:13:48 -08:00
Jonas Devlieghere
f50b8ee71f [YAML I/O] Fix bug in emission of empty sequence
Don't emit an output dash for an empty sequence. Take emitting a vector
of strings for example:

  std::vector<std::string> Strings = {"foo", "bar"};
  LLVM_YAML_IS_SEQUENCE_VECTOR(std::string)
  yout << Strings;

This emits the following YAML document.

  ---
  - foo
  - bar
  ...

When the vector is empty, this generates the following result:

  ---
  - []
  ...

Although this is valid YAML, it does not match what we meant to emit.
The result is a one-element sequence consisting of an empty list.
Indeed, if we were to try to read this again we get an error:

  YAML:2:4: error: not a mapping
  - []

The problem is the output dash before the empty list. The correct output
would be:

  ---
  []
  ...

This patch fixes that by not emitting the output dash for an empty
sequence.

Differential revision: https://reviews.llvm.org/D95280
2021-01-25 13:35:36 -08:00
Nathan James
0e5bfffb13
[YAML] Support extended spellings when parsing bools.
Support all the spellings of boolean datatypes according to https://yaml.org/type/bool.html

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D92755
2020-12-12 12:50:34 +00:00
Nathan James
d380c38e34
[YAML] Use correct source location for unknown key errors.
Currently unknown keys when inputting mapping traits have the location set to the Value.
Example:
```
YAML:1:14: error: unknown key 'UnknownKey'
{UnknownKey: SomeValue}
             ^~~~~~~~~
```
This is unhelpful for a user as it draws them to fix the wrong item.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D93037
2020-12-11 16:34:06 +00:00
Georgii Rymar
9aa7898200 Reland "[lib/Support/YAMLTraits] - Don't print leading zeroes when dumping Hex8/Hex16/Hex32 types." (https://reviews.llvm.org/D90930).
This reverts reverting commit fc40a03323a4b265ccbed34a07e281b13c5e8367
and fixes LLD (MachO/wasm) tests that failed previously.
2020-11-18 13:08:46 +03:00
Georgii Rymar
fc40a03323 Revert "[lib/Support/YAMLTraits] - Don't print leading zeroes when dumping Hex8/Hex16/Hex32 types."
This reverts commit 65fd17c241e22e1671e81efdb683687369c2feb3.

It breaks LLD/MachO tests that seems use obj2yaml the check the output.
2020-11-18 11:55:03 +03:00
Georgii Rymar
65fd17c241 [lib/Support/YAMLTraits] - Don't print leading zeroes when dumping Hex8/Hex16/Hex32 types.
When we produce an YAML output, we also print leading zeroes currently.
An output might look like this:

```
- Name:    .dynsym
  Type:    SHT_DYNSYM
  Address: 0x0000000000001000
  EntSize: 0x0000000000000018
```

There are probably no reason to print leading zeroes.
It just makes harder to read values. This patch stops printing them.
The output becomes like:

```
- Name:    .dynsym
  Type:    SHT_DYNSYM
  Address: 0x1000
  EntSize: 0x18
```

This affects obj2yaml mostly, but also dsymutil and llvm-xray tools output.

Differential revision: https://reviews.llvm.org/D90930
2020-11-18 11:31:00 +03:00
Cyndy Ishida
acb33cba6d [llvm] Fix ODRViolations for VersionTuple YAML specializations NFC
It appears for Swift there was confusing errors when trying to parse APINotes, when libAPINotes and libInterfaceStub are linked, they both export symbol
`__ZN4llvm4yaml7yamlizeINS_12VersionTupleEEENSt3__19enable_ifIXsr16has_ScalarTraitsIT_EE5valueEvE4typeERNS0_2IOERS5_bRNS0_12EmptyContextE`, and discovered
same symbol defined within llvm-ifs.

This consolidates the boilerplate into YAMLTraits and defers the specific validation in reading the whole input.
fixes: rdar://problem/70450563

Reviewed By: phosek, dblaikie

Differential Revision: https://reviews.llvm.org/D89764
2020-10-20 18:29:15 -07:00
Joachim Meyer
f64903fd81 Add -Wno-error=unknown flag to clang-format.
Currently newer clang-format options cannot be included in .clang-format files, if not all users can be forced to use an updated version.
This patch tries to solve this by adding an option to clang-format, enabling to ignore unknown (newer) options.

Differential Revision: https://reviews.llvm.org/D86137
2020-09-19 10:17:57 +02:00
Dmitry Polukhin
9e7fddbd36 [yaml][clang-tidy] Fix multiline YAML serialization
Summary:
New line duplication logic introduced in https://reviews.llvm.org/D63482
has two issues: (1) there is no logic that removes duplicate newlines
when clang-apply-replacment reads YAML and (2) in general such logic
should be applied to all strings and should happen on string
serialization level instead in YAML parser.

This diff changes multiline strings quotation from single quote `'` to
double `"`. It solves problems with internal newlines because now they are
escaped. Also double quotation solves the problem with leading whitespace after
newline. In case of single quotation YAML parsers should remove leading
whitespace according to specification. In case of double quotation these
leading are internal space and they are preserved. There is no way to
instruct YAML parsers to preserve leading whitespaces after newline so
double quotation is the only viable option that solves all problems at
once.

Test Plan: check-all

Reviewers: gribozavr, mgehre, yvvan

Subscribers: xazax.hun, hiraditya, cfe-commits, llvm-commits

Tags: #clang-tools-extra, #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80301
2020-07-09 02:41:58 -07:00
Georgii Rymar
d95f8e7aef [yaml2obj][MachO] - Fix PubName/PubType handling.
`PubName` and `PubType` are optional fields since D80722.

They are defined as:
  Optional<PubSection> PubNames;
  Optional<PubSection> PubTypes;

And initialized in the following way:
  IO.mapOptional("debug_pubnames", DWARF.PubNames);
  IO.mapOptional("debug_pubtypes", DWARF.PubTypes);

But problem is that because of the issue in `YAMLTraits.cpp`,
when there are no `debug_pubnames`/`debug_pubtypes` keys in a YAML description,
they are not initialized to `Optional::None` as the code expects, but they
are initialized to default `PubSection()` instances.

Because of this, the `if` condition in the following code is always true:

if (Obj.DWARF.PubNames)
  Err = DWARFYAML::emitPubSection(OS, *Obj.DWARF.PubNames,
                                  Obj.IsLittleEndian);

What means `emitPubSection` is always called and it writes few values.

This patch fixes the issue. I've reduced `sizeofcmds` by size of data
previously written because of this bug.

Differential revision: https://reviews.llvm.org/D81686
2020-06-12 12:03:51 +03:00
Jonas Devlieghere
d1f0a76b21 [YAMLTraits] Remove char trait and serialize as uint8_t in lldb.
As discussed in https://reviews.llvm.org/D79745
2020-05-26 11:07:27 -07:00
Jonas Devlieghere
f6cc1c08f1 Revert "Revert "[YAMLTraits] Add trait for char""
Reverting this to unblock all the LLDB bots while we try to figure out a
solution for Solaris in https://reviews.llvm.org/D79745.
2020-05-21 10:33:09 -07:00
Rainer Orth
c4169a3efe Revert "[YAMLTraits] Add trait for char"
This reverts commit fab08bf4899e40d02d8bf394a63499ac679ac61c.  It has left
the Solaris buildbots broken for a week and a half as reported
in https://reviews.llvm.org/D79745.
2020-05-21 17:33:42 +02:00
Jonas Devlieghere
fab08bf489 [YAMLTraits] Add trait for char
Add a YAML trait for char.

Differential revision: https://reviews.llvm.org/D79745
2020-05-11 15:51:47 -07:00
Swiftfuchs
a24d46318f [NFC] Corrected a minor typo in a comment 2020-02-21 13:56:44 +01:00
Bill Wendling
c55cf4afa9 Revert "Remove redundant "std::move"s in return statements"
The build failed with

  error: call to deleted constructor of 'llvm::Error'

errors.

This reverts commit 1c2241a7936bf85aa68aef94bd40c3ba77d8ddf2.
2020-02-10 07:07:40 -08:00
Bill Wendling
1c2241a793 Remove redundant "std::move"s in return statements 2020-02-10 06:39:44 -08:00
Thomas Finch
092452d402 YAML parser robustness improvements
Summary: This patch fixes a number of bugs found in the YAML parser
through fuzzing. In general, this makes the parser more robust against
malformed inputs.

The fixes are mostly improved null checking and returning errors in
more cases. In some cases, asserts were changed to regular errors,
this provides the same robustness but also protects release builds
from the triggering conditions. This also improves the fuzzability of
the YAML parser since asserts can act as a roadblock to further
fuzzing once they're hit.

Each fix has a corresponding test case:
  - TestAnchorMapError - Added proper null pointer handling in
    `Stream::printError` if N is null and `KeyValueNode::getValue` if
    getKey returns null, `Input::createHNodes` `dyn_casts` changed to
    `dyn_cast_or_null` so the null pointer checks are actually able to
    fail
  - TestFlowSequenceTokenErrors - Added case in
    `Document::parseBlockNode` for FlowMappingEnd, FlowSequenceEnd, or
    FlowEntry tokens outside of mappings or sequences
  - TestDirectiveMappingNoValue - Changed assert to regular error
    return in `Scanner::scanValue`
  - TestUnescapeInfiniteLoop - Fixed infinite loop in
    `ScalarNode::unescapeDoubleQuoted` by returning an error for
    unrecognized escape codes
  - TestScannerUnexpectedCharacter - Changed asserts to regular error
    returns in `Scanner::consume`
  - TestUnknownDirective - For both of the inputs the stream doesn't
    fail and correctly returns TK_Error, but there is no valid root
    node for the document. There's no reasonable way to make the
    scanner fail for unknown directives without breaking the YAML spec
    (see spec-07-01.test). I think the assert is unnecessary given
    that an error is still generated for this case.

The `SimpleKeys.clear()` line fixes a bug found by AddressSanitizer
triggered by multiple test cases - when TokenQueue is cleared
SimpleKeys is still holding dangling pointers into it, so SimpleKeys
should be cleared as well.

Patch by Thomas Finch!

Reviewers: chandlerc, Bigcheese, hintonda

Reviewed By: Bigcheese, hintonda

Subscribers: hintonda, kristina, beanz, dexonsmith, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61608
2019-11-05 21:51:04 -08:00
George Rimar
4e71702cd4 [yaml2obj][obj2yaml] - Use a single "Other" field instead of "Other", "Visibility" and "StOther".
Currenly we can encode the 'st_other' field of symbol using 3 fields.
'Visibility' is used to encode STV_* values.
'Other' is used to encode everything except the visibility, but it can't handle arbitrary values.
'StOther' is used to encode arbitrary values when 'Visibility'/'Other' are not helpfull enough.

'st_other' field is used to encode symbol visibility and platform-dependent
flags and values. Problem to encode it is that it consists of Visibility part (STV_* values)
which are enumeration values and the Other part, which is different and inconsistent.

For MIPS the Other part contains flags for all STO_MIPS_* values except STO_MIPS_MIPS16.
(Like comment in ELFDumper says: "Someones in their infinite wisdom decided to make
STO_MIPS_MIPS16 flag overlapped with other ST_MIPS_xxx flags."...)

And for PPC64 the Other part might actually encode any value.

This patch implements custom logic for handling the st_other and removes
'Visibility' and 'StOther' fields.

Here is an example of a new YAML style this patch allows:

- Name:  foo
  Other: [ 0x4 ]
- Name:  bar
  Other: [ STV_PROTECTED, 4 ]
- Name:  zed
  Other: [ STV_PROTECTED, STO_MIPS_OPTIONAL, 0xf8 ]

Differential revision: https://reviews.llvm.org/D66886

llvm-svn: 370472
2019-08-30 13:39:22 +00:00
Jonas Devlieghere
0eaee545ee [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Fangrui Song
27ed1c5bb8 [YAMLIO] Remove trailing spaces when outputting maps
llvm::yaml::Output::paddedKey unconditionally outputs spaces, which
are superfluous if the value to be dumped is a sequence or map.
Change `bool NeedsNewLine` to `StringRef Padding` so that it can be
overridden to `\n` if the value is a sequence or map.

An empty map/sequence is special. It is printed as `{}` or `[]` without
a newline, while a non-empty map/sequence follows a newline. To handle
this distinction, add another variable `PaddingBeforeContainer` and does
the special handling in endMapping/endSequence.

Reviewed By: grimar, jhenderson

Differential Revision: https://reviews.llvm.org/D64566

llvm-svn: 365869
2019-07-12 04:51:31 +00:00
George Rimar
45d042ed96 [yaml2obj] - Don't crash on invalid inputs.
yaml2obj might crash on invalid input when unable to parse the YAML.

Recently a crash with a very similar nature was fixed for an empty files. 
This patch revisits the fix and does it in yaml::Input instead.
It seems to be more correct way to handle such situation.

With that crash for invalid inputs is also fixed now.

Differential revision: https://reviews.llvm.org/D61059

llvm-svn: 359178
2019-04-25 09:59:55 +00:00
Pavel Labath
d7e12574c6 YAMLIO: Fix serialization of strings with embedded nuls
Summary:
A bug/typo in Output::scalarString caused us to round-trip a StringRef
through a const char *. This meant that any strings with embedded nuls
were unintentionally cut short at the first such character. (It also
could have caused accidental buffer overruns, but it seems that all
StringRefs coming into this functions were formed from null-terminated
strings.)

This patch fixes the bug and adds an appropriate test.

Reviewers: sammccall, jhenderson

Subscribers: kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60505

llvm-svn: 358176
2019-04-11 14:57:34 +00:00
Chandler Carruth
2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Scott Linder
c0830f5577 [Support] Teach YAMLIO about polymorphic types
Add support for "polymorphic" types to YAMLIO.

PolymorphicTraits can dynamically switch between other traits (Scalar, Map, or
Sequence). When inputting, the PolymorphicTraits type is told which type to
become, and when outputting the PolymorphicTraits type is asked which type it
currently is.

Also add support for TaggedScalarTraits to allow dynamically differentiating
between multiple scalar types using YAML tags.

Serialize empty maps as "{}" and empty sequences as "[]", so that types
are preserved when round-tripping PolymorphicTraits. This change has
equivalent semantics, but may break e.g. tests which compare output
verbatim.

Differential Revision: https://reviews.llvm.org/D48144

llvm-svn: 346884
2018-11-14 19:39:59 +00:00
Scott Linder
ad115b7832 [Support] Remove redundant qualifiers in YAMLTraits (NFC)
llvm-svn: 344166
2018-10-10 18:14:02 +00:00
Graydon Hoare
926cd9b837 [YAML] Escape non-printable multibyte UTF8 in Output::scalarString.
The existing YAML Output::scalarString code path includes a partial and
incorrect implementation of YAML escaping logic. In particular, the logic put
in place in rL321283 escapes non-printable bytes only if they are not part of a
multibyte UTF8 sequence; implicitly this means that all multibyte UTF8
sequences -- printable and non -- are passed through verbatim.

The simplest solution to this is to direct the Output::scalarString method to
use the standalone yaml::escape function, and this _almost_ works, except that
the existing code in that function _over_ escapes: any multibyte UTF8 sequence
is escaped, even printable ones. While this is permitted for YAML, it is also
more aggressive (and hard to read for non-English locales) than necessary,
and the entire point of rL321283 was to back off such aggressive over-escaping.

So in this change, I have both redirected Output::scalarString to use
yaml::escape _and_ modified yaml::escape to optionally restrict its escaping to
non-printables. This preserves behaviour of any existing clients while giving
them a path to more moderate escaping should they desire.

Reviewers: JDevlieghere, thegameg, MatzeB, vladimir.plyashkun

Reviewed By: thegameg

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44863

llvm-svn: 328661
2018-03-27 19:52:45 +00:00
Francis Visoiu Mistrih
b2b961a3db [YAML] Fix UTF-8 handling
Previous YAML quoting patches broke UTF-8 printing in YAML: see https://reviews.llvm.org/D41290#961801.

Differential Revision: https://reviews.llvm.org/D41490

llvm-svn: 321283
2017-12-21 17:14:09 +00:00
Francis Visoiu Mistrih
b213b27ee3 [YAML] Add support for non-printable characters
LLVM IR function names which disable mangling start with '\01'
(https://www.llvm.org/docs/LangRef.html#identifiers).

When an identifier like "\01@abc@" gets dumped to MIR, it is quoted, but
only with single quotes.

http://www.yaml.org/spec/1.2/spec.html#id2770814:

"The allowed character range explicitly excludes the C0 control block
allowed), the surrogate block #xD800-#xDFFF, #xFFFE, and #xFFFF."

http://www.yaml.org/spec/1.2/spec.html#id2776092:

"All non-printable characters must be escaped.
[...]
Note that escape sequences are only interpreted in double-quoted scalars."

This patch adds support for printing escaped non-printable characters
between double quotes if needed.

Should also fix PR31743.

Differential Revision: https://reviews.llvm.org/D41290

llvm-svn: 320996
2017-12-18 17:38:03 +00:00
Dave Lee
c6f2e69695 Allow empty mappings for optional YAML input
Summary:
This change fixes a bug where `obj2yaml` can in some cases produce YAML that
causes `yaml2obj` to error.

The ELF YAML document structure has a `Sections` mapping, which contains three
mappings, all of which are optional: `Local`, `Global`, and `Weak.` Any one of
these can be missing, but if all three are missing, then `yaml2obj` errors. This
change allows YAML input for cases like this one.

I have tested this with check-llvm and check-lld, and all tests passed.

This change is the result of test failures while working on D39582, which
introduces a `DynamicSymbols` mapping, which will be empty at times.

Reviewers: compnerd, jakehehrlich, silvas, kledzik, mehdi_amini, pcc

Reviewed By: compnerd

Subscribers: silvas, llvm-commits

Differential Revision: https://reviews.llvm.org/D39908

llvm-svn: 318428
2017-11-16 17:46:11 +00:00
George Rimar
3674fb6f2c [yaml2obj] - Don't crash on one more invalid document.
This fixes one more crash I faced.
Testcase contains minimal reduced case.

Differential revision: https://reviews.llvm.org/D38082

llvm-svn: 313868
2017-09-21 08:25:59 +00:00
Alex Bradbury
1684317583 [YAMLTraits] Add filename support to yaml::Input
Summary:
The current yaml::Input constructor takes a StringRef of data as its
first parameter, discarding any filename information that may have been
present when a YAML file was opened. Add an alterate yaml::Input
constructor that takes a MemoryBufferRef, which can have a filename
associated with it. This leads to clearer diagnostic messages.

Sponsored By: DARPA, AFRL

Reviewed By: arphaman

Differential Revision: https://reviews.llvm.org/D35398

Patch by: Jonathan Anderson (trombonehero)

llvm-svn: 308172
2017-07-17 11:41:30 +00:00