Rename `ListInit::getValues()` to `getElements()` to better match with
other `ListInit` members like `getElement`. Keep `getValues()` for
existing downstream code but mark it deprecated.
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
Historically, the main example of *very* large string tables used the
`EmitCharArray` to work around MSVC limitations with string literals,
but that was switched (without removing the API) in order to consolidate
on a nicer emission primitive.
While this large string table in `IntrinsicsImpl.inc` seems to compile
correctly on MSVC without the work around in `EmitCharArray` (and that
this PR adds back to the nicer emission path), other users have
repeatedly hit this MSVC limitation as you can see in the discussion on
PR https://github.com/llvm/llvm-project/pull/120534. This PR teaches the
string offset table emission to look at
the size of the table and switch to the char array emission strategy
when the table becomes too large.
This work around does have the downside of making compile times worse
for large string tables, but that appears unavoidable until we can
identify known good MSVC versions and switch to requiring them for all
LLVM users. It also reduces searchability of the generated string table
-- I looked at emitting a comment with each string but it is tricky
because the escaping rules for an inline comment are different from
those of of a string literal, and there's no real way to turn the string
literal into a comment.
While improving the output in this way, also clean up the output to not
emit an extraneous empty string at the end of the string table, and
update the `StringTable` class to not look for that. It isn't actually
used by anything and is wasteful.
This PR also switches the `IntrinsicsImpl.inc` string tables over to the
new `StringTable` runtime abstraction. I didn't want to do this until
landing the MSVC workaround in case it caused even this example to start
hitting the MSVC bug, but I wanted to switch here so that I could
simplify the API for emitting the string table with the workaround
present. With the two different emission strategies, its important to
use a very exact syntax and that seems better encapsulated in the API.
Last but not least, the `SDNodeInfoEmitter` is updated, including its
tests to match the new output.
This PR should unblock landing
https://github.com/llvm/llvm-project/pull/120534 and letting us switch
all of
Clang's builtins to use string tables. That PR has all the details
motivating the overall effort.
Follow-up patches will try to consolidate the remaining users onto the
single interface, but those at least were easy to separate into
follow-ups and keep this PR somewhat smaller.
Now that we have a dedicated abstraction for string tables, switch the
option parser library's string table over to it rather than using a raw
`const char*`. Also try to use the `StringTable::Offset` type rather
than a raw `unsigned` where we can to avoid accidental increments or
other issues.
This is based on review feedback for the initial switch of options to a
string table. Happy to tweak or adjust if desired here.
Apologies for the large change, I looked for ways to break this up and
all of the ones I saw added real complexity. This change focuses on the
option's prefixed names and the array of prefixes. These are present in
every option and the dominant source of dynamic relocations for PIE or
PIC users of LLVM and Clang tooling. In some cases, 100s or 1000s of
them for the Clang driver which has a huge number of options.
This PR addresses this by building a string table and a prefixes table
that can be referenced with indices rather than pointers that require
dynamic relocations. This removes almost 7k dynmaic relocations from the
`clang` binary, roughly 8% of the remaining dynmaic relocations outside
of vtables. For busy-boxing use cases where many different option tables
are linked into the same binary, the savings add up a bit more.
The string table is a straightforward mechanism, but the prefixes
required some subtlety. They are encoded in a Pascal-string fashion with
a size followed by a sequence of offsets. This works relatively well for
the small realistic prefixes arrays in use.
Lots of code has to change in order to land this though: both all the
option library code has to be updated to use the string table and
prefixes table, and all the users of the options library have to be
updated to correctly instantiate the objects.
Some follow-up patches in the works to provide an abstraction for this
style of code, and to start using the same technique for some of the
other strings here now that the infrastructure is in place.
Code cleanups for TableGen files, changes includes function names,
variable names and unused imports.
---------
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>