The splitMustacheString function previously used a loop of
StringRef::split and StringRef::trim. This was inefficient as
it scanned each segment of the accessor string multiple times.
This change introduces a custom splitAndTrim function that
performs both operations in a single pass over the string,
reducing redundant work and improving performance, most notably
in the number of CPU cycles executed.
| Metric | Baseline | Optimized | Change |
| --- | --- | --- | --- |
| Time (ms) | 35\.57 | 35\.36 | \-0.59% |
| Cycles | 34\.91M | 34\.26M | \-1.86% |
| Instructions | 85\.54M | 85\.24M | \-0.35% |
| Branch Misses | 111\.9K | 112\.2K | \+0.27% |
| Cache Misses | 242\.1K | 239\.9K | \-0.91% |
The splitMustacheString function was saving StringRefs that
were already backed by an arena-allocated string. This was
unnecessary work. This change removes the redundant
Ctx.Saver.save() call.
This optimization provides a small but measurable performance
improvement on top of the single-pass tokenizer, most notably
reducing branch misses.
Metric | Baseline | Optimized | Change
-------------- | -------- | --------- | -------
Time (ms) | 35.77 | 35.57 | -0.56%
Cycles | 35.16M | 34.91M | -0.71%
Instructions | 85.77M | 85.54M | -0.27%
Branch Misses | 113.9K | 111.9K | -1.76%
Cache Misses | 237.7K | 242.1K | +1.85%
We make the Mustache ASTNodes usable in the arena by first removing all
of the memory owning data structures, like std::vector, std::unique_ptr,
and SmallVector. We use standard LLVM list types to hold this data
instead, and make use of a UniqueStringSaver to hold the various
templates strings.
Additionally, update clang-doc APIs to use the new interfaces.
Future work can make better use of Twine interfaces to help avoid any
intermediate copies or allocations.
This patch fixes:
llvm/lib/Support/Mustache.cpp:332:20: error: unused function
'tagKindToString' [-Werror,-Wunused-function]
llvm/lib/Support/Mustache.cpp:344:20: error: unused function
'jsonKindToString' [-Werror,-Wunused-function]
The existing logging was inconsistent, and we logged too many things.
This PR introduces a more principled schema, and eliminates many,
redundant log lines.
When rendering partials, we need to use an indentation stream,
but when part of the partial is a unescaped sequence, we cannot
indent those. To address this, we build a common MustacheStream
interface for all the output streams to use. This allows us to
further customize the AddIndentationStream implementation
and opt it out of indenting the UnescapeSequence.
The base mustache spec allows setting custom delimiters, which slightly
change parsing of partials. This patch implements that feature by adding
a new token type, and changing the tokenizer's behavior to allow setting
custom delimiters.
The current implementaion did not correctly handle indentation for
standalone partial tags. It was only applied to lines following a
newline, instead of the first line of a partial's content. This was
fixed by updating the AddIndentation implementaion to prepend the
indentation to the first line of the partial.
The naive char-by-char lookup performed OK, but we can skip ahead to the
next match, avoiding all the extra hash lookups in the key map. Likely
there is a faster method than this, but its already a 42% win in the
BM_Mustache_StringRendering/Escaped benchmark, and an order of magnitude
improvement for BM_Mustache_LargeOutputString.
| Benchmark | Before (ns) | After (ns) | Speedup |
| :--- | ---: | ---: | ---: |
| `StringRendering/Escaped` | 29,440,922 | 16,583,603 | ~44% |
| `LargeOutputString` | 15,139,251 | 929,891 | ~94% |
| `HugeArrayIteration` | 102,148,245 | 95,943,960 | ~6% |
| `PartialsRendering` | 308,330,014 | 303,556,563 | ~1.6% |
Unreported benchmarks, like those for parsing, had no significant
change.
We extend the logic in tokenize() to treat the `{{{}}}` delimiters
to treat it like other unescaped HTML. We do this by updating the
tokenizer to treat the new tokes the same way we do for the `{{&variable}}`
syntax, which avoid the need to change the parser.
We also update the llvm-test-mustache-spec tool to no longer mark Triple
Mustache as XFAIL.
The current implementation set a reference to a nullptr, leading to all
kinds of problems. Instead, we can check the various uses to ensure we
don't deref invalid memory, and improve the logic for how contexts are
passed to children, since that was also subtly wrong in some cases.
The last version of this patch had memory leaks due to using the
BumpPtrAllocator for data types that required destructors to run to
release heap memory (e.g. via std::vector and std::string). This version
avoids that by using smart pointers, and dropping support for
BumpPtrAllocator.
We should refactor this code to use the BumpPtrAllocator again, but that
can be addressed in future patches, since those are more invasive
changes that need to refactor many of the core data types to avoid
owning allocations.
Adds Support for the Mustache Templating Language. See specs here:
https://mustache.github.io/mustache.5.html This patch implements
support+tests for majority of the features of the language including:
- Variables
- Comments
- Lambdas
- Sections
This meant as a library to support places where we have to generate
HTML, such as in clang-doc.
Co-authored-by: Peter Chou <peter.chou@mail.utoronto.ca>
Reapply https://github.com/llvm/llvm-project/pull/130732
Fixes errors which broke build bot that uses GCC as a compiler
https://lab.llvm.org/buildbot/#/builders/66/builds/11049
GCC threw an warning due to an issue std::move with a temporary object
which prevents copy elision. Fixes the issue by removing the std::move
Adds Support for the Mustache Templating Language. See specs here:
https://mustache.github.io/mustache.5.html
This patch implements support+tests for majority of the features of the
language including:
- Variables
- Comments
- Lambdas
- Sections
This meant as a library to support places where we have to generate
HTML, such as in clang-doc.
Reapply https://github.com/llvm/llvm-project/pull/105893
Fixes errors which broke build bot that uses GCC as a compiler
https://lab.llvm.org/buildbot/#/builders/136/builds/3100
The issue here was that using Accessor defined in the anonymous
namespace introduces Accessor as a type alias. Which is, later redeclare
as members in classes Token and ASTNode with the same name which causes
error in GCC. The patch fixes it by renaming the Accesor to
AccessorValue. It also fixes warnings caused by the compile due to
initialization
Adds Support for the Mustache Templating Language. See specs here:
https://mustache.github.io/mustache.5.html
This patch implements support+tests for majority of the features of the
language including:
- Variables
- Comments
- Lambdas
- Sections
This meant as a library to support places where we have to generate
HTML, such as in clang-doc.
Adds Support for the Mustache Templating Language. See specs here:
https://mustache.github.io/mustache.5.html
This patch implements support+tests for majority of the features of the
language including:
- Variables
- Comments
- Lambdas
- Sections
This meant as a library to support places where we have to generate
HTML, such as in clang-doc.