For some ABIs `update_cc_test_checks.py` is unable to generate tests
because of the mismatch between the mangled function names reported by
clang's `-asd-dump` and the function names in LLVM IR.
This patch fixes it by striping the leading underscore from the mangled
name for global functions if the data layout string says they have one.
update_mc_test_check script handle the "error case testline" wrong in
three cases:
1. when user select "--llvm-mc-binary" with a path, the script does not
add "not" on top of the "--llvm-mc-binary" and thus getting non-zero
exit code and failed.
2. When "not" is presented in runline while not all testlines are
expected to fail, the script need to check if the "not" is needed when
it execute llvm-mc line by line. Otherwise the script will fail on
testline which is passing.
3. When there are multiple runlines, the error checkline need to use
correct line offset for "[[LINE-X]]"
This patch solve these three issues
add a unique and a sort option to the update_mc_test_check script.
These mc asm/dasm files are usually large in number of lines, and these
lines are mostly similar to each other. These options can be useful when
maintainer is merging or resolving conflicts by making the file
identifical
Also fixed a small issue in asm/dasm such that the auto generated header
line is
1. asm using ";" instead of "//" as comment marker
2. dasm using ";" instead of "#" as comment marker
Old versions of UTC produced function labels like:
; CHECK-LABEL: @func(
Fix the regular expression used when scanning for old check lines to
recognize this form of label.
This allows meta variable stability to apply when running UTC on tests
using this form of label.
Reported-by: Nikita Popov <npopov@redhat.com>
By default, UTC attempts to keep the produced diff small by keeping IR
value name variables stable. The old algorithm was roughly:
1. Compute a diff between the old and new check lines, where
"uncommitted" variable names are replaced by a wildcard.
This leads to a set of non-crossing "candidate" pairs of
(old line, new line) that we can try to make equal.
2. Greedily walk this list of candidates, committing to variable names
that make candidate lines equal if possible.
The greedy approach in the second step has the downside that committing
to a variable name greedily can sometimes prevent many subsequent
candidates from getting the variable name assignment that would make
them equal.
We keep the first step as-is, but replace the second one by an algorithm
that finds a large independent set of candidates, i.e. candidate pairs
of (old line, new line) which are non-conflicting in the sense that
their desired variable name mappings are not in conflict.
We find the large independent set by greedily assigning a coloring to
the conflict graph and taking the largest color class. We then commit to
all the variable name mappings which are desired by candidates in this
largest color class.
As before, we then recurse into regions between matching lines. This is
required in large cases. For example, running this algorithm at the
top-level of the new test case (stable_ir_values5.ll) matches up most of
the instructions, but not the names of the result values of all the
`load`s. This is because (unlike e.g. the getelementptrs) the load
instructions are all equal except for variable names, and so step 1 (the
diff algorithm) doesn't consider them as candidates. However, they are
trivially matched by recursion.
This also happens to fix a bug in tracking line indices that went
unnoticed previously...
As is usually the case with these changes, the quality improvement is
hard to see from the diff of this patch. However, it becomes obvious
when
comparing the diff of stable_ir_values5.ll against
stable_ir_value5.ll.expected
before and after this change.
Added a script to update the test file generated by llvm-mc binary. The
script accepts .s and .txt for asm and dasm.
For mc test I am targetting there is no function name which can be used
as a key, thus no clear mapping between input and output. The script
assumes the test are always line-by-line and it update the output marker
for each test line-by-line.
---------
Co-authored-by: Alexander Richardson <mail@alexrichardson.me>
There is no need to support Python 2.7 anymore, Python 3.3+ has
`subprocess.DEVNULL`. This is good practice and also prevents file
handles from
staying open unnecessarily.
Also remove a couple unused or unneeded `__future__` imports.
Tweak the LoopDistribute debug output to be prefixed with "LDist: ", get
it to be stable, and extend update_analyze_test_checks.py trivially to
support this output.
Labels are matched using a regexp of the form '^(pattern):', which
requires the addition of a "suffix" concept to NamelessValue.
Aside from that, the key challenge is that block labels are values, and
we typically capture values including the prefix '%'. However, when
labels appear at the start of a basic block, the prefix '%' is not
included, so we must capture block label values *without* the prefix
'%'.
We don't know ahead of time whether an IR value is a label or not. In
most cases, they are prefixed by the word "label" (their type), but this
isn't the case in phi nodes. We solve this issue by leveraging the
two-phase nature of variable generalization: the first pass finds all
occurences of a variable and determines whether the '%' prefix can be
included or not. The second pass does the actual substitution.
This change also unifies the generalization path for assembly with that
for IR and analysis, in the hope that any future changes avoid diverging
those cases future.
I also considered the alternative of trying to detect the phi node case
using more regular expression special cases but ultimately decided
against that because it seemed more fragile, and perhaps the approach of
keeping a tentative prefix that may later be discarded could also be
eventually applied to some metadata and attribute cases.
Note that an early version of this change was reviewed as
https://reviews.llvm.org/D142452, before version numbers were
introduced. This is a substantially updated version of that change.
As we've added new IR elements for the RemoveDIs project,
we need the update_test_checks script to understand them. For the
records themselves this is already done automatically, but their
metadata arguments are not recognized as such due to lacking the
`metadata` prefix, which means they won't be checked by the script. This
patch fixes this by adding a check for all `![0-9]+` patterns as long as
they are not at the start of a line (which avoids matching global
values).
Collect the original check lines in a manner that is independent of
where the check lines appear in the file. This is so that we keep
FileCheck variable names stable even when --include-generated-funcs is
used.
Reported-by: Ruiling Song <ruiling.song@amd.com>
Resubmitting this after previous revert with the following changes:
- Split table into table_rhs_idx and table_candidate_idx so that
bisect.bisect_left can be used without the `key` argument, which
was introduced in Python 3.10
- Remove a re.Pattern type annotation
Original commit message:
Prior to this change, running UTC on larger tests, especially tests
with unnamed IR values, often resulted in a spuriously large diff
because e.g. TMPnn variables in the CHECK lines were renumbered. This
change attempts to reduce the diff by keeping those variable names the
same.
There are cases in which this "drift" of variable names can end up being
more confusing. The old behavior can be re-enabled with the
--reset-variable-names command line argument.
The improvement may not be immediately apparent in the diff of this change.
The point is that the diff of stable_ir_values.ll against
stable_ir_values.ll.expected after this change is smaller.
Ideally, we'd also keep meta variables for "global" objects stable, e.g.
for attributes (#nn) and metadata (!nn). However, that would require a
much more substantial refactoring of how we generate check lines, so I
left it for future work.
Prior to this change, running UTC on larger tests, especially tests
with unnamed IR values, often resulted in a spuriously large diff
because e.g. TMPnn variables in the CHECK lines were renumbered. This
change attempts to reduce the diff by keeping those variable names the
same.
There are cases in which this "drift" of variable names can end up being
more confusing. The old behavior can be re-enabled with the
--reset-variable-names command line argument.
The improvement may not be immediately apparent in the diff of this change.
The point is that the diff of stable_ir_values.ll against
stable_ir_values.ll.expected after this change is smaller.
Ideally, we'd also keep meta variables for "global" objects stable, e.g.
for attributes (#nn) and metadata (!nn). However, that would require a
much more substantial refactoring of how we generate check lines, so I
left it for future work.
When removing only lines that are global value CHECK lines, a related
CHECK-SAME line could be left dangling without a previous line to belong
to.
Resolves#78517
- Change `BranchProbabilityPrinterPass` output to match expectations of `update_analyze_test_checks.py`.
- Add `Branch Probability Analysis` to list of supported analyses.
- Process `llvm/test/Analysis/BranchProbabilityInfo/basic.ll` with `update_analyze_test_checks.py` as proof of concept. Leaving the other tests unchanged to reduce the amount of churn.
Recommits the changes from https://reviews.llvm.org/D148216.
Explicitly named globals are now matched literally, instead of emitting
a capture group for the name. This resolves#70047.
Metadata and annotations, on the other hand, are captured and matched
against by default, since their identifiers are not stable.
The reasons for revert (#63746) have been fixed:
The first issue, that of duplicated checkers, has already been resolved
in #70050.
This PR resolves the second issue listed in #63746, regarding the order
of named and unnamed globals. This is fixed by recording the index of
substrings containing global values, and sorting the checks according to
that index before emitting them. This results in global value checks
being emitted in the order they were seen instead of being grouped
separately.
SCEV expressions may contain multiple {{ or }} in the debug output,
which needs escaping.
See
llvm/test/Analysis/LoopAccessAnalysis/loops-with-indirect-reads-and-writes.ll
for a test that needs escaping.
update_analyze_test_checks.py is an invaluable tool in updating tests.
Unfortunately, it only supports output from the CostModel,
ScalarEvolution, and LoopVectorize analyses. Many LoopAccessAnalysis
tests use hand-crafted CHECK lines, and it is moreover tedious to
generate these CHECK lines, as the output fom the analysis is not
stable, and requires the test-writer to hand-craft FileCheck matches.
Alleviate this pain, and support output from:
$ opt -passes='print<loop-accesses>'
This patch includes several non-trivial changes including:
- Preserving whitespace at the beginning of the line, so that the LAA
output can be properly indented.
- Regexes matching the unstable output, which is basically a pointer
address hex.
- Separating is_analyze from preserve_names clearly, as the former was
formerly used as an overload for the latter.
To demonstate the utility of this patch, several tests in
LoopAccessAnalysis have been auto-generated by
update_analyze_test_checks.py.
Previously when using `-p` a.k.a. `--preserve-names` existing lines for
checking globals were not recognised as such, leading to the line being
kept while also being emitted again, resulting in duplicated CHECK
lines.
This resolves#70048.
update_analyze_test_checks.py currently outputs a warning when updating
a script with the run line:
$ opt -passes='print<scalar-evolution>'
saying that the script doesn't support its output, when it indeed does,
as evidenced by several tests in test/Analysis/ScalarEvolution generated
by this script. There is even a test for update_analyze_test_checks that
makes sure that SCEV output is supported. Hence, squelch the warning.
While at it, rename the update_analyze_test_checks test from basic.ll to
a more explicit scev.ll.
If the function argument block contains patterns, we split argument
matching into a separate SAME line, because LABEL labels may not contain
pattern matches.
Until now, in this case we moved the parenthesis opening the argument block
into the second line.
This generates incorrect labels in case function names are not prefix-free.
For example, for a function `foo` we generated:
CHECK-LABEL: foo
CHECK-SAME: (<args of foo>)
If the output also contains a function `foo.specialzied`, then the label for
`foo` can match `foo.specialized`, depending on output order.
This patch moves opening parenthesis to the first line, breaking common prefixes:
CHECK-LABEL: foo(
CHECK-SAME: <args of foo>)
Bump the UTC version to 3, and only move the parenthesis for version 3 and later.
Differential Revision: https://reviews.llvm.org/D158497
Without this we cannot update various clang OpenMP tests as the UTC_ARGS
version of -global-value-regex is simply ignored. The handling of the
flag should be changed to be in line with others, I left TODOs for now.
Both the pattern for finding the clang version metadata, and the emitted
checker, are now more robust, to handle a vendor prefix.
Differential Revision: https://reviews.llvm.org/D154520
This prevents update_cc_tests.py from emitting hard-coded identifiers
for metadata (global variable checkers still check hard-coded
identifiers). Instead it emits regex checkers that match even if the
identifiers change. Also adds a new mode for --check-globals: instead of
simply being on or off, it now has the options 'none', 'smart' and
'all', with 'none' and 'all' corresponding to the previous modes.
The 'smart' mode only emits checks for global definitions referenced
in the IR or other metadata that itself has a definition checker
emitted, making the rule transitive. It does not emit checks for
attribute sets, since that is better checked by --check-attributes. This
mode is made the new default. To make the change in default mode
backwards compatible a version bump is introduced (to v3), and the
default remains 'none' in v1 & v2.
This will result in metadata checks being emitted more often, so filters
are added to not check absolute file paths and compiler version git
hashes.
rdar://105239218
This patch enables --function-signature by default under --version 2
and makes --version 2 the default. This means that all newly created
tests will check the function signature, while leaving old tests alone.
There's two motivations for this change:
* Without --function-signature, the generated check lines may fail
in a very hard to understand way if the test both includes a
function definition and a call to that function. (Though we could
address this by making the CHECK-LABEL stricter, without checking
the full signature.)
* This actually checks that uses of the arguments in the function
body use the correct argument, instead of matching against any
variable.
This is a replacement for D139006 and D140212 based on the
--version mechanism.
I did not include an opt-out flag --no-function-signature because
I'm not sure we need it. Would be happy to include it though,
if desired.
Differential Revision: https://reviews.llvm.org/D145149
If --function-signature is used with --version 2, then also include
the return type/attributes in the check lines. This is the
implementation of D133943 rebased on the --version mechanism from
D142473.
This doesn't bump the default version yet, because I'd like to do
that together with D140212 (which enables --function-signature by
default), as these changes seem closely related. For now this
functionality can be accessed by explicitly passing --version 2
to UTC.
Fixes https://github.com/llvm/llvm-project/issues/61058.
Differential Revision: https://reviews.llvm.org/D144963
We have a number of pending changes to update_test_checks.py
(and friends) that are essentially blocked on test churn:
If the output of UTC for an existing flag combination changes,
then the next time a test is regenerated, it will contain many
spurious changes. This makes changes to UTC default
behavior essentially impossible.
Examples of such changes are:
* D133943/D142373 want --function-signature to also check the
return type/attributes.
* D139006/D140212 want to make --function-signature the default
behavior.
* D142452 wants to add wildcards for block labels.
This patch tries to resolve this issue by adding a --version
argument, which works as follows:
* When regenerating an old test, the default version is 1.
* When generating a new test, the default version is the newest.
When an explicit version is specified, that of course wins.
This means that any currently existing tests will keep using
--version 1 format, while any new tests will automatically embed
--version N where N is the latest version, and then keep using
that test format from then on.
This patch only implements the --version flag without bumping
the default version, so it does not have any visible behavior
change by itself.
Differential Revision: https://reviews.llvm.org/D142473
Remove global_ir_{prefix,prefix_regexp} (one of which is misnamed),
since they are really quite redundant with ir_{prefix,regexp} and
default the is_before_functions argument, which basically just adds
noise to the table of NamelessValues.
Differential Revision: https://reviews.llvm.org/D142451