update_mc_test_check script handle the "error case testline" wrong in
three cases:
1. when user select "--llvm-mc-binary" with a path, the script does not
add "not" on top of the "--llvm-mc-binary" and thus getting non-zero
exit code and failed.
2. When "not" is presented in runline while not all testlines are
expected to fail, the script need to check if the "not" is needed when
it execute llvm-mc line by line. Otherwise the script will fail on
testline which is passing.
3. When there are multiple runlines, the error checkline need to use
correct line offset for "[[LINE-X]]"
This patch solve these three issues
add a unique and a sort option to the update_mc_test_check script.
These mc asm/dasm files are usually large in number of lines, and these
lines are mostly similar to each other. These options can be useful when
maintainer is merging or resolving conflicts by making the file
identifical
Also fixed a small issue in asm/dasm such that the auto generated header
line is
1. asm using ";" instead of "//" as comment marker
2. dasm using ";" instead of "#" as comment marker
Ensures that struct padding is not skipped, as it may contain actual
data if the struct is really a union.
The patch originated from a discussion on #53710Fixes#53710
The previous error test line is using a 16bit instruction to indicate an
error. However this is a poor pick.
The 16bit instructions on AMDGPU is under development and thus, some
downstream branches are not showing this exact error message. Changing
it to another error dasm code.
Old versions of UTC produced function labels like:
; CHECK-LABEL: @func(
Fix the regular expression used when scanning for old check lines to
recognize this form of label.
This allows meta variable stability to apply when running UTC on tests
using this form of label.
Reported-by: Nikita Popov <npopov@redhat.com>
By default, UTC attempts to keep the produced diff small by keeping IR
value name variables stable. The old algorithm was roughly:
1. Compute a diff between the old and new check lines, where
"uncommitted" variable names are replaced by a wildcard.
This leads to a set of non-crossing "candidate" pairs of
(old line, new line) that we can try to make equal.
2. Greedily walk this list of candidates, committing to variable names
that make candidate lines equal if possible.
The greedy approach in the second step has the downside that committing
to a variable name greedily can sometimes prevent many subsequent
candidates from getting the variable name assignment that would make
them equal.
We keep the first step as-is, but replace the second one by an algorithm
that finds a large independent set of candidates, i.e. candidate pairs
of (old line, new line) which are non-conflicting in the sense that
their desired variable name mappings are not in conflict.
We find the large independent set by greedily assigning a coloring to
the conflict graph and taking the largest color class. We then commit to
all the variable name mappings which are desired by candidates in this
largest color class.
As before, we then recurse into regions between matching lines. This is
required in large cases. For example, running this algorithm at the
top-level of the new test case (stable_ir_values5.ll) matches up most of
the instructions, but not the names of the result values of all the
`load`s. This is because (unlike e.g. the getelementptrs) the load
instructions are all equal except for variable names, and so step 1 (the
diff algorithm) doesn't consider them as candidates. However, they are
trivially matched by recursion.
This also happens to fix a bug in tracking line indices that went
unnoticed previously...
As is usually the case with these changes, the quality improvement is
hard to see from the diff of this patch. However, it becomes obvious
when
comparing the diff of stable_ir_values5.ll against
stable_ir_value5.ll.expected
before and after this change.
Added a script to update the test file generated by llvm-mc binary. The
script accepts .s and .txt for asm and dasm.
For mc test I am targetting there is no function name which can be used
as a key, thus no clear mapping between input and output. The script
assumes the test are always line-by-line and it update the output marker
for each test line-by-line.
---------
Co-authored-by: Alexander Richardson <mail@alexrichardson.me>
The Hi result is sometimes calculated a different way and this
node goes unused. Defer creation until we know for sure it is neeeded.
The test changes is because the node creation order changed the names
in the debug output.
Use getBestVF to select VF up-front and only use
selectVectorizationFactor to get the VF legacy VF to check the
vectorization decision matches the VPlan-based cost model.
PR: https://github.com/llvm/llvm-project/pull/103033
This fixes DAG divergence mishandling inline asm.
This was considering the glue nodes for divergence, when the
divergence should only come from the individual CopyFromRegs
that are glued. As a result, having any VGPR CopyFromRegs would
taint any uniform SGPR copies as divergent, resulting in SGPR
copies to VGPR virtual registers later.
This reverts commit adaff46d087799072438dd744b038e6fd50a2d78.
Drop the -O3 checks from default-attributes.hip. I don't know why they
are different on some bots but reverting this is far too disruptive.
Removing it from the codegen pipeline induces a lot of test churn
because llc is no longer optimizing out implicit arguments to kernels.
Mostly mechanical, but there are some creative test updates. I preferred
to take the changes as-is in tests where the ABI isn't relevant. In
cases where it's more relevant, or the optimize out logic was too
ingrained in the test, I pre-run the optimization. Some cases manually
add attributes to disable inputs.
This reverts commit 6f538f6a2d3224efda985e9eb09012fa4275ea92.
A number of crashes have been fixed by separate fixes, including
ttps://github.com/llvm/llvm-project/pull/96622. This version of the
PR also pre-computes the costs for branches (except the latch) instead
of computing their costs as part of costing of replicate regions, as
there may not be a direct correspondence between original branches and
number of replicate regions.
Original message:
This adds a new interface to compute the cost of recipes, VPBasicBlocks,
VPRegionBlocks and VPlan, initially falling back to the legacy cost model
for all recipes. Follow-up patches will gradually migrate recipes to
compute their own costs step-by-step.
It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.
The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.
Additional discussions and context can be found in @arcbbb's
https://github.com/llvm/llvm-project/pull/67647 and
https://github.com/llvm/llvm-project/pull/67934 which is an earlier
version of the current PR.
PR: https://github.com/llvm/llvm-project/pull/92555
Tweak the LoopDistribute debug output to be prefixed with "LDist: ", get
it to be stable, and extend update_analyze_test_checks.py trivially to
support this output.
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records.
If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.
For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
A full downstream fork can already hack up update_llc_test_checks.py to
support custom tools that output assembly.
An out-of-tree frontend which is meant to build against upstream
llvm-project cannot do this, and so providing additional arguments to
support a non-standard tool is useful.
This also makes a minor adjustment to the regular expression for
matching AMDGPU functions when fewer comments are enabled, which happens
to be the case for our out-of-tree shader compiler (which motivated this
change).
Labels are matched using a regexp of the form '^(pattern):', which
requires the addition of a "suffix" concept to NamelessValue.
Aside from that, the key challenge is that block labels are values, and
we typically capture values including the prefix '%'. However, when
labels appear at the start of a basic block, the prefix '%' is not
included, so we must capture block label values *without* the prefix
'%'.
We don't know ahead of time whether an IR value is a label or not. In
most cases, they are prefixed by the word "label" (their type), but this
isn't the case in phi nodes. We solve this issue by leveraging the
two-phase nature of variable generalization: the first pass finds all
occurences of a variable and determines whether the '%' prefix can be
included or not. The second pass does the actual substitution.
This change also unifies the generalization path for assembly with that
for IR and analysis, in the hope that any future changes avoid diverging
those cases future.
I also considered the alternative of trying to detect the phi node case
using more regular expression special cases but ultimately decided
against that because it seemed more fragile, and perhaps the approach of
keeping a tentative prefix that may later be discarded could also be
eventually applied to some metadata and attribute cases.
Note that an early version of this change was reviewed as
https://reviews.llvm.org/D142452, before version numbers were
introduced. This is a substantially updated version of that change.
Generally, IR and assembly test files benefit from being cleaned to
remove unnecessary details. However, for tests requiring elaborate
IR or assembly files where cleanup is less practical (e.g., large amount
of debug information output from Clang), the current practice is to
include the C/C++ source file and the generation instructions as
comments.
This is inconvenient when regeneration is needed. This patch adds
`llvm/utils/update_test_body.py` to allow easier regeneration.
`ld.lld --debug-names` tests (#86508) utilize this script for
Clang-generated assembly tests.
Note: `-o pipefail` is standard (since
https://www.austingroupbugs.net/view.php?id=789) but not supported by
dash.
Link:
https://discourse.llvm.org/t/utility-to-generate-elaborated-assembly-ir-tests/78408
As we've added new IR elements for the RemoveDIs project,
we need the update_test_checks script to understand them. For the
records themselves this is already done automatically, but their
metadata arguments are not recognized as such due to lacking the
`metadata` prefix, which means they won't be checked by the script. This
patch fixes this by adding a check for all `![0-9]+` patterns as long as
they are not at the start of a line (which avoids matching global
values).
Collect the original check lines in a manner that is independent of
where the check lines appear in the file. This is so that we keep
FileCheck variable names stable even when --include-generated-funcs is
used.
Reported-by: Ruiling Song <ruiling.song@amd.com>
We check DAG.haveNoCommonBitsSet so the operands will be known to be
disjoint.
I couldn't think of a codegen test case since most targets aren't
checking hasDisjoint yet, apart from RISCV in the or_is_add pattern, but
it also falls back to computeKnownBits.
Since some of the users of `CodeExtractor` like `HotColdSplitting` run
late in the pipeline, returns are not cleaned to `unreachable`. So,
just emit `unreachable` directly if the function is `noreturn`.
Closes#84682
Resubmitting this after previous revert with the following changes:
- Split table into table_rhs_idx and table_candidate_idx so that
bisect.bisect_left can be used without the `key` argument, which
was introduced in Python 3.10
- Remove a re.Pattern type annotation
Original commit message:
Prior to this change, running UTC on larger tests, especially tests
with unnamed IR values, often resulted in a spuriously large diff
because e.g. TMPnn variables in the CHECK lines were renumbered. This
change attempts to reduce the diff by keeping those variable names the
same.
There are cases in which this "drift" of variable names can end up being
more confusing. The old behavior can be re-enabled with the
--reset-variable-names command line argument.
The improvement may not be immediately apparent in the diff of this change.
The point is that the diff of stable_ir_values.ll against
stable_ir_values.ll.expected after this change is smaller.
Ideally, we'd also keep meta variables for "global" objects stable, e.g.
for attributes (#nn) and metadata (!nn). However, that would require a
much more substantial refactoring of how we generate check lines, so I
left it for future work.
Prior to this change, running UTC on larger tests, especially tests
with unnamed IR values, often resulted in a spuriously large diff
because e.g. TMPnn variables in the CHECK lines were renumbered. This
change attempts to reduce the diff by keeping those variable names the
same.
There are cases in which this "drift" of variable names can end up being
more confusing. The old behavior can be re-enabled with the
--reset-variable-names command line argument.
The improvement may not be immediately apparent in the diff of this change.
The point is that the diff of stable_ir_values.ll against
stable_ir_values.ll.expected after this change is smaller.
Ideally, we'd also keep meta variables for "global" objects stable, e.g.
for attributes (#nn) and metadata (!nn). However, that would require a
much more substantial refactoring of how we generate check lines, so I
left it for future work.
The SelectionDAG scheduling preference now becomes source order
scheduling (machine scheduler generates better code -- even without
there being a machine model defined for LoongArch yet).
Most of the test changes are trivial instruction reorderings and
differing register allocations, without any obvious performance impact.
This is similar to commit: 3d0fbafd0bce43bb9106230a45d1130f7a40e5ec
When removing only lines that are global value CHECK lines, a related
CHECK-SAME line could be left dangling without a previous line to belong
to.
Resolves#78517
We already have the PtrOff factored into MachinePointerInfo. Any calls
to getAlign on the new load with do commonAlignment with the
MachinePointerInfo offset and the base alignment.
This patch sorts stack objects by their alignment value from the largest
to the smallest. If two objects have the same alignment, then they are
sorted by their size from the largest to the smallest. This minimizes
padding and reduces run time stack size.
This should be the final portion of shaping-up the test suite to be
ready for turning on non-intrinsic debug-info:
* Pin CostModel tests that expect to see intrinsics in their -debug
output to not use RemoveDIs. This is a spurious test output difference.
* Add 'tail' to a bunch of intrinsics in UpdateTestChecks. We're
cannonicalising intrinsics to be printed with "tail" in RemoveDI
conversion as dbg.values usually pick that up while being optimised.
This is another spurious output difference.
* The "DebugInfoDrop" pass used in the debugify unit-tests happens to
operate inside the pass manager, thus it sees non-intrinsic debug-info.
Update it to correctly drop it.
When using `BranchProbabilityPrinterPass`, if a BB has no name, we get pretty unusable information like `edge -> has probability...` (i.e. we have no idea what the vertices of that edge are).
This patch uses `printAsOperand`, which uses the same naming scheme as `Function::dump`, so for example during debugging sessions, the IR obtained from a function and the names used by `BranchProbabilityPrinterPass` will match.
A shortcoming is that `printAsOperand` will result in the numbering algorithm re-running for every edge and every vertex (when `BranchProbabilityPrinterPass` is run on a function). If, for the given scenario, this is a problem, we can revisit this subsequently.
Another nuance is that the entry basic block will be numbered, which may be slightly confusing when it's anonymous, but it's easily identifiable - the first edge would have it as source (and the number should be easily recognizable)
- Change `BranchProbabilityPrinterPass` output to match expectations of `update_analyze_test_checks.py`.
- Add `Branch Probability Analysis` to list of supported analyses.
- Process `llvm/test/Analysis/BranchProbabilityInfo/basic.ll` with `update_analyze_test_checks.py` as proof of concept. Leaving the other tests unchanged to reduce the amount of churn.
Recommits the changes from https://reviews.llvm.org/D148216.
Explicitly named globals are now matched literally, instead of emitting
a capture group for the name. This resolves#70047.
Metadata and annotations, on the other hand, are captured and matched
against by default, since their identifiers are not stable.
The reasons for revert (#63746) have been fixed:
The first issue, that of duplicated checkers, has already been resolved
in #70050.
This PR resolves the second issue listed in #63746, regarding the order
of named and unnamed globals. This is fixed by recording the index of
substrings containing global values, and sorting the checks according to
that index before emitting them. This results in global value checks
being emitted in the order they were seen instead of being grouped
separately.
SCEV expressions may contain multiple {{ or }} in the debug output,
which needs escaping.
See
llvm/test/Analysis/LoopAccessAnalysis/loops-with-indirect-reads-and-writes.ll
for a test that needs escaping.
Previously the kill flags of the source register were unconditionally
cleared when a `str` pair was merged, which results in suboptimal
register allocation and inhibits some renaming opportunities which may
allow further merging `str`.
update_analyze_test_checks.py is an invaluable tool in updating tests.
Unfortunately, it only supports output from the CostModel,
ScalarEvolution, and LoopVectorize analyses. Many LoopAccessAnalysis
tests use hand-crafted CHECK lines, and it is moreover tedious to
generate these CHECK lines, as the output fom the analysis is not
stable, and requires the test-writer to hand-craft FileCheck matches.
Alleviate this pain, and support output from:
$ opt -passes='print<loop-accesses>'
This patch includes several non-trivial changes including:
- Preserving whitespace at the beginning of the line, so that the LAA
output can be properly indented.
- Regexes matching the unstable output, which is basically a pointer
address hex.
- Separating is_analyze from preserve_names clearly, as the former was
formerly used as an overload for the latter.
To demonstate the utility of this patch, several tests in
LoopAccessAnalysis have been auto-generated by
update_analyze_test_checks.py.
Previously when using `-p` a.k.a. `--preserve-names` existing lines for
checking globals were not recognised as such, leading to the line being
kept while also being emitted again, resulting in duplicated CHECK
lines.
This resolves#70048.