There was no real benefit to disallowing this, and it sometimes caused
unnecessary churn in the RUN lines of tests which were updated from
single-to-multiple or multiple-to-single prefixes.
This effectively makes -check-prefixes the primary option and
-check-prefix just an alias of it. The documentation is upated
accordingly.
Support `\g{n}`-style back references in `regcomp` as well by increasing
the limit from 9 to 20 and adding additional parsing. Update the limit
checks in FileCheck. The limit can theoretically be removed by
reallocating the regex-matchers internal arrays but I don't find a use
case for that as of now.
Update a test that now should pass when using more than 9
back-references.
Add a new test that tests for the error message explicitly..
Define lit testsuite for FileCheck and TableGen with smaller set of
dependencies. This uses the new `SKIP` argument to `add_lit_testsuites`
that was added in https://github.com/llvm/llvm-project/pull/157176/.
* When `--enable-var-scope` is active,
`lib/FileCheck.cpp#clearLocalVars` gets called.
* That function loops through `GlobalNumericVariableTable` and then
calls `NumericVariable::clear` on most items. It also removes them from
`GlobalNumericVariableTable`.
* When reassigning an already cleared variable, `Pattern::match` calls
`NumericVariable::setValue`, but it doesn't reinsert it into
`GlobalNumericVariableTable`. Therefore, the next time `clearLocalVars`
is called, it won't be able to loop through the variables.
Fix it by reinserting them in `GlobalNumericVariableTable` inside
`Pattern::match`.
Co-authored-by: Thomas Preud'homme <thomas.preudhomme@arm.com>
Firstly fix FileCheck printing string variables
double-escaped (first regex, then C-style).
This is confusing because it is not clear if the printed
value is the literal value or exactly how it is escaped, without
looking at FileCheck's source code.
Secondly, only escape when doing so makes it easier to read the value
(when the string contains tabs, newlines or non-printable characters).
When the variable value is escaped, make a note of it in the output too,
in order to avoid confusion.
The common case that is motivating this change is variables that contain
windows style paths with backslashes. These were printed as
`"C:\\\\Program Files\\\\MyApp\\\\file.txt"`.
Now prefer to print them as `"C:\Program Files\MyApp\file.txt"`.
Printing the value literally also makes it easier to search for
variables in the output, since the user can just copy-paste it.
Line ending policies were changed in the parent, dccebddb3b80. To make
it easier to resolve downstream merge conflicts after line-ending
policies are adjusted this is a separate whitespace-only commit. If you
have merge conflicts as a result, you can simply `git add --renormalize
-u && git merge --continue` or `git add --renormalize -u && git rebase
--continue` - depending on your workflow.
Historically, we've not automatically enforced how git tracks line
endings, but there are many, many commits that "undo" unintended CRLFs
getting into history.
`git log --pretty=oneline --grep=CRLF` shows nearly 100 commits
involving reverts of CRLF making its way into the index and then
history. As far as I can tell, there are none the other way round except
for specific cases like `.bat` files or tests for parsers that need to
accept such sequences.
Of note, one of the earliest of those listed in that output is:
```
commit 9795860250734e5c2a879546c534e35d9edd5944
Author: NAKAMURA Takumi <geek4civic@gmail.com>
Date: Thu Feb 3 11:41:27 2011 +0000
cmake/*: Add svn:eol-style=native and fix CRLF.
llvm-svn: 124793
```
...which introduced such a defacto policy for subversion.
With old versions of git, it's been a bit of a crap-shoot whether
enforcing storing line endings in the history will upset checkouts on
machines where such line endings are the norm. Indeed many users have
enforced that git checks out the working copy according to a global or
per-user config via core crlf, or core autocrlf.
For ~8 years now[1], however, git has supported the ability to "do as
the Romans do" on checkout, but internally store subsets of text files
with line-endings specified via a system of patterns in the
`.gitattributes` file. Since we now have this ability, and we've been
specifying attributes for various binary files, I think it makes sense
to rid us of all that work converting things "back", and just let git
handle the local checkout. Thus the new toplevel policy here is
* text=auto
In simple terms this means "unless otherwise specified, convert all
files considered "text" files to LF in the project history, but check
them out as expected on the local machine. What is "expected on the
local machine" is dependent on configuration and default.
For those files in the repository that *do* need CRLF endings, I've
adopted a policy of `eol=crlf` which means that git will store them in
history with LF, but regardless of user config, they'll be checked out
in tree with CRLF.
Finally, existing files have been "corrected" in history via `git add
--renormalize .`
End users should *not* need to adjust their local git config or
workflow.
[1]: git 2.10 was released with fixed support for fine-grained
line-ending tracking that respects user-config *and* repo policy. This
can be considered the point at which git will respect both the user's
local working tree preference *and* the history as specified by the
maintainers. See
https://github.com/git/git/blob/master/Documentation/RelNotes/2.10.0.txt#L248
for the release note.
Reland #82595 with fixes of build failures related to colored output.
See https://lab.llvm.org/buildbot/#/builders/139/builds/60549
Use `%ProtectFileCheckOutput` to avoid colored output.
Original commit message below.
In `Pattern::parseVariable`, for global variables (those starting with '$')
and for pseudo variables (those starting with '@') the first character is
consumed before actual variable name parsing. If the name is empty, it
leads to out-of-bound access to the corresponding `StringRef`.
This patch adds an if statement against the case described.
In `Pattern::parseVariable`, for global variables (those starting with
'$') and for pseudo variables (those starting with '@') the first
character is consumed before actual variable name parsing. If the name
is empty, it leads to out-of-bound access to the corresponding
`StringRef`.
This patch adds an if statement against the case described.
Fixes#70221
Fix a bug in FileCheck that corrects the error message when multiple
prefixes are provided
through --check-prefixes and one of them is a PREFIX-NOT.
Earlier, only the first of the provided prefixes was displayed as the
erroneous prefix, while the
actual error might be on the prefix that occurred at the end of the
prefix list in the input file.
Now, the right NOT prefix is shown in the error message.
Use APInt to represent numeric variables and expressions, therefore
removing overflow concerns. Only remains underflow when the format of an
expression is unsigned (incl. hex values) but the result is negative.
Note that this can only happen when substituting an expression, not when
capturing since the regex used to capture unsigned value will not include
minus sign, hence all the code removal for match propagation testing.
This is what this patch implement.
Reviewed By: arichardson
Differential Revision: https://reviews.llvm.org/D150880
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: barannikov88, kwk
Differential Revision: https://reviews.llvm.org/D150762
Change FileCheck to accept patterns like "[[[var...]]" and treat the
excess open brackets at the start as literals.
This makes the patterns for matching assembler output with literal
brackets much cleaner. For example an AMDGPU pattern that used to be
written like:
buffer_store_dwordx2 v{{\[}}[[LO]]:[[HI]]{{\]}}
can now be:
buffer_store_dwordx2 v[[[LO]]:[[HI]]]
(Even before this patch the final close bracket did not need to be
wrapped in {{}}, but people tended to do it anyway for symmetry.)
This does not introduce any ambiguity since "[[" was always followed by
an identifier or '@' or '#', so "[[[" was always an error.
I've included a few test updates in this patch just for illustration and
testing. There are a couple of hundred tests that could be updated as a
follow up, mostly in test/CodeGen/.
Differential Revision: https://reviews.llvm.org/D117117
Change-Id: Ia6bc6f65cb69734821c911f54a43fe1c673bcca7
If MatchRegexp is an invalid regex, an error message will be printed
using SourceManager::PrintMessage via AddRegExToRegEx.
PrintMessage relies on the input being a StringRef into a string managed
by SourceManager. At the moment, a StringRef to a std::string
allocated in the caller of AddRegExToRegEx is passed. If the regex is
invalid, this StringRef is passed to PrintMessage, where it will crash,
because it does not point to a string managed via SourceMgr.
This patch fixes the crash by turning MatchRegexp into a StringRef If
we use MatchStr, we directly use that StringRef, which points into a
string from SourceMgr. Otherwise, MatchRegexp gets assigned
Format.getWildcardRegex(), which returns a std::string. To extend the
lifetime, assign it to a std::string variable WildcardRegexp and assign
MatchRegexp to a stringref to WildcardRegexp. WildcardRegexp should
always be valid, so we should never have to print an error message
via the SoureMgr I think.
Fixes PR49319.
Reviewed By: thopre
Differential Revision: https://reviews.llvm.org/D109050
Currently a CHECK-NOT directive succeeds whenever the corresponding
match fails. However match can fail due to an error rather than a lack
of match, for instance if a variable is undefined. This commit makes match
error a failure for CHECK-NOT.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D86222
In input dump annotations, `check:2'1` indicates diagnostic 1 for the
`CHECK` directive on check file line 2. Without this patch,
`-dump-input` computes the diagnostic index with the assumption that
FileCheck *consecutively* produces all diagnostics for the same
pattern. Already, that can be a false assumption, as in the examples
below. Moreover, it seems like a brittle assumption as FileCheck
evolves. Finally, it actually complicates the implementation even if
it makes it slightly more efficient.
This patch avoids that assumption. Examples below show results after
applying this patch. Before applying this patch, `'N` is omitted
throughout these examples because the implementation doesn't notice
there's more than one diagnostic per pattern.
First, `CHECK-LABEL` violates the assumption because `CHECK-LABEL`
tries to match twice, and other directives can match in between:
```
$ cat check
CHECK: foobar
CHECK-LABEL: foobar
$ FileCheck -vv check < input |& tail -8
<<<<<<
1: text
2: foobar
label:2'0 ^~~~~~
check:1 ^~~~~~
label:2'1 X error: no match found
3: text
>>>>>>
```
Second, `--implicit-check-not` is obviously processed many times among
other directives:
```
$ cat check
CHECK: foo
CHECK: foo
$ FileCheck -vv -dump-input=always -implicit-check-not=foo \
check < input |& tail -16
<<<<<<
1: text
not:imp1'0 X~~~~
2: foo
check:1 ^~~
not:imp1'1 X
3: text
not:imp1'1 ~~~~~
4: foo
check:2 ^~~
not:imp1'2 X
5: text
not:imp1'2 ~~~~~
6:
eof:2 ^
>>>>>>
```
Reviewed By: thopre, jhenderson
Differential Revision: https://reviews.llvm.org/D97813
FileCheck string substitution block parsing code only report an invalid
variable name in a string variable use if it starts with a forbidden
character. It does not report anything if there are unparsed characters
after the variable name, i.e. [[X-Y]] is parsed as [[X]] and no error is
returned. This commit fixes that.
Reviewed By: jdenny, jhenderson
Differential Revision: https://reviews.llvm.org/D98691
Fixed substitution printing not to produce an empty diagnostic for
errors handled elsewhere.
Reviewed By: thopre
Differential Revision: https://reviews.llvm.org/D98088
A more general name might be match-time error propagation. That is,
it's conceivable we'll one day have non-numeric errors that require
the handling fixed by this patch.
Without this patch, FileCheck behaves as follows:
```
$ cat check
CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]]
$ FileCheck -vv -dump-input=never check < input
check:1:54: remark: implicit EOF: expected string found in input
CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]]
^
<stdin>:2:1: note: found here
^
check:1:15: error: unable to substitute variable or numeric expression: overflow error
CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]]
^
$ echo $?
0
```
Notice that the exit status is 0 even though there's an error.
Moreover, FileCheck doesn't print the error diagnostic unless both
`-dump-input=never` and `-vv` are specified.
The same problem occurs when `CHECK-NOT` does have a match but a
capture fails due to overflow: exit status is 0, and no diagnostic is
printed unless both `-dump-input=never` and `-vv` are specified. The
usefulness of capturing from `CHECK-NOT` is questionable, but this
case should certainly produce an error.
With this patch, FileCheck always includes the error diagnostic and
has non-zero exit status for the above examples. It's conceivable
that this change will cause some existing tests to fail, but my
assumption is that they should fail. Moreover, with nearly every
project enabled, this patch didn't produce additional `check-all`
failures for me.
This patch also extends input dumps to include such numeric error
diagnostics for both expected and excluded patterns.
As noted in fixmes in some of the tests added by this patch, this
patch worsens an existing issue with redundant diagnostics. I'll fix
that bug in a subsequent patch.
Reviewed By: thopre, jhenderson
Differential Revision: https://reviews.llvm.org/D98086
Add printf-style alternate form flag to prefix hex number with 0x when
present. This works on both empty numeric expression (e.g. variable
definition from input) and when matching a numeric expression. The
syntax is as follows:
[[#%#<precision specifier><format specifier>, ...]
where <precision specifier> and <format specifier> are optional and ...
can be a variable definition or not with an empty expression or not.
This feature was requested in https://reviews.llvm.org/D81144#2075532
for llvm/test/MC/ELF/gen-dwarf64.s
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D97845
When commit da108b4ed4e6e7267701e76d5fd3b87609c9ab77 introduced
the CHECK-NEXT directive, it added logic to skip to the next line when
printing a diagnostic if the current matching position is at the end of
a line. This was fine while FileCheck did not support regular expression
but since it does now it can be confusing when the pattern to match
starts with the expectation of a newline (e.g. CHECK-NEXT: {{\n}}foo).
It is also inconsistent with the column information in the diagnostic
which does point to the end of line.
This commit removes this logic altogether, such that failure to match
diagnostic for such cases would show the end of line and be consistent
with the column information. The commit also adapts all existing
testcases accordingly.
Note to reviewers: An alternative approach would be to restrict the code
to only skip to the next line if the first character of the pattern is
known not to match a whitespace-like character. This would respect the
original intent but keep the inconsistency in terms of column info and
requires more code. I've only chosen this current approach by laziness
and would be happy to restrict the logic instead.
Reviewed By: jdenny, jhenderson
Differential Revision: https://reviews.llvm.org/D93341
Link: https://lists.llvm.org/pipermail/llvm-dev/2020-October/146162.html "[RFC] FileCheck: (dis)allowing unused prefixes"
If a downstream project using lit needs time for transition,
add the following to `lit.local.cfg`:
```
from lit.llvm.subst import ToolSubst
fc = ToolSubst('FileCheck', unresolved='fatal')
config.substitutions.insert(0, (fc.regex, 'FileCheck --allow-unused-prefixes'))
```
Differential Revision: https://reviews.llvm.org/D95849
cl::ZeroOrMore allows the option to be specified multiple times, which makes
downstream projects possible to specify a default value in lit configuration
while some tests can override the value.
This patch sets the default for llvm tests, with the exception of tests
under Reduce, because quite a few of them use 'FileCheck' as parameter
to a tool, and including a flag as that parameter would complicate
matters.
The rest of the patch undo-es the lit.local.cfg changes we progressively
introduced as temporary measure to avoid regressions under various
directories.
Differential Revision: https://reviews.llvm.org/D95111
The test related to directives with prefix NUMEXPR-CONSTRAINT-NOMATCH
refers to the numeric variable UNSI which is defined by a directive with
prefix CHECK not enabled on the lit command-line. Rather than adding the
prefix CHECK to the lit command-line, this commit defined variable UNSI
in a new NUMEXPR-CONSTRAINT-NOMATCH prefixed directive. The directive
needs to contain more than just the variable otherwise the error message
from FileCheck is about the wrong line being matched.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D93346
Introduce CHECK modifiers that change the behavior of the CHECK
directive. Also add a LITERAL modifier for cases where matching could
end requiring escaping strings interpreted as regex where only
literal/fixed string matching is desired (making the CHECK's more
difficult to write/fragile and difficult to interpret).
If more than a prefix is provided - e.g. --check-prefixes=CHECK,FOO - we
don't report if (say) FOO is never used. This may lead to a gap in our
test coverage.
This patch introduces a new option, --allow-unused-prefixes. It
currently is set to true, keeping today's behavior. After we explicitly
set it in tests where this behavior was actually intentional, we will
switch it to false by default.
Differential Revision: https://reviews.llvm.org/D90281
Add printf-style precision specifier to pad numbers to a given number of
digits when matching them if the value is smaller than the given
precision. This works on both empty numeric expression (e.g. variable
definition from input) and when matching a numeric expression. The
syntax is as follows:
[[#%.<precision><format specifier>, ...]
where <format specifier> is optional and ... can be a variable
definition or not with an empty expression or not. In the absence of a
precision specifier, a variable definition will accept leading zeros.
Reviewed By: jhenderson, grimar
Differential Revision: https://reviews.llvm.org/D81667
This commit makes FileCheck print all CHECK-NOT directive failure in a
CHECK-NOT block even if one fails. Prior to that, it would stop trying
to match CHECK-NOT directive as soon as one in the block fails.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D86315
Substitutions are already reported in the diagnostics appearing before
the input dump in the case of failed directives, and they're reported
in traces (produced by `-vv -dump-input=never`) in the case of
successful directives. However, those reports are not always
convenient to view while investigating the input dump, so this patch
adds the substitution report to the input dump too. For example:
```
$ cat check
CHECK: hello [[WHAT:[a-z]+]]
CHECK: [[VERB]] [[WHAT]]
$ FileCheck -vv -DVERB=goodbye check < input |& tail -8
<<<<<<
1: hello world
check:1 ^~~~~~~~~~~
2: goodbye word
check:2'0 X~~~~~~~~~~~ error: no match found
check:2'1 with "VERB" equal to "goodbye"
check:2'2 with "WHAT" equal to "world"
>>>>>>
```
Without this patch, the location reported for a substitution for a
directive match is the directive's full match range. This location is
misleading as it implies the substitution itself matches that range.
This patch changes the reported location to just the match range start
to suggest the substitution is known at the start of the match. (As
in the above example, input dumps don't mark any range for
substitutions. The location info in that case simply identifies the
right line for the annotation.)
Reviewed By: mehdi_amini, thopre
Differential Revision: https://reviews.llvm.org/D83650