[NFC][analyzer] Document configuration options (#135169)
This commit documents the process of specifying values for the analyzer options and checker options implemented in the static analyzer, and adds a script which includes the documentation of the analyzer options (which was previously only available through a command-line flag) in the RST-based web documentation.
This commit is contained in:
parent
5fb9dca14a
commit
899f26315c
@ -134,6 +134,34 @@ if (LLVM_ENABLE_SPHINX)
|
||||
gen_rst_file_from_td(DiagnosticsReference.rst -gen-diag-docs ../include/clang/Basic/Diagnostic.td "${docs_targets}")
|
||||
gen_rst_file_from_td(ClangCommandLineReference.rst -gen-opt-docs ../include/clang/Driver/ClangOptionDocs.td "${docs_targets}")
|
||||
|
||||
# Another generated file from a different source
|
||||
set(docs_tools_dir ${CMAKE_CURRENT_SOURCE_DIR}/tools)
|
||||
set(aopts_rst_rel_path analyzer/user-docs/Options.rst)
|
||||
set(aopts_rst "${CMAKE_CURRENT_BINARY_DIR}/${aopts_rst_rel_path}")
|
||||
set(analyzeroptions_def "${CMAKE_CURRENT_SOURCE_DIR}/../include/clang/StaticAnalyzer/Core/AnalyzerOptions.def")
|
||||
set(aopts_rst_in "${CMAKE_CURRENT_SOURCE_DIR}/${aopts_rst_rel_path}.in")
|
||||
add_custom_command(
|
||||
OUTPUT ${aopts_rst}
|
||||
COMMAND ${Python3_EXECUTABLE} generate_analyzer_options_docs.py
|
||||
--options-def "${analyzeroptions_def}"
|
||||
--template "${aopts_rst_in}"
|
||||
--out "${aopts_rst}"
|
||||
WORKING_DIRECTORY ${docs_tools_dir}
|
||||
VERBATIM
|
||||
COMMENT "Generating ${aopts_rst}"
|
||||
DEPENDS ${docs_tools_dir}/${generate_aopts_docs}
|
||||
${aopts_rst_in}
|
||||
copy-clang-rst-docs
|
||||
)
|
||||
add_custom_target(generate-analyzer-options-rst DEPENDS ${aopts_rst})
|
||||
foreach(target ${docs_targets})
|
||||
add_dependencies(${target} generate-analyzer-options-rst)
|
||||
endforeach()
|
||||
|
||||
# Technically this is redundant because generate-analyzer-options-rst
|
||||
# depends on the copy operation (because it wants to drop a generated file
|
||||
# into a subdirectory of the copied tree), but I'm leaving it here for the
|
||||
# sake of clarity.
|
||||
foreach(target ${docs_targets})
|
||||
add_dependencies(${target} copy-clang-rst-docs)
|
||||
endforeach()
|
||||
|
@ -8,6 +8,7 @@ Contents:
|
||||
|
||||
user-docs/Installation
|
||||
user-docs/CommandLineUsage
|
||||
user-docs/Options
|
||||
user-docs/UsingWithXCode
|
||||
user-docs/FilingBugs
|
||||
user-docs/CrossTranslationUnit
|
||||
|
@ -194,6 +194,8 @@ When compiling your application to run on the simulator, it is important that **
|
||||
|
||||
If you aren't certain which compiler Xcode uses to build your project, try just running ``xcodebuild`` (without **scan-build**). You should see the full path to the compiler that Xcode is using, and use that as an argument to ``--use-cc``.
|
||||
|
||||
.. _command-line-usage-CodeChecker:
|
||||
|
||||
CodeChecker
|
||||
-----------
|
||||
|
||||
|
114
clang/docs/analyzer/user-docs/Options.rst.in
Normal file
114
clang/docs/analyzer/user-docs/Options.rst.in
Normal file
@ -0,0 +1,114 @@
|
||||
========================
|
||||
Configuring the Analyzer
|
||||
========================
|
||||
|
||||
The clang static analyzer supports two kinds of options:
|
||||
|
||||
1. Global **analyzer options** influence the behavior of the analyzer engine.
|
||||
They are documented on this page, in the section :ref:`List of analyzer
|
||||
options<list-of-analyzer-options>`.
|
||||
2. The **checker options** belong to individual checkers (e.g.
|
||||
``core.BitwiseShift:Pedantic`` and ``unix.Stream:Pedantic`` are completely
|
||||
separate options) and customize the behavior of that particular checker.
|
||||
These are documented within the documentation of each individual checker at
|
||||
:doc:`../checkers`.
|
||||
|
||||
Assigning values to options
|
||||
===========================
|
||||
|
||||
With the compiler frontend
|
||||
--------------------------
|
||||
|
||||
All options can be configured by using the ``-analyzer-config`` flag of ``clang
|
||||
-cc1`` (the so-called *compiler frontend* part of clang). The values of the
|
||||
options are specified with the syntax ``-analyzer-config
|
||||
OPT=VAL,OPT2=VAL2,...`` which supports specifying multiple options, but
|
||||
separate flags like ``-analyzer-config OPT=VAL -analyzer-config OPT2=VAL2`` are
|
||||
also accepted (with equivalent behavior). Analyzer options and checker options
|
||||
can be freely intermixed here because it's easy to recognize that checker
|
||||
option names are always prefixed with ``some.groups.NameOfChecker:``.
|
||||
|
||||
.. warning::
|
||||
This is an internal interface, one should prefer `clang --analyze ...` for
|
||||
regular use. Clang does not intend to preserve backwards compatibility or
|
||||
announce breaking changes within the flags accepted by ``clang -cc1``
|
||||
(but ``-analyzer-config`` survived many years without major changes).
|
||||
|
||||
With the clang driver
|
||||
---------------------
|
||||
|
||||
In a conventional workflow ``clang -cc1`` (which is a low-level internal
|
||||
interface) is invoked indirectly by the clang *driver* (i.e. plain ``clang``
|
||||
without the ``-cc1`` flag), which acts as an "even more frontend" wrapper layer
|
||||
around the ``clang -cc1`` *compiler frontend*. In this situation **each**
|
||||
command line argument intended for the *compiler frontend* must be prefixed
|
||||
with ``-Xclang``.
|
||||
|
||||
For example the following command analyzes ``foo.c`` in :ref:`shallow mode
|
||||
<analyzer-option-mode>` with :ref:`loop unrolling
|
||||
<analyzer-option-unroll-loops>`:
|
||||
|
||||
::
|
||||
|
||||
clang --analyze -Xclang -analyzer-config -Xclang mode=shallow,unroll-loops=true foo.c
|
||||
|
||||
When this is executed, the *driver* will compose and execute the following
|
||||
``clang -cc1`` command (which can be inspected by passing the ``-v`` flag to
|
||||
the *driver*):
|
||||
|
||||
::
|
||||
|
||||
clang -cc1 -analyze [...] -analyzer-config mode=shallow,unroll-loops=true foo.c
|
||||
|
||||
Here ``[...]`` stands for dozens of low-level flags which ensure that ``clang
|
||||
-cc1`` does the right thing (e.g. ``-fcolor-diagnostics`` when it's suitable;
|
||||
``-analyzer-checker`` flags to enable the default set of checkers). Also
|
||||
note the distinction that the ``clang`` *driver* requires ``--analyze`` (double
|
||||
dashes) while the ``clang -cc1`` *compiler frontend* requires ``-analyze``
|
||||
(single dash).
|
||||
|
||||
.. note::
|
||||
The flag ``-Xanalyzer`` is equivalent to ``-Xclang`` in these situations
|
||||
(but doesn't forward other options of the clang frontend).
|
||||
|
||||
With CodeChecker
|
||||
----------------
|
||||
|
||||
If the analysis is performed through :ref:`CodeChecker
|
||||
<command-line-usage-CodeChecker>` (which e.g. supports the analysis of a whole
|
||||
project instead of a single file) then it will act as another indirection
|
||||
layer. CodeChecker provides separate command-line flags called
|
||||
``--analyzer-config`` (for analyzer options) and ``--checker-config`` (for
|
||||
checker options):
|
||||
|
||||
::
|
||||
|
||||
CodeChecker analyze -o outdir --checker-config clangsa:unix.Stream:Pedantic=true \
|
||||
--analyzer-config clangsa:mode=shallow clangsa:unroll-loops=true \
|
||||
-- compile_commands.json
|
||||
|
||||
These CodeChecker flags may be followed by multiple ``OPT=VAL`` pairs as
|
||||
separate arguments (and this is why the example needs to use ``--`` before
|
||||
``compile_commands.json``). The option names are all prefixed with ``clangsa:``
|
||||
to ensure that they are passed to the clang static analyzer (and not other
|
||||
analyzer tools that are also supported by CodeChecker).
|
||||
|
||||
.. _list-of-analyzer-options:
|
||||
|
||||
List of analyzer options
|
||||
========================
|
||||
|
||||
.. warning::
|
||||
These options are primarily intended for development purposes and
|
||||
non-default values are usually unsupported. Changing their values may
|
||||
drastically alter the behavior of the analyzer, and may even result in
|
||||
instabilities or crashes! Crash reports are welcome and depending on the
|
||||
severity they may be fixed.
|
||||
|
||||
..
|
||||
The contents of this section are automatically generated by the script
|
||||
clang/docs/tools/generate_analyzer_options_docs.py from the header file
|
||||
AnalyzerOptions.def to ensure that the RST/web documentation is synchronized
|
||||
with the command line help options.
|
||||
|
||||
.. OPTIONS_LIST_PLACEHOLDER
|
293
clang/docs/tools/generate_analyzer_options_docs.py
Normal file
293
clang/docs/tools/generate_analyzer_options_docs.py
Normal file
@ -0,0 +1,293 @@
|
||||
#!/usr/bin/env python3
|
||||
# A tool to automatically generate documentation for the config options of the
|
||||
# clang static analyzer by reading `AnalyzerOptions.def`.
|
||||
|
||||
import argparse
|
||||
from collections import namedtuple
|
||||
from enum import Enum, auto
|
||||
import re
|
||||
import sys
|
||||
import textwrap
|
||||
|
||||
|
||||
# The following code implements a trivial parser for the narrow subset of C++
|
||||
# which is used in AnalyzerOptions.def. This supports the following features:
|
||||
# - ignores preprocessor directives, even if they are continued with \ at EOL
|
||||
# - ignores comments: both /* ... */ and // ...
|
||||
# - parses string literals (even if they contain \" escapes)
|
||||
# - concatenates adjacent string literals
|
||||
# - parses numbers even if they contain ' as a thousands separator
|
||||
# - recognizes MACRO(arg1, arg2, ..., argN) calls
|
||||
|
||||
|
||||
class TT(Enum):
|
||||
"Token type enum."
|
||||
number = auto()
|
||||
ident = auto()
|
||||
string = auto()
|
||||
punct = auto()
|
||||
|
||||
|
||||
TOKENS = [
|
||||
(re.compile(r"-?[0-9']+"), TT.number),
|
||||
(re.compile(r"\w+"), TT.ident),
|
||||
(re.compile(r'"([^\\"]|\\.)*"'), TT.string),
|
||||
(re.compile(r"[(),]"), TT.punct),
|
||||
(re.compile(r"/\*((?!\*/).)*\*/", re.S), None), # C-style comment
|
||||
(re.compile(r"//.*\n"), None), # C++ style oneline comment
|
||||
(re.compile(r"#.*(\\\n.*)*(?<!\\)\n"), None), # preprocessor directive
|
||||
(re.compile(r"\s+"), None), # whitespace
|
||||
]
|
||||
|
||||
Token = namedtuple("Token", "kind code")
|
||||
|
||||
|
||||
class ErrorHandler:
|
||||
def __init__(self):
|
||||
self.seen_errors = False
|
||||
|
||||
# This script uses some heuristical tweaks to modify the documentation
|
||||
# of some analyzer options. As this code is fragile, we record the use
|
||||
# of these tweaks and report them if they become obsolete:
|
||||
self.unused_tweaks = [
|
||||
"escape star",
|
||||
"escape underline",
|
||||
"accepted values",
|
||||
"example file content",
|
||||
]
|
||||
|
||||
def record_use_of_tweak(self, tweak_name):
|
||||
try:
|
||||
self.unused_tweaks.remove(tweak_name)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
def replace_as_tweak(self, string, pattern, repl, tweak_name):
|
||||
res = string.replace(pattern, repl)
|
||||
if res != string:
|
||||
self.record_use_of_tweak(tweak_name)
|
||||
return res
|
||||
|
||||
def report_error(self, msg):
|
||||
print("Error:", msg, file=sys.stderr)
|
||||
self.seen_errors = True
|
||||
|
||||
def report_unexpected_char(self, s, pos):
|
||||
lines = (s[:pos] + "X").split("\n")
|
||||
lineno, col = (len(lines), len(lines[-1]))
|
||||
self.report_error(
|
||||
"unexpected character %r in AnalyzerOptions.def at line %d column %d"
|
||||
% (s[pos], lineno, col),
|
||||
)
|
||||
|
||||
def report_unused_tweaks(self):
|
||||
if not self.unused_tweaks:
|
||||
return
|
||||
_is = " is" if len(self.unused_tweaks) == 1 else "s are"
|
||||
names = ", ".join(self.unused_tweaks)
|
||||
self.report_error(f"textual tweak{_is} unused in script: {names}")
|
||||
|
||||
|
||||
err_handler = ErrorHandler()
|
||||
|
||||
|
||||
def tokenize(s):
|
||||
result = []
|
||||
pos = 0
|
||||
while pos < len(s):
|
||||
for regex, kind in TOKENS:
|
||||
if m := regex.match(s, pos):
|
||||
if kind is not None:
|
||||
result.append(Token(kind, m.group(0)))
|
||||
pos = m.end()
|
||||
break
|
||||
else:
|
||||
err_handler.report_unexpected_char(s, pos)
|
||||
pos += 1
|
||||
return result
|
||||
|
||||
|
||||
def join_strings(tokens):
|
||||
result = []
|
||||
for tok in tokens:
|
||||
if tok.kind == TT.string and result and result[-1].kind == TT.string:
|
||||
# If this token is a string, and the previous non-ignored token is
|
||||
# also a string, then merge them into a single token. We need to
|
||||
# discard the closing " of the previous string and the opening " of
|
||||
# this string.
|
||||
prev = result.pop()
|
||||
result.append(Token(TT.string, prev.code[:-1] + tok.code[1:]))
|
||||
else:
|
||||
result.append(tok)
|
||||
return result
|
||||
|
||||
|
||||
MacroCall = namedtuple("MacroCall", "name args")
|
||||
|
||||
|
||||
class State(Enum):
|
||||
"States of the state machine used for parsing the macro calls."
|
||||
init = auto()
|
||||
after_ident = auto()
|
||||
before_arg = auto()
|
||||
after_arg = auto()
|
||||
|
||||
|
||||
def get_calls(tokens, macro_names):
|
||||
state = State.init
|
||||
result = []
|
||||
current = None
|
||||
for tok in tokens:
|
||||
if state == State.init and tok.kind == TT.ident and tok.code in macro_names:
|
||||
current = MacroCall(tok.code, [])
|
||||
state = State.after_ident
|
||||
elif state == State.after_ident and tok == Token(TT.punct, "("):
|
||||
state = State.before_arg
|
||||
elif state == State.before_arg:
|
||||
if current is not None:
|
||||
current.args.append(tok)
|
||||
state = State.after_arg
|
||||
elif state == State.after_arg and tok.kind == TT.punct:
|
||||
if tok.code == ")":
|
||||
result.append(current)
|
||||
current = None
|
||||
state = State.init
|
||||
elif tok.code == ",":
|
||||
state = State.before_arg
|
||||
else:
|
||||
current = None
|
||||
state = State.init
|
||||
return result
|
||||
|
||||
|
||||
# The information will be extracted from calls to these two macros:
|
||||
# #define ANALYZER_OPTION(TYPE, NAME, CMDFLAG, DESC, DEFAULT_VAL)
|
||||
# #define ANALYZER_OPTION_DEPENDS_ON_USER_MODE(TYPE, NAME, CMDFLAG, DESC,
|
||||
# SHALLOW_VAL, DEEP_VAL)
|
||||
|
||||
MACRO_NAMES_PARAMCOUNTS = {
|
||||
"ANALYZER_OPTION": 5,
|
||||
"ANALYZER_OPTION_DEPENDS_ON_USER_MODE": 6,
|
||||
}
|
||||
|
||||
|
||||
def string_value(tok):
|
||||
if tok.kind != TT.string:
|
||||
raise ValueError(f"expected a string token, got {tok.kind.name}")
|
||||
text = tok.code[1:-1] # Remove quotes
|
||||
text = re.sub(r"\\(.)", r"\1", text) # Resolve backslash escapes
|
||||
return text
|
||||
|
||||
|
||||
def cmdflag_to_rst_title(cmdflag_tok):
|
||||
text = string_value(cmdflag_tok)
|
||||
underline = "-" * len(text)
|
||||
ref = f".. _analyzer-option-{text}:"
|
||||
|
||||
return f"{ref}\n\n{text}\n{underline}\n\n"
|
||||
|
||||
|
||||
def desc_to_rst_paragraphs(tok):
|
||||
desc = string_value(tok)
|
||||
|
||||
# Escape some characters that have special meaning in RST:
|
||||
desc = err_handler.replace_as_tweak(desc, "*", r"\*", "escape star")
|
||||
desc = err_handler.replace_as_tweak(desc, "_", r"\_", "escape underline")
|
||||
|
||||
# Many descriptions end with "Value: <list of accepted values>", which is
|
||||
# OK for a terse command line printout, but should be prettified for web
|
||||
# documentation.
|
||||
# Moreover, the option ctu-invocation-list shows some example file content
|
||||
# which is formatted as a preformatted block.
|
||||
paragraphs = [desc]
|
||||
extra = ""
|
||||
if m := re.search(r"(^|\s)Value:", desc):
|
||||
err_handler.record_use_of_tweak("accepted values")
|
||||
paragraphs = [desc[: m.start()], "Accepted values:" + desc[m.end() :]]
|
||||
elif m := re.search(r"\s*Example file.content:", desc):
|
||||
err_handler.record_use_of_tweak("example file content")
|
||||
paragraphs = [desc[: m.start()]]
|
||||
extra = "Example file content::\n\n " + desc[m.end() :] + "\n\n"
|
||||
|
||||
wrapped = [textwrap.fill(p, width=80) for p in paragraphs if p.strip()]
|
||||
|
||||
return "\n\n".join(wrapped + [""]) + extra
|
||||
|
||||
|
||||
def default_to_rst(tok):
|
||||
if tok.kind == TT.string:
|
||||
if tok.code == '""':
|
||||
return "(empty string)"
|
||||
return tok.code
|
||||
if tok.kind == TT.ident:
|
||||
return tok.code
|
||||
if tok.kind == TT.number:
|
||||
return tok.code.replace("'", "")
|
||||
raise ValueError(f"unexpected token as default value: {tok.kind.name}")
|
||||
|
||||
|
||||
def defaults_to_rst_paragraph(defaults):
|
||||
strs = [default_to_rst(d) for d in defaults]
|
||||
|
||||
if len(strs) == 1:
|
||||
return f"Default value: {strs[0]}\n\n"
|
||||
if len(strs) == 2:
|
||||
return (
|
||||
f"Default value: {strs[0]} (in shallow mode) / {strs[1]} (in deep mode)\n\n"
|
||||
)
|
||||
raise ValueError("unexpected count of default values: %d" % len(defaults))
|
||||
|
||||
|
||||
def macro_call_to_rst_paragraphs(macro_call):
|
||||
try:
|
||||
arg_count = len(macro_call.args)
|
||||
param_count = MACRO_NAMES_PARAMCOUNTS[macro_call.name]
|
||||
if arg_count != param_count:
|
||||
raise ValueError(
|
||||
f"expected {param_count} arguments for {macro_call.name}, found {arg_count}"
|
||||
)
|
||||
|
||||
_, _, cmdflag, desc, *defaults = macro_call.args
|
||||
|
||||
return (
|
||||
cmdflag_to_rst_title(cmdflag)
|
||||
+ desc_to_rst_paragraphs(desc)
|
||||
+ defaults_to_rst_paragraph(defaults)
|
||||
)
|
||||
except ValueError as ve:
|
||||
err_handler.report_error(ve.args[0])
|
||||
return ""
|
||||
|
||||
|
||||
def get_option_list(input_file):
|
||||
with open(input_file, encoding="utf-8") as f:
|
||||
contents = f.read()
|
||||
tokens = join_strings(tokenize(contents))
|
||||
macro_calls = get_calls(tokens, MACRO_NAMES_PARAMCOUNTS)
|
||||
|
||||
result = ""
|
||||
for mc in macro_calls:
|
||||
result += macro_call_to_rst_paragraphs(mc)
|
||||
return result
|
||||
|
||||
|
||||
p = argparse.ArgumentParser()
|
||||
p.add_argument("--options-def", help="path to AnalyzerOptions.def")
|
||||
p.add_argument("--template", help="template file")
|
||||
p.add_argument("--out", help="output file")
|
||||
opts = p.parse_args()
|
||||
|
||||
with open(opts.template, encoding="utf-8") as f:
|
||||
doc_template = f.read()
|
||||
|
||||
PLACEHOLDER = ".. OPTIONS_LIST_PLACEHOLDER\n"
|
||||
|
||||
rst_output = doc_template.replace(PLACEHOLDER, get_option_list(opts.options_def))
|
||||
|
||||
err_handler.report_unused_tweaks()
|
||||
|
||||
with open(opts.out, "w", newline="", encoding="utf-8") as f:
|
||||
f.write(rst_output)
|
||||
|
||||
if err_handler.seen_errors:
|
||||
sys.exit(1)
|
@ -7,6 +7,9 @@
|
||||
//===----------------------------------------------------------------------===//
|
||||
//
|
||||
// This file defines the analyzer options avaible with -analyzer-config.
|
||||
// Note that clang/docs/tools/generate_analyzer_options_docs.py relies on the
|
||||
// structure of this file, so if this file is refactored, then make sure to
|
||||
// update that script as well.
|
||||
//
|
||||
//===----------------------------------------------------------------------===//
|
||||
|
||||
|
14
clang/test/Analysis/generate_analyzer_options_docs.test
Normal file
14
clang/test/Analysis/generate_analyzer_options_docs.test
Normal file
@ -0,0 +1,14 @@
|
||||
The documentation of analyzer options is generated by a script that parses
|
||||
AnalyzerOptions.def. The following line validates that this script
|
||||
"understands" everything in its input files:
|
||||
|
||||
RUN: %python %src_dir/docs/tools/generate_analyzer_options_docs.py \
|
||||
RUN: --options-def %src_include_dir/clang/StaticAnalyzer/Core/AnalyzerOptions.def \
|
||||
RUN: --template %src_dir/docs/analyzer/user-docs/Options.rst.in \
|
||||
RUN: --out %t.rst
|
||||
|
||||
Moreover, verify that the documentation (e.g. this fragment of the
|
||||
documentation of the "mode" option) can be found in the output file:
|
||||
|
||||
RUN: FileCheck --input-file=%t.rst %s
|
||||
CHECK: Controls the high-level analyzer mode
|
@ -70,6 +70,8 @@ llvm_config.use_default_substitutions()
|
||||
|
||||
llvm_config.use_clang()
|
||||
|
||||
config.substitutions.append(("%src_dir", config.clang_src_dir))
|
||||
|
||||
config.substitutions.append(("%src_include_dir", config.clang_src_dir + "/include"))
|
||||
|
||||
config.substitutions.append(("%target_triple", config.target_triple))
|
||||
|
Loading…
x
Reference in New Issue
Block a user