* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Strip unneeded calls to raw_string_ostream::str(), to avoid extra indirection.
For `snprintf(a, sizeof a, ...)`, the first two arguments form a safe
pattern if `a` is a constant array. In such a case, this commit will
suppress the warning.
(rdar://117182250)
The commit d7dd2c468fecae871ba67e891a3519c758c94b63 crashes for such
an example:
```
void printf() { printf(); }
```
Because it assumes `printf` must have arguments. This commit fixes
this issue.
(rdar://117182250)
Revert commit 23457964392d00fc872fa6021763859024fb38da, and re-land
with a new flag "-Wunsafe-buffer-usage-in-libc-call" for the new
warning.
(rdar://117182250)
[-Wunsafe-buffer-usage] Add warn on unsafe calls to libc functions
Warning about calls to libc functions involving buffer access. Warned
functions are hardcoded by names.
(rdar://117182250)
This fixes false positives related to returning a scoped lockable
object. At the end of a function, we check managed locks instead of
scoped locks.
At real join points, we skip checking managed locks because we assume
that the scope keeps track of its underlying mutexes and will release
them at its destruction. So, checking for the scopes is sufficient.
However, at the end of a function, we aim at comparing the expected and
the actual lock sets. There, we skip checking scoped locks to prevent to
get duplicate warnings for the same lock.
…n/statement.
We don't need these for the same in-tree purposes as the other sets,
i.e. for making sure we model these Decls that are declared outside the
function, but we have an out-of-tree use for these sets that would
benefit from this simple addition and would avoid duplicating so much of
this code.
`QualType::isConstantArrayType()` checks canonical type. So a following
cast should be applied to canonical type as well:
```
if (Ty->isConstantArrayType())
cast<ConstantArrayType>(Ty.getCanonicalType()); // cast<ConstantArrayType>(Ty) is incorrect
```
Extend the unsafe_buffer_usage attribute, so they can also be added to
struct fields. This will cause the compiler to warn about the unsafe
field at their access sites.
Co-authored-by: MalavikaSamak <malavika2@apple.com>
`getDirectCallee()` may return a null pointer if the callee is not a
`FunctionDecl` (for example when using function pointers), this requires
to use `dyn_cast_or_null` instead of `dyn_cast`.
This was missing a call to `ignoreCFGOmittedNodes()`. As a result, the
function
would erroneously conclude that a block did not contain an expression
consumed
in a different block if the expression in question was surrounded by a
`ParenExpr` in the consuming block. The patch adds a test that triggers
this
scenario (and fails without the fix).
To prevent this kind of bug in the future, the patch also adds a new
method
`blockForStmt()` to `AdornedCFG` that calls `ignoreCFGOmittedNodes()`
and is
preferred over accessing `getStmtToBlock()` directly.
Fix the false negative caused by state merging in the evaluation of a
short-circuiting expression inside the condition of a ternary operator.
The fixed symptom is that CSA always evaluates `(x || x) ? n : m` to
`m`.
This change forces the analyzer to consider all logical expressions
prone to short-circuiting alive until the entire conditional expression
is evaluated. Here is why.
By default, LiveVariables analysis marks only direct subexpressions as
live relative to any expression. So for `a ? b : c` it will consider
`a`, `b`, and `c` alive when evaluating the ternary operator expression.
To explore both possibilities opened by a ternary operator, it is
important to keep something different about the exploded nodes created
after the evaluation of its branches. These two nodes come to the same
location, so they must have different states. Otherwise, they will be
considered identical and can engender only one outcome.
`ExprEngine::visitGuardedExpr` chooses the first predecessor exploded
node to carry the value of the conditional expression. It works well in
the case of a simple condition, because when `a ? b : c` is evaluated,
`a` is kept alive, so the two branches differ in the value of `a`.
However, before this patch is applied, this strategy breaks for `(x ||
x) ? n : m`. `x` is not a direct child of the ternary expression. Due to
short-circuiting, once `x` is assumed to be `true`, evaluation jumps
directly to `n` and then to the result of the entire ternary expression.
Given that the result of the entire condition `(x || x)` is not
constructed, and `x` is not kept alive, the difference between the path
coming through `n` and through `m` disappears. As a result, exploded
nodes coming from the "true expression" and the "false expression"
engender identical successors and merge the execution paths.
`CXXInheritedCtorInitExpr` is another of the node kinds that should be
considered an "original initializer". An assertion failure in
`assert(Children.size() == 1)` happens without this fix.
---------
Co-authored-by: martinboehme <mboehme@google.com>
Summary:
If callExpr is type dependent, there is no way to analyze individual
arguments until template specialization. Before this diff only calls
with dependent callees were skipped so unnecessary-value-param was
processing arguments that had non-dependent type that gave false
positives because the call was not fully resolved till specialization.
So now instead of checking type dependent callee, the whole expression
will be checked for type dependent.
Test Plan: check-clang-tools
This PR reverts #95290 and the one-liner followup PR #96494.
I received some substantial feedback on #95290, which I plan to address
in a future PR.
I've also received feedback that because the change emits errors where
they were not emitted before, we should at least have a flag to disable
the stricter warnings.
We definitely know that these operations change the value of their
operand, so
clear out any value associated with it. We don't create a new value,
instead
leaving it to the analysis to do this if desired.
With this change, Clang will generate errors when trylock functions have
improper return types. Today, it silently fails to apply the trylock
attribute to these functions which may incorrectly lead users to believe
they have correctly acquired locks before accessing guarded data.
As a side effect of explicitly checking the success argument type, I
seem to have fixed a false negative in the analysis that could occur
when a trylock's success argument is an enumerator. I've added a
regression test to warn-thread-safety-analysis.cpp named
`TrylockSuccessEnumFalseNegative`.
This change also improves the documentation with descriptions of of the
subtle gotchas that arise from the analysis interpreting the success arg
as a boolean.
Issue #92408
At the same time, rename `PostVisitCFG` to the more descriptive
`PostAnalysisCallbacks` (which emphasizes the fact that these callbacks
are run
after the dataflow analysis itself has converged).
Before this patch, it was only possible to run a callback on the state
_after_
the transfer function had been applied, but for many analyses, it's more
natural
to to check the state _before_ the transfer function has been applied,
because we
are usually checking the preconditions for some operation. Some checks
are
impossible to perform on the "after" state because we can no longer
check the
precondition; for example, the `++` / `--` operators on raw pointers
require the
operand to be nonnull, but after the transfer function for the operator
has been
applied, the original value of the pointer can no longer be accessed.
`UncheckedOptionalAccessModelTest` has been modified to run the
diagnosis
callback on the "before" state. In this particular case, diagnosis can
be run
unchanged on either the "before" or "after" state, but we want this test
to
demonstrate that running diagnosis on the "before" state is usually the
preferred approach.
This change is backwards-compatible; all existing analyses will continue
to run
the callback on the "after" state.
The patch includes a repro for a case where we were returning a null
`FieldDecl`
when calling `getReferencedDecls()` on the `InitListExpr` for a union.
Also, I noticed while working on this that `RecordInitListHelper` has a
bug
where it doesn't work correctly for empty unions. This patch also
includes a
repro and fix for this bug.
This is one of the node kinds that should be considered an "original
initializer". The patch adds a test that was causing an assertion
failure in
`assert(Children.size() == 1)` without the fix.
We previously had a hand-rolled recursive traversal here that was
exactly what
`RecursiveASTVistor` does anyway. Using the visitor not only eliminates
the
explicit traversal logic but also allows us to introduce a common
visitor base
class for `getReferencedDecls()` and `ResultObjectVisitor`, ensuring
that the
two are consistent in terms of the nodes they visit. Inconsistency
between these
two has caused crashes in the past when `ResultObjectVisitor` tried to
propagate
result object locations to entities that weren't modeled becasue
`getReferencedDecls()` didn't visit them.
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode (`set_property(TARGET <target>
PROPERTY FOLDER "<title>")`) when using the respective CMake's IDE
generator.
* Ensure that every target is in a folder
* Use a folder hierarchy with each LLVM subproject as a top-level folder
* Use consistent folder names between subprojects
* When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
Depends on https://github.com/llvm/llvm-project/pull/92527
Clang now support the following:
- Extending lifetime of object bound to reference members of aggregates,
that are created from default member initializer.
- Rebuild `CXXDefaultArgExpr` and `CXXDefaultInitExpr` as needed where
called or constructed.
But CFG and ExprEngine need to be updated to address this change.
This PR add `CXXDefaultArgExpr` and `CXXDefaultInitExpr` into CFG, and
correct handle these expressions in ExprEngine
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
This component can be useful when creating implementations of `Solver`,
as some
SAT solvers require the input to be in 3-CNF.
As part of making `CNFFormula` externally accessible, I have moved some
member
variables out of it that aren't really part of the representation of a
3-CNF
formula and thus live better elsewhere:
* `WatchedHead` and `NextWatched` have been moved to
`WatchedLiteralsSolverImpl`, as they're part of the specific algorithm
used
by that SAT solver.
* `Atomics` has become an output parameter of `buildCNF()` because it
has to do
with the relationship between a `CNFFormula` and the set of `Formula`s
it is
derived from rather than being an integral part of the representation of
a
3-CNF formula.
I have also made all member variables private and added appropriate
accessors.
We routinely rely on implicit conversions of string literals to
StringRef so that we can use operator==(StringRef, StringRef).
The LHS here are all known to be of StringRef.
The -Wunsafe-buffer-usage warning should fire on any call to a function
annotated with [[clang::unsafe_buffer_usage]], however it omitted calls
to constructors, since the expression is a CXXConstructExpr which does
not subclass CallExpr. Thus the matcher on callExpr() does not find
these expressions.
Add a new WarningGadget that matches cxxConstructExpr that are calling a
CXXConstructDecl annotated by [[clang::unsafe_buffer_usage]] and fires
the warning. The new UnsafeBufferUsageCtorAttrGadget gadget explicitly
avoids matching against the std::span(ptr, size) constructor because
that is handled by SpanTwoParamConstructorGadget and we never want two
gadgets to match the same thing (and this is guarded by asserts).
The gadgets themselves do not report the warnings, instead each gadget's
Stmt is passed to the UnsafeBufferUsageHandler (implemented by
UnsafeBufferUsageReporter). The Reporter is previously hardcoded that a
CXXConstructExpr statement must be a match for std::span(ptr, size), but
that is no longer the case. We want the Reporter to generate different
warnings (in the -Wunsafe-buffer-usage-in-container subgroup) for the
span contructor. And we will want it to report more warnings for other
std-container-specific gadgets in the future. To handle this we allow
the gadget to control if the warning is general (it calls
handleUnsafeBufferUsage()) or is a std-container-specific warning (it
calls handleUnsafeOperationInContainer()).
Then the WarningGadget grows a virtual method to dispatch to the
appropriate path in the UnsafeBufferUsageHandler. By doing so, we no
longer need getBaseStmt in the Gadget interface. The only use of it for
FixableGadgets was to get the SourceLocation, so we make an explicit
virtual method for that on Gadget. Then the handleUnsafeOperation()
dispatcher can be a virtual method that is only in WarningGadget.
The SpanTwoParamConstructorGadget gadget dispatches to
handleUnsafeOperationInContainer() while the other WarningGadgets all
dispatch to the original handleUnsafeBufferUsage().
Tests are added for annotated constructors, conversion operattors, call
operators, fold expressions, and regular methods.
Issue #80482
Assume in fewer places that the analysis is of a `FunctionDecl`, and
initialize the `Environment` properly for `Stmt`s.
Moves constructors for `Environment` to header to make it more obvious
that there are only minor differences between them and very little
initialization in the constructors.
Tested with check-clang-tooling.
- Instead of comparing the identity of the `PointerValue`s, compare the
underlying `StorageLocation`s.
- If the `StorageLocation`s are the same, return a definite "true" as
the
result of the comparison. Before, if the `PointerValue`s were different,
we
would return an atom, even if the storage locations themselves were the
same.
- If the `StorageLocation`s are different, return an atom (as before).
Pointers
that have different storage locations may still alias, so we can't
return a
definite "false" in this case.
The application-level gains from this are relatively modest. For the
Crubit
nullability check running on an internal codebase, this change reduces
the
number of functions on which the SAT solver times out from 223 to 221;
the
number of "pointer expression not modeled" errors reduces from 3815 to
3778.
Still, it seems that the gain in precision is generally worthwhile.
@Xazax-hun inspired me to think about this with his
[comments](https://github.com/llvm/llvm-project/pull/73860#pullrequestreview-1761484615)
on a different PR.
The existing code was full of comments about how we assume this is
always the
case, but it's not mandated by the standard, and there is code out there
that
returns a different type. So check that the result type is in fact the
same as
the destination type before attempting to copy to the result.
To make sure that we don't bail out in more cases than intended, I've
extended
existing tests to verify that in the common case, we do return the
destination
object (by reference or value, as the case may be).
Trying to do so can cause crashes -- see newly added test and the
comments in
the fix.
We're starting to see a repeating pattern here: We're getting crashes
because
`ResultObjectVisitor` and `getReferencedDecls()` don't agree on which
parts of
the AST to visit and, hence, which fields should be modeled.
I think we should ensure consistency between these two parts of the code
by
using a `RecursiveASTVisitor` in `getReferencedDecls()`[^1]; the
`Traverse...()` functions that control which parts of the AST we visit
would go
in a common base class that would be used for both `ResultObjectVisitor`
and
`getReferencedDecls()`.
I'd like to focus this PR, however, on a targeted fix for the current
crash and
postpone the refactoring to a later PR (which will be easier to revert
if there
are unintended side-effects).
[^1]: As an added bonus, this would make the code better structured and
more
efficient than the current sequence of `if (dyn_cast<T>(...))`
statements).
`ConstantExpr` does not appear as a `CFGStmt` in the CFG, so
`StmtToEnvMap::getEnvironment()` was not finding an entry for it in the
map,
causing a crash when we tried to access the iterator resulting from the
map
lookup.
The fix is to make `ignoreCFGOmittedNodes()` ignore `ConstantExpr`, but
in
addition, I'm hardening `StmtToEnvMap::getEnvironment()` to make sure
release
builds don't crash in similar situations in the future.
We used to crash if the previous iteration contained a `BoolValue` and
the
current iteration contained an `IntegerValue`. The accompanying test
sets up
this situation -- see comments there for details.
While I'm here, clean up the tests for integral casts to use the test
helpers we
have available now. I was looking at these tests to understand how we
handle
integral casts, and the test helpers make the tests easier to read.