In PR #79382, I need to add a new type that derives from
ConstantArrayType. This means that ConstantArrayType can no longer use
`llvm::TrailingObjects` to store the trailing optional Expr*.
This change refactors ConstantArrayType to store a 60-bit integer and
4-bits for the integer size in bytes. This replaces the APInt field
previously in the type but preserves enough information to recreate it
where needed.
To reduce the number of places where the APInt is re-constructed I've
also added some helper methods to the ConstantArrayType to allow some
common use cases that operate on either the stored small integer or the
APInt as appropriate.
Resolves#85124.
Without the fix gcc warned like
../../clang/lib/Analysis/UnsafeBufferUsage.cpp:2203:26: warning: unused variable 'CArrTy' [-Wunused-variable]
2203 | } else if (const auto *CArrTy = Ctx.getAsConstantArrayType(
| ^~~~~~
Example:
int * const my_var = my_initializer;
Currently when transforming my_var to std::span the fixits:
- replace "int * const my_var = " with "std::span<int> const my_var {"
- add ", SIZE}" after "my_initializer" where SIZE is either inferred or
a placeholder
This patch makes that behavior less intrusive by not modifying variable
cv-qualifiers and initialization syntax.
The new behavior is:
- replace "int *" with "std::span<int>"
- add "{" before "my_initializer"
- add ", SIZE}" after "my_initializer"
This is an improvement on its own - since we don't touch the identifier,
we automatically can handle macros in them.
It also simplifies future work on initializer fixits.
Example:
int arr[10];
int * ptr = arr;
If ptr is unsafe and we transform it to std::span then the fixit we'd
currently provide transforms the code to:
std::span<int> ptr{arr, 10};
That's suboptimal as that repeats the size of the array in the code.
The idiomatic transformation should rely on the span constructor
that takes just the array argument and relies on template
parameter autodeduction to set the span size.
The transformed code should look like:
std::span<int> ptr = arr;
Note that it just should not change the initializer at all and that also
works for other forms of initialization like:
int * ptr {arr};
becoming:
std::span<int> ptr{arr};
This patch changes the initializer handling to the desired (empty)
fixit.
Introducing CArrayToPtrAssignment gadget and implementing fixits for some cases
of array being assigned to pointer.
Key observations:
- const size array can be assigned to std::span and bounds are propagated
- const size array can't be on LHS of assignment
This means array to pointer assignment has no strategy implications.
Fixits are implemented for cases where one of the variables in the assignment is
safe. For assignment of a safe array to unsafe pointer we know that the RHS will
never be transformed since it's safe and can immediately emit the optimal fixit.
Similarly for assignment of unsafe array to safe pointer.
(Obviously this is not and can't be future-proof in regards to what
variables we consider unsafe and that is fine.)
Fixits for assignment from unsafe array to unsafe pointer (from Array to Span
strategy) are not implemented in this patch as that needs to be properly designed
first - we might possibly implement optimal fixits for partially transformed
cases, put both variables in a single fixit group or do something else.
[-Wunsafe-buffer-usage] Ignore safe array subscripts
Don't emit warnings for array subscripts on constant size arrays where the index is constant and within bounds.
Example:
int arr[10];
arr[5] = 0; //safe, no warning
This patch recognizes only array indices that are integer literals - it doesn't understand more complex expressions (arithmetic on constants, etc.).
-Warray-bounds implemented in Sema::CheckArrayAccess() already solves a similar
(opposite) problem, handles complex expressions and is battle-tested.
Adding -Wunsafe-buffer-usage diagnostics to Sema is a non-starter as we need to emit
both the warnings and fixits and the performance impact of the fixit machine is
unacceptable for Sema.
CheckArrayAccess() as is doesn't distinguish between "safe" and "unknown" array
accesses. It also mixes the analysis that decides if an index is out of bounds
with crafting the diagnostics.
A refactor of CheckArrayAccess() might serve both the original purpose
and help us avoid false-positive with -Wunsafe-buffer-usage on constant
size arrrays.
Currently we ignore calls on function pointers (unlike direct calls of
functions and class methods). This patch adds support for function pointers as
well.
The change is to simply replace use of forEachArgumentWithParam matcher in UPC
gadget with forEachArgumentWithParamType.
from the documentation of forEachArgumentWithParamType:
/// Matches all arguments and their respective types for a \c CallExpr or
/// \c CXXConstructExpr. It is very similar to \c forEachArgumentWithParam but
/// it works on calls through function pointers as well.
Currently the matcher also uses hasPointerType() which checks that the
canonical type of an argument is pointer and won't match on arrays decayed to
pointer. Replacing hasPointerType() with isAnyPointerType() which allows
implicit casts allows for the arrays to be matched as well and this way we get
fixits for array arguments to function pointer calls too.
Covers cases where DeclRefExpr referring to a const-size array decays to a
pointer and is used "as a pointer" (e. g. passed to a pointer type
parameter).
Since std::array<T, N> doesn't implicitly convert to pointer to its element
type T* the cast needs to be done explicitly as part of the fixit
when we retrofit std::array to code that previously worked with constant
size array. std::array::data() method is used for the explicit
cast.
In terms of the fixit machine this covers the UPC(DRE) case for Array fixit strategy.
The emitted fixit inserts call to std::array::data() method similarly to
analogous fixit for Span strategy.
Array subscript on a const size array is not bounds-checked. The idiomatic
replacement is std::array which is bounds-safe in hardened mode of libc++.
This commit extends the fixit-producing machine to consider std::array as a
transformation target type and teaches it to handle the array subscript on const
size arrays with a trivial (empty) fixit.
Debug notes for unclaimed DeclRefExpr should report any DRE of an unsafe
variable that is not covered by a Fixable (i. e. fixit for the
particular AST pattern isn't implemented for whatever reason). Currently
not all unclaimed DeclRefExpr-s are reported which is a bug. The debug
notes report only those DREs where the referred VarDecl has at least one
other DeclRefExpr which is claimed (covered by a fixit). If there is an
unsafe VarDecl that has exactly one DRE and the DRE isn't claimed then
the debug note about missing fixit won't be emitted. That is because the
debug note is emitted from within a loop over set of successfully
matched FixableGadgets which by-definition is missing those DRE that are
not matched at all.
The new code simply iterates over all unsafe VarDecls and all of their
unclaimed DREs.
Constructing `std::span` objects with the two parameter constructors
could introduce mismatched bounds information, which defeats the
purpose of using `std::span`. Therefore, we warn every use of such
constructors.
rdar://115817781
The patch fixes the crash introduced by the DataInvocation warning
gadget designed to warn against unsafe invocations of span::data method.
It also now considers the invocation of span::data method inside
parenthesis.
Radar: 121223051
---------
Co-authored-by: MalavikaSamak <malavika2@apple.com>
…-Wunsafe-buffer-usage,
there maybe accidental re-introduction of new OutOfBound accesses into
the code bases. One such case is invoking span::data() method on a span
variable to retrieve a pointer, which is then cast to a larger type and
dereferenced. Such dereferences can introduce OutOfBound accesses.
To address this, a new WarningGadget is being introduced to warn against
such invocations.
---------
Co-authored-by: MalavikaSamak <malavika2@apple.com>
Reported by Static Analyzer tool:
In getSourceRangeToTokenEnd(clang::Decl const *, clang::SourceManager
const &, clang::LangOptions): A very large function call parameter
exceeding the high threshold is passed by value
pass_by_value: Passing parameter LangOpts of type clang::LangOptions
(size 1784 bytes) by value, which exceeds the high threshold of 512
bytes
- For a better understand of what the unsupported cases are, we add
more information to the debug note---a string of ancestor AST nodes
of the unclaimed DRE. For example, an unclaimed DRE p in an
expression `*(p++)` will result in a string starting with
`DRE ==> UnaryOperator(++) ==> Paren ==> UnaryOperator(*)`.
- To find out the most common patterns of those unsupported use cases,
we add a simple script to build a prefix tree over those strings and
count each prefix. The script reads input line by line, assumes a
line is a list of words separated by `==>`s, and builds a prefix tree
over those lists.
Reviewed by: t-rasmud (Rashmi Mudduluru), NoQ (Artem Dergachev)
Differential revision: https://reviews.llvm.org/D158561
- Use Strategy to determine whether to fix a parameter
- Fix the `Strategy` construction so that only variables on the graph
are assigned the `std::span` strategy
Reviewed by: t-rasmud (Rashmi Mudduluru), NoQ (Artem Dergachev)
Differential revision: https://reviews.llvm.org/D157441
For a function `F` whose parameters need to be fixed, we group fix-its
of F's parameters together so that either all of the parameters get
fixed or none of them gets fixed.
Reviewed by: NoQ (Artem Dergachev), t-rasmud (Rashmi Mudduluru), jkorous (Jan Korous)
Differential revision: https://reviews.llvm.org/D153059
We have to give up on fixing a variable declaration if it has
specifiers that are not supported yet. We could support these
specifiers incrementally using the same approach as how we deal with
cv-qualifiers. If a fixing variable declaration has a storage
specifier, instead of trying to find out the source location of the
specifier or to avoid touching it, we add the keyword to a
canonicalized place in the fix-it text that replaces the whole
declaration.
Reviewed by: NoQ (Artem Dergachev), jkorous (Jan Korous)
Differential revision: https://reviews.llvm.org/D156192
Refactor the code for local variable fix-its so that it reuses the
code for parameter fix-its, which is in general better. For example,
cv-qualifiers are supported.
Reviewed by: NoQ (Artem Dergachev), t-rasmud (Rashmi Mudduluru)
Differential revision: https://reviews.llvm.org/D156189
This breaks the clang check-format on CI.
+ grep -rnI '[[:blank:]]$' clang/lib clang/include clang/docs
clang/lib/Analysis/UnsafeBufferUsage.cpp:2277:#endif
This reverts commit ac9a76d7487b9af1ace626eb90064194cb12c53d.
Previously an abstract class has no pure virtual function. It causes build error on some bots.
Slightly refactor and optimize the code in preparation for
implementing grouping parameters for a single fix-it.
Reviewed by: NoQ (Artem Dergachev), t-rasmud (Rashmi Mudduluru)
Differential revision: https://reviews.llvm.org/D156474
- Factor out the code that will be shared by both parameter and local variable fix-its
- Add a check to ensure that a TypeLoc::isNull is false before using the TypeLoc
- Remove the special check for whether a fixing variable involves unnamed types. This check is unnecessary now.
- Move tests for cv-qualified parameters and unnamed types out of the "...-unsupported.cpp" test file.
Reviewed by: NoQ (Artem Dergachev)
Differential revision: https://reviews.llvm.org/D156188
Fix the linting problems which causes `clang/utils/ci/run-buildbot check-format` to return 1.
Also make a correction for the email address of the author of
0fd4175907b40fe63131482c162d7e0f76000521:
The correct email address is "ar.ashouri999@gmail.com", not "ar.ashouri999@google.com".
Reviewed by: ziqingluo-90 (Ziqing Luo)
Differential revision: https://reviews.llvm.org/D155814
`FixableGadget`s are not always associated with variables that are unsafe
(warned). For example, they could be associated with variables whose
unsafe operations are suppressed or that are not used in any unsafe
operation. Such `FixableGadget`s will not be fixed. Removing these
`FixableGadget` as early as possible helps improve the performance
and stability of the analysis.
Reviewed by: NoQ (Artem Dergachev), t-rasmud (Rashmi Mudduluru)
Differential revision: https://reviews.llvm.org/D155524
The safe-buffer analysis analyzes TypeLocs of types of variable
declarations in order to get source locations of them.
However, in some cases, the source locations of a TypeLoc are not
valid. Using invalid source locations results in assertion violation
or incorrect analysis or fix-its.
It is still not clear to me in what circumstances a TypeLoc does not
have valid source locations (it looks like a bug in Clang to me, but
it is not our responsibility to fix it). So we will conservatively
give up the analysis when required source locations are not valid.
Reviewed By: NoQ (Artem Dergachev)
Differential Revision: https://reviews.llvm.org/D155667
It is possible that a function parameter does not have a name even in
a function definition. This patch deals with such cases in generating
function overload fix-its for safe buffers.
Reviewed by: NoQ (Artem Dergachev)
Differential revision: https://reviews.llvm.org/D155641
For a fix-it that inserts the `[[clang::unsafe_buffer_usage]]`
attribute, it will lookup existing macros defined for the attribute
and use the (last defined such) macro directly. Fix-its will use raw
`[[clang::unsafe_buffer_usage]]` if no such macro is defined.
The implementation mimics how a similar machine for the
`[[fallthrough]]` attribute was implemented.
Reviewed by: NoQ (Artem Dergachev)
Differential revision: https://reviews.llvm.org/D150338