In rare cases `SplitBlockAndInsertSimpleForLoop` in `paintOrigin`
crashes outsize iterators.
Somehow existing `SplitBlockAndInsertIfThen` do not invalidate
iterators.
Msan uses `__msan_param_tls` to pass shadow of
arguments. Position of arguments is expected to be
available during compile time, if size of the
argument is know. This is not true for vscale.
As work around we require that vscale parameters
are always initialized, then we don't need to pass
shadow.
Ret val should work out of the box as we don't
need to know size compile time.
Code expects `VectorOrPrimitiveTypeSizeInBits` compile time value,
which is not available for vscale.
In trivial case of the same type we need to do nothing.
Almost NFC, instrumentation is as correct as it was before.
We need InstrumentationList grouped by origin instruction,
so we used stable_sort. However these objects already grouped
because we never interleave sequences of `insertShadowCheck`
of different instrunction.
Pointer sort has artifact that it was deppendent on allocator behavior,
so we could inserted checks in a different order.
There is no test, as I failed to reproduce this with `opt`. My guess
is that for reproducer we need to increase fragmentation in the
allocator.
We don't need to create these instances of ArrayRef because
ConstantDataVector::get takes ArrayRef, and ArrayRef can be implicitly
constructed from C arrays.
We have a lot of repeated code with random constants.
Particular values are not important, the one just needs to be
bigger then another.
UR_NONTAKEN_WEIGHT is selected as it's the most common one.
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
Modify #77393 to clear shadow memory using `llvm.memset.*` when the size
is large, similar to `shouldUseBZeroPlusStoresToInitialize` in clang for
`-ftrivial-auto-var-init=`. The intrinsic, if lowered to libcall, will
use the msan interceptor.
The instruction selector lowers a `StoreInst` to multiple stores, not
utilizing `memset`. When the size is large (e.g.
`store { [100 x i32] } zeroinitializer, ptr %12, align 1`), the
generated code will be long (and `CodeGenPrepare::optimizeInst` will
even crash for a huge size).
```
// Test stack size
template <class T>
void DoNotOptimize(const T& var) { // deprecated by https://github.com/google/benchmark/pull/1493
asm volatile("" : "+m"(const_cast<T&>(var)));
}
int main() {
using LargeArray = std::array<int, 1000000>;
auto large_stack = []() { DoNotOptimize(LargeArray()); };
/////// CodeGenPrepare::optimizeInst triggers an assertion failure when creating an integer type with a bit width>2**23
large_stack();
}
```
KMSAN defaults to `msan-handle-asm-conservative`, which inserts
`__msan_instrument_asm_store` calls to unpoison indirect outputs in
inline assembly (e.g. `=m` constraints in source).
```c
unsigned f() {
unsigned v;
// __msan_instrument_asm_store unpoisons v before invoking the asm.
asm("movl $1,%0" : "=m"(v));
return v;
}
```
Extend the mechanism to userspace, but require explicit
`-mllvm -msan-handle-asm-conservative` for experiments for now.
As
https://docs.kernel.org/dev-tools/kmsan.html#inline-assembly-instrumentation
says, this approach may mask certain errors (an indirect output may not
actually be initialized), but it also helps to avoid a lot of false
positives.
Link: https://github.com/google/sanitizers/issues/192
Caller puts argument shadow one by one into __msan_va_arg_tls, until it
reaches kParamTLSSize. After that it still increment OverflowOffset but
does not store the shadow.
Callee needs OverflowOffset to prepare a shadow for the entire overflow
area. It's done by creating "varargs shadow copy" for complete list of
args, copying available shadow from __msan_va_arg_tls, and clearing the
rest.
However callee does not know if the tail of __msan_va_arg_tls was not
able to fit an argument, and callee will copy tail shadow into "varargs
shadow copy", and later used as a shadow for an omitted argument.
So that unused tail of the __msan_va_arg_tls must be cleared if left
unused.
This allows us to enable compiler-rt/test/msan/vararg_shadow.cpp for
x86.
Reviewers: kstoimenov, thurstond
Reviewed By: thurstond
Pull Request: https://github.com/llvm/llvm-project/pull/72707
3bc439bdff8bb5518098bd9ef52c56ac071276bc implemented overflow copying in a different way.
It's lucky to pass this test, but may fails in a different way.
Reviewers: thurstond, iii-i
Reviewed By: thurstond
Pull Request: https://github.com/llvm/llvm-project/pull/72710
Cast StackSaveAreaPtr, GrRegSaveAreaPtr, VrRegSaveAreaPtr to pointers to
fix assertions in getShadowOriginPtrKernel().
Fixes: https://github.com/llvm/llvm-project/issues/69738
Patch by Mark Johnston.
Partial progress towards removing in-tree uses of `getPointerTo()`,
by employing the following options:
* Drop the call entirely if the sole purpose of it is to support a no-op
bitcast (remove the no-op bitcast as well).
* Replace with `PointerType::get()`/`PointerType::getUnqual()`
This is a NFC cleanup effort.
Reviewed By: barannikov88
Differential Revision: https://reviews.llvm.org/D155232
This patch adds support for variadic argument for loongarch64,
which is based on MIPS64. And `check-msan` all pass.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D158587
Stop using Type::getPointerTo as we do not need typed pointers
nowadays.
Remove no longer used arguments to getOriginPtrForArgument,
getShadowPtrForRetval and getOriginPtrForRetval.
Since we no longer support typed LLVM IR pointer types, the code can
be simplified into for example using PointerType::get directly instead
of using Type::getInt8PtrTy and Type::getInt32PtrTy etc.
Differential Revision: https://reviews.llvm.org/D156733
This patch enabled msan in LLVM and fixed all failing tests in
check-msan.
It does not add VarArgHelper implementation on LoongArch, which
will be done separately later. And it adds a test for VarArgNoOpHelper,
which is based on the X86 one.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D152692
Move `AttributeMask` out of `llvm/IR/Attributes.h` to a new file
`llvm/IR/AttributeMask.h`. After doing this we can remove the
`#include <bitset>` and `#include <set>` directives from `Attributes.h`.
Since there are many headers including `Attributes.h`, but not needing
the definition of `AttributeMask`, this causes unnecessary bloating of
the translation units and slows down compilation.
This commit adds in the include directive for `llvm/IR/AttributeMask.h`
to the handful of source files that need to see the definition.
This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,917,509,187 to 1,902,982,273 - a
reduction of ~0.76%. This should result in a small improvement in
compilation time.
Differential Revision: https://reviews.llvm.org/D153728
c55fffe replaced fcmp with fpclass.
```
declare i1 @llvm.is.fpclass(<fptype> <op>, i32 <test>)
declare <N x i1> @llvm.is.fpclass(<vector-fptype> <op>, i32 <test>)
```
Perfect fix will require checking bits of <op> corresponding to <test>
argument. For now just propagate shadow without reporting before
intrinsic. Still existing handling of fcmp is also simple OR, so it's
not making it worse.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D149491
Enable -fsanitize=kernel-memory support in Clang.
The x86_64 ABI requires that shadow_origin_ptr_t must be returned via a
register pair, and the s390x ABI requires that it must be returned via
memory pointed to by a hidden parameter. Normally Clang takes care of
the ABI, but the sanitizers run long after it, so unfortunately they
have to duplicate the ABI logic.
Therefore add a special case for SystemZ and manually emit the
s390x-ABI-compliant calling sequences. Since it's only 2 architectures,
do not create a VarArgHelper-like abstraction layer.
The kernel functions are compiled with the "packed-stack" and
"use-soft-float" attributes. For the "packed-stack" functions, it's not
correct for copyRegSaveArea() to copy 160 bytes of shadow and origins,
since the save area is dynamically sized. Things are greatly simplified
by the fact that the vararg "use-soft-float" functions use precisely
56 bytes in order to save the argument registers to where va_arg() can
find them.
Make copyRegSaveArea() copy only 56 bytes in the "use-soft-float" case.
The "packed-stack" && !"use-soft-float" case has no practical uses at
the moment, so leave it for the future.
Add tests.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D148596
Ironically, MSan copies uninitialized data off the stack into
VAArgTLSCopy in the callee-side handling of va_start. Clamp the copy
size to the actual length of the buffer, and zero-initialize the
remainder.
Differential Revision: https://reviews.llvm.org/D146858
This adds support for scalable vector types - at least far enough to get basic load and store cases working. It turns out that load/store without origin tracking already worked; I apparently got that working with one of the pre patches to use TypeSize utilities and didn't notice. The code changes here are required to enable origin tracking.
For origin tracking, a 4 byte value - the origin - is broadcast into a shadow region whose size exactly matches the type being accessed. This origin is only written if the shadow value is non-zero. The details of how shadow is computed from the original value being stored aren't relevant for this patch.
The code changes involve two related primitives.
First, we need to be able to perform that broadcast into a scalable sized memory region. This requires the use of a loop, and appropriate bound. The fixed size case optimizes with larger stores and alignment; I did not bother with that for the scalable case for now. We can optimize this codepath later if desired.
Second, we need a way to test if the shadow is zero. The mechanism for this in the code is to convert the shadow value into a scalar, and then zero check that. There's an assumption that this scalar is zero exactly when all elements of the shadow value are zero. As a result, we use an OR reduction on the scalable vector. This is analogous to how e.g. an array is handled. I landed a bunch of cleanup changes to remove other direct uses of the scalar conversion to convince myself there were no other undocumented invariants.
Differential Revision: https://reviews.llvm.org/D146157
This is an implementation detail of the flattening scheme, so hide it in the implementation thereof. This does require one caller to go through the appropriate utility, but doing that makes the code cleaner anyways.
This is part of prework for supporting scalable vector types. This isn't NFC because it shifts the point of failure (i.e. which assert triggers first), but should be NFC for all non-scalable vector inputs.