519 Commits

Author SHA1 Message Date
Kamil Kashapov
b94a24e5dd [nfc][msan] Reorder ifs in CreateVarArgHelper
Part of #109284
2024-11-12 00:26:35 -08:00
Vitaly Buka
adb476b012
[nfc][msan] Clang-format MemorySanitizer.cpp (#115828)
Extracted from #109284

Co-authored-by: Kamil Kashapov <kashapov@ispras.ru>
2024-11-11 23:17:05 -08:00
Thurston Dang
e549ec529c
[msan] Add handleIntrinsicByApplyingToShadow; support NEON tbl/tbx (#114490)
This adds a general function that handles intrinsics by applying the
intrinsic to the shadows, and applies it to the specific case of Arm
NEON TBL/TBX intrinsics.

This also updates the tests from
https://github.com/llvm/llvm-project/pull/114462
2024-11-01 14:58:45 -07:00
Vitaly Buka
cf8d24531e
[msan] Reduces overhead of #113200, by 10% (#113201)
CTMark #113200 size overhead was 5.3%, now it's 4.7%.

The patch affects only signed integers.

https://alive2.llvm.org/ce/z/Lv5hyi

* The patch replaces code which extracted sign bit,
maximized/minimized it, then packed it back, with
simple sign bit flip. The another way to think about
transformation is as a subtraction of MIN_SINT from
A/B. Then we map MIN_SINT to 0, 0 to -MIN_SINT, and
MAX_SINT to MAX_UINT.

* Then to maximize/minimize A/B we don't need
to extract sign bit, we can apply shadow the
same way as to other bits.

* After sign bit flip, we had to switch to unsigned
version of the predicates.

* After change above  getHighestPossibleValue/getLowestPossibleValue
became very similar, so we can combine into a single function.

* Because the function does sign bit flip and
requires unsigned predicates used for returned values,
there is no point in keeping it as a member of class,
to hide, we switch to function local lambda.
2024-10-24 20:46:49 -07:00
Vitaly Buka
c77d8edf80
Revert "Revert "[msan] Switch to -msan-handle-icmp-exact my default"" (#113379)
Reverts llvm/llvm-project#113376

Fixed with #113378
2024-10-22 14:05:35 -07:00
Vitaly Buka
71792dc570
[NFC][msan] Workaround arg evaluation order diff GCC vs Clang (#113378) 2024-10-22 13:31:46 -07:00
Vitaly Buka
c3aa8b7dd6
Revert "[msan] Switch to -msan-handle-icmp-exact my default" (#113376)
Reverts llvm/llvm-project#113200

Breaks bots, see llvm/llvm-project#113200
2024-10-22 13:05:59 -07:00
Vitaly Buka
395093ec15
[msan] Switch to -msan-handle-icmp-exact my default (#113200)
Fixes #111212.

This grows .text by 5.3% on CTMark, (or 2.6% large internal binary)
Perf regressed by 1.6%. We will try to improve in follow up patches.

It worth to pay some performance regression to fix
correctness to avoid stuff like #111212.
2024-10-22 12:35:18 -07:00
Jay Foad
85c17e4092
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
  Fn = Intrinsic::getOrInsertDeclaration(...);
  CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.
2024-10-17 16:20:43 +01:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Antonio Frighetto
2ae968a0d9
[Instrumentation] Move out to Utils (NFC) (#108532)
Utility functions have been moved out to Utils. Minor opportunity to
drop the header where not needed.
2024-09-15 21:07:40 -07:00
Nikita Popov
03d5b7ca3d [MemorySanitizer] Don't create types pointers (NFC)
Everything in this pass uses a single addrspace 0 pointer type.
Don't try to create it using the typed pointer ctor.

This allows removing the type argument from
getShadowPtrForVAArgument().
2024-09-05 11:54:56 +02:00
Chaitanya
62ced8116b
[Sanitizer] Make sanitizer passes idempotent (#99439)
This PR changes the sanitizer passes to be idempotent. 
When any sanitizer pass is run after it has already been run before,
double instrumentation is seen in the resulting IR. This happens because
there is no check in the pass, to verify if IR has been instrumented
before.

This PR checks if "nosanitize_*" module flag is already present and if
true, return early without running the pass again.
2024-08-12 11:16:44 +05:30
Thurston Dang
cb5ec3796a
[msan] Support vst{2,3,4}_lane instructions (#101215)
This generalizes MSan's Arm NEON vst support, to include the
lane-specific variants.

This also updates the test from
https://github.com/llvm/llvm-project/pull/100645.
2024-08-09 10:16:38 -07:00
Thurston Dang
4ce559d059
[msan] Support most Arm NEON vector shift instructions (#102507)
This adds support for the Arm NEON vector shift instructions that follow
the same pattern as x86 (handleVectorShiftIntrinsic).

VSLI is not supported because it does not follow the 2-argument pattern
expected by handleVectorShiftIntrinsic.

This patch also updates the arm64-vshift.ll MSan test that was
introduced in
5d0a12d3e9
2024-08-08 17:02:04 -07:00
Thurston Dang
bbde3f6e9d
[msan] Support vst1x_{2,3,4} and vst_{2,3,4} with floating-point parameters (#100644)
Cloning the vst_ intrinsics to apply them to the shadows did not work if
the arguments were floating-point, since the shadows are integers. This
patch changes MSan to create an intrinsic of the correct integer types.

Additionally, this patch adds support for vst1x_{2,3,4}; these can be
handled similarly to vst_{2,3,4}, since in all cases we are adapting the
corresponding intrinsic.
    
This also updates the tests.
2024-07-29 20:57:28 -07:00
James Y Knight
dfeb3991fb
Remove the x86_mmx IR type. (#98505)
It is now translated to `<1 x i64>`, which allows the removal of a bunch
of special casing.

This _incompatibly_ changes the ABI of any LLVM IR function with
`x86_mmx` arguments or returns: instead of passing in mmx registers,
they will now be passed via integer registers. However, the real-world
incompatibility caused by this is expected to be minimal, because Clang
never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>`
or `double`, depending on ABI.

This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type.
That type simply no longer corresponds to an IR type, and is used only
by MMX intrinsics and inline-asm operands.

Because SelectionDAGBuilder only knows how to generate the
operands/results of intrinsics based on the IR type, it thus now
generates the intrinsics with the type MVT::v1i64, instead of
MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus
have the X86 backend fix them up in DAGCombine. (This may be a
short-lived hack, if all the MMX intrinsics can be removed in upcoming
changes.)

Works towards issue #98272.
2024-07-25 09:19:22 -04:00
Thurston Dang
54dab7dfcf
[msan] Implement support for Arm NEON vst{2,3,4} instructions (#99360)
This adds support for vst{2,3,4}, which are not correctly handled by
handleUnknownIntrinsic/handleVector{Load,Store}Intrinsic.

This patch also updates the tests introduced in
https://github.com/llvm/llvm-project/pull/98247 and
https://github.com/llvm/llvm-project/pull/99555

---------

Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
2024-07-19 11:02:57 -07:00
Sam James
996d31c7ba
[msan] Fix goo.gl link in comment for Valgrind paper
goo.gl is going away: https://developers.googleblog.com/en/google-url-shortener-links-will-no-longer-be-available/

Fix goo.gl link from:
- http://goo.gl/QKbem
+ https://static.usenix.org/event/usenix05/tech/general/full_papers/seward/seward_html/usenix2005.html
and reflow the comment a bit to make it look a bit better after the URL change,
although it's not perfect now.

Committed as obvious.

Bug: https://github.com/llvm/llvm-project/issues/99586
2024-07-19 00:54:24 +01:00
Thurston Dang
7002ecb4c6
[msan] Convert vector shadow to scalar before zext (#96722)
zext does not allow converting vector shadow to scalar, so we must
manually convert it prior to calling zext in materializeOneCheck, for
which the 'ConvertedShadow' parameter isn't actually guaranteed to be
scalar (1). Note that it is safe/no-op to call convertShadowToScalar on
a shadow that is already scalar.

In contrast, the storeOrigin function already converts the (potentially
vector) shadow to scalar; we add a comment to note why it is load
bearing.

(1) In materializeInstructionChecks():
"// Disable combining in some cases. TrackOrigins checks each shadow to
pick
 // correct origin.
 bool Combine = !MS.TrackOrigins;
 ...
       if (!Combine) {
        materializeOneCheck(IRB, ConvertedShadow, ShadowData.Origin);
        continue;
      }"
2024-07-03 12:40:12 -07:00
Kazu Hirata
4b28b3fae4
[Transforms] Use range-based for loops (NFC) (#97195) 2024-07-02 16:20:44 -07:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Vitaly Buka
34aa6c5d9a [msan] Handle blendv intrinsics (#94882)
blendvs are very similar to select, so we adjust
arguments and forward them into select handler.
2024-06-12 19:44:18 -07:00
Vitaly Buka
3bd9d4dedf
[msan] Implement shadow propagation for _mm_dp_pd, _mm_dp_ps, _mm256_dp_ps (#94875)
Default intrinsic handling was to report any
uninitialized part of argument. However intrinsics
use mask which allow to ignore parts of input, so
it's OK to have vectors partially initialized.
2024-06-11 22:48:40 -07:00
NMiehlbradt
1b66306c9c
[KMSAN] Enable on PowerPC64 (#73611)
Enable -fsanitize=kernel-memory support in Clang.

Add tests.

---------

Co-authored-by: Nicholas Miehlbradt <nicholas@linux.ibm.com>
2024-06-12 13:32:39 +08:00
Vitaly Buka
983bf65794
[NFC][msan] Extract handleSelectLikeInst (#94881)
`blendv` instructions are very similar to `select`.
We will add support for them in followup patches.
2024-06-10 13:12:00 -07:00
Vitaly Buka
4f416989d8
[NFC][msan] Prepare function to extract main logic (#94880) 2024-06-10 13:05:05 -07:00
Vitaly Buka
3016c0636f
[NFCI][msan] Use IntPtr for vscales origin for consistency (#90920) 2024-05-02 17:23:57 -07:00
Vitaly Buka
83fdcf234f
[msan] Fix vscale alloca poisoning (#90912) 2024-05-02 16:44:10 -07:00
Vitaly Buka
a2be1b8d03
[msan] Don't modify CFG iterating it (#90691)
In rare cases `SplitBlockAndInsertSimpleForLoop` in `paintOrigin`
crashes outsize iterators.

Somehow existing `SplitBlockAndInsertIfThen` do not invalidate
iterators.
2024-05-01 14:47:00 -07:00
Vitaly Buka
8cf0f9ab2f
[msan] Add conservative handling of vscale params (#90167)
Msan uses `__msan_param_tls` to pass shadow of
arguments. Position of arguments is expected to be
available during compile time, if size of the
argument is know. This is not true for vscale.

As work around we require that vscale parameters
are always initialized, then we don't need to pass
shadow.

Ret val should work out of the box as we don't
need to know size compile time.
2024-04-26 15:26:57 -07:00
Vitaly Buka
21b84928f9
[msan] Don't crash in CreateShadowCast on vscale (#90126)
Code expects `VectorOrPrimitiveTypeSizeInBits` compile time value,
which is not available for vscale.
    
In trivial case of the same type we need to do nothing.
2024-04-25 21:29:24 -07:00
Vitaly Buka
4f4ebee10e
[msan] Eliminate non-deterministic behavior in the pass (#89831)
Almost NFC, instrumentation is as correct as it was before.

We need InstrumentationList grouped by origin instruction,
so we used stable_sort. However these objects already grouped
because we never interleave sequences of `insertShadowCheck`
of different instrunction.

Pointer sort has artifact that it was deppendent on allocator behavior,
so we could inserted checks in a different order.

There is no test, as I failed to reproduce this with `opt`. My guess
is that for reproducer we need to increase fragmentation in the
allocator.
2024-04-23 16:19:47 -07:00
Vitaly Buka
1d14034873 [NFC][msan] Add DebugInstrumentInstruction DEBUG_COUNTER 2024-04-23 11:20:50 -07:00
Vitaly Buka
06cc1754f8 [NFC][msan] Fix typo in comment 2024-04-23 11:20:50 -07:00
Kazu Hirata
d674f45d51
[Transforms] Remove extraneous ArrayRef (NFC) (#89535)
We don't need to create these instances of ArrayRef because
ConstantDataVector::get takes ArrayRef, and ArrayRef can be implicitly
constructed from C arrays.
2024-04-21 08:21:23 -07:00
Vitaly Buka
c60aa430dc
[NFCI][sanitizers][metadata] Exctract create{Unlikely,Likely}BranchWeights (#89464)
We have a lot of repeated code with random constants.
Particular values are not important, the one just needs to be
bigger then another.

UR_NONTAKEN_WEIGHT is selected as it's the most common one.
2024-04-19 17:03:23 -07:00
Harald van Dijk
60de56c743
[ValueTracking] Restore isKnownNonZero parameter order. (#88873)
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
2024-04-16 15:21:09 +01:00
Yingwei Zheng
e0a628715a
[ValueTracking] Convert isKnownNonZero to use SimplifyQuery (#85863)
This patch converts `isKnownNonZero` to use SimplifyQuery. Then we can
use the context information from `DomCondCache`.

Fixes https://github.com/llvm/llvm-project/issues/85823.
Alive2: https://alive2.llvm.org/ce/z/QUvHVj
2024-04-12 23:47:20 +08:00
Evgenii Stepanov
e72c949c15
[msan] Overflow intrinsics. (#88210) 2024-04-10 09:12:25 -07:00
Evgenii Stepanov
5bc87dac75 Revert "Overflow and saturating intrinsics (#88068)"
This reverts commit 118a5d8236d8a483dd401fa35c8b1fcd058eacc1.
2024-04-08 17:02:21 -07:00
Evgenii Stepanov
118a5d8236
Overflow and saturating intrinsics (#88068) 2024-04-08 16:33:45 -07:00
Fangrui Song
9b91c54d9b
[msan] Unpoison indirect outputs for userspace using memset for large operands (#79924)
Modify #77393 to clear shadow memory using `llvm.memset.*` when the size
is large, similar to `shouldUseBZeroPlusStoresToInitialize` in clang for
`-ftrivial-auto-var-init=`. The intrinsic, if lowered to libcall, will
use the msan interceptor.

The instruction selector lowers a `StoreInst` to multiple stores, not
utilizing `memset`. When the size is large (e.g.
`store { [100 x i32] } zeroinitializer, ptr %12, align 1`), the
generated code will be long (and `CodeGenPrepare::optimizeInst` will
even crash for a huge size).

```
// Test stack size
template <class T>
void DoNotOptimize(const T& var) { // deprecated by https://github.com/google/benchmark/pull/1493
  asm volatile("" : "+m"(const_cast<T&>(var)));
}

int main() {
  using LargeArray = std::array<int, 1000000>;
  auto large_stack = []() { DoNotOptimize(LargeArray()); };
  /////// CodeGenPrepare::optimizeInst triggers an assertion failure when creating an integer type with a bit width>2**23
  large_stack();
}
```
2024-01-30 13:45:47 -08:00
Fangrui Song
1ae0448ed3
[msan] Enable msan-handle-asm-conservative for userspace by default (#79251)
msan-handle-asm-conservative is enabled by KMSAN by default.
Enable the userspace by default as well after #77393.
2024-01-24 15:31:43 -08:00
Fangrui Song
c71a5bf940
[msan] Unpoison indirect outputs for userspace when -msan-handle-asm-conservative is specified (#77393)
KMSAN defaults to `msan-handle-asm-conservative`, which inserts
`__msan_instrument_asm_store` calls to unpoison indirect outputs in
inline assembly (e.g. `=m` constraints in source).

```c
unsigned f() {
  unsigned v;
  // __msan_instrument_asm_store unpoisons v before invoking the asm.
  asm("movl $1,%0" : "=m"(v));
  return v;
}
```

Extend the mechanism to userspace, but require explicit
`-mllvm -msan-handle-asm-conservative` for experiments for now.

As

https://docs.kernel.org/dev-tools/kmsan.html#inline-assembly-instrumentation
says, this approach may mask certain errors (an indirect output may not
actually be initialized), but it also helps to avoid a lot of false
positives.

Link: https://github.com/google/sanitizers/issues/192
2024-01-19 16:18:28 -08:00
Nikita Popov
6c2fbc3a68
[IRBuilder] Add CreatePtrAdd() method (NFC) (#77582)
This abstracts over the common pattern of creating a gep with i8 element
type.
2024-01-12 14:21:21 +01:00
Vitaly Buka
66e9429e75
[msan][aarch64] Improve argument classification
Arm64 use multiple registers (varg slots) to pass arrays.

Reviewers: kstoimenov, thurstond

Reviewed By: thurstond

Pull Request: https://github.com/llvm/llvm-project/pull/72728
2023-11-17 17:01:34 -08:00
Vitaly Buka
e7f350951b
[msan][aarch64] Fix cleanup of unused part of overflow area
Similar to a05e736d288a7f2009ee9d057e78713d9adeeb5f.

Reviewers: thurstond, kstoimenov

Reviewed By: thurstond

Pull Request: https://github.com/llvm/llvm-project/pull/72722
2023-11-17 16:48:05 -08:00
Vitaly Buka
a05e736d28
[msan][x86] Fix shadow if vararg overflow beyond kParamTLSSize
Caller puts argument shadow one by one into __msan_va_arg_tls, until it
reaches kParamTLSSize. After that it still increment OverflowOffset but
does not store the shadow.

Callee needs OverflowOffset to prepare a shadow for the entire overflow
area. It's done by creating "varargs shadow copy" for complete list of
args, copying available shadow from __msan_va_arg_tls, and clearing the
rest.

However callee does not know if the tail of __msan_va_arg_tls was not
able to fit an argument, and callee will copy tail shadow into "varargs
shadow copy", and later used as a shadow for an omitted argument.

So that unused tail of the __msan_va_arg_tls must be cleared if left
unused.

This allows us to enable compiler-rt/test/msan/vararg_shadow.cpp for
x86.

Reviewers: kstoimenov, thurstond

Reviewed By: thurstond

Pull Request: https://github.com/llvm/llvm-project/pull/72707
2023-11-17 15:13:11 -08:00
Vitaly Buka
a30e9a1a57 [NFC][msan] Fix formating 2023-11-17 14:31:44 -08:00