601 Commits

Author SHA1 Message Date
Thurston Dang
ef5022745c
[NFCI][msan] Refactor into 'horizontalReduce' (#152961)
The functionality is used by two helper functions, and will be used even more in the future (e.g.,
https://github.com/llvm/llvm-project/pull/152941).
2025-08-11 15:48:20 -07:00
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Nikita Popov
86727fe9a1
[IR] Allow poison argument to lifetime markers (#151148)
This slightly relaxes the invariant established in #149310, by also
allowing the lifetime argument to be poison. This is to support the
typical pattern of RAUWing with poison when removing an instruction.

It's worth noting that this does not require any conservative
assumptions, lifetimes with poison arguments can simply be skipped.

Fixes https://github.com/llvm/llvm-project/issues/151119.
2025-08-04 10:02:04 +02:00
Thurston Dang
56944e606a
[msan] Approximately handle AVX Galois Field Affine Transformation (#150794)
e.g.,
      <16 x i8> @llvm.x86.vgf2p8affineqb.128(<16 x i8>, <16 x i8>, i8)
      <32 x i8> @llvm.x86.vgf2p8affineqb.256(<32 x i8>, <32 x i8>, i8)
      <64 x i8> @llvm.x86.vgf2p8affineqb.512(<64 x i8>, <64 x i8>, i8)
       Out                                    A          x          b
where A and x are packed matrices, b is a vector, Out = A * x + b in
GF(2)

Multiplication in GF(2) is equivalent to bitwise AND. However, the
matrix computation also includes a parity calculation.

For the bitwise AND of bits V1 and V2, the exact shadow is:
Out_Shadow = (V1_Shadow & V2_Shadow) | (V1 & V2_Shadow) | (V1_Shadow &
V2)

We approximate the shadow of gf2p8affine using:
  Out_Shadow =   _mm512_gf2p8affine_epi64_epi8(x_Shadow, A_shadow, 0)
               | _mm512_gf2p8affine_epi64_epi8(x, A_shadow, 0)
               | _mm512_gf2p8affine_epi64_epi8(x_Shadow, A, 0)
               | _mm512_set1_epi8(b_Shadow)

This approximation has false negatives: if an intermediate dot-product
contains an even number of 1's, the parity is 0.

It has no false positives.

Updates the test from https://github.com/llvm/llvm-project/pull/149258
2025-07-30 08:06:50 -07:00
Kazu Hirata
3e53d4d386
[llvm] Remove unused includes (NFC) (#150265)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-23 15:18:46 -07:00
Nikita Popov
b59aaf7da7
[Sanitizers] Remove handling for lifetimes on non-alloca insts (NFC) (#149994)
After #149310 the pointer argument of lifetime.start/lifetime.end is
guaranteed to be an alloca, so we don't need to go through
findAllocaForValue() anymore, and don't have to have special handling
for the case where it fails.
2025-07-23 09:48:32 +02:00
Thurston Dang
5c4877ee0d
[msan] Re-fix disjoint OR instrumentation from #145990 (#148760)
When disjoint OR was specified and a bit position contained a 1 in both
operands, #145990 would set the corresponding shadow bit to
uninitialized. However, the output of the operation is (LLVM) 'poison'
for the entire result, hence the entire shadow ought to be
uninitialized. This patch corrects the issue.
2025-07-15 15:32:15 -07:00
Thurston Dang
66850d0c06
[msan] Fix 'Simplify 'maskedCheckAVXIndexShadow' #147839' (#148785)
https://github.com/llvm/llvm-project/pull/147839/ incorrectly checked
the (lower bits of the) concrete value rather than the shadow.
2025-07-15 10:36:27 -07:00
Thurston Dang
6fc3b40b2c
[msan] Model is_int_min_poison to avoid false negative in abs (#148069)
Note: since this patch reduces false negatives, buggy code that formerly
passed with MSan may start failing.

The current MSan handler for abs, like Hercules' in New York, ignores
is_int_min_poison. This patch fixes the issue by poisoning the shadow
corresponding to each int_min input value if is_int_min_poison.
2025-07-10 16:47:53 -07:00
Fangrui Song
1ae99f5894 [msan] Fix -Wunused-but-set-variable after #147839 2025-07-09 18:14:19 -07:00
Thurston Dang
7c66099545
[msan] Simplify 'maskedCheckAVXIndexShadow' (#147839)
The current instrumentation has more or and element extraction than a
coal mine:

```
[[TMP10:%.*]] = extractelement <16 x i32> [[TMP9]], i64 0
[[TMP11:%.*]] = and i32 [[TMP10]], 15
[[TMP43:%.*]] = or i32 [[TMP10]], [[TMP11]]
[[TMP12:%.*]] = extractelement <16 x i32> [[TMP9]], i64 1
[[TMP13:%.*]] = and i32 [[TMP12]], 15
[[TMP44:%.*]] = or i32 [[TMP12]], [[TMP13]]
    ...
[[TMP40:%.*]] = extractelement <16 x i32> [[TMP9]], i64 15
[[TMP41:%.*]] = and i32 [[TMP40]], 15
[[TMP57:%.*]] = or i32 [[TMP40]], [[TMP41]]
[[_MSCMP:%.*]] = icmp ne i32 [[TMP57]], 0
br i1 [[_MSCMP]], label [[TMP102:%.*]], label [[TMP103:%.*]], !prof [[PROF1]]
```

Simplify it to:

```
[[TMP10:%.*]] = trunc <16 x i32> [[T]] to <16 x i4>
[[TMP12:%.*]] = bitcast <16 x i4> [[TMP10]] to i64
[[_MSCMP:%.*]] = icmp ne i64 [[TMP12]], 0
br i1 [[_MSCMP]], label %[[BB13:.*]], label %[[BB14:.*]], !prof [[PROF1]]
```
2025-07-09 17:56:16 -07:00
Thurston Dang
702784ca76
[msan] Check mask and rounding mode in handleAVX512VectorConvertFPToInt (#147782)
The checks were missing in "Add handler for
llvm.x86.avx512.mask.cvtps2dq.512
(https://github.com/llvm/llvm-project/pull/147377)
2025-07-09 13:06:45 -07:00
Thurston Dang
61d52ea764
[NFCI][msan] Refactor to use 'isFixedIntVector' etc. (#147789)
Inspired by a suggestion from Florian Google in
https://github.com/llvm/llvm-project/pull/147606#discussion_r2193548994
2025-07-09 12:07:19 -07:00
Thurston Dang
cc95e4039b
[msan] Handle AVX512 vector down convert (non-mem) intrinsics (#147606)
This handles `llvm.x86.avx512.mask.pmov{,s,us}.*.512` using
`handleIntrinsicByApplyingToShadow()` where possible, otherwise using a
customized slow-path handler, `handleAVX512VectorDownConvert()`.

Note that shadow propagation of `pmov{s,us}` (signed/unsigned
saturation) are approximated using truncation. Future work could extend
`handleAVX512VectorDownConvert()` to use `GetMinMaxUnsigned()` to handle
saturation precisely.
2025-07-08 20:51:19 -07:00
Florian Mayer
36dbe517a0
[NFC] [MSAN] disambiguate insertShadowCheck (#146616)
One of them operates on values, the other on shadows. It is confusing
for both of them to have the same name but only different number of
parameters.
2025-07-08 09:54:35 -07:00
Thurston Dang
3528e16ff8
[NFCI][msan] Extract 'maybeShrinkVectorShadow' and 'maybeExtendVectorShadowWithZeros' into helper functions (#147415)
These functions will be useful in other intrinsic handlers.
2025-07-07 17:59:37 -07:00
Florian Mayer
a3afbd33d8
[MSAN] only require needed bits to be initialized for permilvar (#147407) 2025-07-07 16:21:55 -07:00
Thurston Dang
556c8467d1
[msan] Add handler for llvm.x86.avx512.mask.cvtps2dq.512 (#147377)
Propagate the shadow according to the writemask, instead of using the
default strict handler.

Updates the test added in
https://github.com/llvm/llvm-project/pull/123980
2025-07-07 14:49:36 -07:00
Florian Mayer
0032148ea6
[MSAN] handle permi2var (#146437) 2025-07-07 11:24:17 -07:00
Thurston Dang
4cf53cd266
[msan] Fix "Add optional flag to improve instrumentation of disjoint OR (#145990)" (#146799)
The "V1" and "V2" values were already NOT'ed, hence the calculation of disjoint OR in #145990 was incorrect. This patch fixes the issue, with some refactoring and renaming of variables.
2025-07-02 20:15:58 -07:00
Florian Mayer
1f7ba23422
[NFC] [MSAN] replace (void) with [[maybe_unused]] (#146617)
The latter is preferred in the LLVM style guide.
2025-07-02 12:14:45 -07:00
Thurston Dang
afe6af14ff
[msan] Add optional flag to improve instrumentation of disjoint OR (#145990)
The disjoint OR (https://github.com/llvm/llvm-project/pull/72583) of two '1's is poison, hence the MSan ought to consider the result uninitialized (rather than initialized - i.e. a false negative - as per the existing instrumentation which ignores disjointedness). This patch adds a flag, `-msan-precise-disjoint-or`, which defaults to false (the legacy behavior). A future patch will default this flag to true.

Updates the test from https://github.com/llvm/llvm-project/pull/145982
2025-06-26 22:55:55 -07:00
Thurston Dang
5a194c1fd9
[msan] Sharpen instrumentation of Intrinsic::{ctlz,cttz} (#145609)
The current instrumentation of Intrinsic::{ctlz,cttz} has false positives. For example, consider `ctlz(0001 11??)` whereby `0` and `1` denotes initialized bits (with concrete values of 0 and 1 respectively) and `?` denotes an uninitialized bit. The result (of 3) is well-defined and the shadow ought to be fully initialized, but the current instrumentation marks it as fully uninitialized.

This patch improves the fidelity of the instrumentation by comparing the number of leading (for ctlz; trailing for cttz) zeros in the concrete value and the shadow.

This patch also renames the function from 'handleCountZeroes' to 'handleLeadingTrailingCountZeros', to clarify that the intrinsics handled do not count all the zeros (unlike `llvm.ctpop`, which counts all the 1s).
2025-06-25 09:29:59 -07:00
Thurston Dang
c85466dcd4
Reapply "[msan] Automatically print shadow for failing outlined checks" (#145611) (#145615)
This reverts commit 5eb5f0d8760c6b757c1da22682b5cf722efee489 i.e.,
relands 1b71ea411a9d36705663b1724ececbdfec7cc98c.

Test case was failing on aarch64 because the long double type is
implemented differently on x86 vs aarch64. This reland restricts the
test to x86.

----

Original CL description:
    
A commonly used aid for debugging MSan reports is
`__msan_print_shadow()`, which requires manual app code annotations
(typically of the variable in the UUM report or nearby). This is in
contrast to ASan, which automatically prints out the shadow map when a
check fails.
    
This patch changes MSan to print the shadow that failed an outlined
check (checks are outlined per function after the
`-msan-instrumentation-with-call-threshold` is exceeded) if verbosity >=
1. Note that we do not print out the shadow map of "neighboring"
variables because this is technically infeasible; see "Caveat" below.
    
This patch can be easier to use than `__msan_print_shadow()` because
this does not require manual app code annotations. Additionally, due to
optimizations, `__msan_print_shadow()` calls can sometimes spuriously
affect whether a variable is initialized.
    
As a side effect, this patch also enables outlined checks for
arbitrary-sized shadows (vs. the current hardcoded handlers for
{1,2,4,8}-byte shadows).
    
Caveat: the shadow does not necessarily correspond to an individual user
variable, because MSan instrumentation may combine and/or truncate
multiple shadows prior to emitting a check that the mangled shadow is
zero (e.g., `convertShadowToScalar()`,
`handleSSEVectorConvertIntrinsic()`, `materializeInstructionChecks()`).
OTOH it is arguably a strength that this feature emit the shadow that
directly matters for the MSan check, but which cannot be obtained using
the MSan API.
2025-06-24 20:33:11 -07:00
Thurston Dang
5eb5f0d876
Revert "[msan] Automatically print shadow for failing outlined checks" (#145611)
Reverts llvm/llvm-project#145107

Reason: buildbot breakage
(https://lab.llvm.org/buildbot/#/builders/51/builds/18512)
2025-06-24 15:53:19 -07:00
Thurston Dang
1b71ea411a
[msan] Automatically print shadow for failing outlined checks (#145107)
A commonly used aid for debugging MSan reports is `__msan_print_shadow()`, which requires manual app code annotations (typically of the variable in the UUM report or nearby). This is in contrast to ASan, which automatically prints out the shadow map when a check fails.

This patch changes MSan to print the shadow that failed an outlined check (checks are outlined per function after the `-msan-instrumentation-with-call-threshold` is exceeded) if verbosity >= 1. Note that we do not print out the shadow map of "neighboring" variables because this is technically infeasible; see "Caveat" below.

This patch can be easier to use than `__msan_print_shadow()` because this does not require manual app code annotations. Additionally, due to optimizations, `__msan_print_shadow()` calls can sometimes spuriously affect whether a variable is initialized.

As a side effect, this patch also enables outlined checks for arbitrary-sized shadows (vs. the current hardcoded handlers for {1,2,4,8}-byte shadows).

Caveat: the shadow does not necessarily correspond to an individual user variable, because MSan instrumentation may combine and/or truncate multiple shadows prior to emitting a check that the mangled shadow is zero (e.g., `convertShadowToScalar()`, `handleSSEVectorConvertIntrinsic()`, `materializeInstructionChecks()`). OTOH it is arguably a strength that this feature emit the shadow that directly matters for the MSan check, but which cannot be obtained using the MSan API.
2025-06-24 15:09:44 -07:00
Florian Mayer
61a969b867
Revert "[MSAN] handle assorted AVX permutations" (#145404)
Rolling back while investigating an issue that might be caused by this.
2025-06-23 14:15:39 -07:00
Thurston Dang
33a92af1b2
[msan] Add off-by-default flag to fix false negatives from partially undefined constant fixed-length vectors (#143837)
This patch adds an off-by-default flag which, when enabled via `-mllvm -msan-poison-undef-vectors=true`, fixes a false negative in MSan (partially-undefined constant fixed-length vectors). It is currently off by default since, by fixing the false positive, code/tests that previously passed MSan may start failing. The default will be changed in a future patch.

Prior to this patch, MSan computes that partially-undefined constant fixed-length vectors are fully initialized, which leads to false negatives; moreover, benign vector rewriting could theoretically flip MSan's shadow computation from initialized to uninitialized or vice-versa (*). `-msan-poison-undef-vectors=true` calculates the shadow precisely: for each element of the vector, the corresponding shadow is fully uninitialized if the element is undefined/poisoned, otherwise it is fully initialized.

Updates the test from https://github.com/llvm/llvm-project/pull/143823

(*) For example:
  ```
  %x = insertelement <2 x i64> <i64 0, i64 poison>, i64 42, i64 0
  %y = insertelement <2 x i64> <i64 poison, i64 poison>, i64 42, i64 0
  ```
%x and %y are equivalent but, prior to this patch, MSan incorrectly computes the shadow of %x as <0, 0> rather than <0, -1>.
2025-06-20 10:11:12 -07:00
Florian Mayer
d7e64d9594
[MSAN] handle assorted AVX permutations (#143462) 2025-06-13 15:48:46 -07:00
Thurston Dang
2efff47363
[NFCI][msan] Show that shadow for partially undefined constant vectors is computed as fully initialized (#143823)
This happens because `getShadow(Value *V)` has a special case for fully undefined/poisoned values, but partially undefined values fall-through and are given a clean shadow. This leads to false negatives (no false positives).

Note: MSan correctly handles InsertElementInst, but the shadow of the initial constant vector may still be wrong and be propagated.

Showing that the same approximation happens for other composite types is left as an exercise for the reader.
2025-06-11 22:43:06 -07:00
Florian Mayer
23fd60d996
[MSAN] support vpermilvar AVX instructions (#143053) 2025-06-09 17:57:19 -07:00
Thurston Dang
d398f476c5
[msan] Rename '-msan-dump-strict-intrinsics' to '-msan-dump-heuristic-instructions' (#143186)
This updates the flag from https://github.com/llvm/llvm-project/pull/123381

Also expands the description of msan-dump-strict-*instructions*
2025-06-06 13:06:00 -07:00
Kazu Hirata
aa15596b5f
[llvm] Remove unused local variables (NFC) (#138478) 2025-05-04 21:33:54 -07:00
Thurston Dang
d913ea307e
[msan] Implement support for avx512fp16.mask.{add/sub/mul/div/max/min}.sh.round (#137441)
This adds a handler, visitGenericScalarHalfwordInst, which works for
mask.{add/sub/mul/div/max/min}.sh.round.

Updates the tests in https://github.com/llvm/llvm-project/pull/136260
2025-04-28 18:41:23 -07:00
Thurston Dang
d1f4f52aa6
[msan] Handle x86.avx512fp16.{add,sub.mul,div,min,max}.ph.512 (#136619)
These are handled similarly to x86_avx512_(min|max)_p[sd]_512 intrinsics
(https://github.com/llvm/llvm-project/pull/124421) i.e., using
maybeHandleSimpleNomemIntrinsic, with the last parameter being the
rounding method.

Updates the test from https://github.com/llvm/llvm-project/pull/136260
2025-04-21 16:45:19 -07:00
k-kashapov
7b4b43bd15
[MSan] Separated PPC32 va_arg helper from PPC64 (#131827)
With more understanding of PowerPC32 ABI we've rewritten the
`VarArgPowerPC32Helper`.
New implementation fills shadow for both `reg_save_area` and
`overflow_arg_area`.
It does not copy shadow for floating-point arguments, as they are stored
in a separate space.
This implementation does not fully support passing arguments `byVal`.
This will be fixed in future PRs.
Tests were also updated via `llvm/utils/update_test_checks.py`.
2025-04-09 12:36:17 -07:00
k-kashapov
f76d9da6d8
[MSan] Update type for MsanMetadataPtrForLoadN and MsanMetadataPtrForStoreN (#135040)
Changed last parameter type for `MsanMetadataPtrForLoadN` and
`MsanMetadataPtrForStoreN` from `i64` to `IntptrTy` to keep it
consistent with call in `getShadowOriginPtrKernelNoVec`

Co-authored-by: Kamil Kashapov <kashapov@ispras.ru>
2025-04-09 10:30:24 -07:00
k-kashapov
b712068af2
[nfc][Msan] Split PPC VarArg Helper into PPC32 and PPC64 (#134860)
As discussed in https://github.com/llvm/llvm-project/pull/131827, copied
ppc32 helper from ppc64. No functional changes have been made.
2025-04-08 15:19:21 -07:00
k-kashapov
271399831b
[MSan] Change overflow_size_tls type to IntPtrTy (#117689)
As discussed in
https://github.com/llvm/llvm-project/pull/109284#discussion_r1838819987:
Changed `__msan_va_arg_overflow_size_tls` type from `Int64Ty` to
`IntPtrTy`.
2025-04-08 09:51:13 -07:00
Rahul Joshi
74b7abf154
[IRBuilder] Add new overload for CreateIntrinsic (#131942)
Add a new `CreateIntrinsic` overload with no `Types`, useful for
creating calls to non-overloaded intrinsics that don't need additional
mangling.
2025-03-31 08:10:34 -07:00
Thurston Dang
8726e97345
[msan] Handle SSE2 cvt(t?)ps2dq/cvt(t?)pd2dq and cvtpd2ps using handleSSEVectorConvertIntrinsicByProp (#132815)
cvt(t?)ps2dq/cvt(t?)pd2dq and cvtpd2ps are currently handled strictly.
This patch handles them using handleSSEVectorConvertIntrinsicByProp
(from https://github.com/llvm/llvm-project/pull/130705), generalized to
handle SSE intrinsics that do not have a rounding mode parameter.
2025-03-28 17:59:59 -07:00
Thurston Dang
5946696d67
[msan] Handle NEON vector load (#130457)
This adds an explicit handler for:
- llvm.aarch64.neon.ld1x2, llvm.aarch64.neon.ld1x3,
llvm.aarch64.neon.ld1x4
- llvm.aarch64.neon.ld2, llvm.aarch64.neon.ld3, llvm.aarch64.neon.ld4
- llvm.aarch64.neon.ld2lane, llvm.aarch64.neon.ld3lane,
llvm.aarch64.neon.ld4lane
- llvm.aarch64.neon.ld2r, llvm.aarch64.neon.ld3r, llvm.aarch64.neon.ld4r
instead of relying on the default strict handler.

Updates the tests from https://github.com/llvm/llvm-project/pull/125267
2025-03-19 20:46:14 -07:00
Thurston Dang
c30ff922ca
[msan] Handle llvm.x86.vcvtps2ph.128/256 explicitly (#130705)
Check whether each lane is fully initialized, and propagate the shadow
per lane instead of using the strict handling of visitInstruction.

Updates the tests from https://github.com/llvm/llvm-project/pull/129807
2025-03-13 16:15:37 -04:00
Thurston Dang
667bbd2ecc
[msan] Apply handleVectorReduceIntrinsic to max/min vector instructions (#129819)
Changes the handling of:
- llvm.aarch64.neon.smaxv
- llvm.aarch64.neon.sminv
- llvm.aarch64.neon.umaxv
- llvm.aarch64.neon.uminv
- llvm.vector.reduce.smax
- llvm.vector.reduce.smin
- llvm.vector.reduce.umax
- llvm.vector.reduce.umin
- llvm.vector.reduce.fmax
- llvm.vector.reduce.fmin
from the default strict handling (visitInstruction) to
handleVectorReduceIntrinsic.

Also adds a parameter to handleVectorReduceIntrinsic to specify whether
the return type must match the elements of the vector.

Updates the tests from https://github.com/llvm/llvm-project/pull/129741,
https://github.com/llvm/llvm-project/pull/129810,
https://github.com/llvm/llvm-project/pull/129768
2025-03-08 19:31:48 -08:00
Thurston Dang
3a0c33afd1
[msan] Handle Arm NEON pairwise min/max instructions (#129824)
Change the handling of:
- llvm.aarch64.neon.fmaxp
- llvm.aarch64.neon.fminp
- llvm.aarch64.neon.fmaxnmp
- llvm.aarch64.neon.fminnmp
- llvm.aarch64.neon.smaxp
- llvm.aarch64.neon.sminp
- llvm.aarch64.neon.umaxp
- llvm.aarch64.neon.uminp
from the incorrect heuristic handler (maybeHandleSimpleNomemIntrinsic)
to handlePairwiseShadowOrIntrinsic.

Updates the tests from https://github.com/llvm/llvm-project/pull/129760

Adds a note that maybeHandleSimpleNomemIntrinsic may incorrectly match
horizontal/pairwise intrinsics.
2025-03-08 19:08:08 -08:00
Nikita Popov
979c275097
[IR] Store Triple in Module (NFC) (#129868)
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.

For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.

The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
2025-03-06 10:27:47 +01:00
Thurston Dang
5d404d75cf
[msan] Generalize handlePairwiseShadowOrIntrinsic, and handle x86 pairwise add/sub (#127567)
x86 pairwise add and sub are currently handled by applying the pairwise add intrinsic to the shadow (https://github.com/llvm/llvm-project/pull/124835), due to the lack of an x86 pairwise OR intrinsic. handlePairwiseShadowOrIntrinsic was added (https://github.com/llvm/llvm-project/pull/126008) to handle Arm
pairwise add, but assumes that the intrinsic operates on each pair of elements as defined by the LLVM type. In contrast, x86 pairwise add/sub may sometimes have e.g., <1 x i64> as a parameter but actually be operating on <2 x i32>.

This patch generalizes handlePairwiseShadowOrIntrinsic, to allow reinterpreting the parameters to be a vector of specified element size, and then uses this function to handle x86 pairwise add/sub.
2025-02-26 21:57:02 -08:00
Thurston Dang
51d8255203
[msan] Handle Arm NEON saturating extract and narrow (#125742)
This handles NEON saturating extract and narrow (Intrinsic::aarch64_neon_{sqxtn, sqxtun, uqxtn}) by (ab)using handleShadowOr() to perform the shadow cast. Previously, these were unknown intrinsics handled suboptimally by visitInstruction.

Updates the tests from https://github.com/llvm/llvm-project/pull/125288 and https://github.com/llvm/llvm-project/pull/125140
2025-02-12 16:22:49 -08:00
Thurston Dang
0d95631a3a
[msan] Handle llvm.[us]cmp (starship operator) (#125804)
Apply handleShadowOr to llvm.[us]cmp. Previously, llvm.[su]cmp was correctly handled heuristically when each parameter type is the same as the return type (e.g., `call i8 @llvm.ucmp.i8.i8(i8 %x, i8 %y)`) but handled incorrectly by visitInstruction when the return type is different e.g., (`call i8 @llvm.ucmp.i8.i62(i62 %x, i62 %y)`, `call <4 x i8> @llvm.ucmp.v4i8.v4i32(<4 x i32> %x, <4 x i32> %y)`).

Updates the tests from https://github.com/llvm/llvm-project/pull/125790
2025-02-12 13:38:45 -08:00
Thurston Dang
e9e6ba6a5e
[msan] Handle single-parameter Arm NEON vector convert intrinsics (#126136)
This handles the following llvm.aarch64.neon intrinsics, which were suboptimally handled by visitInstruction:
- fcvtas, fcvtau
- fcvtms, fcvtmu
- fcvtns, fcvtnu
- fcvtps, fcvtpu
- fcvtzs, fcvtzu

The old instrumentation checked that the shadow of every element of the input vector was fully initialized, and aborted otherwise. The new instrumentation propagates the shadow: for each element of the output, the shadow is initialized iff the corresponding element of the input is *fully* initialized (since these are floating-point to integer conversions).

Updates the tests from https://github.com/llvm/llvm-project/pull/126095
2025-02-12 13:20:22 -08:00