Thurston Dang 638bd11c13
[msan] Handle SSE/AVX pshuf intrinsic by applying to shadow (#153895)
llvm.x86.sse.pshuf.w(<1 x i64>, i8) and llvm.x86.avx512.pshuf.b.512(<64
x i8>, <64 x i8>) are currently handled strictly, which is suboptimal.

llvm.x86.ssse3.pshuf.b(<1 x i64>, <1 x i64>)
llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>) and
llvm.x86.avx2.pshuf.b(<32 x i8>, <32 x i8>) are currently heuristically
handled using maybeHandleSimpleNomemIntrinsic, which is incorrect.

Since the second argument is the shuffle order, we instrument all these
intrinsics using `handleIntrinsicByApplyingToShadow(...,
/*trailingVerbatimArgs=*/1)`
(https://github.com/llvm/llvm-project/pull/114490).
2025-08-15 20:28:30 -07:00

8072 lines
307 KiB
C++

//===- MemorySanitizer.cpp - detector of uninitialized reads --------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
/// \file
/// This file is a part of MemorySanitizer, a detector of uninitialized
/// reads.
///
/// The algorithm of the tool is similar to Memcheck
/// (https://static.usenix.org/event/usenix05/tech/general/full_papers/seward/seward_html/usenix2005.html)
/// We associate a few shadow bits with every byte of the application memory,
/// poison the shadow of the malloc-ed or alloca-ed memory, load the shadow,
/// bits on every memory read, propagate the shadow bits through some of the
/// arithmetic instruction (including MOV), store the shadow bits on every
/// memory write, report a bug on some other instructions (e.g. JMP) if the
/// associated shadow is poisoned.
///
/// But there are differences too. The first and the major one:
/// compiler instrumentation instead of binary instrumentation. This
/// gives us much better register allocation, possible compiler
/// optimizations and a fast start-up. But this brings the major issue
/// as well: msan needs to see all program events, including system
/// calls and reads/writes in system libraries, so we either need to
/// compile *everything* with msan or use a binary translation
/// component (e.g. DynamoRIO) to instrument pre-built libraries.
/// Another difference from Memcheck is that we use 8 shadow bits per
/// byte of application memory and use a direct shadow mapping. This
/// greatly simplifies the instrumentation code and avoids races on
/// shadow updates (Memcheck is single-threaded so races are not a
/// concern there. Memcheck uses 2 shadow bits per byte with a slow
/// path storage that uses 8 bits per byte).
///
/// The default value of shadow is 0, which means "clean" (not poisoned).
///
/// Every module initializer should call __msan_init to ensure that the
/// shadow memory is ready. On error, __msan_warning is called. Since
/// parameters and return values may be passed via registers, we have a
/// specialized thread-local shadow for return values
/// (__msan_retval_tls) and parameters (__msan_param_tls).
///
/// Origin tracking.
///
/// MemorySanitizer can track origins (allocation points) of all uninitialized
/// values. This behavior is controlled with a flag (msan-track-origins) and is
/// disabled by default.
///
/// Origins are 4-byte values created and interpreted by the runtime library.
/// They are stored in a second shadow mapping, one 4-byte value for 4 bytes
/// of application memory. Propagation of origins is basically a bunch of
/// "select" instructions that pick the origin of a dirty argument, if an
/// instruction has one.
///
/// Every 4 aligned, consecutive bytes of application memory have one origin
/// value associated with them. If these bytes contain uninitialized data
/// coming from 2 different allocations, the last store wins. Because of this,
/// MemorySanitizer reports can show unrelated origins, but this is unlikely in
/// practice.
///
/// Origins are meaningless for fully initialized values, so MemorySanitizer
/// avoids storing origin to memory when a fully initialized value is stored.
/// This way it avoids needless overwriting origin of the 4-byte region on
/// a short (i.e. 1 byte) clean store, and it is also good for performance.
///
/// Atomic handling.
///
/// Ideally, every atomic store of application value should update the
/// corresponding shadow location in an atomic way. Unfortunately, atomic store
/// of two disjoint locations can not be done without severe slowdown.
///
/// Therefore, we implement an approximation that may err on the safe side.
/// In this implementation, every atomically accessed location in the program
/// may only change from (partially) uninitialized to fully initialized, but
/// not the other way around. We load the shadow _after_ the application load,
/// and we store the shadow _before_ the app store. Also, we always store clean
/// shadow (if the application store is atomic). This way, if the store-load
/// pair constitutes a happens-before arc, shadow store and load are correctly
/// ordered such that the load will get either the value that was stored, or
/// some later value (which is always clean).
///
/// This does not work very well with Compare-And-Swap (CAS) and
/// Read-Modify-Write (RMW) operations. To follow the above logic, CAS and RMW
/// must store the new shadow before the app operation, and load the shadow
/// after the app operation. Computers don't work this way. Current
/// implementation ignores the load aspect of CAS/RMW, always returning a clean
/// value. It implements the store part as a simple atomic store by storing a
/// clean shadow.
///
/// Instrumenting inline assembly.
///
/// For inline assembly code LLVM has little idea about which memory locations
/// become initialized depending on the arguments. It can be possible to figure
/// out which arguments are meant to point to inputs and outputs, but the
/// actual semantics can be only visible at runtime. In the Linux kernel it's
/// also possible that the arguments only indicate the offset for a base taken
/// from a segment register, so it's dangerous to treat any asm() arguments as
/// pointers. We take a conservative approach generating calls to
/// __msan_instrument_asm_store(ptr, size)
/// , which defer the memory unpoisoning to the runtime library.
/// The latter can perform more complex address checks to figure out whether
/// it's safe to touch the shadow memory.
/// Like with atomic operations, we call __msan_instrument_asm_store() before
/// the assembly call, so that changes to the shadow memory will be seen by
/// other threads together with main memory initialization.
///
/// KernelMemorySanitizer (KMSAN) implementation.
///
/// The major differences between KMSAN and MSan instrumentation are:
/// - KMSAN always tracks the origins and implies msan-keep-going=true;
/// - KMSAN allocates shadow and origin memory for each page separately, so
/// there are no explicit accesses to shadow and origin in the
/// instrumentation.
/// Shadow and origin values for a particular X-byte memory location
/// (X=1,2,4,8) are accessed through pointers obtained via the
/// __msan_metadata_ptr_for_load_X(ptr)
/// __msan_metadata_ptr_for_store_X(ptr)
/// functions. The corresponding functions check that the X-byte accesses
/// are possible and returns the pointers to shadow and origin memory.
/// Arbitrary sized accesses are handled with:
/// __msan_metadata_ptr_for_load_n(ptr, size)
/// __msan_metadata_ptr_for_store_n(ptr, size);
/// Note that the sanitizer code has to deal with how shadow/origin pairs
/// returned by the these functions are represented in different ABIs. In
/// the X86_64 ABI they are returned in RDX:RAX, in PowerPC64 they are
/// returned in r3 and r4, and in the SystemZ ABI they are written to memory
/// pointed to by a hidden parameter.
/// - TLS variables are stored in a single per-task struct. A call to a
/// function __msan_get_context_state() returning a pointer to that struct
/// is inserted into every instrumented function before the entry block;
/// - __msan_warning() takes a 32-bit origin parameter;
/// - local variables are poisoned with __msan_poison_alloca() upon function
/// entry and unpoisoned with __msan_unpoison_alloca() before leaving the
/// function;
/// - the pass doesn't declare any global variables or add global constructors
/// to the translation unit.
///
/// Also, KMSAN currently ignores uninitialized memory passed into inline asm
/// calls, making sure we're on the safe side wrt. possible false positives.
///
/// KernelMemorySanitizer only supports X86_64, SystemZ and PowerPC64 at the
/// moment.
///
//
// FIXME: This sanitizer does not yet handle scalable vectors
//
//===----------------------------------------------------------------------===//
#include "llvm/Transforms/Instrumentation/MemorySanitizer.h"
#include "llvm/ADT/APInt.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/DepthFirstIterator.h"
#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/Argument.h"
#include "llvm/IR/AttributeMask.h"
#include "llvm/IR/Attributes.h"
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/CallingConv.h"
#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/GlobalValue.h"
#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/InlineAsm.h"
#include "llvm/IR/InstVisitor.h"
#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Intrinsics.h"
#include "llvm/IR/IntrinsicsAArch64.h"
#include "llvm/IR/IntrinsicsX86.h"
#include "llvm/IR/MDBuilder.h"
#include "llvm/IR/Module.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/Value.h"
#include "llvm/IR/ValueMap.h"
#include "llvm/Support/Alignment.h"
#include "llvm/Support/AtomicOrdering.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/DebugCounter.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/TargetParser/Triple.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/Transforms/Utils/Instrumentation.h"
#include "llvm/Transforms/Utils/Local.h"
#include "llvm/Transforms/Utils/ModuleUtils.h"
#include <algorithm>
#include <cassert>
#include <cstddef>
#include <cstdint>
#include <memory>
#include <numeric>
#include <string>
#include <tuple>
using namespace llvm;
#define DEBUG_TYPE "msan"
DEBUG_COUNTER(DebugInsertCheck, "msan-insert-check",
"Controls which checks to insert");
DEBUG_COUNTER(DebugInstrumentInstruction, "msan-instrument-instruction",
"Controls which instruction to instrument");
static const unsigned kOriginSize = 4;
static const Align kMinOriginAlignment = Align(4);
static const Align kShadowTLSAlignment = Align(8);
// These constants must be kept in sync with the ones in msan.h.
static const unsigned kParamTLSSize = 800;
static const unsigned kRetvalTLSSize = 800;
// Accesses sizes are powers of two: 1, 2, 4, 8.
static const size_t kNumberOfAccessSizes = 4;
/// Track origins of uninitialized values.
///
/// Adds a section to MemorySanitizer report that points to the allocation
/// (stack or heap) the uninitialized bits came from originally.
static cl::opt<int> ClTrackOrigins(
"msan-track-origins",
cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden,
cl::init(0));
static cl::opt<bool> ClKeepGoing("msan-keep-going",
cl::desc("keep going after reporting a UMR"),
cl::Hidden, cl::init(false));
static cl::opt<bool>
ClPoisonStack("msan-poison-stack",
cl::desc("poison uninitialized stack variables"), cl::Hidden,
cl::init(true));
static cl::opt<bool> ClPoisonStackWithCall(
"msan-poison-stack-with-call",
cl::desc("poison uninitialized stack variables with a call"), cl::Hidden,
cl::init(false));
static cl::opt<int> ClPoisonStackPattern(
"msan-poison-stack-pattern",
cl::desc("poison uninitialized stack variables with the given pattern"),
cl::Hidden, cl::init(0xff));
static cl::opt<bool>
ClPrintStackNames("msan-print-stack-names",
cl::desc("Print name of local stack variable"),
cl::Hidden, cl::init(true));
static cl::opt<bool>
ClPoisonUndef("msan-poison-undef",
cl::desc("Poison fully undef temporary values. "
"Partially undefined constant vectors "
"are unaffected by this flag (see "
"-msan-poison-undef-vectors)."),
cl::Hidden, cl::init(true));
static cl::opt<bool> ClPoisonUndefVectors(
"msan-poison-undef-vectors",
cl::desc("Precisely poison partially undefined constant vectors. "
"If false (legacy behavior), the entire vector is "
"considered fully initialized, which may lead to false "
"negatives. Fully undefined constant vectors are "
"unaffected by this flag (see -msan-poison-undef)."),
cl::Hidden, cl::init(false));
static cl::opt<bool> ClPreciseDisjointOr(
"msan-precise-disjoint-or",
cl::desc("Precisely poison disjoint OR. If false (legacy behavior), "
"disjointedness is ignored (i.e., 1|1 is initialized)."),
cl::Hidden, cl::init(false));
static cl::opt<bool>
ClHandleICmp("msan-handle-icmp",
cl::desc("propagate shadow through ICmpEQ and ICmpNE"),
cl::Hidden, cl::init(true));
static cl::opt<bool>
ClHandleICmpExact("msan-handle-icmp-exact",
cl::desc("exact handling of relational integer ICmp"),
cl::Hidden, cl::init(true));
static cl::opt<bool> ClHandleLifetimeIntrinsics(
"msan-handle-lifetime-intrinsics",
cl::desc(
"when possible, poison scoped variables at the beginning of the scope "
"(slower, but more precise)"),
cl::Hidden, cl::init(true));
// When compiling the Linux kernel, we sometimes see false positives related to
// MSan being unable to understand that inline assembly calls may initialize
// local variables.
// This flag makes the compiler conservatively unpoison every memory location
// passed into an assembly call. Note that this may cause false positives.
// Because it's impossible to figure out the array sizes, we can only unpoison
// the first sizeof(type) bytes for each type* pointer.
static cl::opt<bool> ClHandleAsmConservative(
"msan-handle-asm-conservative",
cl::desc("conservative handling of inline assembly"), cl::Hidden,
cl::init(true));
// This flag controls whether we check the shadow of the address
// operand of load or store. Such bugs are very rare, since load from
// a garbage address typically results in SEGV, but still happen
// (e.g. only lower bits of address are garbage, or the access happens
// early at program startup where malloc-ed memory is more likely to
// be zeroed. As of 2012-08-28 this flag adds 20% slowdown.
static cl::opt<bool> ClCheckAccessAddress(
"msan-check-access-address",
cl::desc("report accesses through a pointer which has poisoned shadow"),
cl::Hidden, cl::init(true));
static cl::opt<bool> ClEagerChecks(
"msan-eager-checks",
cl::desc("check arguments and return values at function call boundaries"),
cl::Hidden, cl::init(false));
static cl::opt<bool> ClDumpStrictInstructions(
"msan-dump-strict-instructions",
cl::desc("print out instructions with default strict semantics i.e.,"
"check that all the inputs are fully initialized, and mark "
"the output as fully initialized. These semantics are applied "
"to instructions that could not be handled explicitly nor "
"heuristically."),
cl::Hidden, cl::init(false));
// Currently, all the heuristically handled instructions are specifically
// IntrinsicInst. However, we use the broader "HeuristicInstructions" name
// to parallel 'msan-dump-strict-instructions', and to keep the door open to
// handling non-intrinsic instructions heuristically.
static cl::opt<bool> ClDumpHeuristicInstructions(
"msan-dump-heuristic-instructions",
cl::desc("Prints 'unknown' instructions that were handled heuristically. "
"Use -msan-dump-strict-instructions to print instructions that "
"could not be handled explicitly nor heuristically."),
cl::Hidden, cl::init(false));
static cl::opt<int> ClInstrumentationWithCallThreshold(
"msan-instrumentation-with-call-threshold",
cl::desc(
"If the function being instrumented requires more than "
"this number of checks and origin stores, use callbacks instead of "
"inline checks (-1 means never use callbacks)."),
cl::Hidden, cl::init(3500));
static cl::opt<bool>
ClEnableKmsan("msan-kernel",
cl::desc("Enable KernelMemorySanitizer instrumentation"),
cl::Hidden, cl::init(false));
static cl::opt<bool>
ClDisableChecks("msan-disable-checks",
cl::desc("Apply no_sanitize to the whole file"), cl::Hidden,
cl::init(false));
static cl::opt<bool>
ClCheckConstantShadow("msan-check-constant-shadow",
cl::desc("Insert checks for constant shadow values"),
cl::Hidden, cl::init(true));
// This is off by default because of a bug in gold:
// https://sourceware.org/bugzilla/show_bug.cgi?id=19002
static cl::opt<bool>
ClWithComdat("msan-with-comdat",
cl::desc("Place MSan constructors in comdat sections"),
cl::Hidden, cl::init(false));
// These options allow to specify custom memory map parameters
// See MemoryMapParams for details.
static cl::opt<uint64_t> ClAndMask("msan-and-mask",
cl::desc("Define custom MSan AndMask"),
cl::Hidden, cl::init(0));
static cl::opt<uint64_t> ClXorMask("msan-xor-mask",
cl::desc("Define custom MSan XorMask"),
cl::Hidden, cl::init(0));
static cl::opt<uint64_t> ClShadowBase("msan-shadow-base",
cl::desc("Define custom MSan ShadowBase"),
cl::Hidden, cl::init(0));
static cl::opt<uint64_t> ClOriginBase("msan-origin-base",
cl::desc("Define custom MSan OriginBase"),
cl::Hidden, cl::init(0));
static cl::opt<int>
ClDisambiguateWarning("msan-disambiguate-warning-threshold",
cl::desc("Define threshold for number of checks per "
"debug location to force origin update."),
cl::Hidden, cl::init(3));
const char kMsanModuleCtorName[] = "msan.module_ctor";
const char kMsanInitName[] = "__msan_init";
namespace {
// Memory map parameters used in application-to-shadow address calculation.
// Offset = (Addr & ~AndMask) ^ XorMask
// Shadow = ShadowBase + Offset
// Origin = OriginBase + Offset
struct MemoryMapParams {
uint64_t AndMask;
uint64_t XorMask;
uint64_t ShadowBase;
uint64_t OriginBase;
};
struct PlatformMemoryMapParams {
const MemoryMapParams *bits32;
const MemoryMapParams *bits64;
};
} // end anonymous namespace
// i386 Linux
static const MemoryMapParams Linux_I386_MemoryMapParams = {
0x000080000000, // AndMask
0, // XorMask (not used)
0, // ShadowBase (not used)
0x000040000000, // OriginBase
};
// x86_64 Linux
static const MemoryMapParams Linux_X86_64_MemoryMapParams = {
0, // AndMask (not used)
0x500000000000, // XorMask
0, // ShadowBase (not used)
0x100000000000, // OriginBase
};
// mips32 Linux
// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
// after picking good constants
// mips64 Linux
static const MemoryMapParams Linux_MIPS64_MemoryMapParams = {
0, // AndMask (not used)
0x008000000000, // XorMask
0, // ShadowBase (not used)
0x002000000000, // OriginBase
};
// ppc32 Linux
// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
// after picking good constants
// ppc64 Linux
static const MemoryMapParams Linux_PowerPC64_MemoryMapParams = {
0xE00000000000, // AndMask
0x100000000000, // XorMask
0x080000000000, // ShadowBase
0x1C0000000000, // OriginBase
};
// s390x Linux
static const MemoryMapParams Linux_S390X_MemoryMapParams = {
0xC00000000000, // AndMask
0, // XorMask (not used)
0x080000000000, // ShadowBase
0x1C0000000000, // OriginBase
};
// arm32 Linux
// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
// after picking good constants
// aarch64 Linux
static const MemoryMapParams Linux_AArch64_MemoryMapParams = {
0, // AndMask (not used)
0x0B00000000000, // XorMask
0, // ShadowBase (not used)
0x0200000000000, // OriginBase
};
// loongarch64 Linux
static const MemoryMapParams Linux_LoongArch64_MemoryMapParams = {
0, // AndMask (not used)
0x500000000000, // XorMask
0, // ShadowBase (not used)
0x100000000000, // OriginBase
};
// riscv32 Linux
// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
// after picking good constants
// aarch64 FreeBSD
static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams = {
0x1800000000000, // AndMask
0x0400000000000, // XorMask
0x0200000000000, // ShadowBase
0x0700000000000, // OriginBase
};
// i386 FreeBSD
static const MemoryMapParams FreeBSD_I386_MemoryMapParams = {
0x000180000000, // AndMask
0x000040000000, // XorMask
0x000020000000, // ShadowBase
0x000700000000, // OriginBase
};
// x86_64 FreeBSD
static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams = {
0xc00000000000, // AndMask
0x200000000000, // XorMask
0x100000000000, // ShadowBase
0x380000000000, // OriginBase
};
// x86_64 NetBSD
static const MemoryMapParams NetBSD_X86_64_MemoryMapParams = {
0, // AndMask
0x500000000000, // XorMask
0, // ShadowBase
0x100000000000, // OriginBase
};
static const PlatformMemoryMapParams Linux_X86_MemoryMapParams = {
&Linux_I386_MemoryMapParams,
&Linux_X86_64_MemoryMapParams,
};
static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams = {
nullptr,
&Linux_MIPS64_MemoryMapParams,
};
static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams = {
nullptr,
&Linux_PowerPC64_MemoryMapParams,
};
static const PlatformMemoryMapParams Linux_S390_MemoryMapParams = {
nullptr,
&Linux_S390X_MemoryMapParams,
};
static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams = {
nullptr,
&Linux_AArch64_MemoryMapParams,
};
static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams = {
nullptr,
&Linux_LoongArch64_MemoryMapParams,
};
static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams = {
nullptr,
&FreeBSD_AArch64_MemoryMapParams,
};
static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams = {
&FreeBSD_I386_MemoryMapParams,
&FreeBSD_X86_64_MemoryMapParams,
};
static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams = {
nullptr,
&NetBSD_X86_64_MemoryMapParams,
};
namespace {
/// Instrument functions of a module to detect uninitialized reads.
///
/// Instantiating MemorySanitizer inserts the msan runtime library API function
/// declarations into the module if they don't exist already. Instantiating
/// ensures the __msan_init function is in the list of global constructors for
/// the module.
class MemorySanitizer {
public:
MemorySanitizer(Module &M, MemorySanitizerOptions Options)
: CompileKernel(Options.Kernel), TrackOrigins(Options.TrackOrigins),
Recover(Options.Recover), EagerChecks(Options.EagerChecks) {
initializeModule(M);
}
// MSan cannot be moved or copied because of MapParams.
MemorySanitizer(MemorySanitizer &&) = delete;
MemorySanitizer &operator=(MemorySanitizer &&) = delete;
MemorySanitizer(const MemorySanitizer &) = delete;
MemorySanitizer &operator=(const MemorySanitizer &) = delete;
bool sanitizeFunction(Function &F, TargetLibraryInfo &TLI);
private:
friend struct MemorySanitizerVisitor;
friend struct VarArgHelperBase;
friend struct VarArgAMD64Helper;
friend struct VarArgAArch64Helper;
friend struct VarArgPowerPC64Helper;
friend struct VarArgPowerPC32Helper;
friend struct VarArgSystemZHelper;
friend struct VarArgI386Helper;
friend struct VarArgGenericHelper;
void initializeModule(Module &M);
void initializeCallbacks(Module &M, const TargetLibraryInfo &TLI);
void createKernelApi(Module &M, const TargetLibraryInfo &TLI);
void createUserspaceApi(Module &M, const TargetLibraryInfo &TLI);
template <typename... ArgsTy>
FunctionCallee getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
ArgsTy... Args);
/// True if we're compiling the Linux kernel.
bool CompileKernel;
/// Track origins (allocation points) of uninitialized values.
int TrackOrigins;
bool Recover;
bool EagerChecks;
Triple TargetTriple;
LLVMContext *C;
Type *IntptrTy; ///< Integer type with the size of a ptr in default AS.
Type *OriginTy;
PointerType *PtrTy; ///< Integer type with the size of a ptr in default AS.
// XxxTLS variables represent the per-thread state in MSan and per-task state
// in KMSAN.
// For the userspace these point to thread-local globals. In the kernel land
// they point to the members of a per-task struct obtained via a call to
// __msan_get_context_state().
/// Thread-local shadow storage for function parameters.
Value *ParamTLS;
/// Thread-local origin storage for function parameters.
Value *ParamOriginTLS;
/// Thread-local shadow storage for function return value.
Value *RetvalTLS;
/// Thread-local origin storage for function return value.
Value *RetvalOriginTLS;
/// Thread-local shadow storage for in-register va_arg function.
Value *VAArgTLS;
/// Thread-local shadow storage for in-register va_arg function.
Value *VAArgOriginTLS;
/// Thread-local shadow storage for va_arg overflow area.
Value *VAArgOverflowSizeTLS;
/// Are the instrumentation callbacks set up?
bool CallbacksInitialized = false;
/// The run-time callback to print a warning.
FunctionCallee WarningFn;
// These arrays are indexed by log2(AccessSize).
FunctionCallee MaybeWarningFn[kNumberOfAccessSizes];
FunctionCallee MaybeWarningVarSizeFn;
FunctionCallee MaybeStoreOriginFn[kNumberOfAccessSizes];
/// Run-time helper that generates a new origin value for a stack
/// allocation.
FunctionCallee MsanSetAllocaOriginWithDescriptionFn;
// No description version
FunctionCallee MsanSetAllocaOriginNoDescriptionFn;
/// Run-time helper that poisons stack on function entry.
FunctionCallee MsanPoisonStackFn;
/// Run-time helper that records a store (or any event) of an
/// uninitialized value and returns an updated origin id encoding this info.
FunctionCallee MsanChainOriginFn;
/// Run-time helper that paints an origin over a region.
FunctionCallee MsanSetOriginFn;
/// MSan runtime replacements for memmove, memcpy and memset.
FunctionCallee MemmoveFn, MemcpyFn, MemsetFn;
/// KMSAN callback for task-local function argument shadow.
StructType *MsanContextStateTy;
FunctionCallee MsanGetContextStateFn;
/// Functions for poisoning/unpoisoning local variables
FunctionCallee MsanPoisonAllocaFn, MsanUnpoisonAllocaFn;
/// Pair of shadow/origin pointers.
Type *MsanMetadata;
/// Each of the MsanMetadataPtrXxx functions returns a MsanMetadata.
FunctionCallee MsanMetadataPtrForLoadN, MsanMetadataPtrForStoreN;
FunctionCallee MsanMetadataPtrForLoad_1_8[4];
FunctionCallee MsanMetadataPtrForStore_1_8[4];
FunctionCallee MsanInstrumentAsmStoreFn;
/// Storage for return values of the MsanMetadataPtrXxx functions.
Value *MsanMetadataAlloca;
/// Helper to choose between different MsanMetadataPtrXxx().
FunctionCallee getKmsanShadowOriginAccessFn(bool isStore, int size);
/// Memory map parameters used in application-to-shadow calculation.
const MemoryMapParams *MapParams;
/// Custom memory map parameters used when -msan-shadow-base or
// -msan-origin-base is provided.
MemoryMapParams CustomMapParams;
MDNode *ColdCallWeights;
/// Branch weights for origin store.
MDNode *OriginStoreWeights;
};
void insertModuleCtor(Module &M) {
getOrCreateSanitizerCtorAndInitFunctions(
M, kMsanModuleCtorName, kMsanInitName,
/*InitArgTypes=*/{},
/*InitArgs=*/{},
// This callback is invoked when the functions are created the first
// time. Hook them into the global ctors list in that case:
[&](Function *Ctor, FunctionCallee) {
if (!ClWithComdat) {
appendToGlobalCtors(M, Ctor, 0);
return;
}
Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName);
Ctor->setComdat(MsanCtorComdat);
appendToGlobalCtors(M, Ctor, 0, Ctor);
});
}
template <class T> T getOptOrDefault(const cl::opt<T> &Opt, T Default) {
return (Opt.getNumOccurrences() > 0) ? Opt : Default;
}
} // end anonymous namespace
MemorySanitizerOptions::MemorySanitizerOptions(int TO, bool R, bool K,
bool EagerChecks)
: Kernel(getOptOrDefault(ClEnableKmsan, K)),
TrackOrigins(getOptOrDefault(ClTrackOrigins, Kernel ? 2 : TO)),
Recover(getOptOrDefault(ClKeepGoing, Kernel || R)),
EagerChecks(getOptOrDefault(ClEagerChecks, EagerChecks)) {}
PreservedAnalyses MemorySanitizerPass::run(Module &M,
ModuleAnalysisManager &AM) {
// Return early if nosanitize_memory module flag is present for the module.
if (checkIfAlreadyInstrumented(M, "nosanitize_memory"))
return PreservedAnalyses::all();
bool Modified = false;
if (!Options.Kernel) {
insertModuleCtor(M);
Modified = true;
}
auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
for (Function &F : M) {
if (F.empty())
continue;
MemorySanitizer Msan(*F.getParent(), Options);
Modified |=
Msan.sanitizeFunction(F, FAM.getResult<TargetLibraryAnalysis>(F));
}
if (!Modified)
return PreservedAnalyses::all();
PreservedAnalyses PA = PreservedAnalyses::none();
// GlobalsAA is considered stateless and does not get invalidated unless
// explicitly invalidated; PreservedAnalyses::none() is not enough. Sanitizers
// make changes that require GlobalsAA to be invalidated.
PA.abandon<GlobalsAA>();
return PA;
}
void MemorySanitizerPass::printPipeline(
raw_ostream &OS, function_ref<StringRef(StringRef)> MapClassName2PassName) {
static_cast<PassInfoMixin<MemorySanitizerPass> *>(this)->printPipeline(
OS, MapClassName2PassName);
OS << '<';
if (Options.Recover)
OS << "recover;";
if (Options.Kernel)
OS << "kernel;";
if (Options.EagerChecks)
OS << "eager-checks;";
OS << "track-origins=" << Options.TrackOrigins;
OS << '>';
}
/// Create a non-const global initialized with the given string.
///
/// Creates a writable global for Str so that we can pass it to the
/// run-time lib. Runtime uses first 4 bytes of the string to store the
/// frame ID, so the string needs to be mutable.
static GlobalVariable *createPrivateConstGlobalForString(Module &M,
StringRef Str) {
Constant *StrConst = ConstantDataArray::getString(M.getContext(), Str);
return new GlobalVariable(M, StrConst->getType(), /*isConstant=*/true,
GlobalValue::PrivateLinkage, StrConst, "");
}
template <typename... ArgsTy>
FunctionCallee
MemorySanitizer::getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
ArgsTy... Args) {
if (TargetTriple.getArch() == Triple::systemz) {
// SystemZ ABI: shadow/origin pair is returned via a hidden parameter.
return M.getOrInsertFunction(Name, Type::getVoidTy(*C), PtrTy,
std::forward<ArgsTy>(Args)...);
}
return M.getOrInsertFunction(Name, MsanMetadata,
std::forward<ArgsTy>(Args)...);
}
/// Create KMSAN API callbacks.
void MemorySanitizer::createKernelApi(Module &M, const TargetLibraryInfo &TLI) {
IRBuilder<> IRB(*C);
// These will be initialized in insertKmsanPrologue().
RetvalTLS = nullptr;
RetvalOriginTLS = nullptr;
ParamTLS = nullptr;
ParamOriginTLS = nullptr;
VAArgTLS = nullptr;
VAArgOriginTLS = nullptr;
VAArgOverflowSizeTLS = nullptr;
WarningFn = M.getOrInsertFunction("__msan_warning",
TLI.getAttrList(C, {0}, /*Signed=*/false),
IRB.getVoidTy(), IRB.getInt32Ty());
// Requests the per-task context state (kmsan_context_state*) from the
// runtime library.
MsanContextStateTy = StructType::get(
ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8),
ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8), /* va_arg_origin */
IRB.getInt64Ty(), ArrayType::get(OriginTy, kParamTLSSize / 4), OriginTy,
OriginTy);
MsanGetContextStateFn =
M.getOrInsertFunction("__msan_get_context_state", PtrTy);
MsanMetadata = StructType::get(PtrTy, PtrTy);
for (int ind = 0, size = 1; ind < 4; ind++, size <<= 1) {
std::string name_load =
"__msan_metadata_ptr_for_load_" + std::to_string(size);
std::string name_store =
"__msan_metadata_ptr_for_store_" + std::to_string(size);
MsanMetadataPtrForLoad_1_8[ind] =
getOrInsertMsanMetadataFunction(M, name_load, PtrTy);
MsanMetadataPtrForStore_1_8[ind] =
getOrInsertMsanMetadataFunction(M, name_store, PtrTy);
}
MsanMetadataPtrForLoadN = getOrInsertMsanMetadataFunction(
M, "__msan_metadata_ptr_for_load_n", PtrTy, IntptrTy);
MsanMetadataPtrForStoreN = getOrInsertMsanMetadataFunction(
M, "__msan_metadata_ptr_for_store_n", PtrTy, IntptrTy);
// Functions for poisoning and unpoisoning memory.
MsanPoisonAllocaFn = M.getOrInsertFunction(
"__msan_poison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
MsanUnpoisonAllocaFn = M.getOrInsertFunction(
"__msan_unpoison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy);
}
static Constant *getOrInsertGlobal(Module &M, StringRef Name, Type *Ty) {
return M.getOrInsertGlobal(Name, Ty, [&] {
return new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
nullptr, Name, nullptr,
GlobalVariable::InitialExecTLSModel);
});
}
/// Insert declarations for userspace-specific functions and globals.
void MemorySanitizer::createUserspaceApi(Module &M,
const TargetLibraryInfo &TLI) {
IRBuilder<> IRB(*C);
// Create the callback.
// FIXME: this function should have "Cold" calling conv,
// which is not yet implemented.
if (TrackOrigins) {
StringRef WarningFnName = Recover ? "__msan_warning_with_origin"
: "__msan_warning_with_origin_noreturn";
WarningFn = M.getOrInsertFunction(WarningFnName,
TLI.getAttrList(C, {0}, /*Signed=*/false),
IRB.getVoidTy(), IRB.getInt32Ty());
} else {
StringRef WarningFnName =
Recover ? "__msan_warning" : "__msan_warning_noreturn";
WarningFn = M.getOrInsertFunction(WarningFnName, IRB.getVoidTy());
}
// Create the global TLS variables.
RetvalTLS =
getOrInsertGlobal(M, "__msan_retval_tls",
ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8));
RetvalOriginTLS = getOrInsertGlobal(M, "__msan_retval_origin_tls", OriginTy);
ParamTLS =
getOrInsertGlobal(M, "__msan_param_tls",
ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
ParamOriginTLS =
getOrInsertGlobal(M, "__msan_param_origin_tls",
ArrayType::get(OriginTy, kParamTLSSize / 4));
VAArgTLS =
getOrInsertGlobal(M, "__msan_va_arg_tls",
ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
VAArgOriginTLS =
getOrInsertGlobal(M, "__msan_va_arg_origin_tls",
ArrayType::get(OriginTy, kParamTLSSize / 4));
VAArgOverflowSizeTLS = getOrInsertGlobal(M, "__msan_va_arg_overflow_size_tls",
IRB.getIntPtrTy(M.getDataLayout()));
for (size_t AccessSizeIndex = 0; AccessSizeIndex < kNumberOfAccessSizes;
AccessSizeIndex++) {
unsigned AccessSize = 1 << AccessSizeIndex;
std::string FunctionName = "__msan_maybe_warning_" + itostr(AccessSize);
MaybeWarningFn[AccessSizeIndex] = M.getOrInsertFunction(
FunctionName, TLI.getAttrList(C, {0, 1}, /*Signed=*/false),
IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), IRB.getInt32Ty());
MaybeWarningVarSizeFn = M.getOrInsertFunction(
"__msan_maybe_warning_N", TLI.getAttrList(C, {}, /*Signed=*/false),
IRB.getVoidTy(), PtrTy, IRB.getInt64Ty(), IRB.getInt32Ty());
FunctionName = "__msan_maybe_store_origin_" + itostr(AccessSize);
MaybeStoreOriginFn[AccessSizeIndex] = M.getOrInsertFunction(
FunctionName, TLI.getAttrList(C, {0, 2}, /*Signed=*/false),
IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), PtrTy,
IRB.getInt32Ty());
}
MsanSetAllocaOriginWithDescriptionFn =
M.getOrInsertFunction("__msan_set_alloca_origin_with_descr",
IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy, PtrTy);
MsanSetAllocaOriginNoDescriptionFn =
M.getOrInsertFunction("__msan_set_alloca_origin_no_descr",
IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
MsanPoisonStackFn = M.getOrInsertFunction("__msan_poison_stack",
IRB.getVoidTy(), PtrTy, IntptrTy);
}
/// Insert extern declaration of runtime-provided functions and globals.
void MemorySanitizer::initializeCallbacks(Module &M,
const TargetLibraryInfo &TLI) {
// Only do this once.
if (CallbacksInitialized)
return;
IRBuilder<> IRB(*C);
// Initialize callbacks that are common for kernel and userspace
// instrumentation.
MsanChainOriginFn = M.getOrInsertFunction(
"__msan_chain_origin",
TLI.getAttrList(C, {0}, /*Signed=*/false, /*Ret=*/true), IRB.getInt32Ty(),
IRB.getInt32Ty());
MsanSetOriginFn = M.getOrInsertFunction(
"__msan_set_origin", TLI.getAttrList(C, {2}, /*Signed=*/false),
IRB.getVoidTy(), PtrTy, IntptrTy, IRB.getInt32Ty());
MemmoveFn =
M.getOrInsertFunction("__msan_memmove", PtrTy, PtrTy, PtrTy, IntptrTy);
MemcpyFn =
M.getOrInsertFunction("__msan_memcpy", PtrTy, PtrTy, PtrTy, IntptrTy);
MemsetFn = M.getOrInsertFunction("__msan_memset",
TLI.getAttrList(C, {1}, /*Signed=*/true),
PtrTy, PtrTy, IRB.getInt32Ty(), IntptrTy);
MsanInstrumentAsmStoreFn = M.getOrInsertFunction(
"__msan_instrument_asm_store", IRB.getVoidTy(), PtrTy, IntptrTy);
if (CompileKernel) {
createKernelApi(M, TLI);
} else {
createUserspaceApi(M, TLI);
}
CallbacksInitialized = true;
}
FunctionCallee MemorySanitizer::getKmsanShadowOriginAccessFn(bool isStore,
int size) {
FunctionCallee *Fns =
isStore ? MsanMetadataPtrForStore_1_8 : MsanMetadataPtrForLoad_1_8;
switch (size) {
case 1:
return Fns[0];
case 2:
return Fns[1];
case 4:
return Fns[2];
case 8:
return Fns[3];
default:
return nullptr;
}
}
/// Module-level initialization.
///
/// inserts a call to __msan_init to the module's constructor list.
void MemorySanitizer::initializeModule(Module &M) {
auto &DL = M.getDataLayout();
TargetTriple = M.getTargetTriple();
bool ShadowPassed = ClShadowBase.getNumOccurrences() > 0;
bool OriginPassed = ClOriginBase.getNumOccurrences() > 0;
// Check the overrides first
if (ShadowPassed || OriginPassed) {
CustomMapParams.AndMask = ClAndMask;
CustomMapParams.XorMask = ClXorMask;
CustomMapParams.ShadowBase = ClShadowBase;
CustomMapParams.OriginBase = ClOriginBase;
MapParams = &CustomMapParams;
} else {
switch (TargetTriple.getOS()) {
case Triple::FreeBSD:
switch (TargetTriple.getArch()) {
case Triple::aarch64:
MapParams = FreeBSD_ARM_MemoryMapParams.bits64;
break;
case Triple::x86_64:
MapParams = FreeBSD_X86_MemoryMapParams.bits64;
break;
case Triple::x86:
MapParams = FreeBSD_X86_MemoryMapParams.bits32;
break;
default:
report_fatal_error("unsupported architecture");
}
break;
case Triple::NetBSD:
switch (TargetTriple.getArch()) {
case Triple::x86_64:
MapParams = NetBSD_X86_MemoryMapParams.bits64;
break;
default:
report_fatal_error("unsupported architecture");
}
break;
case Triple::Linux:
switch (TargetTriple.getArch()) {
case Triple::x86_64:
MapParams = Linux_X86_MemoryMapParams.bits64;
break;
case Triple::x86:
MapParams = Linux_X86_MemoryMapParams.bits32;
break;
case Triple::mips64:
case Triple::mips64el:
MapParams = Linux_MIPS_MemoryMapParams.bits64;
break;
case Triple::ppc64:
case Triple::ppc64le:
MapParams = Linux_PowerPC_MemoryMapParams.bits64;
break;
case Triple::systemz:
MapParams = Linux_S390_MemoryMapParams.bits64;
break;
case Triple::aarch64:
case Triple::aarch64_be:
MapParams = Linux_ARM_MemoryMapParams.bits64;
break;
case Triple::loongarch64:
MapParams = Linux_LoongArch_MemoryMapParams.bits64;
break;
default:
report_fatal_error("unsupported architecture");
}
break;
default:
report_fatal_error("unsupported operating system");
}
}
C = &(M.getContext());
IRBuilder<> IRB(*C);
IntptrTy = IRB.getIntPtrTy(DL);
OriginTy = IRB.getInt32Ty();
PtrTy = IRB.getPtrTy();
ColdCallWeights = MDBuilder(*C).createUnlikelyBranchWeights();
OriginStoreWeights = MDBuilder(*C).createUnlikelyBranchWeights();
if (!CompileKernel) {
if (TrackOrigins)
M.getOrInsertGlobal("__msan_track_origins", IRB.getInt32Ty(), [&] {
return new GlobalVariable(
M, IRB.getInt32Ty(), true, GlobalValue::WeakODRLinkage,
IRB.getInt32(TrackOrigins), "__msan_track_origins");
});
if (Recover)
M.getOrInsertGlobal("__msan_keep_going", IRB.getInt32Ty(), [&] {
return new GlobalVariable(M, IRB.getInt32Ty(), true,
GlobalValue::WeakODRLinkage,
IRB.getInt32(Recover), "__msan_keep_going");
});
}
}
namespace {
/// A helper class that handles instrumentation of VarArg
/// functions on a particular platform.
///
/// Implementations are expected to insert the instrumentation
/// necessary to propagate argument shadow through VarArg function
/// calls. Visit* methods are called during an InstVisitor pass over
/// the function, and should avoid creating new basic blocks. A new
/// instance of this class is created for each instrumented function.
struct VarArgHelper {
virtual ~VarArgHelper() = default;
/// Visit a CallBase.
virtual void visitCallBase(CallBase &CB, IRBuilder<> &IRB) = 0;
/// Visit a va_start call.
virtual void visitVAStartInst(VAStartInst &I) = 0;
/// Visit a va_copy call.
virtual void visitVACopyInst(VACopyInst &I) = 0;
/// Finalize function instrumentation.
///
/// This method is called after visiting all interesting (see above)
/// instructions in a function.
virtual void finalizeInstrumentation() = 0;
};
struct MemorySanitizerVisitor;
} // end anonymous namespace
static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
MemorySanitizerVisitor &Visitor);
static unsigned TypeSizeToSizeIndex(TypeSize TS) {
if (TS.isScalable())
// Scalable types unconditionally take slowpaths.
return kNumberOfAccessSizes;
unsigned TypeSizeFixed = TS.getFixedValue();
if (TypeSizeFixed <= 8)
return 0;
return Log2_32_Ceil((TypeSizeFixed + 7) / 8);
}
namespace {
/// Helper class to attach debug information of the given instruction onto new
/// instructions inserted after.
class NextNodeIRBuilder : public IRBuilder<> {
public:
explicit NextNodeIRBuilder(Instruction *IP) : IRBuilder<>(IP->getNextNode()) {
SetCurrentDebugLocation(IP->getDebugLoc());
}
};
/// This class does all the work for a given function. Store and Load
/// instructions store and load corresponding shadow and origin
/// values. Most instructions propagate shadow from arguments to their
/// return values. Certain instructions (most importantly, BranchInst)
/// test their argument shadow and print reports (with a runtime call) if it's
/// non-zero.
struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
Function &F;
MemorySanitizer &MS;
SmallVector<PHINode *, 16> ShadowPHINodes, OriginPHINodes;
ValueMap<Value *, Value *> ShadowMap, OriginMap;
std::unique_ptr<VarArgHelper> VAHelper;
const TargetLibraryInfo *TLI;
Instruction *FnPrologueEnd;
SmallVector<Instruction *, 16> Instructions;
// The following flags disable parts of MSan instrumentation based on
// exclusion list contents and command-line options.
bool InsertChecks;
bool PropagateShadow;
bool PoisonStack;
bool PoisonUndef;
bool PoisonUndefVectors;
struct ShadowOriginAndInsertPoint {
Value *Shadow;
Value *Origin;
Instruction *OrigIns;
ShadowOriginAndInsertPoint(Value *S, Value *O, Instruction *I)
: Shadow(S), Origin(O), OrigIns(I) {}
};
SmallVector<ShadowOriginAndInsertPoint, 16> InstrumentationList;
DenseMap<const DILocation *, int> LazyWarningDebugLocationCount;
SmallSetVector<AllocaInst *, 16> AllocaSet;
SmallVector<std::pair<IntrinsicInst *, AllocaInst *>, 16> LifetimeStartList;
SmallVector<StoreInst *, 16> StoreList;
int64_t SplittableBlocksCount = 0;
MemorySanitizerVisitor(Function &F, MemorySanitizer &MS,
const TargetLibraryInfo &TLI)
: F(F), MS(MS), VAHelper(CreateVarArgHelper(F, MS, *this)), TLI(&TLI) {
bool SanitizeFunction =
F.hasFnAttribute(Attribute::SanitizeMemory) && !ClDisableChecks;
InsertChecks = SanitizeFunction;
PropagateShadow = SanitizeFunction;
PoisonStack = SanitizeFunction && ClPoisonStack;
PoisonUndef = SanitizeFunction && ClPoisonUndef;
PoisonUndefVectors = SanitizeFunction && ClPoisonUndefVectors;
// In the presence of unreachable blocks, we may see Phi nodes with
// incoming nodes from such blocks. Since InstVisitor skips unreachable
// blocks, such nodes will not have any shadow value associated with them.
// It's easier to remove unreachable blocks than deal with missing shadow.
removeUnreachableBlocks(F);
MS.initializeCallbacks(*F.getParent(), TLI);
FnPrologueEnd =
IRBuilder<>(&F.getEntryBlock(), F.getEntryBlock().getFirstNonPHIIt())
.CreateIntrinsic(Intrinsic::donothing, {});
if (MS.CompileKernel) {
IRBuilder<> IRB(FnPrologueEnd);
insertKmsanPrologue(IRB);
}
LLVM_DEBUG(if (!InsertChecks) dbgs()
<< "MemorySanitizer is not inserting checks into '"
<< F.getName() << "'\n");
}
bool instrumentWithCalls(Value *V) {
// Constants likely will be eliminated by follow-up passes.
if (isa<Constant>(V))
return false;
++SplittableBlocksCount;
return ClInstrumentationWithCallThreshold >= 0 &&
SplittableBlocksCount > ClInstrumentationWithCallThreshold;
}
bool isInPrologue(Instruction &I) {
return I.getParent() == FnPrologueEnd->getParent() &&
(&I == FnPrologueEnd || I.comesBefore(FnPrologueEnd));
}
// Creates a new origin and records the stack trace. In general we can call
// this function for any origin manipulation we like. However it will cost
// runtime resources. So use this wisely only if it can provide additional
// information helpful to a user.
Value *updateOrigin(Value *V, IRBuilder<> &IRB) {
if (MS.TrackOrigins <= 1)
return V;
return IRB.CreateCall(MS.MsanChainOriginFn, V);
}
Value *originToIntptr(IRBuilder<> &IRB, Value *Origin) {
const DataLayout &DL = F.getDataLayout();
unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
if (IntptrSize == kOriginSize)
return Origin;
assert(IntptrSize == kOriginSize * 2);
Origin = IRB.CreateIntCast(Origin, MS.IntptrTy, /* isSigned */ false);
return IRB.CreateOr(Origin, IRB.CreateShl(Origin, kOriginSize * 8));
}
/// Fill memory range with the given origin value.
void paintOrigin(IRBuilder<> &IRB, Value *Origin, Value *OriginPtr,
TypeSize TS, Align Alignment) {
const DataLayout &DL = F.getDataLayout();
const Align IntptrAlignment = DL.getABITypeAlign(MS.IntptrTy);
unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
assert(IntptrAlignment >= kMinOriginAlignment);
assert(IntptrSize >= kOriginSize);
// Note: The loop based formation works for fixed length vectors too,
// however we prefer to unroll and specialize alignment below.
if (TS.isScalable()) {
Value *Size = IRB.CreateTypeSize(MS.IntptrTy, TS);
Value *RoundUp =
IRB.CreateAdd(Size, ConstantInt::get(MS.IntptrTy, kOriginSize - 1));
Value *End =
IRB.CreateUDiv(RoundUp, ConstantInt::get(MS.IntptrTy, kOriginSize));
auto [InsertPt, Index] =
SplitBlockAndInsertSimpleForLoop(End, IRB.GetInsertPoint());
IRB.SetInsertPoint(InsertPt);
Value *GEP = IRB.CreateGEP(MS.OriginTy, OriginPtr, Index);
IRB.CreateAlignedStore(Origin, GEP, kMinOriginAlignment);
return;
}
unsigned Size = TS.getFixedValue();
unsigned Ofs = 0;
Align CurrentAlignment = Alignment;
if (Alignment >= IntptrAlignment && IntptrSize > kOriginSize) {
Value *IntptrOrigin = originToIntptr(IRB, Origin);
Value *IntptrOriginPtr = IRB.CreatePointerCast(OriginPtr, MS.PtrTy);
for (unsigned i = 0; i < Size / IntptrSize; ++i) {
Value *Ptr = i ? IRB.CreateConstGEP1_32(MS.IntptrTy, IntptrOriginPtr, i)
: IntptrOriginPtr;
IRB.CreateAlignedStore(IntptrOrigin, Ptr, CurrentAlignment);
Ofs += IntptrSize / kOriginSize;
CurrentAlignment = IntptrAlignment;
}
}
for (unsigned i = Ofs; i < (Size + kOriginSize - 1) / kOriginSize; ++i) {
Value *GEP =
i ? IRB.CreateConstGEP1_32(MS.OriginTy, OriginPtr, i) : OriginPtr;
IRB.CreateAlignedStore(Origin, GEP, CurrentAlignment);
CurrentAlignment = kMinOriginAlignment;
}
}
void storeOrigin(IRBuilder<> &IRB, Value *Addr, Value *Shadow, Value *Origin,
Value *OriginPtr, Align Alignment) {
const DataLayout &DL = F.getDataLayout();
const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
// ZExt cannot convert between vector and scalar
Value *ConvertedShadow = convertShadowToScalar(Shadow, IRB);
if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
// Origin is not needed: value is initialized or const shadow is
// ignored.
return;
}
if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
// Copy origin as the value is definitely uninitialized.
paintOrigin(IRB, updateOrigin(Origin, IRB), OriginPtr, StoreSize,
OriginAlignment);
return;
}
// Fallback to runtime check, which still can be optimized out later.
}
TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
if (instrumentWithCalls(ConvertedShadow) &&
SizeIndex < kNumberOfAccessSizes && !MS.CompileKernel) {
FunctionCallee Fn = MS.MaybeStoreOriginFn[SizeIndex];
Value *ConvertedShadow2 =
IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
CallBase *CB = IRB.CreateCall(Fn, {ConvertedShadow2, Addr, Origin});
CB->addParamAttr(0, Attribute::ZExt);
CB->addParamAttr(2, Attribute::ZExt);
} else {
Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
Instruction *CheckTerm = SplitBlockAndInsertIfThen(
Cmp, &*IRB.GetInsertPoint(), false, MS.OriginStoreWeights);
IRBuilder<> IRBNew(CheckTerm);
paintOrigin(IRBNew, updateOrigin(Origin, IRBNew), OriginPtr, StoreSize,
OriginAlignment);
}
}
void materializeStores() {
for (StoreInst *SI : StoreList) {
IRBuilder<> IRB(SI);
Value *Val = SI->getValueOperand();
Value *Addr = SI->getPointerOperand();
Value *Shadow = SI->isAtomic() ? getCleanShadow(Val) : getShadow(Val);
Value *ShadowPtr, *OriginPtr;
Type *ShadowTy = Shadow->getType();
const Align Alignment = SI->getAlign();
const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
std::tie(ShadowPtr, OriginPtr) =
getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ true);
[[maybe_unused]] StoreInst *NewSI =
IRB.CreateAlignedStore(Shadow, ShadowPtr, Alignment);
LLVM_DEBUG(dbgs() << " STORE: " << *NewSI << "\n");
if (SI->isAtomic())
SI->setOrdering(addReleaseOrdering(SI->getOrdering()));
if (MS.TrackOrigins && !SI->isAtomic())
storeOrigin(IRB, Addr, Shadow, getOrigin(Val), OriginPtr,
OriginAlignment);
}
}
// Returns true if Debug Location corresponds to multiple warnings.
bool shouldDisambiguateWarningLocation(const DebugLoc &DebugLoc) {
if (MS.TrackOrigins < 2)
return false;
if (LazyWarningDebugLocationCount.empty())
for (const auto &I : InstrumentationList)
++LazyWarningDebugLocationCount[I.OrigIns->getDebugLoc()];
return LazyWarningDebugLocationCount[DebugLoc] >= ClDisambiguateWarning;
}
/// Helper function to insert a warning at IRB's current insert point.
void insertWarningFn(IRBuilder<> &IRB, Value *Origin) {
if (!Origin)
Origin = (Value *)IRB.getInt32(0);
assert(Origin->getType()->isIntegerTy());
if (shouldDisambiguateWarningLocation(IRB.getCurrentDebugLocation())) {
// Try to create additional origin with debug info of the last origin
// instruction. It may provide additional information to the user.
if (Instruction *OI = dyn_cast_or_null<Instruction>(Origin)) {
assert(MS.TrackOrigins);
auto NewDebugLoc = OI->getDebugLoc();
// Origin update with missing or the same debug location provides no
// additional value.
if (NewDebugLoc && NewDebugLoc != IRB.getCurrentDebugLocation()) {
// Insert update just before the check, so we call runtime only just
// before the report.
IRBuilder<> IRBOrigin(&*IRB.GetInsertPoint());
IRBOrigin.SetCurrentDebugLocation(NewDebugLoc);
Origin = updateOrigin(Origin, IRBOrigin);
}
}
}
if (MS.CompileKernel || MS.TrackOrigins)
IRB.CreateCall(MS.WarningFn, Origin)->setCannotMerge();
else
IRB.CreateCall(MS.WarningFn)->setCannotMerge();
// FIXME: Insert UnreachableInst if !MS.Recover?
// This may invalidate some of the following checks and needs to be done
// at the very end.
}
void materializeOneCheck(IRBuilder<> &IRB, Value *ConvertedShadow,
Value *Origin) {
const DataLayout &DL = F.getDataLayout();
TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
if (instrumentWithCalls(ConvertedShadow) && !MS.CompileKernel) {
// ZExt cannot convert between vector and scalar
ConvertedShadow = convertShadowToScalar(ConvertedShadow, IRB);
Value *ConvertedShadow2 =
IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
if (SizeIndex < kNumberOfAccessSizes) {
FunctionCallee Fn = MS.MaybeWarningFn[SizeIndex];
CallBase *CB = IRB.CreateCall(
Fn,
{ConvertedShadow2,
MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
CB->addParamAttr(0, Attribute::ZExt);
CB->addParamAttr(1, Attribute::ZExt);
} else {
FunctionCallee Fn = MS.MaybeWarningVarSizeFn;
Value *ShadowAlloca = IRB.CreateAlloca(ConvertedShadow2->getType(), 0u);
IRB.CreateStore(ConvertedShadow2, ShadowAlloca);
unsigned ShadowSize = DL.getTypeAllocSize(ConvertedShadow2->getType());
CallBase *CB = IRB.CreateCall(
Fn,
{ShadowAlloca, ConstantInt::get(IRB.getInt64Ty(), ShadowSize),
MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
CB->addParamAttr(1, Attribute::ZExt);
CB->addParamAttr(2, Attribute::ZExt);
}
} else {
Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
Instruction *CheckTerm = SplitBlockAndInsertIfThen(
Cmp, &*IRB.GetInsertPoint(),
/* Unreachable */ !MS.Recover, MS.ColdCallWeights);
IRB.SetInsertPoint(CheckTerm);
insertWarningFn(IRB, Origin);
LLVM_DEBUG(dbgs() << " CHECK: " << *Cmp << "\n");
}
}
void materializeInstructionChecks(
ArrayRef<ShadowOriginAndInsertPoint> InstructionChecks) {
const DataLayout &DL = F.getDataLayout();
// Disable combining in some cases. TrackOrigins checks each shadow to pick
// correct origin.
bool Combine = !MS.TrackOrigins;
Instruction *Instruction = InstructionChecks.front().OrigIns;
Value *Shadow = nullptr;
for (const auto &ShadowData : InstructionChecks) {
assert(ShadowData.OrigIns == Instruction);
IRBuilder<> IRB(Instruction);
Value *ConvertedShadow = ShadowData.Shadow;
if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
// Skip, value is initialized or const shadow is ignored.
continue;
}
if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
// Report as the value is definitely uninitialized.
insertWarningFn(IRB, ShadowData.Origin);
if (!MS.Recover)
return; // Always fail and stop here, not need to check the rest.
// Skip entire instruction,
continue;
}
// Fallback to runtime check, which still can be optimized out later.
}
if (!Combine) {
materializeOneCheck(IRB, ConvertedShadow, ShadowData.Origin);
continue;
}
if (!Shadow) {
Shadow = ConvertedShadow;
continue;
}
Shadow = convertToBool(Shadow, IRB, "_mscmp");
ConvertedShadow = convertToBool(ConvertedShadow, IRB, "_mscmp");
Shadow = IRB.CreateOr(Shadow, ConvertedShadow, "_msor");
}
if (Shadow) {
assert(Combine);
IRBuilder<> IRB(Instruction);
materializeOneCheck(IRB, Shadow, nullptr);
}
}
void materializeChecks() {
#ifndef NDEBUG
// For assert below.
SmallPtrSet<Instruction *, 16> Done;
#endif
for (auto I = InstrumentationList.begin();
I != InstrumentationList.end();) {
auto OrigIns = I->OrigIns;
// Checks are grouped by the original instruction. We call all
// `insertShadowCheck` for an instruction at once.
assert(Done.insert(OrigIns).second);
auto J = std::find_if(I + 1, InstrumentationList.end(),
[OrigIns](const ShadowOriginAndInsertPoint &R) {
return OrigIns != R.OrigIns;
});
// Process all checks of instruction at once.
materializeInstructionChecks(ArrayRef<ShadowOriginAndInsertPoint>(I, J));
I = J;
}
LLVM_DEBUG(dbgs() << "DONE:\n" << F);
}
// Returns the last instruction in the new prologue
void insertKmsanPrologue(IRBuilder<> &IRB) {
Value *ContextState = IRB.CreateCall(MS.MsanGetContextStateFn, {});
Constant *Zero = IRB.getInt32(0);
MS.ParamTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
{Zero, IRB.getInt32(0)}, "param_shadow");
MS.RetvalTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
{Zero, IRB.getInt32(1)}, "retval_shadow");
MS.VAArgTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
{Zero, IRB.getInt32(2)}, "va_arg_shadow");
MS.VAArgOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
{Zero, IRB.getInt32(3)}, "va_arg_origin");
MS.VAArgOverflowSizeTLS =
IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
{Zero, IRB.getInt32(4)}, "va_arg_overflow_size");
MS.ParamOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
{Zero, IRB.getInt32(5)}, "param_origin");
MS.RetvalOriginTLS =
IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
{Zero, IRB.getInt32(6)}, "retval_origin");
if (MS.TargetTriple.getArch() == Triple::systemz)
MS.MsanMetadataAlloca = IRB.CreateAlloca(MS.MsanMetadata, 0u);
}
/// Add MemorySanitizer instrumentation to a function.
bool runOnFunction() {
// Iterate all BBs in depth-first order and create shadow instructions
// for all instructions (where applicable).
// For PHI nodes we create dummy shadow PHIs which will be finalized later.
for (BasicBlock *BB : depth_first(FnPrologueEnd->getParent()))
visit(*BB);
// `visit` above only collects instructions. Process them after iterating
// CFG to avoid requirement on CFG transformations.
for (Instruction *I : Instructions)
InstVisitor<MemorySanitizerVisitor>::visit(*I);
// Finalize PHI nodes.
for (PHINode *PN : ShadowPHINodes) {
PHINode *PNS = cast<PHINode>(getShadow(PN));
PHINode *PNO = MS.TrackOrigins ? cast<PHINode>(getOrigin(PN)) : nullptr;
size_t NumValues = PN->getNumIncomingValues();
for (size_t v = 0; v < NumValues; v++) {
PNS->addIncoming(getShadow(PN, v), PN->getIncomingBlock(v));
if (PNO)
PNO->addIncoming(getOrigin(PN, v), PN->getIncomingBlock(v));
}
}
VAHelper->finalizeInstrumentation();
// Poison llvm.lifetime.start intrinsics, if we haven't fallen back to
// instrumenting only allocas.
if (ClHandleLifetimeIntrinsics) {
for (auto Item : LifetimeStartList) {
instrumentAlloca(*Item.second, Item.first);
AllocaSet.remove(Item.second);
}
}
// Poison the allocas for which we didn't instrument the corresponding
// lifetime intrinsics.
for (AllocaInst *AI : AllocaSet)
instrumentAlloca(*AI);
// Insert shadow value checks.
materializeChecks();
// Delayed instrumentation of StoreInst.
// This may not add new address checks.
materializeStores();
return true;
}
/// Compute the shadow type that corresponds to a given Value.
Type *getShadowTy(Value *V) { return getShadowTy(V->getType()); }
/// Compute the shadow type that corresponds to a given Type.
Type *getShadowTy(Type *OrigTy) {
if (!OrigTy->isSized()) {
return nullptr;
}
// For integer type, shadow is the same as the original type.
// This may return weird-sized types like i1.
if (IntegerType *IT = dyn_cast<IntegerType>(OrigTy))
return IT;
const DataLayout &DL = F.getDataLayout();
if (VectorType *VT = dyn_cast<VectorType>(OrigTy)) {
uint32_t EltSize = DL.getTypeSizeInBits(VT->getElementType());
return VectorType::get(IntegerType::get(*MS.C, EltSize),
VT->getElementCount());
}
if (ArrayType *AT = dyn_cast<ArrayType>(OrigTy)) {
return ArrayType::get(getShadowTy(AT->getElementType()),
AT->getNumElements());
}
if (StructType *ST = dyn_cast<StructType>(OrigTy)) {
SmallVector<Type *, 4> Elements;
for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
Elements.push_back(getShadowTy(ST->getElementType(i)));
StructType *Res = StructType::get(*MS.C, Elements, ST->isPacked());
LLVM_DEBUG(dbgs() << "getShadowTy: " << *ST << " ===> " << *Res << "\n");
return Res;
}
uint32_t TypeSize = DL.getTypeSizeInBits(OrigTy);
return IntegerType::get(*MS.C, TypeSize);
}
/// Extract combined shadow of struct elements as a bool
Value *collapseStructShadow(StructType *Struct, Value *Shadow,
IRBuilder<> &IRB) {
Value *FalseVal = IRB.getIntN(/* width */ 1, /* value */ 0);
Value *Aggregator = FalseVal;
for (unsigned Idx = 0; Idx < Struct->getNumElements(); Idx++) {
// Combine by ORing together each element's bool shadow
Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
Value *ShadowBool = convertToBool(ShadowItem, IRB);
if (Aggregator != FalseVal)
Aggregator = IRB.CreateOr(Aggregator, ShadowBool);
else
Aggregator = ShadowBool;
}
return Aggregator;
}
// Extract combined shadow of array elements
Value *collapseArrayShadow(ArrayType *Array, Value *Shadow,
IRBuilder<> &IRB) {
if (!Array->getNumElements())
return IRB.getIntN(/* width */ 1, /* value */ 0);
Value *FirstItem = IRB.CreateExtractValue(Shadow, 0);
Value *Aggregator = convertShadowToScalar(FirstItem, IRB);
for (unsigned Idx = 1; Idx < Array->getNumElements(); Idx++) {
Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
Value *ShadowInner = convertShadowToScalar(ShadowItem, IRB);
Aggregator = IRB.CreateOr(Aggregator, ShadowInner);
}
return Aggregator;
}
/// Convert a shadow value to it's flattened variant. The resulting
/// shadow may not necessarily have the same bit width as the input
/// value, but it will always be comparable to zero.
Value *convertShadowToScalar(Value *V, IRBuilder<> &IRB) {
if (StructType *Struct = dyn_cast<StructType>(V->getType()))
return collapseStructShadow(Struct, V, IRB);
if (ArrayType *Array = dyn_cast<ArrayType>(V->getType()))
return collapseArrayShadow(Array, V, IRB);
if (isa<VectorType>(V->getType())) {
if (isa<ScalableVectorType>(V->getType()))
return convertShadowToScalar(IRB.CreateOrReduce(V), IRB);
unsigned BitWidth =
V->getType()->getPrimitiveSizeInBits().getFixedValue();
return IRB.CreateBitCast(V, IntegerType::get(*MS.C, BitWidth));
}
return V;
}
// Convert a scalar value to an i1 by comparing with 0
Value *convertToBool(Value *V, IRBuilder<> &IRB, const Twine &name = "") {
Type *VTy = V->getType();
if (!VTy->isIntegerTy())
return convertToBool(convertShadowToScalar(V, IRB), IRB, name);
if (VTy->getIntegerBitWidth() == 1)
// Just converting a bool to a bool, so do nothing.
return V;
return IRB.CreateICmpNE(V, ConstantInt::get(VTy, 0), name);
}
Type *ptrToIntPtrType(Type *PtrTy) const {
if (VectorType *VectTy = dyn_cast<VectorType>(PtrTy)) {
return VectorType::get(ptrToIntPtrType(VectTy->getElementType()),
VectTy->getElementCount());
}
assert(PtrTy->isIntOrPtrTy());
return MS.IntptrTy;
}
Type *getPtrToShadowPtrType(Type *IntPtrTy, Type *ShadowTy) const {
if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
return VectorType::get(
getPtrToShadowPtrType(VectTy->getElementType(), ShadowTy),
VectTy->getElementCount());
}
assert(IntPtrTy == MS.IntptrTy);
return MS.PtrTy;
}
Constant *constToIntPtr(Type *IntPtrTy, uint64_t C) const {
if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
return ConstantVector::getSplat(
VectTy->getElementCount(),
constToIntPtr(VectTy->getElementType(), C));
}
assert(IntPtrTy == MS.IntptrTy);
return ConstantInt::get(MS.IntptrTy, C);
}
/// Returns the integer shadow offset that corresponds to a given
/// application address, whereby:
///
/// Offset = (Addr & ~AndMask) ^ XorMask
/// Shadow = ShadowBase + Offset
/// Origin = (OriginBase + Offset) & ~Alignment
///
/// Note: for efficiency, many shadow mappings only require use the XorMask
/// and OriginBase; the AndMask and ShadowBase are often zero.
Value *getShadowPtrOffset(Value *Addr, IRBuilder<> &IRB) {
Type *IntptrTy = ptrToIntPtrType(Addr->getType());
Value *OffsetLong = IRB.CreatePointerCast(Addr, IntptrTy);
if (uint64_t AndMask = MS.MapParams->AndMask)
OffsetLong = IRB.CreateAnd(OffsetLong, constToIntPtr(IntptrTy, ~AndMask));
if (uint64_t XorMask = MS.MapParams->XorMask)
OffsetLong = IRB.CreateXor(OffsetLong, constToIntPtr(IntptrTy, XorMask));
return OffsetLong;
}
/// Compute the shadow and origin addresses corresponding to a given
/// application address.
///
/// Shadow = ShadowBase + Offset
/// Origin = (OriginBase + Offset) & ~3ULL
/// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
/// a single pointee.
/// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
std::pair<Value *, Value *>
getShadowOriginPtrUserspace(Value *Addr, IRBuilder<> &IRB, Type *ShadowTy,
MaybeAlign Alignment) {
VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
if (!VectTy) {
assert(Addr->getType()->isPointerTy());
} else {
assert(VectTy->getElementType()->isPointerTy());
}
Type *IntptrTy = ptrToIntPtrType(Addr->getType());
Value *ShadowOffset = getShadowPtrOffset(Addr, IRB);
Value *ShadowLong = ShadowOffset;
if (uint64_t ShadowBase = MS.MapParams->ShadowBase) {
ShadowLong =
IRB.CreateAdd(ShadowLong, constToIntPtr(IntptrTy, ShadowBase));
}
Value *ShadowPtr = IRB.CreateIntToPtr(
ShadowLong, getPtrToShadowPtrType(IntptrTy, ShadowTy));
Value *OriginPtr = nullptr;
if (MS.TrackOrigins) {
Value *OriginLong = ShadowOffset;
uint64_t OriginBase = MS.MapParams->OriginBase;
if (OriginBase != 0)
OriginLong =
IRB.CreateAdd(OriginLong, constToIntPtr(IntptrTy, OriginBase));
if (!Alignment || *Alignment < kMinOriginAlignment) {
uint64_t Mask = kMinOriginAlignment.value() - 1;
OriginLong = IRB.CreateAnd(OriginLong, constToIntPtr(IntptrTy, ~Mask));
}
OriginPtr = IRB.CreateIntToPtr(
OriginLong, getPtrToShadowPtrType(IntptrTy, MS.OriginTy));
}
return std::make_pair(ShadowPtr, OriginPtr);
}
template <typename... ArgsTy>
Value *createMetadataCall(IRBuilder<> &IRB, FunctionCallee Callee,
ArgsTy... Args) {
if (MS.TargetTriple.getArch() == Triple::systemz) {
IRB.CreateCall(Callee,
{MS.MsanMetadataAlloca, std::forward<ArgsTy>(Args)...});
return IRB.CreateLoad(MS.MsanMetadata, MS.MsanMetadataAlloca);
}
return IRB.CreateCall(Callee, {std::forward<ArgsTy>(Args)...});
}
std::pair<Value *, Value *> getShadowOriginPtrKernelNoVec(Value *Addr,
IRBuilder<> &IRB,
Type *ShadowTy,
bool isStore) {
Value *ShadowOriginPtrs;
const DataLayout &DL = F.getDataLayout();
TypeSize Size = DL.getTypeStoreSize(ShadowTy);
FunctionCallee Getter = MS.getKmsanShadowOriginAccessFn(isStore, Size);
Value *AddrCast = IRB.CreatePointerCast(Addr, MS.PtrTy);
if (Getter) {
ShadowOriginPtrs = createMetadataCall(IRB, Getter, AddrCast);
} else {
Value *SizeVal = ConstantInt::get(MS.IntptrTy, Size);
ShadowOriginPtrs = createMetadataCall(
IRB,
isStore ? MS.MsanMetadataPtrForStoreN : MS.MsanMetadataPtrForLoadN,
AddrCast, SizeVal);
}
Value *ShadowPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 0);
ShadowPtr = IRB.CreatePointerCast(ShadowPtr, MS.PtrTy);
Value *OriginPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 1);
return std::make_pair(ShadowPtr, OriginPtr);
}
/// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
/// a single pointee.
/// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
std::pair<Value *, Value *> getShadowOriginPtrKernel(Value *Addr,
IRBuilder<> &IRB,
Type *ShadowTy,
bool isStore) {
VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
if (!VectTy) {
assert(Addr->getType()->isPointerTy());
return getShadowOriginPtrKernelNoVec(Addr, IRB, ShadowTy, isStore);
}
// TODO: Support callbacs with vectors of addresses.
unsigned NumElements = cast<FixedVectorType>(VectTy)->getNumElements();
Value *ShadowPtrs = ConstantInt::getNullValue(
FixedVectorType::get(IRB.getPtrTy(), NumElements));
Value *OriginPtrs = nullptr;
if (MS.TrackOrigins)
OriginPtrs = ConstantInt::getNullValue(
FixedVectorType::get(IRB.getPtrTy(), NumElements));
for (unsigned i = 0; i < NumElements; ++i) {
Value *OneAddr =
IRB.CreateExtractElement(Addr, ConstantInt::get(IRB.getInt32Ty(), i));
auto [ShadowPtr, OriginPtr] =
getShadowOriginPtrKernelNoVec(OneAddr, IRB, ShadowTy, isStore);
ShadowPtrs = IRB.CreateInsertElement(
ShadowPtrs, ShadowPtr, ConstantInt::get(IRB.getInt32Ty(), i));
if (MS.TrackOrigins)
OriginPtrs = IRB.CreateInsertElement(
OriginPtrs, OriginPtr, ConstantInt::get(IRB.getInt32Ty(), i));
}
return {ShadowPtrs, OriginPtrs};
}
std::pair<Value *, Value *> getShadowOriginPtr(Value *Addr, IRBuilder<> &IRB,
Type *ShadowTy,
MaybeAlign Alignment,
bool isStore) {
if (MS.CompileKernel)
return getShadowOriginPtrKernel(Addr, IRB, ShadowTy, isStore);
return getShadowOriginPtrUserspace(Addr, IRB, ShadowTy, Alignment);
}
/// Compute the shadow address for a given function argument.
///
/// Shadow = ParamTLS+ArgOffset.
Value *getShadowPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
Value *Base = IRB.CreatePointerCast(MS.ParamTLS, MS.IntptrTy);
if (ArgOffset)
Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
return IRB.CreateIntToPtr(Base, IRB.getPtrTy(0), "_msarg");
}
/// Compute the origin address for a given function argument.
Value *getOriginPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
if (!MS.TrackOrigins)
return nullptr;
Value *Base = IRB.CreatePointerCast(MS.ParamOriginTLS, MS.IntptrTy);
if (ArgOffset)
Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
return IRB.CreateIntToPtr(Base, IRB.getPtrTy(0), "_msarg_o");
}
/// Compute the shadow address for a retval.
Value *getShadowPtrForRetval(IRBuilder<> &IRB) {
return IRB.CreatePointerCast(MS.RetvalTLS, IRB.getPtrTy(0), "_msret");
}
/// Compute the origin address for a retval.
Value *getOriginPtrForRetval() {
// We keep a single origin for the entire retval. Might be too optimistic.
return MS.RetvalOriginTLS;
}
/// Set SV to be the shadow value for V.
void setShadow(Value *V, Value *SV) {
assert(!ShadowMap.count(V) && "Values may only have one shadow");
ShadowMap[V] = PropagateShadow ? SV : getCleanShadow(V);
}
/// Set Origin to be the origin value for V.
void setOrigin(Value *V, Value *Origin) {
if (!MS.TrackOrigins)
return;
assert(!OriginMap.count(V) && "Values may only have one origin");
LLVM_DEBUG(dbgs() << "ORIGIN: " << *V << " ==> " << *Origin << "\n");
OriginMap[V] = Origin;
}
Constant *getCleanShadow(Type *OrigTy) {
Type *ShadowTy = getShadowTy(OrigTy);
if (!ShadowTy)
return nullptr;
return Constant::getNullValue(ShadowTy);
}
/// Create a clean shadow value for a given value.
///
/// Clean shadow (all zeroes) means all bits of the value are defined
/// (initialized).
Constant *getCleanShadow(Value *V) { return getCleanShadow(V->getType()); }
/// Create a dirty shadow of a given shadow type.
Constant *getPoisonedShadow(Type *ShadowTy) {
assert(ShadowTy);
if (isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy))
return Constant::getAllOnesValue(ShadowTy);
if (ArrayType *AT = dyn_cast<ArrayType>(ShadowTy)) {
SmallVector<Constant *, 4> Vals(AT->getNumElements(),
getPoisonedShadow(AT->getElementType()));
return ConstantArray::get(AT, Vals);
}
if (StructType *ST = dyn_cast<StructType>(ShadowTy)) {
SmallVector<Constant *, 4> Vals;
for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
Vals.push_back(getPoisonedShadow(ST->getElementType(i)));
return ConstantStruct::get(ST, Vals);
}
llvm_unreachable("Unexpected shadow type");
}
/// Create a dirty shadow for a given value.
Constant *getPoisonedShadow(Value *V) {
Type *ShadowTy = getShadowTy(V);
if (!ShadowTy)
return nullptr;
return getPoisonedShadow(ShadowTy);
}
/// Create a clean (zero) origin.
Value *getCleanOrigin() { return Constant::getNullValue(MS.OriginTy); }
/// Get the shadow value for a given Value.
///
/// This function either returns the value set earlier with setShadow,
/// or extracts if from ParamTLS (for function arguments).
Value *getShadow(Value *V) {
if (Instruction *I = dyn_cast<Instruction>(V)) {
if (!PropagateShadow || I->getMetadata(LLVMContext::MD_nosanitize))
return getCleanShadow(V);
// For instructions the shadow is already stored in the map.
Value *Shadow = ShadowMap[V];
if (!Shadow) {
LLVM_DEBUG(dbgs() << "No shadow: " << *V << "\n" << *(I->getParent()));
assert(Shadow && "No shadow for a value");
}
return Shadow;
}
// Handle fully undefined values
// (partially undefined constant vectors are handled later)
if ([[maybe_unused]] UndefValue *U = dyn_cast<UndefValue>(V)) {
Value *AllOnes = (PropagateShadow && PoisonUndef) ? getPoisonedShadow(V)
: getCleanShadow(V);
LLVM_DEBUG(dbgs() << "Undef: " << *U << " ==> " << *AllOnes << "\n");
return AllOnes;
}
if (Argument *A = dyn_cast<Argument>(V)) {
// For arguments we compute the shadow on demand and store it in the map.
Value *&ShadowPtr = ShadowMap[V];
if (ShadowPtr)
return ShadowPtr;
Function *F = A->getParent();
IRBuilder<> EntryIRB(FnPrologueEnd);
unsigned ArgOffset = 0;
const DataLayout &DL = F->getDataLayout();
for (auto &FArg : F->args()) {
if (!FArg.getType()->isSized() || FArg.getType()->isScalableTy()) {
LLVM_DEBUG(dbgs() << (FArg.getType()->isScalableTy()
? "vscale not fully supported\n"
: "Arg is not sized\n"));
if (A == &FArg) {
ShadowPtr = getCleanShadow(V);
setOrigin(A, getCleanOrigin());
break;
}
continue;
}
unsigned Size = FArg.hasByValAttr()
? DL.getTypeAllocSize(FArg.getParamByValType())
: DL.getTypeAllocSize(FArg.getType());
if (A == &FArg) {
bool Overflow = ArgOffset + Size > kParamTLSSize;
if (FArg.hasByValAttr()) {
// ByVal pointer itself has clean shadow. We copy the actual
// argument shadow to the underlying memory.
// Figure out maximal valid memcpy alignment.
const Align ArgAlign = DL.getValueOrABITypeAlignment(
FArg.getParamAlign(), FArg.getParamByValType());
Value *CpShadowPtr, *CpOriginPtr;
std::tie(CpShadowPtr, CpOriginPtr) =
getShadowOriginPtr(V, EntryIRB, EntryIRB.getInt8Ty(), ArgAlign,
/*isStore*/ true);
if (!PropagateShadow || Overflow) {
// ParamTLS overflow.
EntryIRB.CreateMemSet(
CpShadowPtr, Constant::getNullValue(EntryIRB.getInt8Ty()),
Size, ArgAlign);
} else {
Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
const Align CopyAlign = std::min(ArgAlign, kShadowTLSAlignment);
[[maybe_unused]] Value *Cpy = EntryIRB.CreateMemCpy(
CpShadowPtr, CopyAlign, Base, CopyAlign, Size);
LLVM_DEBUG(dbgs() << " ByValCpy: " << *Cpy << "\n");
if (MS.TrackOrigins) {
Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
// FIXME: OriginSize should be:
// alignTo(V % kMinOriginAlignment + Size, kMinOriginAlignment)
unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
EntryIRB.CreateMemCpy(
CpOriginPtr,
/* by getShadowOriginPtr */ kMinOriginAlignment, OriginPtr,
/* by origin_tls[ArgOffset] */ kMinOriginAlignment,
OriginSize);
}
}
}
if (!PropagateShadow || Overflow || FArg.hasByValAttr() ||
(MS.EagerChecks && FArg.hasAttribute(Attribute::NoUndef))) {
ShadowPtr = getCleanShadow(V);
setOrigin(A, getCleanOrigin());
} else {
// Shadow over TLS
Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
ShadowPtr = EntryIRB.CreateAlignedLoad(getShadowTy(&FArg), Base,
kShadowTLSAlignment);
if (MS.TrackOrigins) {
Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
setOrigin(A, EntryIRB.CreateLoad(MS.OriginTy, OriginPtr));
}
}
LLVM_DEBUG(dbgs()
<< " ARG: " << FArg << " ==> " << *ShadowPtr << "\n");
break;
}
ArgOffset += alignTo(Size, kShadowTLSAlignment);
}
assert(ShadowPtr && "Could not find shadow for an argument");
return ShadowPtr;
}
// Check for partially-undefined constant vectors
// TODO: scalable vectors (this is hard because we do not have IRBuilder)
if (isa<FixedVectorType>(V->getType()) && isa<Constant>(V) &&
cast<Constant>(V)->containsUndefOrPoisonElement() && PropagateShadow &&
PoisonUndefVectors) {
unsigned NumElems = cast<FixedVectorType>(V->getType())->getNumElements();
SmallVector<Constant *, 32> ShadowVector(NumElems);
for (unsigned i = 0; i != NumElems; ++i) {
Constant *Elem = cast<Constant>(V)->getAggregateElement(i);
ShadowVector[i] = isa<UndefValue>(Elem) ? getPoisonedShadow(Elem)
: getCleanShadow(Elem);
}
Value *ShadowConstant = ConstantVector::get(ShadowVector);
LLVM_DEBUG(dbgs() << "Partial undef constant vector: " << *V << " ==> "
<< *ShadowConstant << "\n");
return ShadowConstant;
}
// TODO: partially-undefined constant arrays, structures, and nested types
// For everything else the shadow is zero.
return getCleanShadow(V);
}
/// Get the shadow for i-th argument of the instruction I.
Value *getShadow(Instruction *I, int i) {
return getShadow(I->getOperand(i));
}
/// Get the origin for a value.
Value *getOrigin(Value *V) {
if (!MS.TrackOrigins)
return nullptr;
if (!PropagateShadow || isa<Constant>(V) || isa<InlineAsm>(V))
return getCleanOrigin();
assert((isa<Instruction>(V) || isa<Argument>(V)) &&
"Unexpected value type in getOrigin()");
if (Instruction *I = dyn_cast<Instruction>(V)) {
if (I->getMetadata(LLVMContext::MD_nosanitize))
return getCleanOrigin();
}
Value *Origin = OriginMap[V];
assert(Origin && "Missing origin");
return Origin;
}
/// Get the origin for i-th argument of the instruction I.
Value *getOrigin(Instruction *I, int i) {
return getOrigin(I->getOperand(i));
}
/// Remember the place where a shadow check should be inserted.
///
/// This location will be later instrumented with a check that will print a
/// UMR warning in runtime if the shadow value is not 0.
void insertCheckShadow(Value *Shadow, Value *Origin, Instruction *OrigIns) {
assert(Shadow);
if (!InsertChecks)
return;
if (!DebugCounter::shouldExecute(DebugInsertCheck)) {
LLVM_DEBUG(dbgs() << "Skipping check of " << *Shadow << " before "
<< *OrigIns << "\n");
return;
}
#ifndef NDEBUG
Type *ShadowTy = Shadow->getType();
assert((isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy) ||
isa<StructType>(ShadowTy) || isa<ArrayType>(ShadowTy)) &&
"Can only insert checks for integer, vector, and aggregate shadow "
"types");
#endif
InstrumentationList.push_back(
ShadowOriginAndInsertPoint(Shadow, Origin, OrigIns));
}
/// Get shadow for value, and remember the place where a shadow check should
/// be inserted.
///
/// This location will be later instrumented with a check that will print a
/// UMR warning in runtime if the value is not fully defined.
void insertCheckShadowOf(Value *Val, Instruction *OrigIns) {
assert(Val);
Value *Shadow, *Origin;
if (ClCheckConstantShadow) {
Shadow = getShadow(Val);
if (!Shadow)
return;
Origin = getOrigin(Val);
} else {
Shadow = dyn_cast_or_null<Instruction>(getShadow(Val));
if (!Shadow)
return;
Origin = dyn_cast_or_null<Instruction>(getOrigin(Val));
}
insertCheckShadow(Shadow, Origin, OrigIns);
}
AtomicOrdering addReleaseOrdering(AtomicOrdering a) {
switch (a) {
case AtomicOrdering::NotAtomic:
return AtomicOrdering::NotAtomic;
case AtomicOrdering::Unordered:
case AtomicOrdering::Monotonic:
case AtomicOrdering::Release:
return AtomicOrdering::Release;
case AtomicOrdering::Acquire:
case AtomicOrdering::AcquireRelease:
return AtomicOrdering::AcquireRelease;
case AtomicOrdering::SequentiallyConsistent:
return AtomicOrdering::SequentiallyConsistent;
}
llvm_unreachable("Unknown ordering");
}
Value *makeAddReleaseOrderingTable(IRBuilder<> &IRB) {
constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
uint32_t OrderingTable[NumOrderings] = {};
OrderingTable[(int)AtomicOrderingCABI::relaxed] =
OrderingTable[(int)AtomicOrderingCABI::release] =
(int)AtomicOrderingCABI::release;
OrderingTable[(int)AtomicOrderingCABI::consume] =
OrderingTable[(int)AtomicOrderingCABI::acquire] =
OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
(int)AtomicOrderingCABI::acq_rel;
OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
(int)AtomicOrderingCABI::seq_cst;
return ConstantDataVector::get(IRB.getContext(), OrderingTable);
}
AtomicOrdering addAcquireOrdering(AtomicOrdering a) {
switch (a) {
case AtomicOrdering::NotAtomic:
return AtomicOrdering::NotAtomic;
case AtomicOrdering::Unordered:
case AtomicOrdering::Monotonic:
case AtomicOrdering::Acquire:
return AtomicOrdering::Acquire;
case AtomicOrdering::Release:
case AtomicOrdering::AcquireRelease:
return AtomicOrdering::AcquireRelease;
case AtomicOrdering::SequentiallyConsistent:
return AtomicOrdering::SequentiallyConsistent;
}
llvm_unreachable("Unknown ordering");
}
Value *makeAddAcquireOrderingTable(IRBuilder<> &IRB) {
constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
uint32_t OrderingTable[NumOrderings] = {};
OrderingTable[(int)AtomicOrderingCABI::relaxed] =
OrderingTable[(int)AtomicOrderingCABI::acquire] =
OrderingTable[(int)AtomicOrderingCABI::consume] =
(int)AtomicOrderingCABI::acquire;
OrderingTable[(int)AtomicOrderingCABI::release] =
OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
(int)AtomicOrderingCABI::acq_rel;
OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
(int)AtomicOrderingCABI::seq_cst;
return ConstantDataVector::get(IRB.getContext(), OrderingTable);
}
// ------------------- Visitors.
using InstVisitor<MemorySanitizerVisitor>::visit;
void visit(Instruction &I) {
if (I.getMetadata(LLVMContext::MD_nosanitize))
return;
// Don't want to visit if we're in the prologue
if (isInPrologue(I))
return;
if (!DebugCounter::shouldExecute(DebugInstrumentInstruction)) {
LLVM_DEBUG(dbgs() << "Skipping instruction: " << I << "\n");
// We still need to set the shadow and origin to clean values.
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
return;
}
Instructions.push_back(&I);
}
/// Instrument LoadInst
///
/// Loads the corresponding shadow and (optionally) origin.
/// Optionally, checks that the load address is fully defined.
void visitLoadInst(LoadInst &I) {
assert(I.getType()->isSized() && "Load type must have size");
assert(!I.getMetadata(LLVMContext::MD_nosanitize));
NextNodeIRBuilder IRB(&I);
Type *ShadowTy = getShadowTy(&I);
Value *Addr = I.getPointerOperand();
Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
const Align Alignment = I.getAlign();
if (PropagateShadow) {
std::tie(ShadowPtr, OriginPtr) =
getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
setShadow(&I,
IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
} else {
setShadow(&I, getCleanShadow(&I));
}
if (ClCheckAccessAddress)
insertCheckShadowOf(I.getPointerOperand(), &I);
if (I.isAtomic())
I.setOrdering(addAcquireOrdering(I.getOrdering()));
if (MS.TrackOrigins) {
if (PropagateShadow) {
const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
setOrigin(
&I, IRB.CreateAlignedLoad(MS.OriginTy, OriginPtr, OriginAlignment));
} else {
setOrigin(&I, getCleanOrigin());
}
}
}
/// Instrument StoreInst
///
/// Stores the corresponding shadow and (optionally) origin.
/// Optionally, checks that the store address is fully defined.
void visitStoreInst(StoreInst &I) {
StoreList.push_back(&I);
if (ClCheckAccessAddress)
insertCheckShadowOf(I.getPointerOperand(), &I);
}
void handleCASOrRMW(Instruction &I) {
assert(isa<AtomicRMWInst>(I) || isa<AtomicCmpXchgInst>(I));
IRBuilder<> IRB(&I);
Value *Addr = I.getOperand(0);
Value *Val = I.getOperand(1);
Value *ShadowPtr = getShadowOriginPtr(Addr, IRB, getShadowTy(Val), Align(1),
/*isStore*/ true)
.first;
if (ClCheckAccessAddress)
insertCheckShadowOf(Addr, &I);
// Only test the conditional argument of cmpxchg instruction.
// The other argument can potentially be uninitialized, but we can not
// detect this situation reliably without possible false positives.
if (isa<AtomicCmpXchgInst>(I))
insertCheckShadowOf(Val, &I);
IRB.CreateStore(getCleanShadow(Val), ShadowPtr);
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
}
void visitAtomicRMWInst(AtomicRMWInst &I) {
handleCASOrRMW(I);
I.setOrdering(addReleaseOrdering(I.getOrdering()));
}
void visitAtomicCmpXchgInst(AtomicCmpXchgInst &I) {
handleCASOrRMW(I);
I.setSuccessOrdering(addReleaseOrdering(I.getSuccessOrdering()));
}
// Vector manipulation.
void visitExtractElementInst(ExtractElementInst &I) {
insertCheckShadowOf(I.getOperand(1), &I);
IRBuilder<> IRB(&I);
setShadow(&I, IRB.CreateExtractElement(getShadow(&I, 0), I.getOperand(1),
"_msprop"));
setOrigin(&I, getOrigin(&I, 0));
}
void visitInsertElementInst(InsertElementInst &I) {
insertCheckShadowOf(I.getOperand(2), &I);
IRBuilder<> IRB(&I);
auto *Shadow0 = getShadow(&I, 0);
auto *Shadow1 = getShadow(&I, 1);
setShadow(&I, IRB.CreateInsertElement(Shadow0, Shadow1, I.getOperand(2),
"_msprop"));
setOriginForNaryOp(I);
}
void visitShuffleVectorInst(ShuffleVectorInst &I) {
IRBuilder<> IRB(&I);
auto *Shadow0 = getShadow(&I, 0);
auto *Shadow1 = getShadow(&I, 1);
setShadow(&I, IRB.CreateShuffleVector(Shadow0, Shadow1, I.getShuffleMask(),
"_msprop"));
setOriginForNaryOp(I);
}
// Casts.
void visitSExtInst(SExtInst &I) {
IRBuilder<> IRB(&I);
setShadow(&I, IRB.CreateSExt(getShadow(&I, 0), I.getType(), "_msprop"));
setOrigin(&I, getOrigin(&I, 0));
}
void visitZExtInst(ZExtInst &I) {
IRBuilder<> IRB(&I);
setShadow(&I, IRB.CreateZExt(getShadow(&I, 0), I.getType(), "_msprop"));
setOrigin(&I, getOrigin(&I, 0));
}
void visitTruncInst(TruncInst &I) {
IRBuilder<> IRB(&I);
setShadow(&I, IRB.CreateTrunc(getShadow(&I, 0), I.getType(), "_msprop"));
setOrigin(&I, getOrigin(&I, 0));
}
void visitBitCastInst(BitCastInst &I) {
// Special case: if this is the bitcast (there is exactly 1 allowed) between
// a musttail call and a ret, don't instrument. New instructions are not
// allowed after a musttail call.
if (auto *CI = dyn_cast<CallInst>(I.getOperand(0)))
if (CI->isMustTailCall())
return;
IRBuilder<> IRB(&I);
setShadow(&I, IRB.CreateBitCast(getShadow(&I, 0), getShadowTy(&I)));
setOrigin(&I, getOrigin(&I, 0));
}
void visitPtrToIntInst(PtrToIntInst &I) {
IRBuilder<> IRB(&I);
setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
"_msprop_ptrtoint"));
setOrigin(&I, getOrigin(&I, 0));
}
void visitIntToPtrInst(IntToPtrInst &I) {
IRBuilder<> IRB(&I);
setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
"_msprop_inttoptr"));
setOrigin(&I, getOrigin(&I, 0));
}
void visitFPToSIInst(CastInst &I) { handleShadowOr(I); }
void visitFPToUIInst(CastInst &I) { handleShadowOr(I); }
void visitSIToFPInst(CastInst &I) { handleShadowOr(I); }
void visitUIToFPInst(CastInst &I) { handleShadowOr(I); }
void visitFPExtInst(CastInst &I) { handleShadowOr(I); }
void visitFPTruncInst(CastInst &I) { handleShadowOr(I); }
/// Propagate shadow for bitwise AND.
///
/// This code is exact, i.e. if, for example, a bit in the left argument
/// is defined and 0, then neither the value not definedness of the
/// corresponding bit in B don't affect the resulting shadow.
void visitAnd(BinaryOperator &I) {
IRBuilder<> IRB(&I);
// "And" of 0 and a poisoned value results in unpoisoned value.
// 1&1 => 1; 0&1 => 0; p&1 => p;
// 1&0 => 0; 0&0 => 0; p&0 => 0;
// 1&p => p; 0&p => 0; p&p => p;
// S = (S1 & S2) | (V1 & S2) | (S1 & V2)
Value *S1 = getShadow(&I, 0);
Value *S2 = getShadow(&I, 1);
Value *V1 = I.getOperand(0);
Value *V2 = I.getOperand(1);
if (V1->getType() != S1->getType()) {
V1 = IRB.CreateIntCast(V1, S1->getType(), false);
V2 = IRB.CreateIntCast(V2, S2->getType(), false);
}
Value *S1S2 = IRB.CreateAnd(S1, S2);
Value *V1S2 = IRB.CreateAnd(V1, S2);
Value *S1V2 = IRB.CreateAnd(S1, V2);
setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));
setOriginForNaryOp(I);
}
void visitOr(BinaryOperator &I) {
IRBuilder<> IRB(&I);
// "Or" of 1 and a poisoned value results in unpoisoned value:
// 1|1 => 1; 0|1 => 1; p|1 => 1;
// 1|0 => 1; 0|0 => 0; p|0 => p;
// 1|p => 1; 0|p => p; p|p => p;
//
// S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
//
// If the "disjoint OR" property is violated, the result is poison, and
// hence the entire shadow is uninitialized:
// S = S | SignExt(V1 & V2 != 0)
Value *S1 = getShadow(&I, 0);
Value *S2 = getShadow(&I, 1);
Value *V1 = I.getOperand(0);
Value *V2 = I.getOperand(1);
if (V1->getType() != S1->getType()) {
V1 = IRB.CreateIntCast(V1, S1->getType(), false);
V2 = IRB.CreateIntCast(V2, S2->getType(), false);
}
Value *NotV1 = IRB.CreateNot(V1);
Value *NotV2 = IRB.CreateNot(V2);
Value *S1S2 = IRB.CreateAnd(S1, S2);
Value *S2NotV1 = IRB.CreateAnd(NotV1, S2);
Value *S1NotV2 = IRB.CreateAnd(S1, NotV2);
Value *S = IRB.CreateOr({S1S2, S2NotV1, S1NotV2});
if (ClPreciseDisjointOr && cast<PossiblyDisjointInst>(&I)->isDisjoint()) {
Value *V1V2 = IRB.CreateAnd(V1, V2);
Value *DisjointOrShadow = IRB.CreateSExt(
IRB.CreateICmpNE(V1V2, getCleanShadow(V1V2)), V1V2->getType());
S = IRB.CreateOr(S, DisjointOrShadow, "_ms_disjoint");
}
setShadow(&I, S);
setOriginForNaryOp(I);
}
/// Default propagation of shadow and/or origin.
///
/// This class implements the general case of shadow propagation, used in all
/// cases where we don't know and/or don't care about what the operation
/// actually does. It converts all input shadow values to a common type
/// (extending or truncating as necessary), and bitwise OR's them.
///
/// This is much cheaper than inserting checks (i.e. requiring inputs to be
/// fully initialized), and less prone to false positives.
///
/// This class also implements the general case of origin propagation. For a
/// Nary operation, result origin is set to the origin of an argument that is
/// not entirely initialized. If there is more than one such arguments, the
/// rightmost of them is picked. It does not matter which one is picked if all
/// arguments are initialized.
template <bool CombineShadow> class Combiner {
Value *Shadow = nullptr;
Value *Origin = nullptr;
IRBuilder<> &IRB;
MemorySanitizerVisitor *MSV;
public:
Combiner(MemorySanitizerVisitor *MSV, IRBuilder<> &IRB)
: IRB(IRB), MSV(MSV) {}
/// Add a pair of shadow and origin values to the mix.
Combiner &Add(Value *OpShadow, Value *OpOrigin) {
if (CombineShadow) {
assert(OpShadow);
if (!Shadow)
Shadow = OpShadow;
else {
OpShadow = MSV->CreateShadowCast(IRB, OpShadow, Shadow->getType());
Shadow = IRB.CreateOr(Shadow, OpShadow, "_msprop");
}
}
if (MSV->MS.TrackOrigins) {
assert(OpOrigin);
if (!Origin) {
Origin = OpOrigin;
} else {
Constant *ConstOrigin = dyn_cast<Constant>(OpOrigin);
// No point in adding something that might result in 0 origin value.
if (!ConstOrigin || !ConstOrigin->isNullValue()) {
Value *Cond = MSV->convertToBool(OpShadow, IRB);
Origin = IRB.CreateSelect(Cond, OpOrigin, Origin);
}
}
}
return *this;
}
/// Add an application value to the mix.
Combiner &Add(Value *V) {
Value *OpShadow = MSV->getShadow(V);
Value *OpOrigin = MSV->MS.TrackOrigins ? MSV->getOrigin(V) : nullptr;
return Add(OpShadow, OpOrigin);
}
/// Set the current combined values as the given instruction's shadow
/// and origin.
void Done(Instruction *I) {
if (CombineShadow) {
assert(Shadow);
Shadow = MSV->CreateShadowCast(IRB, Shadow, MSV->getShadowTy(I));
MSV->setShadow(I, Shadow);
}
if (MSV->MS.TrackOrigins) {
assert(Origin);
MSV->setOrigin(I, Origin);
}
}
/// Store the current combined value at the specified origin
/// location.
void DoneAndStoreOrigin(TypeSize TS, Value *OriginPtr) {
if (MSV->MS.TrackOrigins) {
assert(Origin);
MSV->paintOrigin(IRB, Origin, OriginPtr, TS, kMinOriginAlignment);
}
}
};
using ShadowAndOriginCombiner = Combiner<true>;
using OriginCombiner = Combiner<false>;
/// Propagate origin for arbitrary operation.
void setOriginForNaryOp(Instruction &I) {
if (!MS.TrackOrigins)
return;
IRBuilder<> IRB(&I);
OriginCombiner OC(this, IRB);
for (Use &Op : I.operands())
OC.Add(Op.get());
OC.Done(&I);
}
size_t VectorOrPrimitiveTypeSizeInBits(Type *Ty) {
assert(!(Ty->isVectorTy() && Ty->getScalarType()->isPointerTy()) &&
"Vector of pointers is not a valid shadow type");
return Ty->isVectorTy() ? cast<FixedVectorType>(Ty)->getNumElements() *
Ty->getScalarSizeInBits()
: Ty->getPrimitiveSizeInBits();
}
/// Cast between two shadow types, extending or truncating as
/// necessary.
Value *CreateShadowCast(IRBuilder<> &IRB, Value *V, Type *dstTy,
bool Signed = false) {
Type *srcTy = V->getType();
if (srcTy == dstTy)
return V;
size_t srcSizeInBits = VectorOrPrimitiveTypeSizeInBits(srcTy);
size_t dstSizeInBits = VectorOrPrimitiveTypeSizeInBits(dstTy);
if (srcSizeInBits > 1 && dstSizeInBits == 1)
return IRB.CreateICmpNE(V, getCleanShadow(V));
if (dstTy->isIntegerTy() && srcTy->isIntegerTy())
return IRB.CreateIntCast(V, dstTy, Signed);
if (dstTy->isVectorTy() && srcTy->isVectorTy() &&
cast<VectorType>(dstTy)->getElementCount() ==
cast<VectorType>(srcTy)->getElementCount())
return IRB.CreateIntCast(V, dstTy, Signed);
Value *V1 = IRB.CreateBitCast(V, Type::getIntNTy(*MS.C, srcSizeInBits));
Value *V2 =
IRB.CreateIntCast(V1, Type::getIntNTy(*MS.C, dstSizeInBits), Signed);
return IRB.CreateBitCast(V2, dstTy);
// TODO: handle struct types.
}
/// Cast an application value to the type of its own shadow.
Value *CreateAppToShadowCast(IRBuilder<> &IRB, Value *V) {
Type *ShadowTy = getShadowTy(V);
if (V->getType() == ShadowTy)
return V;
if (V->getType()->isPtrOrPtrVectorTy())
return IRB.CreatePtrToInt(V, ShadowTy);
else
return IRB.CreateBitCast(V, ShadowTy);
}
/// Propagate shadow for arbitrary operation.
void handleShadowOr(Instruction &I) {
IRBuilder<> IRB(&I);
ShadowAndOriginCombiner SC(this, IRB);
for (Use &Op : I.operands())
SC.Add(Op.get());
SC.Done(&I);
}
// Perform a bitwise OR on the horizontal pairs (or other specified grouping)
// of elements.
//
// For example, suppose we have:
// VectorA: <a1, a2, a3, a4, a5, a6>
// VectorB: <b1, b2, b3, b4, b5, b6>
// ReductionFactor: 3.
// The output would be:
// <a1|a2|a3, a4|a5|a6, b1|b2|b3, b4|b5|b6>
//
// This is convenient for instrumenting horizontal add/sub.
// For bitwise OR on "vertical" pairs, see maybeHandleSimpleNomemIntrinsic().
Value *horizontalReduce(IntrinsicInst &I, unsigned ReductionFactor,
Value *VectorA, Value *VectorB) {
assert(isa<FixedVectorType>(VectorA->getType()));
unsigned TotalNumElems =
cast<FixedVectorType>(VectorA->getType())->getNumElements();
if (VectorB) {
assert(VectorA->getType() == VectorB->getType());
TotalNumElems = TotalNumElems * 2;
}
assert(TotalNumElems % ReductionFactor == 0);
Value *Or = nullptr;
IRBuilder<> IRB(&I);
for (unsigned i = 0; i < ReductionFactor; i++) {
SmallVector<int, 16> Mask;
for (unsigned X = 0; X < TotalNumElems; X += ReductionFactor)
Mask.push_back(X + i);
Value *Masked;
if (VectorB)
Masked = IRB.CreateShuffleVector(VectorA, VectorB, Mask);
else
Masked = IRB.CreateShuffleVector(VectorA, Mask);
if (Or)
Or = IRB.CreateOr(Or, Masked);
else
Or = Masked;
}
return Or;
}
/// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
/// fields.
///
/// e.g., <2 x i32> @llvm.aarch64.neon.saddlp.v2i32.v4i16(<4 x i16>)
/// <16 x i8> @llvm.aarch64.neon.addp.v16i8(<16 x i8>, <16 x i8>)
void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I) {
assert(I.arg_size() == 1 || I.arg_size() == 2);
assert(I.getType()->isVectorTy());
assert(I.getArgOperand(0)->getType()->isVectorTy());
[[maybe_unused]] FixedVectorType *ParamType =
cast<FixedVectorType>(I.getArgOperand(0)->getType());
assert((I.arg_size() != 2) ||
(ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
[[maybe_unused]] FixedVectorType *ReturnType =
cast<FixedVectorType>(I.getType());
assert(ParamType->getNumElements() * I.arg_size() ==
2 * ReturnType->getNumElements());
IRBuilder<> IRB(&I);
// Horizontal OR of shadow
Value *FirstArgShadow = getShadow(&I, 0);
Value *SecondArgShadow = nullptr;
if (I.arg_size() == 2)
SecondArgShadow = getShadow(&I, 1);
Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, FirstArgShadow,
SecondArgShadow);
OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
setShadow(&I, OrShadow);
setOriginForNaryOp(I);
}
/// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
/// fields, with the parameters reinterpreted to have elements of a specified
/// width. For example:
/// @llvm.x86.ssse3.phadd.w(<1 x i64> [[VAR1]], <1 x i64> [[VAR2]])
/// conceptually operates on
/// (<4 x i16> [[VAR1]], <4 x i16> [[VAR2]])
/// and can be handled with ReinterpretElemWidth == 16.
void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I,
int ReinterpretElemWidth) {
assert(I.arg_size() == 1 || I.arg_size() == 2);
assert(I.getType()->isVectorTy());
assert(I.getArgOperand(0)->getType()->isVectorTy());
FixedVectorType *ParamType =
cast<FixedVectorType>(I.getArgOperand(0)->getType());
assert((I.arg_size() != 2) ||
(ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
[[maybe_unused]] FixedVectorType *ReturnType =
cast<FixedVectorType>(I.getType());
assert(ParamType->getNumElements() * I.arg_size() ==
2 * ReturnType->getNumElements());
IRBuilder<> IRB(&I);
FixedVectorType *ReinterpretShadowTy = nullptr;
assert(isAligned(Align(ReinterpretElemWidth),
ParamType->getPrimitiveSizeInBits()));
ReinterpretShadowTy = FixedVectorType::get(
IRB.getIntNTy(ReinterpretElemWidth),
ParamType->getPrimitiveSizeInBits() / ReinterpretElemWidth);
// Horizontal OR of shadow
Value *FirstArgShadow = getShadow(&I, 0);
FirstArgShadow = IRB.CreateBitCast(FirstArgShadow, ReinterpretShadowTy);
// If we had two parameters each with an odd number of elements, the total
// number of elements is even, but we have never seen this in extant
// instruction sets, so we enforce that each parameter must have an even
// number of elements.
assert(isAligned(
Align(2),
cast<FixedVectorType>(FirstArgShadow->getType())->getNumElements()));
Value *SecondArgShadow = nullptr;
if (I.arg_size() == 2) {
SecondArgShadow = getShadow(&I, 1);
SecondArgShadow = IRB.CreateBitCast(SecondArgShadow, ReinterpretShadowTy);
}
Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, FirstArgShadow,
SecondArgShadow);
OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
setShadow(&I, OrShadow);
setOriginForNaryOp(I);
}
void visitFNeg(UnaryOperator &I) { handleShadowOr(I); }
// Handle multiplication by constant.
//
// Handle a special case of multiplication by constant that may have one or
// more zeros in the lower bits. This makes corresponding number of lower bits
// of the result zero as well. We model it by shifting the other operand
// shadow left by the required number of bits. Effectively, we transform
// (X * (A * 2**B)) to ((X << B) * A) and instrument (X << B) as (Sx << B).
// We use multiplication by 2**N instead of shift to cover the case of
// multiplication by 0, which may occur in some elements of a vector operand.
void handleMulByConstant(BinaryOperator &I, Constant *ConstArg,
Value *OtherArg) {
Constant *ShadowMul;
Type *Ty = ConstArg->getType();
if (auto *VTy = dyn_cast<VectorType>(Ty)) {
unsigned NumElements = cast<FixedVectorType>(VTy)->getNumElements();
Type *EltTy = VTy->getElementType();
SmallVector<Constant *, 16> Elements;
for (unsigned Idx = 0; Idx < NumElements; ++Idx) {
if (ConstantInt *Elt =
dyn_cast<ConstantInt>(ConstArg->getAggregateElement(Idx))) {
const APInt &V = Elt->getValue();
APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
Elements.push_back(ConstantInt::get(EltTy, V2));
} else {
Elements.push_back(ConstantInt::get(EltTy, 1));
}
}
ShadowMul = ConstantVector::get(Elements);
} else {
if (ConstantInt *Elt = dyn_cast<ConstantInt>(ConstArg)) {
const APInt &V = Elt->getValue();
APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
ShadowMul = ConstantInt::get(Ty, V2);
} else {
ShadowMul = ConstantInt::get(Ty, 1);
}
}
IRBuilder<> IRB(&I);
setShadow(&I,
IRB.CreateMul(getShadow(OtherArg), ShadowMul, "msprop_mul_cst"));
setOrigin(&I, getOrigin(OtherArg));
}
void visitMul(BinaryOperator &I) {
Constant *constOp0 = dyn_cast<Constant>(I.getOperand(0));
Constant *constOp1 = dyn_cast<Constant>(I.getOperand(1));
if (constOp0 && !constOp1)
handleMulByConstant(I, constOp0, I.getOperand(1));
else if (constOp1 && !constOp0)
handleMulByConstant(I, constOp1, I.getOperand(0));
else
handleShadowOr(I);
}
void visitFAdd(BinaryOperator &I) { handleShadowOr(I); }
void visitFSub(BinaryOperator &I) { handleShadowOr(I); }
void visitFMul(BinaryOperator &I) { handleShadowOr(I); }
void visitAdd(BinaryOperator &I) { handleShadowOr(I); }
void visitSub(BinaryOperator &I) { handleShadowOr(I); }
void visitXor(BinaryOperator &I) { handleShadowOr(I); }
void handleIntegerDiv(Instruction &I) {
IRBuilder<> IRB(&I);
// Strict on the second argument.
insertCheckShadowOf(I.getOperand(1), &I);
setShadow(&I, getShadow(&I, 0));
setOrigin(&I, getOrigin(&I, 0));
}
void visitUDiv(BinaryOperator &I) { handleIntegerDiv(I); }
void visitSDiv(BinaryOperator &I) { handleIntegerDiv(I); }
void visitURem(BinaryOperator &I) { handleIntegerDiv(I); }
void visitSRem(BinaryOperator &I) { handleIntegerDiv(I); }
// Floating point division is side-effect free. We can not require that the
// divisor is fully initialized and must propagate shadow. See PR37523.
void visitFDiv(BinaryOperator &I) { handleShadowOr(I); }
void visitFRem(BinaryOperator &I) { handleShadowOr(I); }
/// Instrument == and != comparisons.
///
/// Sometimes the comparison result is known even if some of the bits of the
/// arguments are not.
void handleEqualityComparison(ICmpInst &I) {
IRBuilder<> IRB(&I);
Value *A = I.getOperand(0);
Value *B = I.getOperand(1);
Value *Sa = getShadow(A);
Value *Sb = getShadow(B);
// Get rid of pointers and vectors of pointers.
// For ints (and vectors of ints), types of A and Sa match,
// and this is a no-op.
A = IRB.CreatePointerCast(A, Sa->getType());
B = IRB.CreatePointerCast(B, Sb->getType());
// A == B <==> (C = A^B) == 0
// A != B <==> (C = A^B) != 0
// Sc = Sa | Sb
Value *C = IRB.CreateXor(A, B);
Value *Sc = IRB.CreateOr(Sa, Sb);
// Now dealing with i = (C == 0) comparison (or C != 0, does not matter now)
// Result is defined if one of the following is true
// * there is a defined 1 bit in C
// * C is fully defined
// Si = !(C & ~Sc) && Sc
Value *Zero = Constant::getNullValue(Sc->getType());
Value *MinusOne = Constant::getAllOnesValue(Sc->getType());
Value *LHS = IRB.CreateICmpNE(Sc, Zero);
Value *RHS =
IRB.CreateICmpEQ(IRB.CreateAnd(IRB.CreateXor(Sc, MinusOne), C), Zero);
Value *Si = IRB.CreateAnd(LHS, RHS);
Si->setName("_msprop_icmp");
setShadow(&I, Si);
setOriginForNaryOp(I);
}
/// Instrument relational comparisons.
///
/// This function does exact shadow propagation for all relational
/// comparisons of integers, pointers and vectors of those.
/// FIXME: output seems suboptimal when one of the operands is a constant
void handleRelationalComparisonExact(ICmpInst &I) {
IRBuilder<> IRB(&I);
Value *A = I.getOperand(0);
Value *B = I.getOperand(1);
Value *Sa = getShadow(A);
Value *Sb = getShadow(B);
// Get rid of pointers and vectors of pointers.
// For ints (and vectors of ints), types of A and Sa match,
// and this is a no-op.
A = IRB.CreatePointerCast(A, Sa->getType());
B = IRB.CreatePointerCast(B, Sb->getType());
// Let [a0, a1] be the interval of possible values of A, taking into account
// its undefined bits. Let [b0, b1] be the interval of possible values of B.
// Then (A cmp B) is defined iff (a0 cmp b1) == (a1 cmp b0).
bool IsSigned = I.isSigned();
auto GetMinMaxUnsigned = [&](Value *V, Value *S) {
if (IsSigned) {
// Sign-flip to map from signed range to unsigned range. Relation A vs B
// should be preserved, if checked with `getUnsignedPredicate()`.
// Relationship between Amin, Amax, Bmin, Bmax also will not be
// affected, as they are created by effectively adding/substructing from
// A (or B) a value, derived from shadow, with no overflow, either
// before or after sign flip.
APInt MinVal =
APInt::getSignedMinValue(V->getType()->getScalarSizeInBits());
V = IRB.CreateXor(V, ConstantInt::get(V->getType(), MinVal));
}
// Minimize undefined bits.
Value *Min = IRB.CreateAnd(V, IRB.CreateNot(S));
Value *Max = IRB.CreateOr(V, S);
return std::make_pair(Min, Max);
};
auto [Amin, Amax] = GetMinMaxUnsigned(A, Sa);
auto [Bmin, Bmax] = GetMinMaxUnsigned(B, Sb);
Value *S1 = IRB.CreateICmp(I.getUnsignedPredicate(), Amin, Bmax);
Value *S2 = IRB.CreateICmp(I.getUnsignedPredicate(), Amax, Bmin);
Value *Si = IRB.CreateXor(S1, S2);
setShadow(&I, Si);
setOriginForNaryOp(I);
}
/// Instrument signed relational comparisons.
///
/// Handle sign bit tests: x<0, x>=0, x<=-1, x>-1 by propagating the highest
/// bit of the shadow. Everything else is delegated to handleShadowOr().
void handleSignedRelationalComparison(ICmpInst &I) {
Constant *constOp;
Value *op = nullptr;
CmpInst::Predicate pre;
if ((constOp = dyn_cast<Constant>(I.getOperand(1)))) {
op = I.getOperand(0);
pre = I.getPredicate();
} else if ((constOp = dyn_cast<Constant>(I.getOperand(0)))) {
op = I.getOperand(1);
pre = I.getSwappedPredicate();
} else {
handleShadowOr(I);
return;
}
if ((constOp->isNullValue() &&
(pre == CmpInst::ICMP_SLT || pre == CmpInst::ICMP_SGE)) ||
(constOp->isAllOnesValue() &&
(pre == CmpInst::ICMP_SGT || pre == CmpInst::ICMP_SLE))) {
IRBuilder<> IRB(&I);
Value *Shadow = IRB.CreateICmpSLT(getShadow(op), getCleanShadow(op),
"_msprop_icmp_s");
setShadow(&I, Shadow);
setOrigin(&I, getOrigin(op));
} else {
handleShadowOr(I);
}
}
void visitICmpInst(ICmpInst &I) {
if (!ClHandleICmp) {
handleShadowOr(I);
return;
}
if (I.isEquality()) {
handleEqualityComparison(I);
return;
}
assert(I.isRelational());
if (ClHandleICmpExact) {
handleRelationalComparisonExact(I);
return;
}
if (I.isSigned()) {
handleSignedRelationalComparison(I);
return;
}
assert(I.isUnsigned());
if ((isa<Constant>(I.getOperand(0)) || isa<Constant>(I.getOperand(1)))) {
handleRelationalComparisonExact(I);
return;
}
handleShadowOr(I);
}
void visitFCmpInst(FCmpInst &I) { handleShadowOr(I); }
void handleShift(BinaryOperator &I) {
IRBuilder<> IRB(&I);
// If any of the S2 bits are poisoned, the whole thing is poisoned.
// Otherwise perform the same shift on S1.
Value *S1 = getShadow(&I, 0);
Value *S2 = getShadow(&I, 1);
Value *S2Conv =
IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
Value *V2 = I.getOperand(1);
Value *Shift = IRB.CreateBinOp(I.getOpcode(), S1, V2);
setShadow(&I, IRB.CreateOr(Shift, S2Conv));
setOriginForNaryOp(I);
}
void visitShl(BinaryOperator &I) { handleShift(I); }
void visitAShr(BinaryOperator &I) { handleShift(I); }
void visitLShr(BinaryOperator &I) { handleShift(I); }
void handleFunnelShift(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
// If any of the S2 bits are poisoned, the whole thing is poisoned.
// Otherwise perform the same shift on S0 and S1.
Value *S0 = getShadow(&I, 0);
Value *S1 = getShadow(&I, 1);
Value *S2 = getShadow(&I, 2);
Value *S2Conv =
IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
Value *V2 = I.getOperand(2);
Value *Shift = IRB.CreateIntrinsic(I.getIntrinsicID(), S2Conv->getType(),
{S0, S1, V2});
setShadow(&I, IRB.CreateOr(Shift, S2Conv));
setOriginForNaryOp(I);
}
/// Instrument llvm.memmove
///
/// At this point we don't know if llvm.memmove will be inlined or not.
/// If we don't instrument it and it gets inlined,
/// our interceptor will not kick in and we will lose the memmove.
/// If we instrument the call here, but it does not get inlined,
/// we will memove the shadow twice: which is bad in case
/// of overlapping regions. So, we simply lower the intrinsic to a call.
///
/// Similar situation exists for memcpy and memset.
void visitMemMoveInst(MemMoveInst &I) {
getShadow(I.getArgOperand(1)); // Ensure shadow initialized
IRBuilder<> IRB(&I);
IRB.CreateCall(MS.MemmoveFn,
{I.getArgOperand(0), I.getArgOperand(1),
IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
I.eraseFromParent();
}
/// Instrument memcpy
///
/// Similar to memmove: avoid copying shadow twice. This is somewhat
/// unfortunate as it may slowdown small constant memcpys.
/// FIXME: consider doing manual inline for small constant sizes and proper
/// alignment.
///
/// Note: This also handles memcpy.inline, which promises no calls to external
/// functions as an optimization. However, with instrumentation enabled this
/// is difficult to promise; additionally, we know that the MSan runtime
/// exists and provides __msan_memcpy(). Therefore, we assume that with
/// instrumentation it's safe to turn memcpy.inline into a call to
/// __msan_memcpy(). Should this be wrong, such as when implementing memcpy()
/// itself, instrumentation should be disabled with the no_sanitize attribute.
void visitMemCpyInst(MemCpyInst &I) {
getShadow(I.getArgOperand(1)); // Ensure shadow initialized
IRBuilder<> IRB(&I);
IRB.CreateCall(MS.MemcpyFn,
{I.getArgOperand(0), I.getArgOperand(1),
IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
I.eraseFromParent();
}
// Same as memcpy.
void visitMemSetInst(MemSetInst &I) {
IRBuilder<> IRB(&I);
IRB.CreateCall(
MS.MemsetFn,
{I.getArgOperand(0),
IRB.CreateIntCast(I.getArgOperand(1), IRB.getInt32Ty(), false),
IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
I.eraseFromParent();
}
void visitVAStartInst(VAStartInst &I) { VAHelper->visitVAStartInst(I); }
void visitVACopyInst(VACopyInst &I) { VAHelper->visitVACopyInst(I); }
/// Handle vector store-like intrinsics.
///
/// Instrument intrinsics that look like a simple SIMD store: writes memory,
/// has 1 pointer argument and 1 vector argument, returns void.
bool handleVectorStoreIntrinsic(IntrinsicInst &I) {
assert(I.arg_size() == 2);
IRBuilder<> IRB(&I);
Value *Addr = I.getArgOperand(0);
Value *Shadow = getShadow(&I, 1);
Value *ShadowPtr, *OriginPtr;
// We don't know the pointer alignment (could be unaligned SSE store!).
// Have to assume to worst case.
std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
Addr, IRB, Shadow->getType(), Align(1), /*isStore*/ true);
IRB.CreateAlignedStore(Shadow, ShadowPtr, Align(1));
if (ClCheckAccessAddress)
insertCheckShadowOf(Addr, &I);
// FIXME: factor out common code from materializeStores
if (MS.TrackOrigins)
IRB.CreateStore(getOrigin(&I, 1), OriginPtr);
return true;
}
/// Handle vector load-like intrinsics.
///
/// Instrument intrinsics that look like a simple SIMD load: reads memory,
/// has 1 pointer argument, returns a vector.
bool handleVectorLoadIntrinsic(IntrinsicInst &I) {
assert(I.arg_size() == 1);
IRBuilder<> IRB(&I);
Value *Addr = I.getArgOperand(0);
Type *ShadowTy = getShadowTy(&I);
Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
if (PropagateShadow) {
// We don't know the pointer alignment (could be unaligned SSE load!).
// Have to assume to worst case.
const Align Alignment = Align(1);
std::tie(ShadowPtr, OriginPtr) =
getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
setShadow(&I,
IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
} else {
setShadow(&I, getCleanShadow(&I));
}
if (ClCheckAccessAddress)
insertCheckShadowOf(Addr, &I);
if (MS.TrackOrigins) {
if (PropagateShadow)
setOrigin(&I, IRB.CreateLoad(MS.OriginTy, OriginPtr));
else
setOrigin(&I, getCleanOrigin());
}
return true;
}
/// Handle (SIMD arithmetic)-like intrinsics.
///
/// Instrument intrinsics with any number of arguments of the same type [*],
/// equal to the return type, plus a specified number of trailing flags of
/// any type.
///
/// [*] The type should be simple (no aggregates or pointers; vectors are
/// fine).
///
/// Caller guarantees that this intrinsic does not access memory.
///
/// TODO: "horizontal"/"pairwise" intrinsics are often incorrectly matched by
/// by this handler. See horizontalReduce().
///
/// TODO: permutation intrinsics are also often incorrectly matched.
[[maybe_unused]] bool
maybeHandleSimpleNomemIntrinsic(IntrinsicInst &I,
unsigned int trailingFlags) {
Type *RetTy = I.getType();
if (!(RetTy->isIntOrIntVectorTy() || RetTy->isFPOrFPVectorTy()))
return false;
unsigned NumArgOperands = I.arg_size();
assert(NumArgOperands >= trailingFlags);
for (unsigned i = 0; i < NumArgOperands - trailingFlags; ++i) {
Type *Ty = I.getArgOperand(i)->getType();
if (Ty != RetTy)
return false;
}
IRBuilder<> IRB(&I);
ShadowAndOriginCombiner SC(this, IRB);
for (unsigned i = 0; i < NumArgOperands; ++i)
SC.Add(I.getArgOperand(i));
SC.Done(&I);
return true;
}
/// Heuristically instrument unknown intrinsics.
///
/// The main purpose of this code is to do something reasonable with all
/// random intrinsics we might encounter, most importantly - SIMD intrinsics.
/// We recognize several classes of intrinsics by their argument types and
/// ModRefBehaviour and apply special instrumentation when we are reasonably
/// sure that we know what the intrinsic does.
///
/// We special-case intrinsics where this approach fails. See llvm.bswap
/// handling as an example of that.
bool handleUnknownIntrinsicUnlogged(IntrinsicInst &I) {
unsigned NumArgOperands = I.arg_size();
if (NumArgOperands == 0)
return false;
if (NumArgOperands == 2 && I.getArgOperand(0)->getType()->isPointerTy() &&
I.getArgOperand(1)->getType()->isVectorTy() &&
I.getType()->isVoidTy() && !I.onlyReadsMemory()) {
// This looks like a vector store.
return handleVectorStoreIntrinsic(I);
}
if (NumArgOperands == 1 && I.getArgOperand(0)->getType()->isPointerTy() &&
I.getType()->isVectorTy() && I.onlyReadsMemory()) {
// This looks like a vector load.
return handleVectorLoadIntrinsic(I);
}
if (I.doesNotAccessMemory())
if (maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/0))
return true;
// FIXME: detect and handle SSE maskstore/maskload?
// Some cases are now handled in handleAVXMasked{Load,Store}.
return false;
}
bool handleUnknownIntrinsic(IntrinsicInst &I) {
if (handleUnknownIntrinsicUnlogged(I)) {
if (ClDumpHeuristicInstructions)
dumpInst(I);
LLVM_DEBUG(dbgs() << "UNKNOWN INSTRUCTION HANDLED HEURISTICALLY: " << I
<< "\n");
return true;
} else
return false;
}
void handleInvariantGroup(IntrinsicInst &I) {
setShadow(&I, getShadow(&I, 0));
setOrigin(&I, getOrigin(&I, 0));
}
void handleLifetimeStart(IntrinsicInst &I) {
if (!PoisonStack)
return;
AllocaInst *AI = dyn_cast<AllocaInst>(I.getArgOperand(0));
if (AI)
LifetimeStartList.push_back(std::make_pair(&I, AI));
}
void handleBswap(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Op = I.getArgOperand(0);
Type *OpType = Op->getType();
setShadow(&I, IRB.CreateIntrinsic(Intrinsic::bswap, ArrayRef(&OpType, 1),
getShadow(Op)));
setOrigin(&I, getOrigin(Op));
}
// Uninitialized bits are ok if they appear after the leading/trailing 0's
// and a 1. If the input is all zero, it is fully initialized iff
// !is_zero_poison.
//
// e.g., for ctlz, with little-endian, if 0/1 are initialized bits with
// concrete value 0/1, and ? is an uninitialized bit:
// - 0001 0??? is fully initialized
// - 000? ???? is fully uninitialized (*)
// - ???? ???? is fully uninitialized
// - 0000 0000 is fully uninitialized if is_zero_poison,
// fully initialized otherwise
//
// (*) TODO: arguably, since the number of zeros is in the range [3, 8], we
// only need to poison 4 bits.
//
// OutputShadow =
// ((ConcreteZerosCount >= ShadowZerosCount) && !AllZeroShadow)
// || (is_zero_poison && AllZeroSrc)
void handleCountLeadingTrailingZeros(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Src = I.getArgOperand(0);
Value *SrcShadow = getShadow(Src);
Value *False = IRB.getInt1(false);
Value *ConcreteZerosCount = IRB.CreateIntrinsic(
I.getType(), I.getIntrinsicID(), {Src, /*is_zero_poison=*/False});
Value *ShadowZerosCount = IRB.CreateIntrinsic(
I.getType(), I.getIntrinsicID(), {SrcShadow, /*is_zero_poison=*/False});
Value *CompareConcreteZeros = IRB.CreateICmpUGE(
ConcreteZerosCount, ShadowZerosCount, "_mscz_cmp_zeros");
Value *NotAllZeroShadow =
IRB.CreateIsNotNull(SrcShadow, "_mscz_shadow_not_null");
Value *OutputShadow =
IRB.CreateAnd(CompareConcreteZeros, NotAllZeroShadow, "_mscz_main");
// If zero poison is requested, mix in with the shadow
Constant *IsZeroPoison = cast<Constant>(I.getOperand(1));
if (!IsZeroPoison->isZeroValue()) {
Value *BoolZeroPoison = IRB.CreateIsNull(Src, "_mscz_bzp");
OutputShadow = IRB.CreateOr(OutputShadow, BoolZeroPoison, "_mscz_bs");
}
OutputShadow = IRB.CreateSExt(OutputShadow, getShadowTy(Src), "_mscz_os");
setShadow(&I, OutputShadow);
setOriginForNaryOp(I);
}
/// Handle Arm NEON vector convert intrinsics.
///
/// e.g., <4 x i32> @llvm.aarch64.neon.fcvtpu.v4i32.v4f32(<4 x float>)
/// i32 @llvm.aarch64.neon.fcvtms.i32.f64(double)
///
/// For x86 SSE vector convert intrinsics, see
/// handleSSEVectorConvertIntrinsic().
void handleNEONVectorConvertIntrinsic(IntrinsicInst &I) {
assert(I.arg_size() == 1);
IRBuilder<> IRB(&I);
Value *S0 = getShadow(&I, 0);
/// For scalars:
/// Since they are converting from floating-point to integer, the output is
/// - fully uninitialized if *any* bit of the input is uninitialized
/// - fully ininitialized if all bits of the input are ininitialized
/// We apply the same principle on a per-field basis for vectors.
Value *OutShadow = IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)),
getShadowTy(&I));
setShadow(&I, OutShadow);
setOriginForNaryOp(I);
}
/// Some instructions have additional zero-elements in the return type
/// e.g., <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512(<8 x i64>, ...)
///
/// This function will return a vector type with the same number of elements
/// as the input, but same per-element width as the return value e.g.,
/// <8 x i8>.
FixedVectorType *maybeShrinkVectorShadowType(Value *Src, IntrinsicInst &I) {
assert(isa<FixedVectorType>(getShadowTy(&I)));
FixedVectorType *ShadowType = cast<FixedVectorType>(getShadowTy(&I));
// TODO: generalize beyond 2x?
if (ShadowType->getElementCount() ==
cast<VectorType>(Src->getType())->getElementCount() * 2)
ShadowType = FixedVectorType::getHalfElementsVectorType(ShadowType);
assert(ShadowType->getElementCount() ==
cast<VectorType>(Src->getType())->getElementCount());
return ShadowType;
}
/// Doubles the length of a vector shadow (filled with zeros) if necessary to
/// match the length of the shadow for the instruction.
/// This is more type-safe than CreateShadowCast().
Value *maybeExtendVectorShadowWithZeros(Value *Shadow, IntrinsicInst &I) {
IRBuilder<> IRB(&I);
assert(isa<FixedVectorType>(Shadow->getType()));
assert(isa<FixedVectorType>(I.getType()));
Value *FullShadow = getCleanShadow(&I);
assert(cast<FixedVectorType>(Shadow->getType())->getNumElements() <=
cast<FixedVectorType>(FullShadow->getType())->getNumElements());
assert(cast<FixedVectorType>(Shadow->getType())->getScalarType() ==
cast<FixedVectorType>(FullShadow->getType())->getScalarType());
if (Shadow->getType() == FullShadow->getType()) {
FullShadow = Shadow;
} else {
// TODO: generalize beyond 2x?
SmallVector<int, 32> ShadowMask(
cast<FixedVectorType>(FullShadow->getType())->getNumElements());
std::iota(ShadowMask.begin(), ShadowMask.end(), 0);
// Append zeros
FullShadow =
IRB.CreateShuffleVector(Shadow, getCleanShadow(Shadow), ShadowMask);
}
return FullShadow;
}
/// Handle x86 SSE vector conversion.
///
/// e.g., single-precision to half-precision conversion:
/// <8 x i16> @llvm.x86.vcvtps2ph.256(<8 x float> %a0, i32 0)
/// <8 x i16> @llvm.x86.vcvtps2ph.128(<4 x float> %a0, i32 0)
///
/// floating-point to integer:
/// <4 x i32> @llvm.x86.sse2.cvtps2dq(<4 x float>)
/// <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>)
///
/// Note: if the output has more elements, they are zero-initialized (and
/// therefore the shadow will also be initialized).
///
/// This differs from handleSSEVectorConvertIntrinsic() because it
/// propagates uninitialized shadow (instead of checking the shadow).
void handleSSEVectorConvertIntrinsicByProp(IntrinsicInst &I,
bool HasRoundingMode) {
if (HasRoundingMode) {
assert(I.arg_size() == 2);
[[maybe_unused]] Value *RoundingMode = I.getArgOperand(1);
assert(RoundingMode->getType()->isIntegerTy());
} else {
assert(I.arg_size() == 1);
}
Value *Src = I.getArgOperand(0);
assert(Src->getType()->isVectorTy());
// The return type might have more elements than the input.
// Temporarily shrink the return type's number of elements.
VectorType *ShadowType = maybeShrinkVectorShadowType(Src, I);
IRBuilder<> IRB(&I);
Value *S0 = getShadow(&I, 0);
/// For scalars:
/// Since they are converting to and/or from floating-point, the output is:
/// - fully uninitialized if *any* bit of the input is uninitialized
/// - fully ininitialized if all bits of the input are ininitialized
/// We apply the same principle on a per-field basis for vectors.
Value *Shadow =
IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)), ShadowType);
// The return type might have more elements than the input.
// Extend the return type back to its original width if necessary.
Value *FullShadow = maybeExtendVectorShadowWithZeros(Shadow, I);
setShadow(&I, FullShadow);
setOriginForNaryOp(I);
}
// Instrument x86 SSE vector convert intrinsic.
//
// This function instruments intrinsics like cvtsi2ss:
// %Out = int_xxx_cvtyyy(%ConvertOp)
// or
// %Out = int_xxx_cvtyyy(%CopyOp, %ConvertOp)
// Intrinsic converts \p NumUsedElements elements of \p ConvertOp to the same
// number \p Out elements, and (if has 2 arguments) copies the rest of the
// elements from \p CopyOp.
// In most cases conversion involves floating-point value which may trigger a
// hardware exception when not fully initialized. For this reason we require
// \p ConvertOp[0:NumUsedElements] to be fully initialized and trap otherwise.
// We copy the shadow of \p CopyOp[NumUsedElements:] to \p
// Out[NumUsedElements:]. This means that intrinsics without \p CopyOp always
// return a fully initialized value.
//
// For Arm NEON vector convert intrinsics, see
// handleNEONVectorConvertIntrinsic().
void handleSSEVectorConvertIntrinsic(IntrinsicInst &I, int NumUsedElements,
bool HasRoundingMode = false) {
IRBuilder<> IRB(&I);
Value *CopyOp, *ConvertOp;
assert((!HasRoundingMode ||
isa<ConstantInt>(I.getArgOperand(I.arg_size() - 1))) &&
"Invalid rounding mode");
switch (I.arg_size() - HasRoundingMode) {
case 2:
CopyOp = I.getArgOperand(0);
ConvertOp = I.getArgOperand(1);
break;
case 1:
ConvertOp = I.getArgOperand(0);
CopyOp = nullptr;
break;
default:
llvm_unreachable("Cvt intrinsic with unsupported number of arguments.");
}
// The first *NumUsedElements* elements of ConvertOp are converted to the
// same number of output elements. The rest of the output is copied from
// CopyOp, or (if not available) filled with zeroes.
// Combine shadow for elements of ConvertOp that are used in this operation,
// and insert a check.
// FIXME: consider propagating shadow of ConvertOp, at least in the case of
// int->any conversion.
Value *ConvertShadow = getShadow(ConvertOp);
Value *AggShadow = nullptr;
if (ConvertOp->getType()->isVectorTy()) {
AggShadow = IRB.CreateExtractElement(
ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), 0));
for (int i = 1; i < NumUsedElements; ++i) {
Value *MoreShadow = IRB.CreateExtractElement(
ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), i));
AggShadow = IRB.CreateOr(AggShadow, MoreShadow);
}
} else {
AggShadow = ConvertShadow;
}
assert(AggShadow->getType()->isIntegerTy());
insertCheckShadow(AggShadow, getOrigin(ConvertOp), &I);
// Build result shadow by zero-filling parts of CopyOp shadow that come from
// ConvertOp.
if (CopyOp) {
assert(CopyOp->getType() == I.getType());
assert(CopyOp->getType()->isVectorTy());
Value *ResultShadow = getShadow(CopyOp);
Type *EltTy = cast<VectorType>(ResultShadow->getType())->getElementType();
for (int i = 0; i < NumUsedElements; ++i) {
ResultShadow = IRB.CreateInsertElement(
ResultShadow, ConstantInt::getNullValue(EltTy),
ConstantInt::get(IRB.getInt32Ty(), i));
}
setShadow(&I, ResultShadow);
setOrigin(&I, getOrigin(CopyOp));
} else {
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
}
}
// Given a scalar or vector, extract lower 64 bits (or less), and return all
// zeroes if it is zero, and all ones otherwise.
Value *Lower64ShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
if (S->getType()->isVectorTy())
S = CreateShadowCast(IRB, S, IRB.getInt64Ty(), /* Signed */ true);
assert(S->getType()->getPrimitiveSizeInBits() <= 64);
Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
return CreateShadowCast(IRB, S2, T, /* Signed */ true);
}
// Given a vector, extract its first element, and return all
// zeroes if it is zero, and all ones otherwise.
Value *LowerElementShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
Value *S1 = IRB.CreateExtractElement(S, (uint64_t)0);
Value *S2 = IRB.CreateICmpNE(S1, getCleanShadow(S1));
return CreateShadowCast(IRB, S2, T, /* Signed */ true);
}
Value *VariableShadowExtend(IRBuilder<> &IRB, Value *S) {
Type *T = S->getType();
assert(T->isVectorTy());
Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
return IRB.CreateSExt(S2, T);
}
// Instrument vector shift intrinsic.
//
// This function instruments intrinsics like int_x86_avx2_psll_w.
// Intrinsic shifts %In by %ShiftSize bits.
// %ShiftSize may be a vector. In that case the lower 64 bits determine shift
// size, and the rest is ignored. Behavior is defined even if shift size is
// greater than register (or field) width.
void handleVectorShiftIntrinsic(IntrinsicInst &I, bool Variable) {
assert(I.arg_size() == 2);
IRBuilder<> IRB(&I);
// If any of the S2 bits are poisoned, the whole thing is poisoned.
// Otherwise perform the same shift on S1.
Value *S1 = getShadow(&I, 0);
Value *S2 = getShadow(&I, 1);
Value *S2Conv = Variable ? VariableShadowExtend(IRB, S2)
: Lower64ShadowExtend(IRB, S2, getShadowTy(&I));
Value *V1 = I.getOperand(0);
Value *V2 = I.getOperand(1);
Value *Shift = IRB.CreateCall(I.getFunctionType(), I.getCalledOperand(),
{IRB.CreateBitCast(S1, V1->getType()), V2});
Shift = IRB.CreateBitCast(Shift, getShadowTy(&I));
setShadow(&I, IRB.CreateOr(Shift, S2Conv));
setOriginForNaryOp(I);
}
// Get an MMX-sized (64-bit) vector type, or optionally, other sized
// vectors.
Type *getMMXVectorTy(unsigned EltSizeInBits,
unsigned X86_MMXSizeInBits = 64) {
assert(EltSizeInBits != 0 && (X86_MMXSizeInBits % EltSizeInBits) == 0 &&
"Illegal MMX vector element size");
return FixedVectorType::get(IntegerType::get(*MS.C, EltSizeInBits),
X86_MMXSizeInBits / EltSizeInBits);
}
// Returns a signed counterpart for an (un)signed-saturate-and-pack
// intrinsic.
Intrinsic::ID getSignedPackIntrinsic(Intrinsic::ID id) {
switch (id) {
case Intrinsic::x86_sse2_packsswb_128:
case Intrinsic::x86_sse2_packuswb_128:
return Intrinsic::x86_sse2_packsswb_128;
case Intrinsic::x86_sse2_packssdw_128:
case Intrinsic::x86_sse41_packusdw:
return Intrinsic::x86_sse2_packssdw_128;
case Intrinsic::x86_avx2_packsswb:
case Intrinsic::x86_avx2_packuswb:
return Intrinsic::x86_avx2_packsswb;
case Intrinsic::x86_avx2_packssdw:
case Intrinsic::x86_avx2_packusdw:
return Intrinsic::x86_avx2_packssdw;
case Intrinsic::x86_mmx_packsswb:
case Intrinsic::x86_mmx_packuswb:
return Intrinsic::x86_mmx_packsswb;
case Intrinsic::x86_mmx_packssdw:
return Intrinsic::x86_mmx_packssdw;
default:
llvm_unreachable("unexpected intrinsic id");
}
}
// Instrument vector pack intrinsic.
//
// This function instruments intrinsics like x86_mmx_packsswb, that
// packs elements of 2 input vectors into half as many bits with saturation.
// Shadow is propagated with the signed variant of the same intrinsic applied
// to sext(Sa != zeroinitializer), sext(Sb != zeroinitializer).
// MMXEltSizeInBits is used only for x86mmx arguments.
void handleVectorPackIntrinsic(IntrinsicInst &I,
unsigned MMXEltSizeInBits = 0) {
assert(I.arg_size() == 2);
IRBuilder<> IRB(&I);
Value *S1 = getShadow(&I, 0);
Value *S2 = getShadow(&I, 1);
assert(S1->getType()->isVectorTy());
// SExt and ICmpNE below must apply to individual elements of input vectors.
// In case of x86mmx arguments, cast them to appropriate vector types and
// back.
Type *T =
MMXEltSizeInBits ? getMMXVectorTy(MMXEltSizeInBits) : S1->getType();
if (MMXEltSizeInBits) {
S1 = IRB.CreateBitCast(S1, T);
S2 = IRB.CreateBitCast(S2, T);
}
Value *S1_ext =
IRB.CreateSExt(IRB.CreateICmpNE(S1, Constant::getNullValue(T)), T);
Value *S2_ext =
IRB.CreateSExt(IRB.CreateICmpNE(S2, Constant::getNullValue(T)), T);
if (MMXEltSizeInBits) {
S1_ext = IRB.CreateBitCast(S1_ext, getMMXVectorTy(64));
S2_ext = IRB.CreateBitCast(S2_ext, getMMXVectorTy(64));
}
Value *S = IRB.CreateIntrinsic(getSignedPackIntrinsic(I.getIntrinsicID()),
{S1_ext, S2_ext}, /*FMFSource=*/nullptr,
"_msprop_vector_pack");
if (MMXEltSizeInBits)
S = IRB.CreateBitCast(S, getShadowTy(&I));
setShadow(&I, S);
setOriginForNaryOp(I);
}
// Convert `Mask` into `<n x i1>`.
Constant *createDppMask(unsigned Width, unsigned Mask) {
SmallVector<Constant *, 4> R(Width);
for (auto &M : R) {
M = ConstantInt::getBool(F.getContext(), Mask & 1);
Mask >>= 1;
}
return ConstantVector::get(R);
}
// Calculate output shadow as array of booleans `<n x i1>`, assuming if any
// arg is poisoned, entire dot product is poisoned.
Value *findDppPoisonedOutput(IRBuilder<> &IRB, Value *S, unsigned SrcMask,
unsigned DstMask) {
const unsigned Width =
cast<FixedVectorType>(S->getType())->getNumElements();
S = IRB.CreateSelect(createDppMask(Width, SrcMask), S,
Constant::getNullValue(S->getType()));
Value *SElem = IRB.CreateOrReduce(S);
Value *IsClean = IRB.CreateIsNull(SElem, "_msdpp");
Value *DstMaskV = createDppMask(Width, DstMask);
return IRB.CreateSelect(
IsClean, Constant::getNullValue(DstMaskV->getType()), DstMaskV);
}
// See `Intel Intrinsics Guide` for `_dp_p*` instructions.
//
// 2 and 4 element versions produce single scalar of dot product, and then
// puts it into elements of output vector, selected by 4 lowest bits of the
// mask. Top 4 bits of the mask control which elements of input to use for dot
// product.
//
// 8 element version mask still has only 4 bit for input, and 4 bit for output
// mask. According to the spec it just operates as 4 element version on first
// 4 elements of inputs and output, and then on last 4 elements of inputs and
// output.
void handleDppIntrinsic(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *S0 = getShadow(&I, 0);
Value *S1 = getShadow(&I, 1);
Value *S = IRB.CreateOr(S0, S1);
const unsigned Width =
cast<FixedVectorType>(S->getType())->getNumElements();
assert(Width == 2 || Width == 4 || Width == 8);
const unsigned Mask = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
const unsigned SrcMask = Mask >> 4;
const unsigned DstMask = Mask & 0xf;
// Calculate shadow as `<n x i1>`.
Value *SI1 = findDppPoisonedOutput(IRB, S, SrcMask, DstMask);
if (Width == 8) {
// First 4 elements of shadow are already calculated. `makeDppShadow`
// operats on 32 bit masks, so we can just shift masks, and repeat.
SI1 = IRB.CreateOr(
SI1, findDppPoisonedOutput(IRB, S, SrcMask << 4, DstMask << 4));
}
// Extend to real size of shadow, poisoning either all or none bits of an
// element.
S = IRB.CreateSExt(SI1, S->getType(), "_msdpp");
setShadow(&I, S);
setOriginForNaryOp(I);
}
Value *convertBlendvToSelectMask(IRBuilder<> &IRB, Value *C) {
C = CreateAppToShadowCast(IRB, C);
FixedVectorType *FVT = cast<FixedVectorType>(C->getType());
unsigned ElSize = FVT->getElementType()->getPrimitiveSizeInBits();
C = IRB.CreateAShr(C, ElSize - 1);
FVT = FixedVectorType::get(IRB.getInt1Ty(), FVT->getNumElements());
return IRB.CreateTrunc(C, FVT);
}
// `blendv(f, t, c)` is effectively `select(c[top_bit], t, f)`.
void handleBlendvIntrinsic(IntrinsicInst &I) {
Value *C = I.getOperand(2);
Value *T = I.getOperand(1);
Value *F = I.getOperand(0);
Value *Sc = getShadow(&I, 2);
Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
{
IRBuilder<> IRB(&I);
// Extract top bit from condition and its shadow.
C = convertBlendvToSelectMask(IRB, C);
Sc = convertBlendvToSelectMask(IRB, Sc);
setShadow(C, Sc);
setOrigin(C, Oc);
}
handleSelectLikeInst(I, C, T, F);
}
// Instrument sum-of-absolute-differences intrinsic.
void handleVectorSadIntrinsic(IntrinsicInst &I, bool IsMMX = false) {
const unsigned SignificantBitsPerResultElement = 16;
Type *ResTy = IsMMX ? IntegerType::get(*MS.C, 64) : I.getType();
unsigned ZeroBitsPerResultElement =
ResTy->getScalarSizeInBits() - SignificantBitsPerResultElement;
IRBuilder<> IRB(&I);
auto *Shadow0 = getShadow(&I, 0);
auto *Shadow1 = getShadow(&I, 1);
Value *S = IRB.CreateOr(Shadow0, Shadow1);
S = IRB.CreateBitCast(S, ResTy);
S = IRB.CreateSExt(IRB.CreateICmpNE(S, Constant::getNullValue(ResTy)),
ResTy);
S = IRB.CreateLShr(S, ZeroBitsPerResultElement);
S = IRB.CreateBitCast(S, getShadowTy(&I));
setShadow(&I, S);
setOriginForNaryOp(I);
}
// Instrument multiply-add intrinsics.
//
// e.g., Two operands:
// <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a, <8 x i16> %b)
//
// Two operands which require an EltSizeInBits override:
// <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64> %a, <1 x i64> %b)
//
// Three operands are not implemented yet:
// <4 x i32> @llvm.x86.avx512.vpdpbusd.128
// (<4 x i32> %s, <4 x i32> %a, <4 x i32> %b)
// (the result of multiply-add'ing %a and %b is accumulated with %s)
void handleVectorPmaddIntrinsic(IntrinsicInst &I, unsigned ReductionFactor,
unsigned EltSizeInBits = 0) {
IRBuilder<> IRB(&I);
[[maybe_unused]] FixedVectorType *ReturnType =
cast<FixedVectorType>(I.getType());
assert(isa<FixedVectorType>(ReturnType));
assert(I.arg_size() == 2);
// Vectors A and B, and shadows
Value *Va = I.getOperand(0);
Value *Vb = I.getOperand(1);
Value *Sa = getShadow(&I, 0);
Value *Sb = getShadow(&I, 1);
FixedVectorType *ParamType =
cast<FixedVectorType>(I.getArgOperand(0)->getType());
assert(ParamType == I.getArgOperand(1)->getType());
assert(ParamType->getPrimitiveSizeInBits() ==
ReturnType->getPrimitiveSizeInBits());
FixedVectorType *ImplicitReturnType = ReturnType;
// Step 1: instrument multiplication of corresponding vector elements
if (EltSizeInBits) {
ImplicitReturnType = cast<FixedVectorType>(getMMXVectorTy(
EltSizeInBits * 2, ParamType->getPrimitiveSizeInBits()));
ParamType = cast<FixedVectorType>(
getMMXVectorTy(EltSizeInBits, ParamType->getPrimitiveSizeInBits()));
Va = IRB.CreateBitCast(Va, ParamType);
Vb = IRB.CreateBitCast(Vb, ParamType);
Sa = IRB.CreateBitCast(Sa, getShadowTy(ParamType));
Sb = IRB.CreateBitCast(Sb, getShadowTy(ParamType));
} else {
assert(ParamType->getNumElements() ==
ReturnType->getNumElements() * ReductionFactor);
}
// Multiplying an *initialized* zero by an uninitialized element results in
// an initialized zero element.
//
// This is analogous to bitwise AND, where "AND" of 0 and a poisoned value
// results in an unpoisoned value. We can therefore adapt the visitAnd()
// instrumentation:
// OutShadow = (SaNonZero & SbNonZero)
// | (VaNonZero & SbNonZero)
// | (SaNonZero & VbNonZero)
// where non-zero is checked on a per-element basis (not per bit).
Value *SZero = Constant::getNullValue(Va->getType());
Value *VZero = Constant::getNullValue(Sa->getType());
Value *SaNonZero = IRB.CreateICmpNE(Sa, SZero);
Value *SbNonZero = IRB.CreateICmpNE(Sb, SZero);
Value *VaNonZero = IRB.CreateICmpNE(Va, VZero);
Value *VbNonZero = IRB.CreateICmpNE(Vb, VZero);
Value *SaAndSbNonZero = IRB.CreateAnd(SaNonZero, SbNonZero);
Value *VaAndSbNonZero = IRB.CreateAnd(VaNonZero, SbNonZero);
Value *SaAndVbNonZero = IRB.CreateAnd(SaNonZero, VbNonZero);
// Each element of the vector is represented by a single bit (poisoned or
// not) e.g., <8 x i1>.
Value *And = IRB.CreateOr({SaAndSbNonZero, VaAndSbNonZero, SaAndVbNonZero});
// Extend <8 x i1> to <8 x i16>.
// (The real pmadd intrinsic would have computed intermediate values of
// <8 x i32>, but that is irrelevant for our shadow purposes because we
// consider each element to be either fully initialized or fully
// uninitialized.)
And = IRB.CreateSExt(And, Sa->getType());
// Step 2: instrument horizontal add
// We don't need bit-precise horizontalReduce because we only want to check
// if each pair of elements is fully zero.
// Cast to <4 x i32>.
Value *Horizontal = IRB.CreateBitCast(And, ImplicitReturnType);
// Compute <4 x i1>, then extend back to <4 x i32>.
Value *OutShadow = IRB.CreateSExt(
IRB.CreateICmpNE(Horizontal,
Constant::getNullValue(Horizontal->getType())),
ImplicitReturnType);
// For MMX, cast it back to the required fake return type (<1 x i64>).
if (EltSizeInBits)
OutShadow = CreateShadowCast(IRB, OutShadow, getShadowTy(&I));
setShadow(&I, OutShadow);
setOriginForNaryOp(I);
}
// Instrument compare-packed intrinsic.
// Basically, an or followed by sext(icmp ne 0) to end up with all-zeros or
// all-ones shadow.
void handleVectorComparePackedIntrinsic(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Type *ResTy = getShadowTy(&I);
auto *Shadow0 = getShadow(&I, 0);
auto *Shadow1 = getShadow(&I, 1);
Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
Value *S = IRB.CreateSExt(
IRB.CreateICmpNE(S0, Constant::getNullValue(ResTy)), ResTy);
setShadow(&I, S);
setOriginForNaryOp(I);
}
// Instrument compare-scalar intrinsic.
// This handles both cmp* intrinsics which return the result in the first
// element of a vector, and comi* which return the result as i32.
void handleVectorCompareScalarIntrinsic(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
auto *Shadow0 = getShadow(&I, 0);
auto *Shadow1 = getShadow(&I, 1);
Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
Value *S = LowerElementShadowExtend(IRB, S0, getShadowTy(&I));
setShadow(&I, S);
setOriginForNaryOp(I);
}
// Instrument generic vector reduction intrinsics
// by ORing together all their fields.
//
// If AllowShadowCast is true, the return type does not need to be the same
// type as the fields
// e.g., declare i32 @llvm.aarch64.neon.uaddv.i32.v16i8(<16 x i8>)
void handleVectorReduceIntrinsic(IntrinsicInst &I, bool AllowShadowCast) {
assert(I.arg_size() == 1);
IRBuilder<> IRB(&I);
Value *S = IRB.CreateOrReduce(getShadow(&I, 0));
if (AllowShadowCast)
S = CreateShadowCast(IRB, S, getShadowTy(&I));
else
assert(S->getType() == getShadowTy(&I));
setShadow(&I, S);
setOriginForNaryOp(I);
}
// Similar to handleVectorReduceIntrinsic but with an initial starting value.
// e.g., call float @llvm.vector.reduce.fadd.f32.v2f32(float %a0, <2 x float>
// %a1)
// shadow = shadow[a0] | shadow[a1.0] | shadow[a1.1]
//
// The type of the return value, initial starting value, and elements of the
// vector must be identical.
void handleVectorReduceWithStarterIntrinsic(IntrinsicInst &I) {
assert(I.arg_size() == 2);
IRBuilder<> IRB(&I);
Value *Shadow0 = getShadow(&I, 0);
Value *Shadow1 = IRB.CreateOrReduce(getShadow(&I, 1));
assert(Shadow0->getType() == Shadow1->getType());
Value *S = IRB.CreateOr(Shadow0, Shadow1);
assert(S->getType() == getShadowTy(&I));
setShadow(&I, S);
setOriginForNaryOp(I);
}
// Instrument vector.reduce.or intrinsic.
// Valid (non-poisoned) set bits in the operand pull low the
// corresponding shadow bits.
void handleVectorReduceOrIntrinsic(IntrinsicInst &I) {
assert(I.arg_size() == 1);
IRBuilder<> IRB(&I);
Value *OperandShadow = getShadow(&I, 0);
Value *OperandUnsetBits = IRB.CreateNot(I.getOperand(0));
Value *OperandUnsetOrPoison = IRB.CreateOr(OperandUnsetBits, OperandShadow);
// Bit N is clean if any field's bit N is 1 and unpoison
Value *OutShadowMask = IRB.CreateAndReduce(OperandUnsetOrPoison);
// Otherwise, it is clean if every field's bit N is unpoison
Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
setShadow(&I, S);
setOrigin(&I, getOrigin(&I, 0));
}
// Instrument vector.reduce.and intrinsic.
// Valid (non-poisoned) unset bits in the operand pull down the
// corresponding shadow bits.
void handleVectorReduceAndIntrinsic(IntrinsicInst &I) {
assert(I.arg_size() == 1);
IRBuilder<> IRB(&I);
Value *OperandShadow = getShadow(&I, 0);
Value *OperandSetOrPoison = IRB.CreateOr(I.getOperand(0), OperandShadow);
// Bit N is clean if any field's bit N is 0 and unpoison
Value *OutShadowMask = IRB.CreateAndReduce(OperandSetOrPoison);
// Otherwise, it is clean if every field's bit N is unpoison
Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
setShadow(&I, S);
setOrigin(&I, getOrigin(&I, 0));
}
void handleStmxcsr(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Addr = I.getArgOperand(0);
Type *Ty = IRB.getInt32Ty();
Value *ShadowPtr =
getShadowOriginPtr(Addr, IRB, Ty, Align(1), /*isStore*/ true).first;
IRB.CreateStore(getCleanShadow(Ty), ShadowPtr);
if (ClCheckAccessAddress)
insertCheckShadowOf(Addr, &I);
}
void handleLdmxcsr(IntrinsicInst &I) {
if (!InsertChecks)
return;
IRBuilder<> IRB(&I);
Value *Addr = I.getArgOperand(0);
Type *Ty = IRB.getInt32Ty();
const Align Alignment = Align(1);
Value *ShadowPtr, *OriginPtr;
std::tie(ShadowPtr, OriginPtr) =
getShadowOriginPtr(Addr, IRB, Ty, Alignment, /*isStore*/ false);
if (ClCheckAccessAddress)
insertCheckShadowOf(Addr, &I);
Value *Shadow = IRB.CreateAlignedLoad(Ty, ShadowPtr, Alignment, "_ldmxcsr");
Value *Origin = MS.TrackOrigins ? IRB.CreateLoad(MS.OriginTy, OriginPtr)
: getCleanOrigin();
insertCheckShadow(Shadow, Origin, &I);
}
void handleMaskedExpandLoad(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Ptr = I.getArgOperand(0);
MaybeAlign Align = I.getParamAlign(0);
Value *Mask = I.getArgOperand(1);
Value *PassThru = I.getArgOperand(2);
if (ClCheckAccessAddress) {
insertCheckShadowOf(Ptr, &I);
insertCheckShadowOf(Mask, &I);
}
if (!PropagateShadow) {
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
return;
}
Type *ShadowTy = getShadowTy(&I);
Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
auto [ShadowPtr, OriginPtr] =
getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ false);
Value *Shadow =
IRB.CreateMaskedExpandLoad(ShadowTy, ShadowPtr, Align, Mask,
getShadow(PassThru), "_msmaskedexpload");
setShadow(&I, Shadow);
// TODO: Store origins.
setOrigin(&I, getCleanOrigin());
}
void handleMaskedCompressStore(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Values = I.getArgOperand(0);
Value *Ptr = I.getArgOperand(1);
MaybeAlign Align = I.getParamAlign(1);
Value *Mask = I.getArgOperand(2);
if (ClCheckAccessAddress) {
insertCheckShadowOf(Ptr, &I);
insertCheckShadowOf(Mask, &I);
}
Value *Shadow = getShadow(Values);
Type *ElementShadowTy =
getShadowTy(cast<VectorType>(Values->getType())->getElementType());
auto [ShadowPtr, OriginPtrs] =
getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ true);
IRB.CreateMaskedCompressStore(Shadow, ShadowPtr, Align, Mask);
// TODO: Store origins.
}
void handleMaskedGather(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Ptrs = I.getArgOperand(0);
const Align Alignment(
cast<ConstantInt>(I.getArgOperand(1))->getZExtValue());
Value *Mask = I.getArgOperand(2);
Value *PassThru = I.getArgOperand(3);
Type *PtrsShadowTy = getShadowTy(Ptrs);
if (ClCheckAccessAddress) {
insertCheckShadowOf(Mask, &I);
Value *MaskedPtrShadow = IRB.CreateSelect(
Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
"_msmaskedptrs");
insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
}
if (!PropagateShadow) {
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
return;
}
Type *ShadowTy = getShadowTy(&I);
Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ false);
Value *Shadow =
IRB.CreateMaskedGather(ShadowTy, ShadowPtrs, Alignment, Mask,
getShadow(PassThru), "_msmaskedgather");
setShadow(&I, Shadow);
// TODO: Store origins.
setOrigin(&I, getCleanOrigin());
}
void handleMaskedScatter(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Values = I.getArgOperand(0);
Value *Ptrs = I.getArgOperand(1);
const Align Alignment(
cast<ConstantInt>(I.getArgOperand(2))->getZExtValue());
Value *Mask = I.getArgOperand(3);
Type *PtrsShadowTy = getShadowTy(Ptrs);
if (ClCheckAccessAddress) {
insertCheckShadowOf(Mask, &I);
Value *MaskedPtrShadow = IRB.CreateSelect(
Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
"_msmaskedptrs");
insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
}
Value *Shadow = getShadow(Values);
Type *ElementShadowTy =
getShadowTy(cast<VectorType>(Values->getType())->getElementType());
auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ true);
IRB.CreateMaskedScatter(Shadow, ShadowPtrs, Alignment, Mask);
// TODO: Store origin.
}
// Intrinsic::masked_store
//
// Note: handleAVXMaskedStore handles AVX/AVX2 variants, though AVX512 masked
// stores are lowered to Intrinsic::masked_store.
void handleMaskedStore(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *V = I.getArgOperand(0);
Value *Ptr = I.getArgOperand(1);
const Align Alignment(
cast<ConstantInt>(I.getArgOperand(2))->getZExtValue());
Value *Mask = I.getArgOperand(3);
Value *Shadow = getShadow(V);
if (ClCheckAccessAddress) {
insertCheckShadowOf(Ptr, &I);
insertCheckShadowOf(Mask, &I);
}
Value *ShadowPtr;
Value *OriginPtr;
std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
Ptr, IRB, Shadow->getType(), Alignment, /*isStore*/ true);
IRB.CreateMaskedStore(Shadow, ShadowPtr, Alignment, Mask);
if (!MS.TrackOrigins)
return;
auto &DL = F.getDataLayout();
paintOrigin(IRB, getOrigin(V), OriginPtr,
DL.getTypeStoreSize(Shadow->getType()),
std::max(Alignment, kMinOriginAlignment));
}
// Intrinsic::masked_load
//
// Note: handleAVXMaskedLoad handles AVX/AVX2 variants, though AVX512 masked
// loads are lowered to Intrinsic::masked_load.
void handleMaskedLoad(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Ptr = I.getArgOperand(0);
const Align Alignment(
cast<ConstantInt>(I.getArgOperand(1))->getZExtValue());
Value *Mask = I.getArgOperand(2);
Value *PassThru = I.getArgOperand(3);
if (ClCheckAccessAddress) {
insertCheckShadowOf(Ptr, &I);
insertCheckShadowOf(Mask, &I);
}
if (!PropagateShadow) {
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
return;
}
Type *ShadowTy = getShadowTy(&I);
Value *ShadowPtr, *OriginPtr;
std::tie(ShadowPtr, OriginPtr) =
getShadowOriginPtr(Ptr, IRB, ShadowTy, Alignment, /*isStore*/ false);
setShadow(&I, IRB.CreateMaskedLoad(ShadowTy, ShadowPtr, Alignment, Mask,
getShadow(PassThru), "_msmaskedld"));
if (!MS.TrackOrigins)
return;
// Choose between PassThru's and the loaded value's origins.
Value *MaskedPassThruShadow = IRB.CreateAnd(
getShadow(PassThru), IRB.CreateSExt(IRB.CreateNeg(Mask), ShadowTy));
Value *NotNull = convertToBool(MaskedPassThruShadow, IRB, "_mscmp");
Value *PtrOrigin = IRB.CreateLoad(MS.OriginTy, OriginPtr);
Value *Origin = IRB.CreateSelect(NotNull, getOrigin(PassThru), PtrOrigin);
setOrigin(&I, Origin);
}
// e.g., void @llvm.x86.avx.maskstore.ps.256(ptr, <8 x i32>, <8 x float>)
// dst mask src
//
// AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
// by handleMaskedStore.
//
// This function handles AVX and AVX2 masked stores; these use the MSBs of a
// vector of integers, unlike the LLVM masked intrinsics, which require a
// vector of booleans. X86InstCombineIntrinsic.cpp::simplifyX86MaskedLoad
// mentions that the x86 backend does not know how to efficiently convert
// from a vector of booleans back into the AVX mask format; therefore, they
// (and we) do not reduce AVX/AVX2 masked intrinsics into LLVM masked
// intrinsics.
void handleAVXMaskedStore(IntrinsicInst &I) {
assert(I.arg_size() == 3);
IRBuilder<> IRB(&I);
Value *Dst = I.getArgOperand(0);
assert(Dst->getType()->isPointerTy() && "Destination is not a pointer!");
Value *Mask = I.getArgOperand(1);
assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
Value *Src = I.getArgOperand(2);
assert(isa<VectorType>(Src->getType()) && "Source is not a vector!");
const Align Alignment = Align(1);
Value *SrcShadow = getShadow(Src);
if (ClCheckAccessAddress) {
insertCheckShadowOf(Dst, &I);
insertCheckShadowOf(Mask, &I);
}
Value *DstShadowPtr;
Value *DstOriginPtr;
std::tie(DstShadowPtr, DstOriginPtr) = getShadowOriginPtr(
Dst, IRB, SrcShadow->getType(), Alignment, /*isStore*/ true);
SmallVector<Value *, 2> ShadowArgs;
ShadowArgs.append(1, DstShadowPtr);
ShadowArgs.append(1, Mask);
// The intrinsic may require floating-point but shadows can be arbitrary
// bit patterns, of which some would be interpreted as "invalid"
// floating-point values (NaN etc.); we assume the intrinsic will happily
// copy them.
ShadowArgs.append(1, IRB.CreateBitCast(SrcShadow, Src->getType()));
CallInst *CI =
IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
setShadow(&I, CI);
if (!MS.TrackOrigins)
return;
// Approximation only
auto &DL = F.getDataLayout();
paintOrigin(IRB, getOrigin(Src), DstOriginPtr,
DL.getTypeStoreSize(SrcShadow->getType()),
std::max(Alignment, kMinOriginAlignment));
}
// e.g., <8 x float> @llvm.x86.avx.maskload.ps.256(ptr, <8 x i32>)
// return src mask
//
// Masked-off values are replaced with 0, which conveniently also represents
// initialized memory.
//
// AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
// by handleMaskedStore.
//
// We do not combine this with handleMaskedLoad; see comment in
// handleAVXMaskedStore for the rationale.
//
// This is subtly different than handleIntrinsicByApplyingToShadow(I, 1)
// because we need to apply getShadowOriginPtr, not getShadow, to the first
// parameter.
void handleAVXMaskedLoad(IntrinsicInst &I) {
assert(I.arg_size() == 2);
IRBuilder<> IRB(&I);
Value *Src = I.getArgOperand(0);
assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
Value *Mask = I.getArgOperand(1);
assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
const Align Alignment = Align(1);
if (ClCheckAccessAddress) {
insertCheckShadowOf(Mask, &I);
}
Type *SrcShadowTy = getShadowTy(Src);
Value *SrcShadowPtr, *SrcOriginPtr;
std::tie(SrcShadowPtr, SrcOriginPtr) =
getShadowOriginPtr(Src, IRB, SrcShadowTy, Alignment, /*isStore*/ false);
SmallVector<Value *, 2> ShadowArgs;
ShadowArgs.append(1, SrcShadowPtr);
ShadowArgs.append(1, Mask);
CallInst *CI =
IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(), ShadowArgs);
// The AVX masked load intrinsics do not have integer variants. We use the
// floating-point variants, which will happily copy the shadows even if
// they are interpreted as "invalid" floating-point values (NaN etc.).
setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
if (!MS.TrackOrigins)
return;
// The "pass-through" value is always zero (initialized). To the extent
// that that results in initialized aligned 4-byte chunks, the origin value
// is ignored. It is therefore correct to simply copy the origin from src.
Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
setOrigin(&I, PtrSrcOrigin);
}
// Test whether the mask indices are initialized, only checking the bits that
// are actually used.
//
// e.g., if Idx is <32 x i16>, only (log2(32) == 5) bits of each index are
// used/checked.
void maskedCheckAVXIndexShadow(IRBuilder<> &IRB, Value *Idx, Instruction *I) {
assert(isFixedIntVector(Idx));
auto IdxVectorSize =
cast<FixedVectorType>(Idx->getType())->getNumElements();
assert(isPowerOf2_64(IdxVectorSize));
// Compiler isn't smart enough, let's help it
if (isa<Constant>(Idx))
return;
auto *IdxShadow = getShadow(Idx);
Value *Truncated = IRB.CreateTrunc(
IdxShadow,
FixedVectorType::get(Type::getIntNTy(*MS.C, Log2_64(IdxVectorSize)),
IdxVectorSize));
insertCheckShadow(Truncated, getOrigin(Idx), I);
}
// Instrument AVX permutation intrinsic.
// We apply the same permutation (argument index 1) to the shadow.
void handleAVXVpermilvar(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Shadow = getShadow(&I, 0);
maskedCheckAVXIndexShadow(IRB, I.getArgOperand(1), &I);
// Shadows are integer-ish types but some intrinsics require a
// different (e.g., floating-point) type.
Shadow = IRB.CreateBitCast(Shadow, I.getArgOperand(0)->getType());
CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
{Shadow, I.getArgOperand(1)});
setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
setOriginForNaryOp(I);
}
// Instrument AVX permutation intrinsic.
// We apply the same permutation (argument index 1) to the shadows.
void handleAVXVpermi2var(IntrinsicInst &I) {
assert(I.arg_size() == 3);
assert(isa<FixedVectorType>(I.getArgOperand(0)->getType()));
assert(isa<FixedVectorType>(I.getArgOperand(1)->getType()));
assert(isa<FixedVectorType>(I.getArgOperand(2)->getType()));
[[maybe_unused]] auto ArgVectorSize =
cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
assert(cast<FixedVectorType>(I.getArgOperand(1)->getType())
->getNumElements() == ArgVectorSize);
assert(cast<FixedVectorType>(I.getArgOperand(2)->getType())
->getNumElements() == ArgVectorSize);
assert(I.getArgOperand(0)->getType() == I.getArgOperand(2)->getType());
assert(I.getType() == I.getArgOperand(0)->getType());
assert(I.getArgOperand(1)->getType()->isIntOrIntVectorTy());
IRBuilder<> IRB(&I);
Value *AShadow = getShadow(&I, 0);
Value *Idx = I.getArgOperand(1);
Value *BShadow = getShadow(&I, 2);
maskedCheckAVXIndexShadow(IRB, Idx, &I);
// Shadows are integer-ish types but some intrinsics require a
// different (e.g., floating-point) type.
AShadow = IRB.CreateBitCast(AShadow, I.getArgOperand(0)->getType());
BShadow = IRB.CreateBitCast(BShadow, I.getArgOperand(2)->getType());
CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
{AShadow, Idx, BShadow});
setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
setOriginForNaryOp(I);
}
[[maybe_unused]] static bool isFixedIntVectorTy(const Type *T) {
return isa<FixedVectorType>(T) && T->isIntOrIntVectorTy();
}
[[maybe_unused]] static bool isFixedFPVectorTy(const Type *T) {
return isa<FixedVectorType>(T) && T->isFPOrFPVectorTy();
}
[[maybe_unused]] static bool isFixedIntVector(const Value *V) {
return isFixedIntVectorTy(V->getType());
}
[[maybe_unused]] static bool isFixedFPVector(const Value *V) {
return isFixedFPVectorTy(V->getType());
}
// e.g., call <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
// (<16 x float> a, <16 x i32> writethru, i16 mask,
// i32 rounding)
//
// dst[i] = mask[i] ? convert(a[i]) : writethru[i]
// dst_shadow[i] = mask[i] ? all_or_nothing(a_shadow[i]) : writethru_shadow[i]
// where all_or_nothing(x) is fully uninitialized if x has any
// uninitialized bits
void handleAVX512VectorConvertFPToInt(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
assert(I.arg_size() == 4);
Value *A = I.getOperand(0);
Value *WriteThrough = I.getOperand(1);
Value *Mask = I.getOperand(2);
Value *RoundingMode = I.getOperand(3);
assert(isFixedFPVector(A));
assert(isFixedIntVector(WriteThrough));
unsigned ANumElements =
cast<FixedVectorType>(A->getType())->getNumElements();
assert(ANumElements ==
cast<FixedVectorType>(WriteThrough->getType())->getNumElements());
assert(Mask->getType()->isIntegerTy());
assert(Mask->getType()->getScalarSizeInBits() == ANumElements);
insertCheckShadowOf(Mask, &I);
assert(RoundingMode->getType()->isIntegerTy());
// Only four bits of the rounding mode are used, though it's very
// unusual to have uninitialized bits there (more commonly, it's a
// constant).
insertCheckShadowOf(RoundingMode, &I);
assert(I.getType() == WriteThrough->getType());
// Convert i16 mask to <16 x i1>
Mask = IRB.CreateBitCast(
Mask, FixedVectorType::get(IRB.getInt1Ty(), ANumElements));
Value *AShadow = getShadow(A);
/// For scalars:
/// Since they are converting from floating-point, the output is:
/// - fully uninitialized if *any* bit of the input is uninitialized
/// - fully ininitialized if all bits of the input are ininitialized
/// We apply the same principle on a per-element basis for vectors.
AShadow = IRB.CreateSExt(IRB.CreateICmpNE(AShadow, getCleanShadow(A)),
getShadowTy(A));
Value *WriteThroughShadow = getShadow(WriteThrough);
Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow);
setShadow(&I, Shadow);
setOriginForNaryOp(I);
}
// Instrument BMI / BMI2 intrinsics.
// All of these intrinsics are Z = I(X, Y)
// where the types of all operands and the result match, and are either i32 or
// i64. The following instrumentation happens to work for all of them:
// Sz = I(Sx, Y) | (sext (Sy != 0))
void handleBmiIntrinsic(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Type *ShadowTy = getShadowTy(&I);
// If any bit of the mask operand is poisoned, then the whole thing is.
Value *SMask = getShadow(&I, 1);
SMask = IRB.CreateSExt(IRB.CreateICmpNE(SMask, getCleanShadow(ShadowTy)),
ShadowTy);
// Apply the same intrinsic to the shadow of the first operand.
Value *S = IRB.CreateCall(I.getCalledFunction(),
{getShadow(&I, 0), I.getOperand(1)});
S = IRB.CreateOr(SMask, S);
setShadow(&I, S);
setOriginForNaryOp(I);
}
static SmallVector<int, 8> getPclmulMask(unsigned Width, bool OddElements) {
SmallVector<int, 8> Mask;
for (unsigned X = OddElements ? 1 : 0; X < Width; X += 2) {
Mask.append(2, X);
}
return Mask;
}
// Instrument pclmul intrinsics.
// These intrinsics operate either on odd or on even elements of the input
// vectors, depending on the constant in the 3rd argument, ignoring the rest.
// Replace the unused elements with copies of the used ones, ex:
// (0, 1, 2, 3) -> (0, 0, 2, 2) (even case)
// or
// (0, 1, 2, 3) -> (1, 1, 3, 3) (odd case)
// and then apply the usual shadow combining logic.
void handlePclmulIntrinsic(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
unsigned Width =
cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
assert(isa<ConstantInt>(I.getArgOperand(2)) &&
"pclmul 3rd operand must be a constant");
unsigned Imm = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
Value *Shuf0 = IRB.CreateShuffleVector(getShadow(&I, 0),
getPclmulMask(Width, Imm & 0x01));
Value *Shuf1 = IRB.CreateShuffleVector(getShadow(&I, 1),
getPclmulMask(Width, Imm & 0x10));
ShadowAndOriginCombiner SOC(this, IRB);
SOC.Add(Shuf0, getOrigin(&I, 0));
SOC.Add(Shuf1, getOrigin(&I, 1));
SOC.Done(&I);
}
// Instrument _mm_*_sd|ss intrinsics
void handleUnarySdSsIntrinsic(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
unsigned Width =
cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
Value *First = getShadow(&I, 0);
Value *Second = getShadow(&I, 1);
// First element of second operand, remaining elements of first operand
SmallVector<int, 16> Mask;
Mask.push_back(Width);
for (unsigned i = 1; i < Width; i++)
Mask.push_back(i);
Value *Shadow = IRB.CreateShuffleVector(First, Second, Mask);
setShadow(&I, Shadow);
setOriginForNaryOp(I);
}
void handleVtestIntrinsic(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Shadow0 = getShadow(&I, 0);
Value *Shadow1 = getShadow(&I, 1);
Value *Or = IRB.CreateOr(Shadow0, Shadow1);
Value *NZ = IRB.CreateICmpNE(Or, Constant::getNullValue(Or->getType()));
Value *Scalar = convertShadowToScalar(NZ, IRB);
Value *Shadow = IRB.CreateZExt(Scalar, getShadowTy(&I));
setShadow(&I, Shadow);
setOriginForNaryOp(I);
}
void handleBinarySdSsIntrinsic(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
unsigned Width =
cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
Value *First = getShadow(&I, 0);
Value *Second = getShadow(&I, 1);
Value *OrShadow = IRB.CreateOr(First, Second);
// First element of both OR'd together, remaining elements of first operand
SmallVector<int, 16> Mask;
Mask.push_back(Width);
for (unsigned i = 1; i < Width; i++)
Mask.push_back(i);
Value *Shadow = IRB.CreateShuffleVector(First, OrShadow, Mask);
setShadow(&I, Shadow);
setOriginForNaryOp(I);
}
// _mm_round_ps / _mm_round_ps.
// Similar to maybeHandleSimpleNomemIntrinsic except
// the second argument is guranteed to be a constant integer.
void handleRoundPdPsIntrinsic(IntrinsicInst &I) {
assert(I.getArgOperand(0)->getType() == I.getType());
assert(I.arg_size() == 2);
assert(isa<ConstantInt>(I.getArgOperand(1)));
IRBuilder<> IRB(&I);
ShadowAndOriginCombiner SC(this, IRB);
SC.Add(I.getArgOperand(0));
SC.Done(&I);
}
// Instrument @llvm.abs intrinsic.
//
// e.g., i32 @llvm.abs.i32 (i32 <Src>, i1 <is_int_min_poison>)
// <4 x i32> @llvm.abs.v4i32(<4 x i32> <Src>, i1 <is_int_min_poison>)
void handleAbsIntrinsic(IntrinsicInst &I) {
assert(I.arg_size() == 2);
Value *Src = I.getArgOperand(0);
Value *IsIntMinPoison = I.getArgOperand(1);
assert(I.getType()->isIntOrIntVectorTy());
assert(Src->getType() == I.getType());
assert(IsIntMinPoison->getType()->isIntegerTy());
assert(IsIntMinPoison->getType()->getIntegerBitWidth() == 1);
IRBuilder<> IRB(&I);
Value *SrcShadow = getShadow(Src);
APInt MinVal =
APInt::getSignedMinValue(Src->getType()->getScalarSizeInBits());
Value *MinValVec = ConstantInt::get(Src->getType(), MinVal);
Value *SrcIsMin = IRB.CreateICmp(CmpInst::ICMP_EQ, Src, MinValVec);
Value *PoisonedShadow = getPoisonedShadow(Src);
Value *PoisonedIfIntMinShadow =
IRB.CreateSelect(SrcIsMin, PoisonedShadow, SrcShadow);
Value *Shadow =
IRB.CreateSelect(IsIntMinPoison, PoisonedIfIntMinShadow, SrcShadow);
setShadow(&I, Shadow);
setOrigin(&I, getOrigin(&I, 0));
}
void handleIsFpClass(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Shadow = getShadow(&I, 0);
setShadow(&I, IRB.CreateICmpNE(Shadow, getCleanShadow(Shadow)));
setOrigin(&I, getOrigin(&I, 0));
}
void handleArithmeticWithOverflow(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *Shadow0 = getShadow(&I, 0);
Value *Shadow1 = getShadow(&I, 1);
Value *ShadowElt0 = IRB.CreateOr(Shadow0, Shadow1);
Value *ShadowElt1 =
IRB.CreateICmpNE(ShadowElt0, getCleanShadow(ShadowElt0));
Value *Shadow = PoisonValue::get(getShadowTy(&I));
Shadow = IRB.CreateInsertValue(Shadow, ShadowElt0, 0);
Shadow = IRB.CreateInsertValue(Shadow, ShadowElt1, 1);
setShadow(&I, Shadow);
setOriginForNaryOp(I);
}
Value *extractLowerShadow(IRBuilder<> &IRB, Value *V) {
assert(isa<FixedVectorType>(V->getType()));
assert(cast<FixedVectorType>(V->getType())->getNumElements() > 0);
Value *Shadow = getShadow(V);
return IRB.CreateExtractElement(Shadow,
ConstantInt::get(IRB.getInt32Ty(), 0));
}
// Handle llvm.x86.avx512.mask.pmov{,s,us}.*.512
//
// e.g., call <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512
// (<8 x i64>, <16 x i8>, i8)
// A WriteThru Mask
//
// call <16 x i8> @llvm.x86.avx512.mask.pmovs.db.512
// (<16 x i32>, <16 x i8>, i16)
//
// Dst[i] = Mask[i] ? truncate_or_saturate(A[i]) : WriteThru[i]
// Dst_shadow[i] = Mask[i] ? truncate(A_shadow[i]) : WriteThru_shadow[i]
//
// If Dst has more elements than A, the excess elements are zeroed (and the
// corresponding shadow is initialized).
//
// Note: for PMOV (truncation), handleIntrinsicByApplyingToShadow is precise
// and is much faster than this handler.
void handleAVX512VectorDownConvert(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
assert(I.arg_size() == 3);
Value *A = I.getOperand(0);
Value *WriteThrough = I.getOperand(1);
Value *Mask = I.getOperand(2);
assert(isFixedIntVector(A));
assert(isFixedIntVector(WriteThrough));
unsigned ANumElements =
cast<FixedVectorType>(A->getType())->getNumElements();
unsigned OutputNumElements =
cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
assert(ANumElements == OutputNumElements ||
ANumElements * 2 == OutputNumElements);
assert(Mask->getType()->isIntegerTy());
assert(Mask->getType()->getScalarSizeInBits() == ANumElements);
insertCheckShadowOf(Mask, &I);
assert(I.getType() == WriteThrough->getType());
// Widen the mask, if necessary, to have one bit per element of the output
// vector.
// We want the extra bits to have '1's, so that the CreateSelect will
// select the values from AShadow instead of WriteThroughShadow ("maskless"
// versions of the intrinsics are sometimes implemented using an all-1's
// mask and an undefined value for WriteThroughShadow). We accomplish this
// by using bitwise NOT before and after the ZExt.
if (ANumElements != OutputNumElements) {
Mask = IRB.CreateNot(Mask);
Mask = IRB.CreateZExt(Mask, Type::getIntNTy(*MS.C, OutputNumElements),
"_ms_widen_mask");
Mask = IRB.CreateNot(Mask);
}
Mask = IRB.CreateBitCast(
Mask, FixedVectorType::get(IRB.getInt1Ty(), OutputNumElements));
Value *AShadow = getShadow(A);
// The return type might have more elements than the input.
// Temporarily shrink the return type's number of elements.
VectorType *ShadowType = maybeShrinkVectorShadowType(A, I);
// PMOV truncates; PMOVS/PMOVUS uses signed/unsigned saturation.
// This handler treats them all as truncation, which leads to some rare
// false positives in the cases where the truncated bytes could
// unambiguously saturate the value e.g., if A = ??????10 ????????
// (big-endian), the unsigned saturated byte conversion is 11111111 i.e.,
// fully defined, but the truncated byte is ????????.
//
// TODO: use GetMinMaxUnsigned() to handle saturation precisely.
AShadow = IRB.CreateTrunc(AShadow, ShadowType, "_ms_trunc_shadow");
AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
Value *WriteThroughShadow = getShadow(WriteThrough);
Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow);
setShadow(&I, Shadow);
setOriginForNaryOp(I);
}
// For sh.* compiler intrinsics:
// llvm.x86.avx512fp16.mask.{add/sub/mul/div/max/min}.sh.round
// (<8 x half>, <8 x half>, <8 x half>, i8, i32)
// A B WriteThru Mask RoundingMode
//
// DstShadow[0] = Mask[0] ? (AShadow[0] | BShadow[0]) : WriteThruShadow[0]
// DstShadow[1..7] = AShadow[1..7]
void visitGenericScalarHalfwordInst(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
assert(I.arg_size() == 5);
Value *A = I.getOperand(0);
Value *B = I.getOperand(1);
Value *WriteThrough = I.getOperand(2);
Value *Mask = I.getOperand(3);
Value *RoundingMode = I.getOperand(4);
// Technically, we could probably just check whether the LSB is
// initialized, but intuitively it feels like a partly uninitialized mask
// is unintended, and we should warn the user immediately.
insertCheckShadowOf(Mask, &I);
insertCheckShadowOf(RoundingMode, &I);
assert(isa<FixedVectorType>(A->getType()));
unsigned NumElements =
cast<FixedVectorType>(A->getType())->getNumElements();
assert(NumElements == 8);
assert(A->getType() == B->getType());
assert(B->getType() == WriteThrough->getType());
assert(Mask->getType()->getPrimitiveSizeInBits() == NumElements);
assert(RoundingMode->getType()->isIntegerTy());
Value *ALowerShadow = extractLowerShadow(IRB, A);
Value *BLowerShadow = extractLowerShadow(IRB, B);
Value *ABLowerShadow = IRB.CreateOr(ALowerShadow, BLowerShadow);
Value *WriteThroughLowerShadow = extractLowerShadow(IRB, WriteThrough);
Mask = IRB.CreateBitCast(
Mask, FixedVectorType::get(IRB.getInt1Ty(), NumElements));
Value *MaskLower =
IRB.CreateExtractElement(Mask, ConstantInt::get(IRB.getInt32Ty(), 0));
Value *AShadow = getShadow(A);
Value *DstLowerShadow =
IRB.CreateSelect(MaskLower, ABLowerShadow, WriteThroughLowerShadow);
Value *DstShadow = IRB.CreateInsertElement(
AShadow, DstLowerShadow, ConstantInt::get(IRB.getInt32Ty(), 0),
"_msprop");
setShadow(&I, DstShadow);
setOriginForNaryOp(I);
}
// Approximately handle AVX Galois Field Affine Transformation
//
// e.g.,
// <16 x i8> @llvm.x86.vgf2p8affineqb.128(<16 x i8>, <16 x i8>, i8)
// <32 x i8> @llvm.x86.vgf2p8affineqb.256(<32 x i8>, <32 x i8>, i8)
// <64 x i8> @llvm.x86.vgf2p8affineqb.512(<64 x i8>, <64 x i8>, i8)
// Out A x b
// where A and x are packed matrices, b is a vector,
// Out = A * x + b in GF(2)
//
// Multiplication in GF(2) is equivalent to bitwise AND. However, the matrix
// computation also includes a parity calculation.
//
// For the bitwise AND of bits V1 and V2, the exact shadow is:
// Out_Shadow = (V1_Shadow & V2_Shadow)
// | (V1 & V2_Shadow)
// | (V1_Shadow & V2 )
//
// We approximate the shadow of gf2p8affineqb using:
// Out_Shadow = gf2p8affineqb(x_Shadow, A_shadow, 0)
// | gf2p8affineqb(x, A_shadow, 0)
// | gf2p8affineqb(x_Shadow, A, 0)
// | set1_epi8(b_Shadow)
//
// This approximation has false negatives: if an intermediate dot-product
// contains an even number of 1's, the parity is 0.
// It has no false positives.
void handleAVXGF2P8Affine(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
assert(I.arg_size() == 3);
Value *A = I.getOperand(0);
Value *X = I.getOperand(1);
Value *B = I.getOperand(2);
assert(isFixedIntVector(A));
assert(cast<VectorType>(A->getType())
->getElementType()
->getScalarSizeInBits() == 8);
assert(A->getType() == X->getType());
assert(B->getType()->isIntegerTy());
assert(B->getType()->getScalarSizeInBits() == 8);
assert(I.getType() == A->getType());
Value *AShadow = getShadow(A);
Value *XShadow = getShadow(X);
Value *BZeroShadow = getCleanShadow(B);
CallInst *AShadowXShadow = IRB.CreateIntrinsic(
I.getType(), I.getIntrinsicID(), {XShadow, AShadow, BZeroShadow});
CallInst *AShadowX = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
{X, AShadow, BZeroShadow});
CallInst *XShadowA = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
{XShadow, A, BZeroShadow});
unsigned NumElements = cast<FixedVectorType>(I.getType())->getNumElements();
Value *BShadow = getShadow(B);
Value *BBroadcastShadow = getCleanShadow(AShadow);
// There is no LLVM IR intrinsic for _mm512_set1_epi8.
// This loop generates a lot of LLVM IR, which we expect that CodeGen will
// lower appropriately (e.g., VPBROADCASTB).
// Besides, b is often a constant, in which case it is fully initialized.
for (unsigned i = 0; i < NumElements; i++)
BBroadcastShadow = IRB.CreateInsertElement(BBroadcastShadow, BShadow, i);
setShadow(&I, IRB.CreateOr(
{AShadowXShadow, AShadowX, XShadowA, BBroadcastShadow}));
setOriginForNaryOp(I);
}
// Handle Arm NEON vector load intrinsics (vld*).
//
// The WithLane instructions (ld[234]lane) are similar to:
// call {<4 x i32>, <4 x i32>, <4 x i32>}
// @llvm.aarch64.neon.ld3lane.v4i32.p0
// (<4 x i32> %L1, <4 x i32> %L2, <4 x i32> %L3, i64 %lane, ptr
// %A)
//
// The non-WithLane instructions (ld[234], ld1x[234], ld[234]r) are similar
// to:
// call {<8 x i8>, <8 x i8>} @llvm.aarch64.neon.ld2.v8i8.p0(ptr %A)
void handleNEONVectorLoad(IntrinsicInst &I, bool WithLane) {
unsigned int numArgs = I.arg_size();
// Return type is a struct of vectors of integers or floating-point
assert(I.getType()->isStructTy());
[[maybe_unused]] StructType *RetTy = cast<StructType>(I.getType());
assert(RetTy->getNumElements() > 0);
assert(RetTy->getElementType(0)->isIntOrIntVectorTy() ||
RetTy->getElementType(0)->isFPOrFPVectorTy());
for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
assert(RetTy->getElementType(i) == RetTy->getElementType(0));
if (WithLane) {
// 2, 3 or 4 vectors, plus lane number, plus input pointer
assert(4 <= numArgs && numArgs <= 6);
// Return type is a struct of the input vectors
assert(RetTy->getNumElements() + 2 == numArgs);
for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
assert(I.getArgOperand(i)->getType() == RetTy->getElementType(0));
} else {
assert(numArgs == 1);
}
IRBuilder<> IRB(&I);
SmallVector<Value *, 6> ShadowArgs;
if (WithLane) {
for (unsigned int i = 0; i < numArgs - 2; i++)
ShadowArgs.push_back(getShadow(I.getArgOperand(i)));
// Lane number, passed verbatim
Value *LaneNumber = I.getArgOperand(numArgs - 2);
ShadowArgs.push_back(LaneNumber);
// TODO: blend shadow of lane number into output shadow?
insertCheckShadowOf(LaneNumber, &I);
}
Value *Src = I.getArgOperand(numArgs - 1);
assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
Type *SrcShadowTy = getShadowTy(Src);
auto [SrcShadowPtr, SrcOriginPtr] =
getShadowOriginPtr(Src, IRB, SrcShadowTy, Align(1), /*isStore*/ false);
ShadowArgs.push_back(SrcShadowPtr);
// The NEON vector load instructions handled by this function all have
// integer variants. It is easier to use those rather than trying to cast
// a struct of vectors of floats into a struct of vectors of integers.
CallInst *CI =
IRB.CreateIntrinsic(getShadowTy(&I), I.getIntrinsicID(), ShadowArgs);
setShadow(&I, CI);
if (!MS.TrackOrigins)
return;
Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
setOrigin(&I, PtrSrcOrigin);
}
/// Handle Arm NEON vector store intrinsics (vst{2,3,4}, vst1x_{2,3,4},
/// and vst{2,3,4}lane).
///
/// Arm NEON vector store intrinsics have the output address (pointer) as the
/// last argument, with the initial arguments being the inputs (and lane
/// number for vst{2,3,4}lane). They return void.
///
/// - st4 interleaves the output e.g., st4 (inA, inB, inC, inD, outP) writes
/// abcdabcdabcdabcd... into *outP
/// - st1_x4 is non-interleaved e.g., st1_x4 (inA, inB, inC, inD, outP)
/// writes aaaa...bbbb...cccc...dddd... into *outP
/// - st4lane has arguments of (inA, inB, inC, inD, lane, outP)
/// These instructions can all be instrumented with essentially the same
/// MSan logic, simply by applying the corresponding intrinsic to the shadow.
void handleNEONVectorStoreIntrinsic(IntrinsicInst &I, bool useLane) {
IRBuilder<> IRB(&I);
// Don't use getNumOperands() because it includes the callee
int numArgOperands = I.arg_size();
// The last arg operand is the output (pointer)
assert(numArgOperands >= 1);
Value *Addr = I.getArgOperand(numArgOperands - 1);
assert(Addr->getType()->isPointerTy());
int skipTrailingOperands = 1;
if (ClCheckAccessAddress)
insertCheckShadowOf(Addr, &I);
// Second-last operand is the lane number (for vst{2,3,4}lane)
if (useLane) {
skipTrailingOperands++;
assert(numArgOperands >= static_cast<int>(skipTrailingOperands));
assert(isa<IntegerType>(
I.getArgOperand(numArgOperands - skipTrailingOperands)->getType()));
}
SmallVector<Value *, 8> ShadowArgs;
// All the initial operands are the inputs
for (int i = 0; i < numArgOperands - skipTrailingOperands; i++) {
assert(isa<FixedVectorType>(I.getArgOperand(i)->getType()));
Value *Shadow = getShadow(&I, i);
ShadowArgs.append(1, Shadow);
}
// MSan's GetShadowTy assumes the LHS is the type we want the shadow for
// e.g., for:
// [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to i128
// we know the type of the output (and its shadow) is <16 x i8>.
//
// Arm NEON VST is unusual because the last argument is the output address:
// define void @st2_16b(<16 x i8> %A, <16 x i8> %B, ptr %P) {
// call void @llvm.aarch64.neon.st2.v16i8.p0
// (<16 x i8> [[A]], <16 x i8> [[B]], ptr [[P]])
// and we have no type information about P's operand. We must manually
// compute the type (<16 x i8> x 2).
FixedVectorType *OutputVectorTy = FixedVectorType::get(
cast<FixedVectorType>(I.getArgOperand(0)->getType())->getElementType(),
cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements() *
(numArgOperands - skipTrailingOperands));
Type *OutputShadowTy = getShadowTy(OutputVectorTy);
if (useLane)
ShadowArgs.append(1,
I.getArgOperand(numArgOperands - skipTrailingOperands));
Value *OutputShadowPtr, *OutputOriginPtr;
// AArch64 NEON does not need alignment (unless OS requires it)
std::tie(OutputShadowPtr, OutputOriginPtr) = getShadowOriginPtr(
Addr, IRB, OutputShadowTy, Align(1), /*isStore*/ true);
ShadowArgs.append(1, OutputShadowPtr);
CallInst *CI =
IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
setShadow(&I, CI);
if (MS.TrackOrigins) {
// TODO: if we modelled the vst* instruction more precisely, we could
// more accurately track the origins (e.g., if both inputs are
// uninitialized for vst2, we currently blame the second input, even
// though part of the output depends only on the first input).
//
// This is particularly imprecise for vst{2,3,4}lane, since only one
// lane of each input is actually copied to the output.
OriginCombiner OC(this, IRB);
for (int i = 0; i < numArgOperands - skipTrailingOperands; i++)
OC.Add(I.getArgOperand(i));
const DataLayout &DL = F.getDataLayout();
OC.DoneAndStoreOrigin(DL.getTypeStoreSize(OutputVectorTy),
OutputOriginPtr);
}
}
/// Handle intrinsics by applying the intrinsic to the shadows.
///
/// The trailing arguments are passed verbatim to the intrinsic, though any
/// uninitialized trailing arguments can also taint the shadow e.g., for an
/// intrinsic with one trailing verbatim argument:
/// out = intrinsic(var1, var2, opType)
/// we compute:
/// shadow[out] =
/// intrinsic(shadow[var1], shadow[var2], opType) | shadow[opType]
///
/// Typically, shadowIntrinsicID will be specified by the caller to be
/// I.getIntrinsicID(), but the caller can choose to replace it with another
/// intrinsic of the same type.
///
/// CAUTION: this assumes that the intrinsic will handle arbitrary
/// bit-patterns (for example, if the intrinsic accepts floats for
/// var1, we require that it doesn't care if inputs are NaNs).
///
/// For example, this can be applied to the Arm NEON vector table intrinsics
/// (tbl{1,2,3,4}).
///
/// The origin is approximated using setOriginForNaryOp.
void handleIntrinsicByApplyingToShadow(IntrinsicInst &I,
Intrinsic::ID shadowIntrinsicID,
unsigned int trailingVerbatimArgs) {
IRBuilder<> IRB(&I);
assert(trailingVerbatimArgs < I.arg_size());
SmallVector<Value *, 8> ShadowArgs;
// Don't use getNumOperands() because it includes the callee
for (unsigned int i = 0; i < I.arg_size() - trailingVerbatimArgs; i++) {
Value *Shadow = getShadow(&I, i);
// Shadows are integer-ish types but some intrinsics require a
// different (e.g., floating-point) type.
ShadowArgs.push_back(
IRB.CreateBitCast(Shadow, I.getArgOperand(i)->getType()));
}
for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
i++) {
Value *Arg = I.getArgOperand(i);
ShadowArgs.push_back(Arg);
}
CallInst *CI =
IRB.CreateIntrinsic(I.getType(), shadowIntrinsicID, ShadowArgs);
Value *CombinedShadow = CI;
// Combine the computed shadow with the shadow of trailing args
for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
i++) {
Value *Shadow =
CreateShadowCast(IRB, getShadow(&I, i), CombinedShadow->getType());
CombinedShadow = IRB.CreateOr(Shadow, CombinedShadow, "_msprop");
}
setShadow(&I, IRB.CreateBitCast(CombinedShadow, getShadowTy(&I)));
setOriginForNaryOp(I);
}
// Approximation only
//
// e.g., <16 x i8> @llvm.aarch64.neon.pmull64(i64, i64)
void handleNEONVectorMultiplyIntrinsic(IntrinsicInst &I) {
assert(I.arg_size() == 2);
handleShadowOr(I);
}
void visitIntrinsicInst(IntrinsicInst &I) {
switch (I.getIntrinsicID()) {
case Intrinsic::uadd_with_overflow:
case Intrinsic::sadd_with_overflow:
case Intrinsic::usub_with_overflow:
case Intrinsic::ssub_with_overflow:
case Intrinsic::umul_with_overflow:
case Intrinsic::smul_with_overflow:
handleArithmeticWithOverflow(I);
break;
case Intrinsic::abs:
handleAbsIntrinsic(I);
break;
case Intrinsic::bitreverse:
handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
/*trailingVerbatimArgs*/ 0);
break;
case Intrinsic::is_fpclass:
handleIsFpClass(I);
break;
case Intrinsic::lifetime_start:
handleLifetimeStart(I);
break;
case Intrinsic::launder_invariant_group:
case Intrinsic::strip_invariant_group:
handleInvariantGroup(I);
break;
case Intrinsic::bswap:
handleBswap(I);
break;
case Intrinsic::ctlz:
case Intrinsic::cttz:
handleCountLeadingTrailingZeros(I);
break;
case Intrinsic::masked_compressstore:
handleMaskedCompressStore(I);
break;
case Intrinsic::masked_expandload:
handleMaskedExpandLoad(I);
break;
case Intrinsic::masked_gather:
handleMaskedGather(I);
break;
case Intrinsic::masked_scatter:
handleMaskedScatter(I);
break;
case Intrinsic::masked_store:
handleMaskedStore(I);
break;
case Intrinsic::masked_load:
handleMaskedLoad(I);
break;
case Intrinsic::vector_reduce_and:
handleVectorReduceAndIntrinsic(I);
break;
case Intrinsic::vector_reduce_or:
handleVectorReduceOrIntrinsic(I);
break;
case Intrinsic::vector_reduce_add:
case Intrinsic::vector_reduce_xor:
case Intrinsic::vector_reduce_mul:
// Signed/Unsigned Min/Max
// TODO: handling similarly to AND/OR may be more precise.
case Intrinsic::vector_reduce_smax:
case Intrinsic::vector_reduce_smin:
case Intrinsic::vector_reduce_umax:
case Intrinsic::vector_reduce_umin:
// TODO: this has no false positives, but arguably we should check that all
// the bits are initialized.
case Intrinsic::vector_reduce_fmax:
case Intrinsic::vector_reduce_fmin:
handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/false);
break;
case Intrinsic::vector_reduce_fadd:
case Intrinsic::vector_reduce_fmul:
handleVectorReduceWithStarterIntrinsic(I);
break;
case Intrinsic::x86_sse_stmxcsr:
handleStmxcsr(I);
break;
case Intrinsic::x86_sse_ldmxcsr:
handleLdmxcsr(I);
break;
case Intrinsic::x86_avx512_vcvtsd2usi64:
case Intrinsic::x86_avx512_vcvtsd2usi32:
case Intrinsic::x86_avx512_vcvtss2usi64:
case Intrinsic::x86_avx512_vcvtss2usi32:
case Intrinsic::x86_avx512_cvttss2usi64:
case Intrinsic::x86_avx512_cvttss2usi:
case Intrinsic::x86_avx512_cvttsd2usi64:
case Intrinsic::x86_avx512_cvttsd2usi:
case Intrinsic::x86_avx512_cvtusi2ss:
case Intrinsic::x86_avx512_cvtusi642sd:
case Intrinsic::x86_avx512_cvtusi642ss:
handleSSEVectorConvertIntrinsic(I, 1, true);
break;
case Intrinsic::x86_sse2_cvtsd2si64:
case Intrinsic::x86_sse2_cvtsd2si:
case Intrinsic::x86_sse2_cvtsd2ss:
case Intrinsic::x86_sse2_cvttsd2si64:
case Intrinsic::x86_sse2_cvttsd2si:
case Intrinsic::x86_sse_cvtss2si64:
case Intrinsic::x86_sse_cvtss2si:
case Intrinsic::x86_sse_cvttss2si64:
case Intrinsic::x86_sse_cvttss2si:
handleSSEVectorConvertIntrinsic(I, 1);
break;
case Intrinsic::x86_sse_cvtps2pi:
case Intrinsic::x86_sse_cvttps2pi:
handleSSEVectorConvertIntrinsic(I, 2);
break;
// TODO:
// <1 x i64> @llvm.x86.sse.cvtpd2pi(<2 x double>)
// <2 x double> @llvm.x86.sse.cvtpi2pd(<1 x i64>)
// <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, <1 x i64>)
case Intrinsic::x86_vcvtps2ph_128:
case Intrinsic::x86_vcvtps2ph_256: {
handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/true);
break;
}
case Intrinsic::x86_sse2_cvtpd2ps:
case Intrinsic::x86_sse2_cvtps2dq:
case Intrinsic::x86_sse2_cvtpd2dq:
case Intrinsic::x86_sse2_cvttps2dq:
case Intrinsic::x86_sse2_cvttpd2dq:
case Intrinsic::x86_avx_cvt_pd2_ps_256:
case Intrinsic::x86_avx_cvt_ps2dq_256:
case Intrinsic::x86_avx_cvt_pd2dq_256:
case Intrinsic::x86_avx_cvtt_ps2dq_256:
case Intrinsic::x86_avx_cvtt_pd2dq_256: {
handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/false);
break;
}
case Intrinsic::x86_avx512_psll_w_512:
case Intrinsic::x86_avx512_psll_d_512:
case Intrinsic::x86_avx512_psll_q_512:
case Intrinsic::x86_avx512_pslli_w_512:
case Intrinsic::x86_avx512_pslli_d_512:
case Intrinsic::x86_avx512_pslli_q_512:
case Intrinsic::x86_avx512_psrl_w_512:
case Intrinsic::x86_avx512_psrl_d_512:
case Intrinsic::x86_avx512_psrl_q_512:
case Intrinsic::x86_avx512_psra_w_512:
case Intrinsic::x86_avx512_psra_d_512:
case Intrinsic::x86_avx512_psra_q_512:
case Intrinsic::x86_avx512_psrli_w_512:
case Intrinsic::x86_avx512_psrli_d_512:
case Intrinsic::x86_avx512_psrli_q_512:
case Intrinsic::x86_avx512_psrai_w_512:
case Intrinsic::x86_avx512_psrai_d_512:
case Intrinsic::x86_avx512_psrai_q_512:
case Intrinsic::x86_avx512_psra_q_256:
case Intrinsic::x86_avx512_psra_q_128:
case Intrinsic::x86_avx512_psrai_q_256:
case Intrinsic::x86_avx512_psrai_q_128:
case Intrinsic::x86_avx2_psll_w:
case Intrinsic::x86_avx2_psll_d:
case Intrinsic::x86_avx2_psll_q:
case Intrinsic::x86_avx2_pslli_w:
case Intrinsic::x86_avx2_pslli_d:
case Intrinsic::x86_avx2_pslli_q:
case Intrinsic::x86_avx2_psrl_w:
case Intrinsic::x86_avx2_psrl_d:
case Intrinsic::x86_avx2_psrl_q:
case Intrinsic::x86_avx2_psra_w:
case Intrinsic::x86_avx2_psra_d:
case Intrinsic::x86_avx2_psrli_w:
case Intrinsic::x86_avx2_psrli_d:
case Intrinsic::x86_avx2_psrli_q:
case Intrinsic::x86_avx2_psrai_w:
case Intrinsic::x86_avx2_psrai_d:
case Intrinsic::x86_sse2_psll_w:
case Intrinsic::x86_sse2_psll_d:
case Intrinsic::x86_sse2_psll_q:
case Intrinsic::x86_sse2_pslli_w:
case Intrinsic::x86_sse2_pslli_d:
case Intrinsic::x86_sse2_pslli_q:
case Intrinsic::x86_sse2_psrl_w:
case Intrinsic::x86_sse2_psrl_d:
case Intrinsic::x86_sse2_psrl_q:
case Intrinsic::x86_sse2_psra_w:
case Intrinsic::x86_sse2_psra_d:
case Intrinsic::x86_sse2_psrli_w:
case Intrinsic::x86_sse2_psrli_d:
case Intrinsic::x86_sse2_psrli_q:
case Intrinsic::x86_sse2_psrai_w:
case Intrinsic::x86_sse2_psrai_d:
case Intrinsic::x86_mmx_psll_w:
case Intrinsic::x86_mmx_psll_d:
case Intrinsic::x86_mmx_psll_q:
case Intrinsic::x86_mmx_pslli_w:
case Intrinsic::x86_mmx_pslli_d:
case Intrinsic::x86_mmx_pslli_q:
case Intrinsic::x86_mmx_psrl_w:
case Intrinsic::x86_mmx_psrl_d:
case Intrinsic::x86_mmx_psrl_q:
case Intrinsic::x86_mmx_psra_w:
case Intrinsic::x86_mmx_psra_d:
case Intrinsic::x86_mmx_psrli_w:
case Intrinsic::x86_mmx_psrli_d:
case Intrinsic::x86_mmx_psrli_q:
case Intrinsic::x86_mmx_psrai_w:
case Intrinsic::x86_mmx_psrai_d:
case Intrinsic::aarch64_neon_rshrn:
case Intrinsic::aarch64_neon_sqrshl:
case Intrinsic::aarch64_neon_sqrshrn:
case Intrinsic::aarch64_neon_sqrshrun:
case Intrinsic::aarch64_neon_sqshl:
case Intrinsic::aarch64_neon_sqshlu:
case Intrinsic::aarch64_neon_sqshrn:
case Intrinsic::aarch64_neon_sqshrun:
case Intrinsic::aarch64_neon_srshl:
case Intrinsic::aarch64_neon_sshl:
case Intrinsic::aarch64_neon_uqrshl:
case Intrinsic::aarch64_neon_uqrshrn:
case Intrinsic::aarch64_neon_uqshl:
case Intrinsic::aarch64_neon_uqshrn:
case Intrinsic::aarch64_neon_urshl:
case Intrinsic::aarch64_neon_ushl:
// Not handled here: aarch64_neon_vsli (vector shift left and insert)
handleVectorShiftIntrinsic(I, /* Variable */ false);
break;
case Intrinsic::x86_avx2_psllv_d:
case Intrinsic::x86_avx2_psllv_d_256:
case Intrinsic::x86_avx512_psllv_d_512:
case Intrinsic::x86_avx2_psllv_q:
case Intrinsic::x86_avx2_psllv_q_256:
case Intrinsic::x86_avx512_psllv_q_512:
case Intrinsic::x86_avx2_psrlv_d:
case Intrinsic::x86_avx2_psrlv_d_256:
case Intrinsic::x86_avx512_psrlv_d_512:
case Intrinsic::x86_avx2_psrlv_q:
case Intrinsic::x86_avx2_psrlv_q_256:
case Intrinsic::x86_avx512_psrlv_q_512:
case Intrinsic::x86_avx2_psrav_d:
case Intrinsic::x86_avx2_psrav_d_256:
case Intrinsic::x86_avx512_psrav_d_512:
case Intrinsic::x86_avx512_psrav_q_128:
case Intrinsic::x86_avx512_psrav_q_256:
case Intrinsic::x86_avx512_psrav_q_512:
handleVectorShiftIntrinsic(I, /* Variable */ true);
break;
case Intrinsic::x86_sse2_packsswb_128:
case Intrinsic::x86_sse2_packssdw_128:
case Intrinsic::x86_sse2_packuswb_128:
case Intrinsic::x86_sse41_packusdw:
case Intrinsic::x86_avx2_packsswb:
case Intrinsic::x86_avx2_packssdw:
case Intrinsic::x86_avx2_packuswb:
case Intrinsic::x86_avx2_packusdw:
handleVectorPackIntrinsic(I);
break;
case Intrinsic::x86_sse41_pblendvb:
case Intrinsic::x86_sse41_blendvpd:
case Intrinsic::x86_sse41_blendvps:
case Intrinsic::x86_avx_blendv_pd_256:
case Intrinsic::x86_avx_blendv_ps_256:
case Intrinsic::x86_avx2_pblendvb:
handleBlendvIntrinsic(I);
break;
case Intrinsic::x86_avx_dp_ps_256:
case Intrinsic::x86_sse41_dppd:
case Intrinsic::x86_sse41_dpps:
handleDppIntrinsic(I);
break;
case Intrinsic::x86_mmx_packsswb:
case Intrinsic::x86_mmx_packuswb:
handleVectorPackIntrinsic(I, 16);
break;
case Intrinsic::x86_mmx_packssdw:
handleVectorPackIntrinsic(I, 32);
break;
case Intrinsic::x86_mmx_psad_bw:
handleVectorSadIntrinsic(I, true);
break;
case Intrinsic::x86_sse2_psad_bw:
case Intrinsic::x86_avx2_psad_bw:
handleVectorSadIntrinsic(I);
break;
// Multiply and Add Packed Words
// < 4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16>, <8 x i16>)
// < 8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16>, <16 x i16>)
//
// Multiply and Add Packed Signed and Unsigned Bytes
// < 8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8>, <16 x i8>)
// <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8>, <32 x i8>)
case Intrinsic::x86_sse2_pmadd_wd:
case Intrinsic::x86_avx2_pmadd_wd:
case Intrinsic::x86_ssse3_pmadd_ub_sw_128:
case Intrinsic::x86_avx2_pmadd_ub_sw:
handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2);
break;
// <1 x i64> @llvm.x86.ssse3.pmadd.ub.sw(<1 x i64>, <1 x i64>)
case Intrinsic::x86_ssse3_pmadd_ub_sw:
handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/8);
break;
// <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64>, <1 x i64>)
case Intrinsic::x86_mmx_pmadd_wd:
handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/16);
break;
case Intrinsic::x86_sse_cmp_ss:
case Intrinsic::x86_sse2_cmp_sd:
case Intrinsic::x86_sse_comieq_ss:
case Intrinsic::x86_sse_comilt_ss:
case Intrinsic::x86_sse_comile_ss:
case Intrinsic::x86_sse_comigt_ss:
case Intrinsic::x86_sse_comige_ss:
case Intrinsic::x86_sse_comineq_ss:
case Intrinsic::x86_sse_ucomieq_ss:
case Intrinsic::x86_sse_ucomilt_ss:
case Intrinsic::x86_sse_ucomile_ss:
case Intrinsic::x86_sse_ucomigt_ss:
case Intrinsic::x86_sse_ucomige_ss:
case Intrinsic::x86_sse_ucomineq_ss:
case Intrinsic::x86_sse2_comieq_sd:
case Intrinsic::x86_sse2_comilt_sd:
case Intrinsic::x86_sse2_comile_sd:
case Intrinsic::x86_sse2_comigt_sd:
case Intrinsic::x86_sse2_comige_sd:
case Intrinsic::x86_sse2_comineq_sd:
case Intrinsic::x86_sse2_ucomieq_sd:
case Intrinsic::x86_sse2_ucomilt_sd:
case Intrinsic::x86_sse2_ucomile_sd:
case Intrinsic::x86_sse2_ucomigt_sd:
case Intrinsic::x86_sse2_ucomige_sd:
case Intrinsic::x86_sse2_ucomineq_sd:
handleVectorCompareScalarIntrinsic(I);
break;
case Intrinsic::x86_avx_cmp_pd_256:
case Intrinsic::x86_avx_cmp_ps_256:
case Intrinsic::x86_sse2_cmp_pd:
case Intrinsic::x86_sse_cmp_ps:
handleVectorComparePackedIntrinsic(I);
break;
case Intrinsic::x86_bmi_bextr_32:
case Intrinsic::x86_bmi_bextr_64:
case Intrinsic::x86_bmi_bzhi_32:
case Intrinsic::x86_bmi_bzhi_64:
case Intrinsic::x86_bmi_pdep_32:
case Intrinsic::x86_bmi_pdep_64:
case Intrinsic::x86_bmi_pext_32:
case Intrinsic::x86_bmi_pext_64:
handleBmiIntrinsic(I);
break;
case Intrinsic::x86_pclmulqdq:
case Intrinsic::x86_pclmulqdq_256:
case Intrinsic::x86_pclmulqdq_512:
handlePclmulIntrinsic(I);
break;
case Intrinsic::x86_avx_round_pd_256:
case Intrinsic::x86_avx_round_ps_256:
case Intrinsic::x86_sse41_round_pd:
case Intrinsic::x86_sse41_round_ps:
handleRoundPdPsIntrinsic(I);
break;
case Intrinsic::x86_sse41_round_sd:
case Intrinsic::x86_sse41_round_ss:
handleUnarySdSsIntrinsic(I);
break;
case Intrinsic::x86_sse2_max_sd:
case Intrinsic::x86_sse_max_ss:
case Intrinsic::x86_sse2_min_sd:
case Intrinsic::x86_sse_min_ss:
handleBinarySdSsIntrinsic(I);
break;
case Intrinsic::x86_avx_vtestc_pd:
case Intrinsic::x86_avx_vtestc_pd_256:
case Intrinsic::x86_avx_vtestc_ps:
case Intrinsic::x86_avx_vtestc_ps_256:
case Intrinsic::x86_avx_vtestnzc_pd:
case Intrinsic::x86_avx_vtestnzc_pd_256:
case Intrinsic::x86_avx_vtestnzc_ps:
case Intrinsic::x86_avx_vtestnzc_ps_256:
case Intrinsic::x86_avx_vtestz_pd:
case Intrinsic::x86_avx_vtestz_pd_256:
case Intrinsic::x86_avx_vtestz_ps:
case Intrinsic::x86_avx_vtestz_ps_256:
case Intrinsic::x86_avx_ptestc_256:
case Intrinsic::x86_avx_ptestnzc_256:
case Intrinsic::x86_avx_ptestz_256:
case Intrinsic::x86_sse41_ptestc:
case Intrinsic::x86_sse41_ptestnzc:
case Intrinsic::x86_sse41_ptestz:
handleVtestIntrinsic(I);
break;
// Packed Horizontal Add/Subtract
case Intrinsic::x86_ssse3_phadd_w:
case Intrinsic::x86_ssse3_phadd_w_128:
case Intrinsic::x86_avx2_phadd_w:
case Intrinsic::x86_ssse3_phsub_w:
case Intrinsic::x86_ssse3_phsub_w_128:
case Intrinsic::x86_avx2_phsub_w: {
handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
break;
}
// Packed Horizontal Add/Subtract
case Intrinsic::x86_ssse3_phadd_d:
case Intrinsic::x86_ssse3_phadd_d_128:
case Intrinsic::x86_avx2_phadd_d:
case Intrinsic::x86_ssse3_phsub_d:
case Intrinsic::x86_ssse3_phsub_d_128:
case Intrinsic::x86_avx2_phsub_d: {
handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/32);
break;
}
// Packed Horizontal Add/Subtract and Saturate
case Intrinsic::x86_ssse3_phadd_sw:
case Intrinsic::x86_ssse3_phadd_sw_128:
case Intrinsic::x86_avx2_phadd_sw:
case Intrinsic::x86_ssse3_phsub_sw:
case Intrinsic::x86_ssse3_phsub_sw_128:
case Intrinsic::x86_avx2_phsub_sw: {
handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
break;
}
// Packed Single/Double Precision Floating-Point Horizontal Add
case Intrinsic::x86_sse3_hadd_ps:
case Intrinsic::x86_sse3_hadd_pd:
case Intrinsic::x86_avx_hadd_pd_256:
case Intrinsic::x86_avx_hadd_ps_256:
case Intrinsic::x86_sse3_hsub_ps:
case Intrinsic::x86_sse3_hsub_pd:
case Intrinsic::x86_avx_hsub_pd_256:
case Intrinsic::x86_avx_hsub_ps_256: {
handlePairwiseShadowOrIntrinsic(I);
break;
}
case Intrinsic::x86_avx_maskstore_ps:
case Intrinsic::x86_avx_maskstore_pd:
case Intrinsic::x86_avx_maskstore_ps_256:
case Intrinsic::x86_avx_maskstore_pd_256:
case Intrinsic::x86_avx2_maskstore_d:
case Intrinsic::x86_avx2_maskstore_q:
case Intrinsic::x86_avx2_maskstore_d_256:
case Intrinsic::x86_avx2_maskstore_q_256: {
handleAVXMaskedStore(I);
break;
}
case Intrinsic::x86_avx_maskload_ps:
case Intrinsic::x86_avx_maskload_pd:
case Intrinsic::x86_avx_maskload_ps_256:
case Intrinsic::x86_avx_maskload_pd_256:
case Intrinsic::x86_avx2_maskload_d:
case Intrinsic::x86_avx2_maskload_q:
case Intrinsic::x86_avx2_maskload_d_256:
case Intrinsic::x86_avx2_maskload_q_256: {
handleAVXMaskedLoad(I);
break;
}
// Packed
case Intrinsic::x86_avx512fp16_add_ph_512:
case Intrinsic::x86_avx512fp16_sub_ph_512:
case Intrinsic::x86_avx512fp16_mul_ph_512:
case Intrinsic::x86_avx512fp16_div_ph_512:
case Intrinsic::x86_avx512fp16_max_ph_512:
case Intrinsic::x86_avx512fp16_min_ph_512:
case Intrinsic::x86_avx512_min_ps_512:
case Intrinsic::x86_avx512_min_pd_512:
case Intrinsic::x86_avx512_max_ps_512:
case Intrinsic::x86_avx512_max_pd_512: {
// These AVX512 variants contain the rounding mode as a trailing flag.
// Earlier variants do not have a trailing flag and are already handled
// by maybeHandleSimpleNomemIntrinsic(I, 0) via handleUnknownIntrinsic.
[[maybe_unused]] bool Success =
maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/1);
assert(Success);
break;
}
case Intrinsic::x86_avx_vpermilvar_pd:
case Intrinsic::x86_avx_vpermilvar_pd_256:
case Intrinsic::x86_avx512_vpermilvar_pd_512:
case Intrinsic::x86_avx_vpermilvar_ps:
case Intrinsic::x86_avx_vpermilvar_ps_256:
case Intrinsic::x86_avx512_vpermilvar_ps_512: {
handleAVXVpermilvar(I);
break;
}
case Intrinsic::x86_avx512_vpermi2var_d_128:
case Intrinsic::x86_avx512_vpermi2var_d_256:
case Intrinsic::x86_avx512_vpermi2var_d_512:
case Intrinsic::x86_avx512_vpermi2var_hi_128:
case Intrinsic::x86_avx512_vpermi2var_hi_256:
case Intrinsic::x86_avx512_vpermi2var_hi_512:
case Intrinsic::x86_avx512_vpermi2var_pd_128:
case Intrinsic::x86_avx512_vpermi2var_pd_256:
case Intrinsic::x86_avx512_vpermi2var_pd_512:
case Intrinsic::x86_avx512_vpermi2var_ps_128:
case Intrinsic::x86_avx512_vpermi2var_ps_256:
case Intrinsic::x86_avx512_vpermi2var_ps_512:
case Intrinsic::x86_avx512_vpermi2var_q_128:
case Intrinsic::x86_avx512_vpermi2var_q_256:
case Intrinsic::x86_avx512_vpermi2var_q_512:
case Intrinsic::x86_avx512_vpermi2var_qi_128:
case Intrinsic::x86_avx512_vpermi2var_qi_256:
case Intrinsic::x86_avx512_vpermi2var_qi_512:
handleAVXVpermi2var(I);
break;
// Packed Shuffle
// llvm.x86.sse.pshuf.w(<1 x i64>, i8)
// llvm.x86.ssse3.pshuf.b(<1 x i64>, <1 x i64>)
// llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>)
// llvm.x86.avx2.pshuf.b(<32 x i8>, <32 x i8>)
// llvm.x86.avx512.pshuf.b.512(<64 x i8>, <64 x i8>)
//
// The following intrinsics are auto-upgraded:
// llvm.x86.sse2.pshuf.d(<4 x i32>, i8)
// llvm.x86.sse2.gpshufh.w(<8 x i16>, i8)
// llvm.x86.sse2.pshufl.w(<8 x i16>, i8)
case Intrinsic::x86_avx2_pshuf_b:
case Intrinsic::x86_sse_pshuf_w:
case Intrinsic::x86_ssse3_pshuf_b_128:
case Intrinsic::x86_ssse3_pshuf_b:
case Intrinsic::x86_avx512_pshuf_b_512:
handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
/*trailingVerbatimArgs=*/1);
break;
case Intrinsic::x86_avx512_mask_cvtps2dq_512: {
handleAVX512VectorConvertFPToInt(I);
break;
}
// AVX512 PMOV: Packed MOV, with truncation
// Precisely handled by applying the same intrinsic to the shadow
case Intrinsic::x86_avx512_mask_pmov_dw_512:
case Intrinsic::x86_avx512_mask_pmov_db_512:
case Intrinsic::x86_avx512_mask_pmov_qb_512:
case Intrinsic::x86_avx512_mask_pmov_qw_512: {
// Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 were removed in
// f608dc1f5775ee880e8ea30e2d06ab5a4a935c22
handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
/*trailingVerbatimArgs=*/1);
break;
}
// AVX512 PMVOV{S,US}: Packed MOV, with signed/unsigned saturation
// Approximately handled using the corresponding truncation intrinsic
// TODO: improve handleAVX512VectorDownConvert to precisely model saturation
case Intrinsic::x86_avx512_mask_pmovs_dw_512:
case Intrinsic::x86_avx512_mask_pmovus_dw_512: {
handleIntrinsicByApplyingToShadow(I,
Intrinsic::x86_avx512_mask_pmov_dw_512,
/* trailingVerbatimArgs=*/1);
break;
}
case Intrinsic::x86_avx512_mask_pmovs_db_512:
case Intrinsic::x86_avx512_mask_pmovus_db_512: {
handleIntrinsicByApplyingToShadow(I,
Intrinsic::x86_avx512_mask_pmov_db_512,
/* trailingVerbatimArgs=*/1);
break;
}
case Intrinsic::x86_avx512_mask_pmovs_qb_512:
case Intrinsic::x86_avx512_mask_pmovus_qb_512: {
handleIntrinsicByApplyingToShadow(I,
Intrinsic::x86_avx512_mask_pmov_qb_512,
/* trailingVerbatimArgs=*/1);
break;
}
case Intrinsic::x86_avx512_mask_pmovs_qw_512:
case Intrinsic::x86_avx512_mask_pmovus_qw_512: {
handleIntrinsicByApplyingToShadow(I,
Intrinsic::x86_avx512_mask_pmov_qw_512,
/* trailingVerbatimArgs=*/1);
break;
}
case Intrinsic::x86_avx512_mask_pmovs_qd_512:
case Intrinsic::x86_avx512_mask_pmovus_qd_512:
case Intrinsic::x86_avx512_mask_pmovs_wb_512:
case Intrinsic::x86_avx512_mask_pmovus_wb_512: {
// Since Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 do not exist, we
// cannot use handleIntrinsicByApplyingToShadow. Instead, we call the
// slow-path handler.
handleAVX512VectorDownConvert(I);
break;
}
// AVX512 FP16 Arithmetic
case Intrinsic::x86_avx512fp16_mask_add_sh_round:
case Intrinsic::x86_avx512fp16_mask_sub_sh_round:
case Intrinsic::x86_avx512fp16_mask_mul_sh_round:
case Intrinsic::x86_avx512fp16_mask_div_sh_round:
case Intrinsic::x86_avx512fp16_mask_max_sh_round:
case Intrinsic::x86_avx512fp16_mask_min_sh_round: {
visitGenericScalarHalfwordInst(I);
break;
}
// AVX Galois Field New Instructions
case Intrinsic::x86_vgf2p8affineqb_128:
case Intrinsic::x86_vgf2p8affineqb_256:
case Intrinsic::x86_vgf2p8affineqb_512:
handleAVXGF2P8Affine(I);
break;
case Intrinsic::fshl:
case Intrinsic::fshr:
handleFunnelShift(I);
break;
case Intrinsic::is_constant:
// The result of llvm.is.constant() is always defined.
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
break;
// TODO: handling max/min similarly to AND/OR may be more precise
// Floating-Point Maximum/Minimum Pairwise
case Intrinsic::aarch64_neon_fmaxp:
case Intrinsic::aarch64_neon_fminp:
// Floating-Point Maximum/Minimum Number Pairwise
case Intrinsic::aarch64_neon_fmaxnmp:
case Intrinsic::aarch64_neon_fminnmp:
// Signed/Unsigned Maximum/Minimum Pairwise
case Intrinsic::aarch64_neon_smaxp:
case Intrinsic::aarch64_neon_sminp:
case Intrinsic::aarch64_neon_umaxp:
case Intrinsic::aarch64_neon_uminp:
// Add Pairwise
case Intrinsic::aarch64_neon_addp:
// Floating-point Add Pairwise
case Intrinsic::aarch64_neon_faddp:
// Add Long Pairwise
case Intrinsic::aarch64_neon_saddlp:
case Intrinsic::aarch64_neon_uaddlp: {
handlePairwiseShadowOrIntrinsic(I);
break;
}
// Floating-point Convert to integer, rounding to nearest with ties to Away
case Intrinsic::aarch64_neon_fcvtas:
case Intrinsic::aarch64_neon_fcvtau:
// Floating-point convert to integer, rounding toward minus infinity
case Intrinsic::aarch64_neon_fcvtms:
case Intrinsic::aarch64_neon_fcvtmu:
// Floating-point convert to integer, rounding to nearest with ties to even
case Intrinsic::aarch64_neon_fcvtns:
case Intrinsic::aarch64_neon_fcvtnu:
// Floating-point convert to integer, rounding toward plus infinity
case Intrinsic::aarch64_neon_fcvtps:
case Intrinsic::aarch64_neon_fcvtpu:
// Floating-point Convert to integer, rounding toward Zero
case Intrinsic::aarch64_neon_fcvtzs:
case Intrinsic::aarch64_neon_fcvtzu:
// Floating-point convert to lower precision narrow, rounding to odd
case Intrinsic::aarch64_neon_fcvtxn: {
handleNEONVectorConvertIntrinsic(I);
break;
}
// Add reduction to scalar
case Intrinsic::aarch64_neon_faddv:
case Intrinsic::aarch64_neon_saddv:
case Intrinsic::aarch64_neon_uaddv:
// Signed/Unsigned min/max (Vector)
// TODO: handling similarly to AND/OR may be more precise.
case Intrinsic::aarch64_neon_smaxv:
case Intrinsic::aarch64_neon_sminv:
case Intrinsic::aarch64_neon_umaxv:
case Intrinsic::aarch64_neon_uminv:
// Floating-point min/max (vector)
// The f{min,max}"nm"v variants handle NaN differently than f{min,max}v,
// but our shadow propagation is the same.
case Intrinsic::aarch64_neon_fmaxv:
case Intrinsic::aarch64_neon_fminv:
case Intrinsic::aarch64_neon_fmaxnmv:
case Intrinsic::aarch64_neon_fminnmv:
// Sum long across vector
case Intrinsic::aarch64_neon_saddlv:
case Intrinsic::aarch64_neon_uaddlv:
handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/true);
break;
case Intrinsic::aarch64_neon_ld1x2:
case Intrinsic::aarch64_neon_ld1x3:
case Intrinsic::aarch64_neon_ld1x4:
case Intrinsic::aarch64_neon_ld2:
case Intrinsic::aarch64_neon_ld3:
case Intrinsic::aarch64_neon_ld4:
case Intrinsic::aarch64_neon_ld2r:
case Intrinsic::aarch64_neon_ld3r:
case Intrinsic::aarch64_neon_ld4r: {
handleNEONVectorLoad(I, /*WithLane=*/false);
break;
}
case Intrinsic::aarch64_neon_ld2lane:
case Intrinsic::aarch64_neon_ld3lane:
case Intrinsic::aarch64_neon_ld4lane: {
handleNEONVectorLoad(I, /*WithLane=*/true);
break;
}
// Saturating extract narrow
case Intrinsic::aarch64_neon_sqxtn:
case Intrinsic::aarch64_neon_sqxtun:
case Intrinsic::aarch64_neon_uqxtn:
// These only have one argument, but we (ab)use handleShadowOr because it
// does work on single argument intrinsics and will typecast the shadow
// (and update the origin).
handleShadowOr(I);
break;
case Intrinsic::aarch64_neon_st1x2:
case Intrinsic::aarch64_neon_st1x3:
case Intrinsic::aarch64_neon_st1x4:
case Intrinsic::aarch64_neon_st2:
case Intrinsic::aarch64_neon_st3:
case Intrinsic::aarch64_neon_st4: {
handleNEONVectorStoreIntrinsic(I, false);
break;
}
case Intrinsic::aarch64_neon_st2lane:
case Intrinsic::aarch64_neon_st3lane:
case Intrinsic::aarch64_neon_st4lane: {
handleNEONVectorStoreIntrinsic(I, true);
break;
}
// Arm NEON vector table intrinsics have the source/table register(s) as
// arguments, followed by the index register. They return the output.
//
// 'TBL writes a zero if an index is out-of-range, while TBX leaves the
// original value unchanged in the destination register.'
// Conveniently, zero denotes a clean shadow, which means out-of-range
// indices for TBL will initialize the user data with zero and also clean
// the shadow. (For TBX, neither the user data nor the shadow will be
// updated, which is also correct.)
case Intrinsic::aarch64_neon_tbl1:
case Intrinsic::aarch64_neon_tbl2:
case Intrinsic::aarch64_neon_tbl3:
case Intrinsic::aarch64_neon_tbl4:
case Intrinsic::aarch64_neon_tbx1:
case Intrinsic::aarch64_neon_tbx2:
case Intrinsic::aarch64_neon_tbx3:
case Intrinsic::aarch64_neon_tbx4: {
// The last trailing argument (index register) should be handled verbatim
handleIntrinsicByApplyingToShadow(
I, /*shadowIntrinsicID=*/I.getIntrinsicID(),
/*trailingVerbatimArgs*/ 1);
break;
}
case Intrinsic::aarch64_neon_fmulx:
case Intrinsic::aarch64_neon_pmul:
case Intrinsic::aarch64_neon_pmull:
case Intrinsic::aarch64_neon_smull:
case Intrinsic::aarch64_neon_pmull64:
case Intrinsic::aarch64_neon_umull: {
handleNEONVectorMultiplyIntrinsic(I);
break;
}
case Intrinsic::scmp:
case Intrinsic::ucmp: {
handleShadowOr(I);
break;
}
default:
if (!handleUnknownIntrinsic(I))
visitInstruction(I);
break;
}
}
void visitLibAtomicLoad(CallBase &CB) {
// Since we use getNextNode here, we can't have CB terminate the BB.
assert(isa<CallInst>(CB));
IRBuilder<> IRB(&CB);
Value *Size = CB.getArgOperand(0);
Value *SrcPtr = CB.getArgOperand(1);
Value *DstPtr = CB.getArgOperand(2);
Value *Ordering = CB.getArgOperand(3);
// Convert the call to have at least Acquire ordering to make sure
// the shadow operations aren't reordered before it.
Value *NewOrdering =
IRB.CreateExtractElement(makeAddAcquireOrderingTable(IRB), Ordering);
CB.setArgOperand(3, NewOrdering);
NextNodeIRBuilder NextIRB(&CB);
Value *SrcShadowPtr, *SrcOriginPtr;
std::tie(SrcShadowPtr, SrcOriginPtr) =
getShadowOriginPtr(SrcPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
/*isStore*/ false);
Value *DstShadowPtr =
getShadowOriginPtr(DstPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
/*isStore*/ true)
.first;
NextIRB.CreateMemCpy(DstShadowPtr, Align(1), SrcShadowPtr, Align(1), Size);
if (MS.TrackOrigins) {
Value *SrcOrigin = NextIRB.CreateAlignedLoad(MS.OriginTy, SrcOriginPtr,
kMinOriginAlignment);
Value *NewOrigin = updateOrigin(SrcOrigin, NextIRB);
NextIRB.CreateCall(MS.MsanSetOriginFn, {DstPtr, Size, NewOrigin});
}
}
void visitLibAtomicStore(CallBase &CB) {
IRBuilder<> IRB(&CB);
Value *Size = CB.getArgOperand(0);
Value *DstPtr = CB.getArgOperand(2);
Value *Ordering = CB.getArgOperand(3);
// Convert the call to have at least Release ordering to make sure
// the shadow operations aren't reordered after it.
Value *NewOrdering =
IRB.CreateExtractElement(makeAddReleaseOrderingTable(IRB), Ordering);
CB.setArgOperand(3, NewOrdering);
Value *DstShadowPtr =
getShadowOriginPtr(DstPtr, IRB, IRB.getInt8Ty(), Align(1),
/*isStore*/ true)
.first;
// Atomic store always paints clean shadow/origin. See file header.
IRB.CreateMemSet(DstShadowPtr, getCleanShadow(IRB.getInt8Ty()), Size,
Align(1));
}
void visitCallBase(CallBase &CB) {
assert(!CB.getMetadata(LLVMContext::MD_nosanitize));
if (CB.isInlineAsm()) {
// For inline asm (either a call to asm function, or callbr instruction),
// do the usual thing: check argument shadow and mark all outputs as
// clean. Note that any side effects of the inline asm that are not
// immediately visible in its constraints are not handled.
if (ClHandleAsmConservative)
visitAsmInstruction(CB);
else
visitInstruction(CB);
return;
}
LibFunc LF;
if (TLI->getLibFunc(CB, LF)) {
// libatomic.a functions need to have special handling because there isn't
// a good way to intercept them or compile the library with
// instrumentation.
switch (LF) {
case LibFunc_atomic_load:
if (!isa<CallInst>(CB)) {
llvm::errs() << "MSAN -- cannot instrument invoke of libatomic load."
"Ignoring!\n";
break;
}
visitLibAtomicLoad(CB);
return;
case LibFunc_atomic_store:
visitLibAtomicStore(CB);
return;
default:
break;
}
}
if (auto *Call = dyn_cast<CallInst>(&CB)) {
assert(!isa<IntrinsicInst>(Call) && "intrinsics are handled elsewhere");
// We are going to insert code that relies on the fact that the callee
// will become a non-readonly function after it is instrumented by us. To
// prevent this code from being optimized out, mark that function
// non-readonly in advance.
// TODO: We can likely do better than dropping memory() completely here.
AttributeMask B;
B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
Call->removeFnAttrs(B);
if (Function *Func = Call->getCalledFunction()) {
Func->removeFnAttrs(B);
}
maybeMarkSanitizerLibraryCallNoBuiltin(Call, TLI);
}
IRBuilder<> IRB(&CB);
bool MayCheckCall = MS.EagerChecks;
if (Function *Func = CB.getCalledFunction()) {
// __sanitizer_unaligned_{load,store} functions may be called by users
// and always expects shadows in the TLS. So don't check them.
MayCheckCall &= !Func->getName().starts_with("__sanitizer_unaligned_");
}
unsigned ArgOffset = 0;
LLVM_DEBUG(dbgs() << " CallSite: " << CB << "\n");
for (const auto &[i, A] : llvm::enumerate(CB.args())) {
if (!A->getType()->isSized()) {
LLVM_DEBUG(dbgs() << "Arg " << i << " is not sized: " << CB << "\n");
continue;
}
if (A->getType()->isScalableTy()) {
LLVM_DEBUG(dbgs() << "Arg " << i << " is vscale: " << CB << "\n");
// Handle as noundef, but don't reserve tls slots.
insertCheckShadowOf(A, &CB);
continue;
}
unsigned Size = 0;
const DataLayout &DL = F.getDataLayout();
bool ByVal = CB.paramHasAttr(i, Attribute::ByVal);
bool NoUndef = CB.paramHasAttr(i, Attribute::NoUndef);
bool EagerCheck = MayCheckCall && !ByVal && NoUndef;
if (EagerCheck) {
insertCheckShadowOf(A, &CB);
Size = DL.getTypeAllocSize(A->getType());
} else {
[[maybe_unused]] Value *Store = nullptr;
// Compute the Shadow for arg even if it is ByVal, because
// in that case getShadow() will copy the actual arg shadow to
// __msan_param_tls.
Value *ArgShadow = getShadow(A);
Value *ArgShadowBase = getShadowPtrForArgument(IRB, ArgOffset);
LLVM_DEBUG(dbgs() << " Arg#" << i << ": " << *A
<< " Shadow: " << *ArgShadow << "\n");
if (ByVal) {
// ByVal requires some special handling as it's too big for a single
// load
assert(A->getType()->isPointerTy() &&
"ByVal argument is not a pointer!");
Size = DL.getTypeAllocSize(CB.getParamByValType(i));
if (ArgOffset + Size > kParamTLSSize)
break;
const MaybeAlign ParamAlignment(CB.getParamAlign(i));
MaybeAlign Alignment = std::nullopt;
if (ParamAlignment)
Alignment = std::min(*ParamAlignment, kShadowTLSAlignment);
Value *AShadowPtr, *AOriginPtr;
std::tie(AShadowPtr, AOriginPtr) =
getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), Alignment,
/*isStore*/ false);
if (!PropagateShadow) {
Store = IRB.CreateMemSet(ArgShadowBase,
Constant::getNullValue(IRB.getInt8Ty()),
Size, Alignment);
} else {
Store = IRB.CreateMemCpy(ArgShadowBase, Alignment, AShadowPtr,
Alignment, Size);
if (MS.TrackOrigins) {
Value *ArgOriginBase = getOriginPtrForArgument(IRB, ArgOffset);
// FIXME: OriginSize should be:
// alignTo(A % kMinOriginAlignment + Size, kMinOriginAlignment)
unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
IRB.CreateMemCpy(
ArgOriginBase,
/* by origin_tls[ArgOffset] */ kMinOriginAlignment,
AOriginPtr,
/* by getShadowOriginPtr */ kMinOriginAlignment, OriginSize);
}
}
} else {
// Any other parameters mean we need bit-grained tracking of uninit
// data
Size = DL.getTypeAllocSize(A->getType());
if (ArgOffset + Size > kParamTLSSize)
break;
Store = IRB.CreateAlignedStore(ArgShadow, ArgShadowBase,
kShadowTLSAlignment);
Constant *Cst = dyn_cast<Constant>(ArgShadow);
if (MS.TrackOrigins && !(Cst && Cst->isNullValue())) {
IRB.CreateStore(getOrigin(A),
getOriginPtrForArgument(IRB, ArgOffset));
}
}
assert(Store != nullptr);
LLVM_DEBUG(dbgs() << " Param:" << *Store << "\n");
}
assert(Size != 0);
ArgOffset += alignTo(Size, kShadowTLSAlignment);
}
LLVM_DEBUG(dbgs() << " done with call args\n");
FunctionType *FT = CB.getFunctionType();
if (FT->isVarArg()) {
VAHelper->visitCallBase(CB, IRB);
}
// Now, get the shadow for the RetVal.
if (!CB.getType()->isSized())
return;
// Don't emit the epilogue for musttail call returns.
if (isa<CallInst>(CB) && cast<CallInst>(CB).isMustTailCall())
return;
if (MayCheckCall && CB.hasRetAttr(Attribute::NoUndef)) {
setShadow(&CB, getCleanShadow(&CB));
setOrigin(&CB, getCleanOrigin());
return;
}
IRBuilder<> IRBBefore(&CB);
// Until we have full dynamic coverage, make sure the retval shadow is 0.
Value *Base = getShadowPtrForRetval(IRBBefore);
IRBBefore.CreateAlignedStore(getCleanShadow(&CB), Base,
kShadowTLSAlignment);
BasicBlock::iterator NextInsn;
if (isa<CallInst>(CB)) {
NextInsn = ++CB.getIterator();
assert(NextInsn != CB.getParent()->end());
} else {
BasicBlock *NormalDest = cast<InvokeInst>(CB).getNormalDest();
if (!NormalDest->getSinglePredecessor()) {
// FIXME: this case is tricky, so we are just conservative here.
// Perhaps we need to split the edge between this BB and NormalDest,
// but a naive attempt to use SplitEdge leads to a crash.
setShadow(&CB, getCleanShadow(&CB));
setOrigin(&CB, getCleanOrigin());
return;
}
// FIXME: NextInsn is likely in a basic block that has not been visited
// yet. Anything inserted there will be instrumented by MSan later!
NextInsn = NormalDest->getFirstInsertionPt();
assert(NextInsn != NormalDest->end() &&
"Could not find insertion point for retval shadow load");
}
IRBuilder<> IRBAfter(&*NextInsn);
Value *RetvalShadow = IRBAfter.CreateAlignedLoad(
getShadowTy(&CB), getShadowPtrForRetval(IRBAfter), kShadowTLSAlignment,
"_msret");
setShadow(&CB, RetvalShadow);
if (MS.TrackOrigins)
setOrigin(&CB, IRBAfter.CreateLoad(MS.OriginTy, getOriginPtrForRetval()));
}
bool isAMustTailRetVal(Value *RetVal) {
if (auto *I = dyn_cast<BitCastInst>(RetVal)) {
RetVal = I->getOperand(0);
}
if (auto *I = dyn_cast<CallInst>(RetVal)) {
return I->isMustTailCall();
}
return false;
}
void visitReturnInst(ReturnInst &I) {
IRBuilder<> IRB(&I);
Value *RetVal = I.getReturnValue();
if (!RetVal)
return;
// Don't emit the epilogue for musttail call returns.
if (isAMustTailRetVal(RetVal))
return;
Value *ShadowPtr = getShadowPtrForRetval(IRB);
bool HasNoUndef = F.hasRetAttribute(Attribute::NoUndef);
bool StoreShadow = !(MS.EagerChecks && HasNoUndef);
// FIXME: Consider using SpecialCaseList to specify a list of functions that
// must always return fully initialized values. For now, we hardcode "main".
bool EagerCheck = (MS.EagerChecks && HasNoUndef) || (F.getName() == "main");
Value *Shadow = getShadow(RetVal);
bool StoreOrigin = true;
if (EagerCheck) {
insertCheckShadowOf(RetVal, &I);
Shadow = getCleanShadow(RetVal);
StoreOrigin = false;
}
// The caller may still expect information passed over TLS if we pass our
// check
if (StoreShadow) {
IRB.CreateAlignedStore(Shadow, ShadowPtr, kShadowTLSAlignment);
if (MS.TrackOrigins && StoreOrigin)
IRB.CreateStore(getOrigin(RetVal), getOriginPtrForRetval());
}
}
void visitPHINode(PHINode &I) {
IRBuilder<> IRB(&I);
if (!PropagateShadow) {
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
return;
}
ShadowPHINodes.push_back(&I);
setShadow(&I, IRB.CreatePHI(getShadowTy(&I), I.getNumIncomingValues(),
"_msphi_s"));
if (MS.TrackOrigins)
setOrigin(
&I, IRB.CreatePHI(MS.OriginTy, I.getNumIncomingValues(), "_msphi_o"));
}
Value *getLocalVarIdptr(AllocaInst &I) {
ConstantInt *IntConst =
ConstantInt::get(Type::getInt32Ty((*F.getParent()).getContext()), 0);
return new GlobalVariable(*F.getParent(), IntConst->getType(),
/*isConstant=*/false, GlobalValue::PrivateLinkage,
IntConst);
}
Value *getLocalVarDescription(AllocaInst &I) {
return createPrivateConstGlobalForString(*F.getParent(), I.getName());
}
void poisonAllocaUserspace(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
if (PoisonStack && ClPoisonStackWithCall) {
IRB.CreateCall(MS.MsanPoisonStackFn, {&I, Len});
} else {
Value *ShadowBase, *OriginBase;
std::tie(ShadowBase, OriginBase) = getShadowOriginPtr(
&I, IRB, IRB.getInt8Ty(), Align(1), /*isStore*/ true);
Value *PoisonValue = IRB.getInt8(PoisonStack ? ClPoisonStackPattern : 0);
IRB.CreateMemSet(ShadowBase, PoisonValue, Len, I.getAlign());
}
if (PoisonStack && MS.TrackOrigins) {
Value *Idptr = getLocalVarIdptr(I);
if (ClPrintStackNames) {
Value *Descr = getLocalVarDescription(I);
IRB.CreateCall(MS.MsanSetAllocaOriginWithDescriptionFn,
{&I, Len, Idptr, Descr});
} else {
IRB.CreateCall(MS.MsanSetAllocaOriginNoDescriptionFn, {&I, Len, Idptr});
}
}
}
void poisonAllocaKmsan(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
Value *Descr = getLocalVarDescription(I);
if (PoisonStack) {
IRB.CreateCall(MS.MsanPoisonAllocaFn, {&I, Len, Descr});
} else {
IRB.CreateCall(MS.MsanUnpoisonAllocaFn, {&I, Len});
}
}
void instrumentAlloca(AllocaInst &I, Instruction *InsPoint = nullptr) {
if (!InsPoint)
InsPoint = &I;
NextNodeIRBuilder IRB(InsPoint);
const DataLayout &DL = F.getDataLayout();
TypeSize TS = DL.getTypeAllocSize(I.getAllocatedType());
Value *Len = IRB.CreateTypeSize(MS.IntptrTy, TS);
if (I.isArrayAllocation())
Len = IRB.CreateMul(Len,
IRB.CreateZExtOrTrunc(I.getArraySize(), MS.IntptrTy));
if (MS.CompileKernel)
poisonAllocaKmsan(I, IRB, Len);
else
poisonAllocaUserspace(I, IRB, Len);
}
void visitAllocaInst(AllocaInst &I) {
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
// We'll get to this alloca later unless it's poisoned at the corresponding
// llvm.lifetime.start.
AllocaSet.insert(&I);
}
void visitSelectInst(SelectInst &I) {
// a = select b, c, d
Value *B = I.getCondition();
Value *C = I.getTrueValue();
Value *D = I.getFalseValue();
handleSelectLikeInst(I, B, C, D);
}
void handleSelectLikeInst(Instruction &I, Value *B, Value *C, Value *D) {
IRBuilder<> IRB(&I);
Value *Sb = getShadow(B);
Value *Sc = getShadow(C);
Value *Sd = getShadow(D);
Value *Ob = MS.TrackOrigins ? getOrigin(B) : nullptr;
Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
Value *Od = MS.TrackOrigins ? getOrigin(D) : nullptr;
// Result shadow if condition shadow is 0.
Value *Sa0 = IRB.CreateSelect(B, Sc, Sd);
Value *Sa1;
if (I.getType()->isAggregateType()) {
// To avoid "sign extending" i1 to an arbitrary aggregate type, we just do
// an extra "select". This results in much more compact IR.
// Sa = select Sb, poisoned, (select b, Sc, Sd)
Sa1 = getPoisonedShadow(getShadowTy(I.getType()));
} else {
// Sa = select Sb, [ (c^d) | Sc | Sd ], [ b ? Sc : Sd ]
// If Sb (condition is poisoned), look for bits in c and d that are equal
// and both unpoisoned.
// If !Sb (condition is unpoisoned), simply pick one of Sc and Sd.
// Cast arguments to shadow-compatible type.
C = CreateAppToShadowCast(IRB, C);
D = CreateAppToShadowCast(IRB, D);
// Result shadow if condition shadow is 1.
Sa1 = IRB.CreateOr({IRB.CreateXor(C, D), Sc, Sd});
}
Value *Sa = IRB.CreateSelect(Sb, Sa1, Sa0, "_msprop_select");
setShadow(&I, Sa);
if (MS.TrackOrigins) {
// Origins are always i32, so any vector conditions must be flattened.
// FIXME: consider tracking vector origins for app vectors?
if (B->getType()->isVectorTy()) {
B = convertToBool(B, IRB);
Sb = convertToBool(Sb, IRB);
}
// a = select b, c, d
// Oa = Sb ? Ob : (b ? Oc : Od)
setOrigin(&I, IRB.CreateSelect(Sb, Ob, IRB.CreateSelect(B, Oc, Od)));
}
}
void visitLandingPadInst(LandingPadInst &I) {
// Do nothing.
// See https://github.com/google/sanitizers/issues/504
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
}
void visitCatchSwitchInst(CatchSwitchInst &I) {
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
}
void visitFuncletPadInst(FuncletPadInst &I) {
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
}
void visitGetElementPtrInst(GetElementPtrInst &I) { handleShadowOr(I); }
void visitExtractValueInst(ExtractValueInst &I) {
IRBuilder<> IRB(&I);
Value *Agg = I.getAggregateOperand();
LLVM_DEBUG(dbgs() << "ExtractValue: " << I << "\n");
Value *AggShadow = getShadow(Agg);
LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
Value *ResShadow = IRB.CreateExtractValue(AggShadow, I.getIndices());
LLVM_DEBUG(dbgs() << " ResShadow: " << *ResShadow << "\n");
setShadow(&I, ResShadow);
setOriginForNaryOp(I);
}
void visitInsertValueInst(InsertValueInst &I) {
IRBuilder<> IRB(&I);
LLVM_DEBUG(dbgs() << "InsertValue: " << I << "\n");
Value *AggShadow = getShadow(I.getAggregateOperand());
Value *InsShadow = getShadow(I.getInsertedValueOperand());
LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
LLVM_DEBUG(dbgs() << " InsShadow: " << *InsShadow << "\n");
Value *Res = IRB.CreateInsertValue(AggShadow, InsShadow, I.getIndices());
LLVM_DEBUG(dbgs() << " Res: " << *Res << "\n");
setShadow(&I, Res);
setOriginForNaryOp(I);
}
void dumpInst(Instruction &I) {
if (CallInst *CI = dyn_cast<CallInst>(&I)) {
errs() << "ZZZ call " << CI->getCalledFunction()->getName() << "\n";
} else {
errs() << "ZZZ " << I.getOpcodeName() << "\n";
}
errs() << "QQQ " << I << "\n";
}
void visitResumeInst(ResumeInst &I) {
LLVM_DEBUG(dbgs() << "Resume: " << I << "\n");
// Nothing to do here.
}
void visitCleanupReturnInst(CleanupReturnInst &CRI) {
LLVM_DEBUG(dbgs() << "CleanupReturn: " << CRI << "\n");
// Nothing to do here.
}
void visitCatchReturnInst(CatchReturnInst &CRI) {
LLVM_DEBUG(dbgs() << "CatchReturn: " << CRI << "\n");
// Nothing to do here.
}
void instrumentAsmArgument(Value *Operand, Type *ElemTy, Instruction &I,
IRBuilder<> &IRB, const DataLayout &DL,
bool isOutput) {
// For each assembly argument, we check its value for being initialized.
// If the argument is a pointer, we assume it points to a single element
// of the corresponding type (or to a 8-byte word, if the type is unsized).
// Each such pointer is instrumented with a call to the runtime library.
Type *OpType = Operand->getType();
// Check the operand value itself.
insertCheckShadowOf(Operand, &I);
if (!OpType->isPointerTy() || !isOutput) {
assert(!isOutput);
return;
}
if (!ElemTy->isSized())
return;
auto Size = DL.getTypeStoreSize(ElemTy);
Value *SizeVal = IRB.CreateTypeSize(MS.IntptrTy, Size);
if (MS.CompileKernel) {
IRB.CreateCall(MS.MsanInstrumentAsmStoreFn, {Operand, SizeVal});
} else {
// ElemTy, derived from elementtype(), does not encode the alignment of
// the pointer. Conservatively assume that the shadow memory is unaligned.
// When Size is large, avoid StoreInst as it would expand to many
// instructions.
auto [ShadowPtr, _] =
getShadowOriginPtrUserspace(Operand, IRB, IRB.getInt8Ty(), Align(1));
if (Size <= 32)
IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1));
else
IRB.CreateMemSet(ShadowPtr, ConstantInt::getNullValue(IRB.getInt8Ty()),
SizeVal, Align(1));
}
}
/// Get the number of output arguments returned by pointers.
int getNumOutputArgs(InlineAsm *IA, CallBase *CB) {
int NumRetOutputs = 0;
int NumOutputs = 0;
Type *RetTy = cast<Value>(CB)->getType();
if (!RetTy->isVoidTy()) {
// Register outputs are returned via the CallInst return value.
auto *ST = dyn_cast<StructType>(RetTy);
if (ST)
NumRetOutputs = ST->getNumElements();
else
NumRetOutputs = 1;
}
InlineAsm::ConstraintInfoVector Constraints = IA->ParseConstraints();
for (const InlineAsm::ConstraintInfo &Info : Constraints) {
switch (Info.Type) {
case InlineAsm::isOutput:
NumOutputs++;
break;
default:
break;
}
}
return NumOutputs - NumRetOutputs;
}
void visitAsmInstruction(Instruction &I) {
// Conservative inline assembly handling: check for poisoned shadow of
// asm() arguments, then unpoison the result and all the memory locations
// pointed to by those arguments.
// An inline asm() statement in C++ contains lists of input and output
// arguments used by the assembly code. These are mapped to operands of the
// CallInst as follows:
// - nR register outputs ("=r) are returned by value in a single structure
// (SSA value of the CallInst);
// - nO other outputs ("=m" and others) are returned by pointer as first
// nO operands of the CallInst;
// - nI inputs ("r", "m" and others) are passed to CallInst as the
// remaining nI operands.
// The total number of asm() arguments in the source is nR+nO+nI, and the
// corresponding CallInst has nO+nI+1 operands (the last operand is the
// function to be called).
const DataLayout &DL = F.getDataLayout();
CallBase *CB = cast<CallBase>(&I);
IRBuilder<> IRB(&I);
InlineAsm *IA = cast<InlineAsm>(CB->getCalledOperand());
int OutputArgs = getNumOutputArgs(IA, CB);
// The last operand of a CallInst is the function itself.
int NumOperands = CB->getNumOperands() - 1;
// Check input arguments. Doing so before unpoisoning output arguments, so
// that we won't overwrite uninit values before checking them.
for (int i = OutputArgs; i < NumOperands; i++) {
Value *Operand = CB->getOperand(i);
instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
/*isOutput*/ false);
}
// Unpoison output arguments. This must happen before the actual InlineAsm
// call, so that the shadow for memory published in the asm() statement
// remains valid.
for (int i = 0; i < OutputArgs; i++) {
Value *Operand = CB->getOperand(i);
instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
/*isOutput*/ true);
}
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
}
void visitFreezeInst(FreezeInst &I) {
// Freeze always returns a fully defined value.
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
}
void visitInstruction(Instruction &I) {
// Everything else: stop propagating and check for poisoned shadow.
if (ClDumpStrictInstructions)
dumpInst(I);
LLVM_DEBUG(dbgs() << "DEFAULT: " << I << "\n");
for (size_t i = 0, n = I.getNumOperands(); i < n; i++) {
Value *Operand = I.getOperand(i);
if (Operand->getType()->isSized())
insertCheckShadowOf(Operand, &I);
}
setShadow(&I, getCleanShadow(&I));
setOrigin(&I, getCleanOrigin());
}
};
struct VarArgHelperBase : public VarArgHelper {
Function &F;
MemorySanitizer &MS;
MemorySanitizerVisitor &MSV;
SmallVector<CallInst *, 16> VAStartInstrumentationList;
const unsigned VAListTagSize;
VarArgHelperBase(Function &F, MemorySanitizer &MS,
MemorySanitizerVisitor &MSV, unsigned VAListTagSize)
: F(F), MS(MS), MSV(MSV), VAListTagSize(VAListTagSize) {}
Value *getShadowAddrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
return IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
}
/// Compute the shadow address for a given va_arg.
Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
return IRB.CreateIntToPtr(Base, MS.PtrTy, "_msarg_va_s");
}
/// Compute the shadow address for a given va_arg.
Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset,
unsigned ArgSize) {
// Make sure we don't overflow __msan_va_arg_tls.
if (ArgOffset + ArgSize > kParamTLSSize)
return nullptr;
return getShadowPtrForVAArgument(IRB, ArgOffset);
}
/// Compute the origin address for a given va_arg.
Value *getOriginPtrForVAArgument(IRBuilder<> &IRB, int ArgOffset) {
Value *Base = IRB.CreatePointerCast(MS.VAArgOriginTLS, MS.IntptrTy);
// getOriginPtrForVAArgument() is always called after
// getShadowPtrForVAArgument(), so __msan_va_arg_origin_tls can never
// overflow.
Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
return IRB.CreateIntToPtr(Base, MS.PtrTy, "_msarg_va_o");
}
void CleanUnusedTLS(IRBuilder<> &IRB, Value *ShadowBase,
unsigned BaseOffset) {
// The tails of __msan_va_arg_tls is not large enough to fit full
// value shadow, but it will be copied to backup anyway. Make it
// clean.
if (BaseOffset >= kParamTLSSize)
return;
Value *TailSize =
ConstantInt::getSigned(IRB.getInt32Ty(), kParamTLSSize - BaseOffset);
IRB.CreateMemSet(ShadowBase, ConstantInt::getNullValue(IRB.getInt8Ty()),
TailSize, Align(8));
}
void unpoisonVAListTagForInst(IntrinsicInst &I) {
IRBuilder<> IRB(&I);
Value *VAListTag = I.getArgOperand(0);
const Align Alignment = Align(8);
auto [ShadowPtr, OriginPtr] = MSV.getShadowOriginPtr(
VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
// Unpoison the whole __va_list_tag.
IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
VAListTagSize, Alignment, false);
}
void visitVAStartInst(VAStartInst &I) override {
if (F.getCallingConv() == CallingConv::Win64)
return;
VAStartInstrumentationList.push_back(&I);
unpoisonVAListTagForInst(I);
}
void visitVACopyInst(VACopyInst &I) override {
if (F.getCallingConv() == CallingConv::Win64)
return;
unpoisonVAListTagForInst(I);
}
};
/// AMD64-specific implementation of VarArgHelper.
struct VarArgAMD64Helper : public VarArgHelperBase {
// An unfortunate workaround for asymmetric lowering of va_arg stuff.
// See a comment in visitCallBase for more details.
static const unsigned AMD64GpEndOffset = 48; // AMD64 ABI Draft 0.99.6 p3.5.7
static const unsigned AMD64FpEndOffsetSSE = 176;
// If SSE is disabled, fp_offset in va_list is zero.
static const unsigned AMD64FpEndOffsetNoSSE = AMD64GpEndOffset;
unsigned AMD64FpEndOffset;
AllocaInst *VAArgTLSCopy = nullptr;
AllocaInst *VAArgTLSOriginCopy = nullptr;
Value *VAArgOverflowSize = nullptr;
enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
VarArgAMD64Helper(Function &F, MemorySanitizer &MS,
MemorySanitizerVisitor &MSV)
: VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/24) {
AMD64FpEndOffset = AMD64FpEndOffsetSSE;
for (const auto &Attr : F.getAttributes().getFnAttrs()) {
if (Attr.isStringAttribute() &&
(Attr.getKindAsString() == "target-features")) {
if (Attr.getValueAsString().contains("-sse"))
AMD64FpEndOffset = AMD64FpEndOffsetNoSSE;
break;
}
}
}
ArgKind classifyArgument(Value *arg) {
// A very rough approximation of X86_64 argument classification rules.
Type *T = arg->getType();
if (T->isX86_FP80Ty())
return AK_Memory;
if (T->isFPOrFPVectorTy())
return AK_FloatingPoint;
if (T->isIntegerTy() && T->getPrimitiveSizeInBits() <= 64)
return AK_GeneralPurpose;
if (T->isPointerTy())
return AK_GeneralPurpose;
return AK_Memory;
}
// For VarArg functions, store the argument shadow in an ABI-specific format
// that corresponds to va_list layout.
// We do this because Clang lowers va_arg in the frontend, and this pass
// only sees the low level code that deals with va_list internals.
// A much easier alternative (provided that Clang emits va_arg instructions)
// would have been to associate each live instance of va_list with a copy of
// MSanParamTLS, and extract shadow on va_arg() call in the argument list
// order.
void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
unsigned GpOffset = 0;
unsigned FpOffset = AMD64GpEndOffset;
unsigned OverflowOffset = AMD64FpEndOffset;
const DataLayout &DL = F.getDataLayout();
for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
if (IsByVal) {
// ByVal arguments always go to the overflow area.
// Fixed arguments passed through the overflow area will be stepped
// over by va_start, so don't count them towards the offset.
if (IsFixed)
continue;
assert(A->getType()->isPointerTy());
Type *RealTy = CB.getParamByValType(ArgNo);
uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
uint64_t AlignedSize = alignTo(ArgSize, 8);
unsigned BaseOffset = OverflowOffset;
Value *ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
Value *OriginBase = nullptr;
if (MS.TrackOrigins)
OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
OverflowOffset += AlignedSize;
if (OverflowOffset > kParamTLSSize) {
CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
continue; // We have no space to copy shadow there.
}
Value *ShadowPtr, *OriginPtr;
std::tie(ShadowPtr, OriginPtr) =
MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), kShadowTLSAlignment,
/*isStore*/ false);
IRB.CreateMemCpy(ShadowBase, kShadowTLSAlignment, ShadowPtr,
kShadowTLSAlignment, ArgSize);
if (MS.TrackOrigins)
IRB.CreateMemCpy(OriginBase, kShadowTLSAlignment, OriginPtr,
kShadowTLSAlignment, ArgSize);
} else {
ArgKind AK = classifyArgument(A);
if (AK == AK_GeneralPurpose && GpOffset >= AMD64GpEndOffset)
AK = AK_Memory;
if (AK == AK_FloatingPoint && FpOffset >= AMD64FpEndOffset)
AK = AK_Memory;
Value *ShadowBase, *OriginBase = nullptr;
switch (AK) {
case AK_GeneralPurpose:
ShadowBase = getShadowPtrForVAArgument(IRB, GpOffset);
if (MS.TrackOrigins)
OriginBase = getOriginPtrForVAArgument(IRB, GpOffset);
GpOffset += 8;
assert(GpOffset <= kParamTLSSize);
break;
case AK_FloatingPoint:
ShadowBase = getShadowPtrForVAArgument(IRB, FpOffset);
if (MS.TrackOrigins)
OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
FpOffset += 16;
assert(FpOffset <= kParamTLSSize);
break;
case AK_Memory:
if (IsFixed)
continue;
uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
uint64_t AlignedSize = alignTo(ArgSize, 8);
unsigned BaseOffset = OverflowOffset;
ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
if (MS.TrackOrigins) {
OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
}
OverflowOffset += AlignedSize;
if (OverflowOffset > kParamTLSSize) {
// We have no space to copy shadow there.
CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
continue;
}
}
// Take fixed arguments into account for GpOffset and FpOffset,
// but don't actually store shadows for them.
// TODO(glider): don't call get*PtrForVAArgument() for them.
if (IsFixed)
continue;
Value *Shadow = MSV.getShadow(A);
IRB.CreateAlignedStore(Shadow, ShadowBase, kShadowTLSAlignment);
if (MS.TrackOrigins) {
Value *Origin = MSV.getOrigin(A);
TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
std::max(kShadowTLSAlignment, kMinOriginAlignment));
}
}
}
Constant *OverflowSize =
ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AMD64FpEndOffset);
IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
}
void finalizeInstrumentation() override {
assert(!VAArgOverflowSize && !VAArgTLSCopy &&
"finalizeInstrumentation called twice");
if (!VAStartInstrumentationList.empty()) {
// If there is a va_start in this function, make a backup copy of
// va_arg_tls somewhere in the function entry block.
IRBuilder<> IRB(MSV.FnPrologueEnd);
VAArgOverflowSize =
IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
Value *CopySize = IRB.CreateAdd(
ConstantInt::get(MS.IntptrTy, AMD64FpEndOffset), VAArgOverflowSize);
VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
CopySize, kShadowTLSAlignment, false);
Value *SrcSize = IRB.CreateBinaryIntrinsic(
Intrinsic::umin, CopySize,
ConstantInt::get(MS.IntptrTy, kParamTLSSize));
IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
kShadowTLSAlignment, SrcSize);
if (MS.TrackOrigins) {
VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
}
}
// Instrument va_start.
// Copy va_list shadow from the backup copy of the TLS contents.
for (CallInst *OrigInst : VAStartInstrumentationList) {
NextNodeIRBuilder IRB(OrigInst);
Value *VAListTag = OrigInst->getArgOperand(0);
Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
ConstantInt::get(MS.IntptrTy, 16)),
MS.PtrTy);
Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
const Align Alignment = Align(16);
std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
Alignment, /*isStore*/ true);
IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
AMD64FpEndOffset);
if (MS.TrackOrigins)
IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
Alignment, AMD64FpEndOffset);
Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
ConstantInt::get(MS.IntptrTy, 8)),
MS.PtrTy);
Value *OverflowArgAreaPtr =
IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
Alignment, /*isStore*/ true);
Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
AMD64FpEndOffset);
IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
VAArgOverflowSize);
if (MS.TrackOrigins) {
SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
AMD64FpEndOffset);
IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
VAArgOverflowSize);
}
}
}
};
/// AArch64-specific implementation of VarArgHelper.
struct VarArgAArch64Helper : public VarArgHelperBase {
static const unsigned kAArch64GrArgSize = 64;
static const unsigned kAArch64VrArgSize = 128;
static const unsigned AArch64GrBegOffset = 0;
static const unsigned AArch64GrEndOffset = kAArch64GrArgSize;
// Make VR space aligned to 16 bytes.
static const unsigned AArch64VrBegOffset = AArch64GrEndOffset;
static const unsigned AArch64VrEndOffset =
AArch64VrBegOffset + kAArch64VrArgSize;
static const unsigned AArch64VAEndOffset = AArch64VrEndOffset;
AllocaInst *VAArgTLSCopy = nullptr;
Value *VAArgOverflowSize = nullptr;
enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
VarArgAArch64Helper(Function &F, MemorySanitizer &MS,
MemorySanitizerVisitor &MSV)
: VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/32) {}
// A very rough approximation of aarch64 argument classification rules.
std::pair<ArgKind, uint64_t> classifyArgument(Type *T) {
if (T->isIntOrPtrTy() && T->getPrimitiveSizeInBits() <= 64)
return {AK_GeneralPurpose, 1};
if (T->isFloatingPointTy() && T->getPrimitiveSizeInBits() <= 128)
return {AK_FloatingPoint, 1};
if (T->isArrayTy()) {
auto R = classifyArgument(T->getArrayElementType());
R.second *= T->getScalarType()->getArrayNumElements();
return R;
}
if (const FixedVectorType *FV = dyn_cast<FixedVectorType>(T)) {
auto R = classifyArgument(FV->getScalarType());
R.second *= FV->getNumElements();
return R;
}
LLVM_DEBUG(errs() << "Unknown vararg type: " << *T << "\n");
return {AK_Memory, 0};
}
// The instrumentation stores the argument shadow in a non ABI-specific
// format because it does not know which argument is named (since Clang,
// like x86_64 case, lowers the va_args in the frontend and this pass only
// sees the low level code that deals with va_list internals).
// The first seven GR registers are saved in the first 56 bytes of the
// va_arg tls arra, followed by the first 8 FP/SIMD registers, and then
// the remaining arguments.
// Using constant offset within the va_arg TLS array allows fast copy
// in the finalize instrumentation.
void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
unsigned GrOffset = AArch64GrBegOffset;
unsigned VrOffset = AArch64VrBegOffset;
unsigned OverflowOffset = AArch64VAEndOffset;
const DataLayout &DL = F.getDataLayout();
for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
auto [AK, RegNum] = classifyArgument(A->getType());
if (AK == AK_GeneralPurpose &&
(GrOffset + RegNum * 8) > AArch64GrEndOffset)
AK = AK_Memory;
if (AK == AK_FloatingPoint &&
(VrOffset + RegNum * 16) > AArch64VrEndOffset)
AK = AK_Memory;
Value *Base;
switch (AK) {
case AK_GeneralPurpose:
Base = getShadowPtrForVAArgument(IRB, GrOffset);
GrOffset += 8 * RegNum;
break;
case AK_FloatingPoint:
Base = getShadowPtrForVAArgument(IRB, VrOffset);
VrOffset += 16 * RegNum;
break;
case AK_Memory:
// Don't count fixed arguments in the overflow area - va_start will
// skip right over them.
if (IsFixed)
continue;
uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
uint64_t AlignedSize = alignTo(ArgSize, 8);
unsigned BaseOffset = OverflowOffset;
Base = getShadowPtrForVAArgument(IRB, BaseOffset);
OverflowOffset += AlignedSize;
if (OverflowOffset > kParamTLSSize) {
// We have no space to copy shadow there.
CleanUnusedTLS(IRB, Base, BaseOffset);
continue;
}
break;
}
// Count Gp/Vr fixed arguments to their respective offsets, but don't
// bother to actually store a shadow.
if (IsFixed)
continue;
IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
}
Constant *OverflowSize =
ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AArch64VAEndOffset);
IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
}
// Retrieve a va_list field of 'void*' size.
Value *getVAField64(IRBuilder<> &IRB, Value *VAListTag, int offset) {
Value *SaveAreaPtrPtr = IRB.CreateIntToPtr(
IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
ConstantInt::get(MS.IntptrTy, offset)),
MS.PtrTy);
return IRB.CreateLoad(Type::getInt64Ty(*MS.C), SaveAreaPtrPtr);
}
// Retrieve a va_list field of 'int' size.
Value *getVAField32(IRBuilder<> &IRB, Value *VAListTag, int offset) {
Value *SaveAreaPtr = IRB.CreateIntToPtr(
IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
ConstantInt::get(MS.IntptrTy, offset)),
MS.PtrTy);
Value *SaveArea32 = IRB.CreateLoad(IRB.getInt32Ty(), SaveAreaPtr);
return IRB.CreateSExt(SaveArea32, MS.IntptrTy);
}
void finalizeInstrumentation() override {
assert(!VAArgOverflowSize && !VAArgTLSCopy &&
"finalizeInstrumentation called twice");
if (!VAStartInstrumentationList.empty()) {
// If there is a va_start in this function, make a backup copy of
// va_arg_tls somewhere in the function entry block.
IRBuilder<> IRB(MSV.FnPrologueEnd);
VAArgOverflowSize =
IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
Value *CopySize = IRB.CreateAdd(
ConstantInt::get(MS.IntptrTy, AArch64VAEndOffset), VAArgOverflowSize);
VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
CopySize, kShadowTLSAlignment, false);
Value *SrcSize = IRB.CreateBinaryIntrinsic(
Intrinsic::umin, CopySize,
ConstantInt::get(MS.IntptrTy, kParamTLSSize));
IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
kShadowTLSAlignment, SrcSize);
}
Value *GrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64GrArgSize);
Value *VrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64VrArgSize);
// Instrument va_start, copy va_list shadow from the backup copy of
// the TLS contents.
for (CallInst *OrigInst : VAStartInstrumentationList) {
NextNodeIRBuilder IRB(OrigInst);
Value *VAListTag = OrigInst->getArgOperand(0);
// The variadic ABI for AArch64 creates two areas to save the incoming
// argument registers (one for 64-bit general register xn-x7 and another
// for 128-bit FP/SIMD vn-v7).
// We need then to propagate the shadow arguments on both regions
// 'va::__gr_top + va::__gr_offs' and 'va::__vr_top + va::__vr_offs'.
// The remaining arguments are saved on shadow for 'va::stack'.
// One caveat is it requires only to propagate the non-named arguments,
// however on the call site instrumentation 'all' the arguments are
// saved. So to copy the shadow values from the va_arg TLS array
// we need to adjust the offset for both GR and VR fields based on
// the __{gr,vr}_offs value (since they are stores based on incoming
// named arguments).
Type *RegSaveAreaPtrTy = IRB.getPtrTy();
// Read the stack pointer from the va_list.
Value *StackSaveAreaPtr =
IRB.CreateIntToPtr(getVAField64(IRB, VAListTag, 0), RegSaveAreaPtrTy);
// Read both the __gr_top and __gr_off and add them up.
Value *GrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 8);
Value *GrOffSaveArea = getVAField32(IRB, VAListTag, 24);
Value *GrRegSaveAreaPtr = IRB.CreateIntToPtr(
IRB.CreateAdd(GrTopSaveAreaPtr, GrOffSaveArea), RegSaveAreaPtrTy);
// Read both the __vr_top and __vr_off and add them up.
Value *VrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 16);
Value *VrOffSaveArea = getVAField32(IRB, VAListTag, 28);
Value *VrRegSaveAreaPtr = IRB.CreateIntToPtr(
IRB.CreateAdd(VrTopSaveAreaPtr, VrOffSaveArea), RegSaveAreaPtrTy);
// It does not know how many named arguments is being used and, on the
// callsite all the arguments were saved. Since __gr_off is defined as
// '0 - ((8 - named_gr) * 8)', the idea is to just propagate the variadic
// argument by ignoring the bytes of shadow from named arguments.
Value *GrRegSaveAreaShadowPtrOff =
IRB.CreateAdd(GrArgSize, GrOffSaveArea);
Value *GrRegSaveAreaShadowPtr =
MSV.getShadowOriginPtr(GrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
Align(8), /*isStore*/ true)
.first;
Value *GrSrcPtr =
IRB.CreateInBoundsPtrAdd(VAArgTLSCopy, GrRegSaveAreaShadowPtrOff);
Value *GrCopySize = IRB.CreateSub(GrArgSize, GrRegSaveAreaShadowPtrOff);
IRB.CreateMemCpy(GrRegSaveAreaShadowPtr, Align(8), GrSrcPtr, Align(8),
GrCopySize);
// Again, but for FP/SIMD values.
Value *VrRegSaveAreaShadowPtrOff =
IRB.CreateAdd(VrArgSize, VrOffSaveArea);
Value *VrRegSaveAreaShadowPtr =
MSV.getShadowOriginPtr(VrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
Align(8), /*isStore*/ true)
.first;
Value *VrSrcPtr = IRB.CreateInBoundsPtrAdd(
IRB.CreateInBoundsPtrAdd(VAArgTLSCopy,
IRB.getInt32(AArch64VrBegOffset)),
VrRegSaveAreaShadowPtrOff);
Value *VrCopySize = IRB.CreateSub(VrArgSize, VrRegSaveAreaShadowPtrOff);
IRB.CreateMemCpy(VrRegSaveAreaShadowPtr, Align(8), VrSrcPtr, Align(8),
VrCopySize);
// And finally for remaining arguments.
Value *StackSaveAreaShadowPtr =
MSV.getShadowOriginPtr(StackSaveAreaPtr, IRB, IRB.getInt8Ty(),
Align(16), /*isStore*/ true)
.first;
Value *StackSrcPtr = IRB.CreateInBoundsPtrAdd(
VAArgTLSCopy, IRB.getInt32(AArch64VAEndOffset));
IRB.CreateMemCpy(StackSaveAreaShadowPtr, Align(16), StackSrcPtr,
Align(16), VAArgOverflowSize);
}
}
};
/// PowerPC64-specific implementation of VarArgHelper.
struct VarArgPowerPC64Helper : public VarArgHelperBase {
AllocaInst *VAArgTLSCopy = nullptr;
Value *VAArgSize = nullptr;
VarArgPowerPC64Helper(Function &F, MemorySanitizer &MS,
MemorySanitizerVisitor &MSV)
: VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/8) {}
void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
// For PowerPC, we need to deal with alignment of stack arguments -
// they are mostly aligned to 8 bytes, but vectors and i128 arrays
// are aligned to 16 bytes, byvals can be aligned to 8 or 16 bytes,
// For that reason, we compute current offset from stack pointer (which is
// always properly aligned), and offset for the first vararg, then subtract
// them.
unsigned VAArgBase;
Triple TargetTriple(F.getParent()->getTargetTriple());
// Parameter save area starts at 48 bytes from frame pointer for ABIv1,
// and 32 bytes for ABIv2. This is usually determined by target
// endianness, but in theory could be overridden by function attribute.
if (TargetTriple.isPPC64ELFv2ABI())
VAArgBase = 32;
else
VAArgBase = 48;
unsigned VAArgOffset = VAArgBase;
const DataLayout &DL = F.getDataLayout();
for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
if (IsByVal) {
assert(A->getType()->isPointerTy());
Type *RealTy = CB.getParamByValType(ArgNo);
uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(8));
if (ArgAlign < 8)
ArgAlign = Align(8);
VAArgOffset = alignTo(VAArgOffset, ArgAlign);
if (!IsFixed) {
Value *Base =
getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
if (Base) {
Value *AShadowPtr, *AOriginPtr;
std::tie(AShadowPtr, AOriginPtr) =
MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
kShadowTLSAlignment, /*isStore*/ false);
IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
kShadowTLSAlignment, ArgSize);
}
}
VAArgOffset += alignTo(ArgSize, Align(8));
} else {
Value *Base;
uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
Align ArgAlign = Align(8);
if (A->getType()->isArrayTy()) {
// Arrays are aligned to element size, except for long double
// arrays, which are aligned to 8 bytes.
Type *ElementTy = A->getType()->getArrayElementType();
if (!ElementTy->isPPC_FP128Ty())
ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
} else if (A->getType()->isVectorTy()) {
// Vectors are naturally aligned.
ArgAlign = Align(ArgSize);
}
if (ArgAlign < 8)
ArgAlign = Align(8);
VAArgOffset = alignTo(VAArgOffset, ArgAlign);
if (DL.isBigEndian()) {
// Adjusting the shadow for argument with size < 8 to match the
// placement of bits in big endian system
if (ArgSize < 8)
VAArgOffset += (8 - ArgSize);
}
if (!IsFixed) {
Base =
getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
if (Base)
IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
}
VAArgOffset += ArgSize;
VAArgOffset = alignTo(VAArgOffset, Align(8));
}
if (IsFixed)
VAArgBase = VAArgOffset;
}
Constant *TotalVAArgSize =
ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
// Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
// a new class member i.e. it is the total size of all VarArgs.
IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
}
void finalizeInstrumentation() override {
assert(!VAArgSize && !VAArgTLSCopy &&
"finalizeInstrumentation called twice");
IRBuilder<> IRB(MSV.FnPrologueEnd);
VAArgSize = IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
Value *CopySize = VAArgSize;
if (!VAStartInstrumentationList.empty()) {
// If there is a va_start in this function, make a backup copy of
// va_arg_tls somewhere in the function entry block.
VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
CopySize, kShadowTLSAlignment, false);
Value *SrcSize = IRB.CreateBinaryIntrinsic(
Intrinsic::umin, CopySize,
ConstantInt::get(IRB.getInt64Ty(), kParamTLSSize));
IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
kShadowTLSAlignment, SrcSize);
}
// Instrument va_start.
// Copy va_list shadow from the backup copy of the TLS contents.
for (CallInst *OrigInst : VAStartInstrumentationList) {
NextNodeIRBuilder IRB(OrigInst);
Value *VAListTag = OrigInst->getArgOperand(0);
Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
const DataLayout &DL = F.getDataLayout();
unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
const Align Alignment = Align(IntptrSize);
std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
Alignment, /*isStore*/ true);
IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
CopySize);
}
}
};
/// PowerPC32-specific implementation of VarArgHelper.
struct VarArgPowerPC32Helper : public VarArgHelperBase {
AllocaInst *VAArgTLSCopy = nullptr;
Value *VAArgSize = nullptr;
VarArgPowerPC32Helper(Function &F, MemorySanitizer &MS,
MemorySanitizerVisitor &MSV)
: VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/12) {}
void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
unsigned VAArgBase;
// Parameter save area is 8 bytes from frame pointer in PPC32
VAArgBase = 8;
unsigned VAArgOffset = VAArgBase;
const DataLayout &DL = F.getDataLayout();
unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
if (IsByVal) {
assert(A->getType()->isPointerTy());
Type *RealTy = CB.getParamByValType(ArgNo);
uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
if (ArgAlign < IntptrSize)
ArgAlign = Align(IntptrSize);
VAArgOffset = alignTo(VAArgOffset, ArgAlign);
if (!IsFixed) {
Value *Base =
getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
if (Base) {
Value *AShadowPtr, *AOriginPtr;
std::tie(AShadowPtr, AOriginPtr) =
MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
kShadowTLSAlignment, /*isStore*/ false);
IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
kShadowTLSAlignment, ArgSize);
}
}
VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
} else {
Value *Base;
Type *ArgTy = A->getType();
// On PPC 32 floating point variable arguments are stored in separate
// area: fp_save_area = reg_save_area + 4*8. We do not copy shaodow for
// them as they will be found when checking call arguments.
if (!ArgTy->isFloatingPointTy()) {
uint64_t ArgSize = DL.getTypeAllocSize(ArgTy);
Align ArgAlign = Align(IntptrSize);
if (ArgTy->isArrayTy()) {
// Arrays are aligned to element size, except for long double
// arrays, which are aligned to 8 bytes.
Type *ElementTy = ArgTy->getArrayElementType();
if (!ElementTy->isPPC_FP128Ty())
ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
} else if (ArgTy->isVectorTy()) {
// Vectors are naturally aligned.
ArgAlign = Align(ArgSize);
}
if (ArgAlign < IntptrSize)
ArgAlign = Align(IntptrSize);
VAArgOffset = alignTo(VAArgOffset, ArgAlign);
if (DL.isBigEndian()) {
// Adjusting the shadow for argument with size < IntptrSize to match
// the placement of bits in big endian system
if (ArgSize < IntptrSize)
VAArgOffset += (IntptrSize - ArgSize);
}
if (!IsFixed) {
Base = getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase,
ArgSize);
if (Base)
IRB.CreateAlignedStore(MSV.getShadow(A), Base,
kShadowTLSAlignment);
}
VAArgOffset += ArgSize;
VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
}
}
}
Constant *TotalVAArgSize =
ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
// Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
// a new class member i.e. it is the total size of all VarArgs.
IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
}
void finalizeInstrumentation() override {
assert(!VAArgSize && !VAArgTLSCopy &&
"finalizeInstrumentation called twice");
IRBuilder<> IRB(MSV.FnPrologueEnd);
VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
Value *CopySize = VAArgSize;
if (!VAStartInstrumentationList.empty()) {
// If there is a va_start in this function, make a backup copy of
// va_arg_tls somewhere in the function entry block.
VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
CopySize, kShadowTLSAlignment, false);
Value *SrcSize = IRB.CreateBinaryIntrinsic(
Intrinsic::umin, CopySize,
ConstantInt::get(MS.IntptrTy, kParamTLSSize));
IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
kShadowTLSAlignment, SrcSize);
}
// Instrument va_start.
// Copy va_list shadow from the backup copy of the TLS contents.
for (CallInst *OrigInst : VAStartInstrumentationList) {
NextNodeIRBuilder IRB(OrigInst);
Value *VAListTag = OrigInst->getArgOperand(0);
Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
Value *RegSaveAreaSize = CopySize;
// In PPC32 va_list_tag is a struct
RegSaveAreaPtrPtr =
IRB.CreateAdd(RegSaveAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 8));
// On PPC 32 reg_save_area can only hold 32 bytes of data
RegSaveAreaSize = IRB.CreateBinaryIntrinsic(
Intrinsic::umin, CopySize, ConstantInt::get(MS.IntptrTy, 32));
RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
const DataLayout &DL = F.getDataLayout();
unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
const Align Alignment = Align(IntptrSize);
{ // Copy reg save area
Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
Alignment, /*isStore*/ true);
IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy,
Alignment, RegSaveAreaSize);
RegSaveAreaShadowPtr =
IRB.CreatePtrToInt(RegSaveAreaShadowPtr, MS.IntptrTy);
Value *FPSaveArea = IRB.CreateAdd(RegSaveAreaShadowPtr,
ConstantInt::get(MS.IntptrTy, 32));
FPSaveArea = IRB.CreateIntToPtr(FPSaveArea, MS.PtrTy);
// We fill fp shadow with zeroes as uninitialized fp args should have
// been found during call base check
IRB.CreateMemSet(FPSaveArea, ConstantInt::getNullValue(IRB.getInt8Ty()),
ConstantInt::get(MS.IntptrTy, 32), Alignment);
}
{ // Copy overflow area
// RegSaveAreaSize is min(CopySize, 32) -> no overflow can occur
Value *OverflowAreaSize = IRB.CreateSub(CopySize, RegSaveAreaSize);
Value *OverflowAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
OverflowAreaPtrPtr =
IRB.CreateAdd(OverflowAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 4));
OverflowAreaPtrPtr = IRB.CreateIntToPtr(OverflowAreaPtrPtr, MS.PtrTy);
Value *OverflowAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowAreaPtrPtr);
Value *OverflowAreaShadowPtr, *OverflowAreaOriginPtr;
std::tie(OverflowAreaShadowPtr, OverflowAreaOriginPtr) =
MSV.getShadowOriginPtr(OverflowAreaPtr, IRB, IRB.getInt8Ty(),
Alignment, /*isStore*/ true);
Value *OverflowVAArgTLSCopyPtr =
IRB.CreatePtrToInt(VAArgTLSCopy, MS.IntptrTy);
OverflowVAArgTLSCopyPtr =
IRB.CreateAdd(OverflowVAArgTLSCopyPtr, RegSaveAreaSize);
OverflowVAArgTLSCopyPtr =
IRB.CreateIntToPtr(OverflowVAArgTLSCopyPtr, MS.PtrTy);
IRB.CreateMemCpy(OverflowAreaShadowPtr, Alignment,
OverflowVAArgTLSCopyPtr, Alignment, OverflowAreaSize);
}
}
}
};
/// SystemZ-specific implementation of VarArgHelper.
struct VarArgSystemZHelper : public VarArgHelperBase {
static const unsigned SystemZGpOffset = 16;
static const unsigned SystemZGpEndOffset = 56;
static const unsigned SystemZFpOffset = 128;
static const unsigned SystemZFpEndOffset = 160;
static const unsigned SystemZMaxVrArgs = 8;
static const unsigned SystemZRegSaveAreaSize = 160;
static const unsigned SystemZOverflowOffset = 160;
static const unsigned SystemZVAListTagSize = 32;
static const unsigned SystemZOverflowArgAreaPtrOffset = 16;
static const unsigned SystemZRegSaveAreaPtrOffset = 24;
bool IsSoftFloatABI;
AllocaInst *VAArgTLSCopy = nullptr;
AllocaInst *VAArgTLSOriginCopy = nullptr;
Value *VAArgOverflowSize = nullptr;
enum class ArgKind {
GeneralPurpose,
FloatingPoint,
Vector,
Memory,
Indirect,
};
enum class ShadowExtension { None, Zero, Sign };
VarArgSystemZHelper(Function &F, MemorySanitizer &MS,
MemorySanitizerVisitor &MSV)
: VarArgHelperBase(F, MS, MSV, SystemZVAListTagSize),
IsSoftFloatABI(F.getFnAttribute("use-soft-float").getValueAsBool()) {}
ArgKind classifyArgument(Type *T) {
// T is a SystemZABIInfo::classifyArgumentType() output, and there are
// only a few possibilities of what it can be. In particular, enums, single
// element structs and large types have already been taken care of.
// Some i128 and fp128 arguments are converted to pointers only in the
// back end.
if (T->isIntegerTy(128) || T->isFP128Ty())
return ArgKind::Indirect;
if (T->isFloatingPointTy())
return IsSoftFloatABI ? ArgKind::GeneralPurpose : ArgKind::FloatingPoint;
if (T->isIntegerTy() || T->isPointerTy())
return ArgKind::GeneralPurpose;
if (T->isVectorTy())
return ArgKind::Vector;
return ArgKind::Memory;
}
ShadowExtension getShadowExtension(const CallBase &CB, unsigned ArgNo) {
// ABI says: "One of the simple integer types no more than 64 bits wide.
// ... If such an argument is shorter than 64 bits, replace it by a full
// 64-bit integer representing the same number, using sign or zero
// extension". Shadow for an integer argument has the same type as the
// argument itself, so it can be sign or zero extended as well.
bool ZExt = CB.paramHasAttr(ArgNo, Attribute::ZExt);
bool SExt = CB.paramHasAttr(ArgNo, Attribute::SExt);
if (ZExt) {
assert(!SExt);
return ShadowExtension::Zero;
}
if (SExt) {
assert(!ZExt);
return ShadowExtension::Sign;
}
return ShadowExtension::None;
}
void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
unsigned GpOffset = SystemZGpOffset;
unsigned FpOffset = SystemZFpOffset;
unsigned VrIndex = 0;
unsigned OverflowOffset = SystemZOverflowOffset;
const DataLayout &DL = F.getDataLayout();
for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
// SystemZABIInfo does not produce ByVal parameters.
assert(!CB.paramHasAttr(ArgNo, Attribute::ByVal));
Type *T = A->getType();
ArgKind AK = classifyArgument(T);
if (AK == ArgKind::Indirect) {
T = MS.PtrTy;
AK = ArgKind::GeneralPurpose;
}
if (AK == ArgKind::GeneralPurpose && GpOffset >= SystemZGpEndOffset)
AK = ArgKind::Memory;
if (AK == ArgKind::FloatingPoint && FpOffset >= SystemZFpEndOffset)
AK = ArgKind::Memory;
if (AK == ArgKind::Vector && (VrIndex >= SystemZMaxVrArgs || !IsFixed))
AK = ArgKind::Memory;
Value *ShadowBase = nullptr;
Value *OriginBase = nullptr;
ShadowExtension SE = ShadowExtension::None;
switch (AK) {
case ArgKind::GeneralPurpose: {
// Always keep track of GpOffset, but store shadow only for varargs.
uint64_t ArgSize = 8;
if (GpOffset + ArgSize <= kParamTLSSize) {
if (!IsFixed) {
SE = getShadowExtension(CB, ArgNo);
uint64_t GapSize = 0;
if (SE == ShadowExtension::None) {
uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
assert(ArgAllocSize <= ArgSize);
GapSize = ArgSize - ArgAllocSize;
}
ShadowBase = getShadowAddrForVAArgument(IRB, GpOffset + GapSize);
if (MS.TrackOrigins)
OriginBase = getOriginPtrForVAArgument(IRB, GpOffset + GapSize);
}
GpOffset += ArgSize;
} else {
GpOffset = kParamTLSSize;
}
break;
}
case ArgKind::FloatingPoint: {
// Always keep track of FpOffset, but store shadow only for varargs.
uint64_t ArgSize = 8;
if (FpOffset + ArgSize <= kParamTLSSize) {
if (!IsFixed) {
// PoP says: "A short floating-point datum requires only the
// left-most 32 bit positions of a floating-point register".
// Therefore, in contrast to AK_GeneralPurpose and AK_Memory,
// don't extend shadow and don't mind the gap.
ShadowBase = getShadowAddrForVAArgument(IRB, FpOffset);
if (MS.TrackOrigins)
OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
}
FpOffset += ArgSize;
} else {
FpOffset = kParamTLSSize;
}
break;
}
case ArgKind::Vector: {
// Keep track of VrIndex. No need to store shadow, since vector varargs
// go through AK_Memory.
assert(IsFixed);
VrIndex++;
break;
}
case ArgKind::Memory: {
// Keep track of OverflowOffset and store shadow only for varargs.
// Ignore fixed args, since we need to copy only the vararg portion of
// the overflow area shadow.
if (!IsFixed) {
uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
uint64_t ArgSize = alignTo(ArgAllocSize, 8);
if (OverflowOffset + ArgSize <= kParamTLSSize) {
SE = getShadowExtension(CB, ArgNo);
uint64_t GapSize =
SE == ShadowExtension::None ? ArgSize - ArgAllocSize : 0;
ShadowBase =
getShadowAddrForVAArgument(IRB, OverflowOffset + GapSize);
if (MS.TrackOrigins)
OriginBase =
getOriginPtrForVAArgument(IRB, OverflowOffset + GapSize);
OverflowOffset += ArgSize;
} else {
OverflowOffset = kParamTLSSize;
}
}
break;
}
case ArgKind::Indirect:
llvm_unreachable("Indirect must be converted to GeneralPurpose");
}
if (ShadowBase == nullptr)
continue;
Value *Shadow = MSV.getShadow(A);
if (SE != ShadowExtension::None)
Shadow = MSV.CreateShadowCast(IRB, Shadow, IRB.getInt64Ty(),
/*Signed*/ SE == ShadowExtension::Sign);
ShadowBase = IRB.CreateIntToPtr(ShadowBase, MS.PtrTy, "_msarg_va_s");
IRB.CreateStore(Shadow, ShadowBase);
if (MS.TrackOrigins) {
Value *Origin = MSV.getOrigin(A);
TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
kMinOriginAlignment);
}
}
Constant *OverflowSize = ConstantInt::get(
IRB.getInt64Ty(), OverflowOffset - SystemZOverflowOffset);
IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
}
void copyRegSaveArea(IRBuilder<> &IRB, Value *VAListTag) {
Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
IRB.CreateAdd(
IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
ConstantInt::get(MS.IntptrTy, SystemZRegSaveAreaPtrOffset)),
MS.PtrTy);
Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
const Align Alignment = Align(8);
std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(), Alignment,
/*isStore*/ true);
// TODO(iii): copy only fragments filled by visitCallBase()
// TODO(iii): support packed-stack && !use-soft-float
// For use-soft-float functions, it is enough to copy just the GPRs.
unsigned RegSaveAreaSize =
IsSoftFloatABI ? SystemZGpEndOffset : SystemZRegSaveAreaSize;
IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
RegSaveAreaSize);
if (MS.TrackOrigins)
IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
Alignment, RegSaveAreaSize);
}
// FIXME: This implementation limits OverflowOffset to kParamTLSSize, so we
// don't know real overflow size and can't clear shadow beyond kParamTLSSize.
void copyOverflowArea(IRBuilder<> &IRB, Value *VAListTag) {
Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
IRB.CreateAdd(
IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
ConstantInt::get(MS.IntptrTy, SystemZOverflowArgAreaPtrOffset)),
MS.PtrTy);
Value *OverflowArgAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
const Align Alignment = Align(8);
std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
Alignment, /*isStore*/ true);
Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
SystemZOverflowOffset);
IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
VAArgOverflowSize);
if (MS.TrackOrigins) {
SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
SystemZOverflowOffset);
IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
VAArgOverflowSize);
}
}
void finalizeInstrumentation() override {
assert(!VAArgOverflowSize && !VAArgTLSCopy &&
"finalizeInstrumentation called twice");
if (!VAStartInstrumentationList.empty()) {
// If there is a va_start in this function, make a backup copy of
// va_arg_tls somewhere in the function entry block.
IRBuilder<> IRB(MSV.FnPrologueEnd);
VAArgOverflowSize =
IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
Value *CopySize =
IRB.CreateAdd(ConstantInt::get(MS.IntptrTy, SystemZOverflowOffset),
VAArgOverflowSize);
VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
CopySize, kShadowTLSAlignment, false);
Value *SrcSize = IRB.CreateBinaryIntrinsic(
Intrinsic::umin, CopySize,
ConstantInt::get(MS.IntptrTy, kParamTLSSize));
IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
kShadowTLSAlignment, SrcSize);
if (MS.TrackOrigins) {
VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
}
}
// Instrument va_start.
// Copy va_list shadow from the backup copy of the TLS contents.
for (CallInst *OrigInst : VAStartInstrumentationList) {
NextNodeIRBuilder IRB(OrigInst);
Value *VAListTag = OrigInst->getArgOperand(0);
copyRegSaveArea(IRB, VAListTag);
copyOverflowArea(IRB, VAListTag);
}
}
};
/// i386-specific implementation of VarArgHelper.
struct VarArgI386Helper : public VarArgHelperBase {
AllocaInst *VAArgTLSCopy = nullptr;
Value *VAArgSize = nullptr;
VarArgI386Helper(Function &F, MemorySanitizer &MS,
MemorySanitizerVisitor &MSV)
: VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/4) {}
void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
const DataLayout &DL = F.getDataLayout();
unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
unsigned VAArgOffset = 0;
for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
if (IsByVal) {
assert(A->getType()->isPointerTy());
Type *RealTy = CB.getParamByValType(ArgNo);
uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
if (ArgAlign < IntptrSize)
ArgAlign = Align(IntptrSize);
VAArgOffset = alignTo(VAArgOffset, ArgAlign);
if (!IsFixed) {
Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
if (Base) {
Value *AShadowPtr, *AOriginPtr;
std::tie(AShadowPtr, AOriginPtr) =
MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
kShadowTLSAlignment, /*isStore*/ false);
IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
kShadowTLSAlignment, ArgSize);
}
VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
}
} else {
Value *Base;
uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
Align ArgAlign = Align(IntptrSize);
VAArgOffset = alignTo(VAArgOffset, ArgAlign);
if (DL.isBigEndian()) {
// Adjusting the shadow for argument with size < IntptrSize to match
// the placement of bits in big endian system
if (ArgSize < IntptrSize)
VAArgOffset += (IntptrSize - ArgSize);
}
if (!IsFixed) {
Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
if (Base)
IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
VAArgOffset += ArgSize;
VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
}
}
}
Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
// Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
// a new class member i.e. it is the total size of all VarArgs.
IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
}
void finalizeInstrumentation() override {
assert(!VAArgSize && !VAArgTLSCopy &&
"finalizeInstrumentation called twice");
IRBuilder<> IRB(MSV.FnPrologueEnd);
VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
Value *CopySize = VAArgSize;
if (!VAStartInstrumentationList.empty()) {
// If there is a va_start in this function, make a backup copy of
// va_arg_tls somewhere in the function entry block.
VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
CopySize, kShadowTLSAlignment, false);
Value *SrcSize = IRB.CreateBinaryIntrinsic(
Intrinsic::umin, CopySize,
ConstantInt::get(MS.IntptrTy, kParamTLSSize));
IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
kShadowTLSAlignment, SrcSize);
}
// Instrument va_start.
// Copy va_list shadow from the backup copy of the TLS contents.
for (CallInst *OrigInst : VAStartInstrumentationList) {
NextNodeIRBuilder IRB(OrigInst);
Value *VAListTag = OrigInst->getArgOperand(0);
Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
Value *RegSaveAreaPtrPtr =
IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
PointerType::get(*MS.C, 0));
Value *RegSaveAreaPtr =
IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
const DataLayout &DL = F.getDataLayout();
unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
const Align Alignment = Align(IntptrSize);
std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
Alignment, /*isStore*/ true);
IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
CopySize);
}
}
};
/// Implementation of VarArgHelper that is used for ARM32, MIPS, RISCV,
/// LoongArch64.
struct VarArgGenericHelper : public VarArgHelperBase {
AllocaInst *VAArgTLSCopy = nullptr;
Value *VAArgSize = nullptr;
VarArgGenericHelper(Function &F, MemorySanitizer &MS,
MemorySanitizerVisitor &MSV, const unsigned VAListTagSize)
: VarArgHelperBase(F, MS, MSV, VAListTagSize) {}
void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
unsigned VAArgOffset = 0;
const DataLayout &DL = F.getDataLayout();
unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
if (IsFixed)
continue;
uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
if (DL.isBigEndian()) {
// Adjusting the shadow for argument with size < IntptrSize to match the
// placement of bits in big endian system
if (ArgSize < IntptrSize)
VAArgOffset += (IntptrSize - ArgSize);
}
Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
VAArgOffset += ArgSize;
VAArgOffset = alignTo(VAArgOffset, IntptrSize);
if (!Base)
continue;
IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
}
Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
// Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
// a new class member i.e. it is the total size of all VarArgs.
IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
}
void finalizeInstrumentation() override {
assert(!VAArgSize && !VAArgTLSCopy &&
"finalizeInstrumentation called twice");
IRBuilder<> IRB(MSV.FnPrologueEnd);
VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
Value *CopySize = VAArgSize;
if (!VAStartInstrumentationList.empty()) {
// If there is a va_start in this function, make a backup copy of
// va_arg_tls somewhere in the function entry block.
VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
CopySize, kShadowTLSAlignment, false);
Value *SrcSize = IRB.CreateBinaryIntrinsic(
Intrinsic::umin, CopySize,
ConstantInt::get(MS.IntptrTy, kParamTLSSize));
IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
kShadowTLSAlignment, SrcSize);
}
// Instrument va_start.
// Copy va_list shadow from the backup copy of the TLS contents.
for (CallInst *OrigInst : VAStartInstrumentationList) {
NextNodeIRBuilder IRB(OrigInst);
Value *VAListTag = OrigInst->getArgOperand(0);
Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
Value *RegSaveAreaPtrPtr =
IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
PointerType::get(*MS.C, 0));
Value *RegSaveAreaPtr =
IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
const DataLayout &DL = F.getDataLayout();
unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
const Align Alignment = Align(IntptrSize);
std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
Alignment, /*isStore*/ true);
IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
CopySize);
}
}
};
// ARM32, Loongarch64, MIPS and RISCV share the same calling conventions
// regarding VAArgs.
using VarArgARM32Helper = VarArgGenericHelper;
using VarArgRISCVHelper = VarArgGenericHelper;
using VarArgMIPSHelper = VarArgGenericHelper;
using VarArgLoongArch64Helper = VarArgGenericHelper;
/// A no-op implementation of VarArgHelper.
struct VarArgNoOpHelper : public VarArgHelper {
VarArgNoOpHelper(Function &F, MemorySanitizer &MS,
MemorySanitizerVisitor &MSV) {}
void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {}
void visitVAStartInst(VAStartInst &I) override {}
void visitVACopyInst(VACopyInst &I) override {}
void finalizeInstrumentation() override {}
};
} // end anonymous namespace
static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
MemorySanitizerVisitor &Visitor) {
// VarArg handling is only implemented on AMD64. False positives are possible
// on other platforms.
Triple TargetTriple(Func.getParent()->getTargetTriple());
if (TargetTriple.getArch() == Triple::x86)
return new VarArgI386Helper(Func, Msan, Visitor);
if (TargetTriple.getArch() == Triple::x86_64)
return new VarArgAMD64Helper(Func, Msan, Visitor);
if (TargetTriple.isARM())
return new VarArgARM32Helper(Func, Msan, Visitor, /*VAListTagSize=*/4);
if (TargetTriple.isAArch64())
return new VarArgAArch64Helper(Func, Msan, Visitor);
if (TargetTriple.isSystemZ())
return new VarArgSystemZHelper(Func, Msan, Visitor);
// On PowerPC32 VAListTag is a struct
// {char, char, i16 padding, char *, char *}
if (TargetTriple.isPPC32())
return new VarArgPowerPC32Helper(Func, Msan, Visitor);
if (TargetTriple.isPPC64())
return new VarArgPowerPC64Helper(Func, Msan, Visitor);
if (TargetTriple.isRISCV32())
return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
if (TargetTriple.isRISCV64())
return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
if (TargetTriple.isMIPS32())
return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
if (TargetTriple.isMIPS64())
return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
if (TargetTriple.isLoongArch64())
return new VarArgLoongArch64Helper(Func, Msan, Visitor,
/*VAListTagSize=*/8);
return new VarArgNoOpHelper(Func, Msan, Visitor);
}
bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) {
if (!CompileKernel && F.getName() == kMsanModuleCtorName)
return false;
if (F.hasFnAttribute(Attribute::DisableSanitizerInstrumentation))
return false;
MemorySanitizerVisitor Visitor(F, *this, TLI);
// Clear out memory attributes.
AttributeMask B;
B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
F.removeFnAttrs(B);
return Visitor.runOnFunction();
}