llvm-project/llvm/lib/Target/X86/X86AsmPrinter.h
sivadeilra b933f0c376
Fix Windows EH IP2State tables (remove +1 bias) (#144745)
This changes how LLVM constructs certain data structures that relate to
exception handling (EH) on Windows. Specifically this changes how
IP2State tables for functions are constructed. The purpose of this
change is to align LLVM to the requires of the Windows AMD64 ABI, which
requires that the IP2State table entries point to the boundaries between
instructions.

On most Windows platforms (AMD64, ARM64, ARM32, IA64, but *not* x86-32),
exception handling works by looking up instruction pointers in lookup
tables. These lookup tables are stored in `.xdata` sections in
executables. One element of the lookup tables are the `IP2State` tables
(Instruction Pointer to State).

If a function has any instructions that require cleanup during exception
unwinding, then it will have an IP2State table. Each entry in the
IP2State table describes a range of bytes in the function's instruction
stream, and associates an "EH state number" with that range of
instructions. A value of -1 means "the null state", which does not
require any code to execute. A value other than -1 is an index into the
State table.

The entries in the IP2State table contain byte offsets within the
instruction stream of the function. The Windows ABI requires that these
offsets are aligned to instruction boundaries; they are not permitted to
point to a byte that is not the first byte of an instruction.

Unfortunately, CALL instructions present a problem during unwinding.
CALL instructions push the address of the instruction after the CALL
instruction, so that execution can resume after the CALL. If the CALL is
the last instruction within an IP2State region, then the return address
(on the stack) points to the *next* IP2State region. This means that the
unwinder will use the wrong cleanup funclet during unwinding.

To fix this problem, compilers should insert a NOP after a CALL
instruction, if the CALL instruction is the last instruction within an
IP2State region. The NOP is placed within the same IP2State region as
the CALL, so that the return address points to the NOP and the unwinder
will locate the correct region.

This PR modifies LLVM so that it inserts NOP instructions after CALL
instructions, when needed. In performance tests, the NOP has no
detectable significance. The NOP is rarely inserted, since it is only
inserted when the CALL is the last instruction before an IP2State
transition or the CALL is the last instruction before the function
epilogue.

NOP padding is only necessary on Windows AMD64 targets. On ARM64 and
ARM32, instructions have a fixed size so the unwinder knows how to "back
up" by one instruction.

Interaction with Import Call Optimization (ICO):

Import Call Optimization (ICO) is a compiler + OS feature on Windows
which improves the performance and security of DLL imports. ICO relies
on using a specific CALL idiom that can be replaced by the OS DLL
loader. This removes a load and indirect CALL and replaces it with a
single direct CALL.

To achieve this, ICO also inserts NOPs after the CALL instruction. If
the end of the CALL is aligned with an EH state transition, we *also*
insert a single-byte NOP. **Both forms of NOPs must be preserved.** They
cannot be combined into a single larger NOP; nor can the second NOP be
removed.

This is necessary because, if ICO is active and the call site is
modified by the loader, the loader will end up overwriting the NOPs that
were inserted for ICO. That means that those NOPs cannot be used for the
correct termination of the exception handling region (the IP2State
transition), so we still need an additional NOP instruction. The NOPs
cannot be combined into a longer NOP (which is ordinarily desirable)
because then ICO would split one instruction, producing a malformed
instruction after the ICO call.
2025-07-22 09:18:13 -07:00

202 lines
7.8 KiB
C++

//===-- X86AsmPrinter.h - X86 implementation of AsmPrinter ------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_LIB_TARGET_X86_X86ASMPRINTER_H
#define LLVM_LIB_TARGET_X86_X86ASMPRINTER_H
#include "llvm/CodeGen/AsmPrinter.h"
#include "llvm/CodeGen/FaultMaps.h"
#include "llvm/CodeGen/StackMaps.h"
// Implemented in X86MCInstLower.cpp
namespace {
class X86MCInstLower;
}
namespace llvm {
class MCCodeEmitter;
class MCStreamer;
class X86Subtarget;
class TargetMachine;
class LLVM_LIBRARY_VISIBILITY X86AsmPrinter : public AsmPrinter {
public:
static char ID;
private:
const X86Subtarget *Subtarget = nullptr;
FaultMaps FM;
std::unique_ptr<MCCodeEmitter> CodeEmitter;
bool EmitFPOData = false;
bool ShouldEmitWeakSwiftAsyncExtendedFramePointerFlags = false;
bool IndCSPrefix = false;
bool EnableImportCallOptimization = false;
enum ImportCallKind : unsigned {
IMAGE_RETPOLINE_AMD64_IMPORT_BR = 0x02,
IMAGE_RETPOLINE_AMD64_IMPORT_CALL = 0x03,
IMAGE_RETPOLINE_AMD64_INDIR_BR = 0x04,
IMAGE_RETPOLINE_AMD64_INDIR_CALL = 0x05,
IMAGE_RETPOLINE_AMD64_INDIR_BR_REX = 0x06,
IMAGE_RETPOLINE_AMD64_CFG_BR = 0x08,
IMAGE_RETPOLINE_AMD64_CFG_CALL = 0x09,
IMAGE_RETPOLINE_AMD64_CFG_BR_REX = 0x0A,
IMAGE_RETPOLINE_AMD64_SWITCHTABLE_FIRST = 0x010,
IMAGE_RETPOLINE_AMD64_SWITCHTABLE_LAST = 0x01F,
};
struct ImportCallInfo {
MCSymbol *CalleeSymbol;
ImportCallKind Kind;
};
DenseMap<MCSection *, std::vector<ImportCallInfo>>
SectionToImportedFunctionCalls;
// This utility class tracks the length of a stackmap instruction's 'shadow'.
// It is used by the X86AsmPrinter to ensure that the stackmap shadow
// invariants (i.e. no other stackmaps, patchpoints, or control flow within
// the shadow) are met, while outputting a minimal number of NOPs for padding.
//
// To minimise the number of NOPs used, the shadow tracker counts the number
// of instruction bytes output since the last stackmap. Only if there are too
// few instruction bytes to cover the shadow are NOPs used for padding.
class StackMapShadowTracker {
public:
void startFunction(MachineFunction &MF) {
this->MF = &MF;
}
void count(const MCInst &Inst, const MCSubtargetInfo &STI,
MCCodeEmitter *CodeEmitter);
// Called to signal the start of a shadow of RequiredSize bytes.
void reset(unsigned RequiredSize) {
RequiredShadowSize = RequiredSize;
CurrentShadowSize = 0;
InShadow = true;
}
// Called before every stackmap/patchpoint, and at the end of basic blocks,
// to emit any necessary padding-NOPs.
void emitShadowPadding(MCStreamer &OutStreamer, const MCSubtargetInfo &STI);
private:
const MachineFunction *MF = nullptr;
bool InShadow = false;
// RequiredShadowSize holds the length of the shadow specified in the most
// recently encountered STACKMAP instruction.
// CurrentShadowSize counts the number of bytes encoded since the most
// recently encountered STACKMAP, stopping when that number is greater than
// or equal to RequiredShadowSize.
unsigned RequiredShadowSize = 0, CurrentShadowSize = 0;
};
StackMapShadowTracker SMShadowTracker;
// All instructions emitted by the X86AsmPrinter should use this helper
// method.
//
// This helper function invokes the SMShadowTracker on each instruction before
// outputting it to the OutStream. This allows the shadow tracker to minimise
// the number of NOPs used for stackmap padding.
void EmitAndCountInstruction(MCInst &Inst);
void LowerSTACKMAP(const MachineInstr &MI);
void LowerPATCHPOINT(const MachineInstr &MI, X86MCInstLower &MCIL);
void LowerSTATEPOINT(const MachineInstr &MI, X86MCInstLower &MCIL);
void LowerFAULTING_OP(const MachineInstr &MI, X86MCInstLower &MCIL);
void LowerPATCHABLE_OP(const MachineInstr &MI, X86MCInstLower &MCIL);
void LowerTlsAddr(X86MCInstLower &MCInstLowering, const MachineInstr &MI);
// XRay-specific lowering for X86.
void LowerPATCHABLE_FUNCTION_ENTER(const MachineInstr &MI,
X86MCInstLower &MCIL);
void LowerPATCHABLE_RET(const MachineInstr &MI, X86MCInstLower &MCIL);
void LowerPATCHABLE_TAIL_CALL(const MachineInstr &MI, X86MCInstLower &MCIL);
void LowerPATCHABLE_EVENT_CALL(const MachineInstr &MI, X86MCInstLower &MCIL);
void LowerPATCHABLE_TYPED_EVENT_CALL(const MachineInstr &MI,
X86MCInstLower &MCIL);
void LowerFENTRY_CALL(const MachineInstr &MI, X86MCInstLower &MCIL);
// KCFI specific lowering for X86.
uint32_t MaskKCFIType(uint32_t Value);
void EmitKCFITypePadding(const MachineFunction &MF, bool HasType = true);
void LowerKCFI_CHECK(const MachineInstr &MI);
// Address sanitizer specific lowering for X86.
void LowerASAN_CHECK_MEMACCESS(const MachineInstr &MI);
// Choose between emitting .seh_ directives and .cv_fpo_ directives.
void EmitSEHInstruction(const MachineInstr *MI);
void PrintSymbolOperand(const MachineOperand &MO, raw_ostream &O) override;
void PrintOperand(const MachineInstr *MI, unsigned OpNo, raw_ostream &O);
void PrintModifiedOperand(const MachineInstr *MI, unsigned OpNo,
raw_ostream &O, StringRef Modifier = {});
void PrintPCRelImm(const MachineInstr *MI, unsigned OpNo, raw_ostream &O);
void PrintLeaMemReference(const MachineInstr *MI, unsigned OpNo,
raw_ostream &O, StringRef Modifier = {});
void PrintMemReference(const MachineInstr *MI, unsigned OpNo, raw_ostream &O,
StringRef Modifier = {});
void PrintIntelMemReference(const MachineInstr *MI, unsigned OpNo,
raw_ostream &O, StringRef Modifier = {});
const MCSubtargetInfo *getIFuncMCSubtargetInfo() const override;
void emitMachOIFuncStubBody(Module &M, const GlobalIFunc &GI,
MCSymbol *LazyPointer) override;
void emitMachOIFuncStubHelperBody(Module &M, const GlobalIFunc &GI,
MCSymbol *LazyPointer) override;
void emitCallInstruction(const llvm::MCInst &MCI);
void maybeEmitNopAfterCallForWindowsEH(const MachineInstr *MI);
// Emits a label to mark the next instruction as being relevant to Import Call
// Optimization.
void emitLabelAndRecordForImportCallOptimization(ImportCallKind Kind);
public:
X86AsmPrinter(TargetMachine &TM, std::unique_ptr<MCStreamer> Streamer);
StringRef getPassName() const override {
return "X86 Assembly Printer";
}
const X86Subtarget &getSubtarget() const { return *Subtarget; }
void emitStartOfAsmFile(Module &M) override;
void emitEndOfAsmFile(Module &M) override;
void emitInstruction(const MachineInstr *MI) override;
void emitBasicBlockEnd(const MachineBasicBlock &MBB) override;
bool PrintAsmOperand(const MachineInstr *MI, unsigned OpNo,
const char *ExtraCode, raw_ostream &O) override;
bool PrintAsmMemoryOperand(const MachineInstr *MI, unsigned OpNo,
const char *ExtraCode, raw_ostream &O) override;
bool doInitialization(Module &M) override {
SMShadowTracker.reset(0);
SM.reset();
FM.reset();
return AsmPrinter::doInitialization(M);
}
bool runOnMachineFunction(MachineFunction &MF) override;
void emitFunctionBodyStart() override;
void emitFunctionBodyEnd() override;
void emitKCFITypeId(const MachineFunction &MF) override;
bool shouldEmitWeakSwiftAsyncExtendedFramePointerFlags() const override {
return ShouldEmitWeakSwiftAsyncExtendedFramePointerFlags;
}
};
} // end namespace llvm
#endif