Flang currently lowers internal procedures passed as actual arguments using LLVM's `llvm.init.trampoline` / `llvm.adjust.trampoline` intrinsics, which require an executable stack. On modern Linux toolchains and security-hardened kernels that enforce W^X (Write XOR Execute), this causes link-time failures (`ld.lld: error: ... requires an executable stack`) or runtime `SEGV` from NX violations. This patch introduces a runtime trampoline pool that allocates trampolines from a dedicated `mmap`'d region instead of the stack. The pool toggles page permissions between writable (for patching) and executable (for dispatch), so the stack stays non-executable throughout. On macOS, MAP_JIT and `pthread_jit_write_protect_np` are used for the same effect. An i-cache flush (`__builtin___clear_cache` on Linux, `sys_icache_invalidate` on macOS) is performed after each write→exec transition. The feature is gated behind a new driver flag, `-fsafe-trampoline` (off by default), which threads through the frontend into the `BoxedProcedurePass`. When enabled, the pass emits calls to `_FortranATrampolineInit`, `_FortranATrampolineAdjust`, and `_FortranATrampolineFree` instead of the legacy intrinsics. The legacy path is completely untouched when the flag is off. The pool is a singleton with a fixed capacity (default 1024 slots, overridable via `FLANG_TRAMPOLINE_POOL_SIZE`). Slot size varies by target (32 bytes on x86-64/AArch64, 48 on PPC64, 64 fallback). Each slot holds a small architecture-specific stub, currently x86-64 (17 bytes, using `r10` as the nest/static-chain register) and AArch64 (24 bytes, using `x15`). The implementation compiles on all architectures but will crash at runtime with a clear diagnostic if trampoline emission is actually attempted on an unsupported target. This avoids breaking the flang-rt build on e.g. RISC-V or PPC64. Freed slots are poisoned (the callee pointer is overwritten with a sentinel) and recycled into a freelist, so the pool can sustain long-running programs that repeatedly create and destroy closures. A few design choices worth calling out: The runtime avoids all C++ runtime dependencies, no `std::mutex`, no `operator new`, no function-local statics with hidden guard variables. Locking is via flang-rt's own `Lock` / `CriticalSection`, memory is via `AllocateMemoryOrCrash` / `FreeMemory`, and the singleton uses explicit double-checked locking with a raw pointer. This was done so the trampoline pool links cleanly in minimal / freestanding flang-rt configurations. `_FortranATrampolineFree` calls are inserted immediately before every `func.return` in the enclosing host function. This is a conservative but correct strategy. The trampoline handle cannot outlive the host's stack frame since the closure captures the host's local variables by reference. The GNU_STACK note is verified via a dedicated integration test (`safe-trampoline-gnustack.f90`) that compiles and links a Fortran program using the runtime path, then inspects the ELF with `llvm-readelf` to confirm the stack segment is `RW` (not `RWE`). **Test coverage:** - `flang/test/Driver/fsafe-trampoline.f90` — flag forwarding (on, off, default) - `flang/test/Fir/boxproc-safe-trampoline.fir` — FIR-level FileCheck for emitted runtime calls - `flang/test/Lower/safe-trampoline.f90` — end-to-end lowering - `flang-rt/test/Driver/safe-trampoline-gnustack.f90` — GNU_STACK ELF verification Closes #182813 Co-authored-by: Sairudra More <moresair@pe31.hpc.amslabs.hpecorp.net>
75 lines
3.4 KiB
C++
75 lines
3.4 KiB
C++
//===-- CommandLineOpts.h -- shared command line options --------*- C++ -*-===//
|
|
//
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
//
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
/// This file declares some shared command-line options that can be used when
|
|
/// debugging the test tools.
|
|
|
|
#ifndef FORTRAN_OPTIMIZER_PASSES_COMMANDLINEOPTS_H
|
|
#define FORTRAN_OPTIMIZER_PASSES_COMMANDLINEOPTS_H
|
|
|
|
#include "llvm/Frontend/Debug/Options.h"
|
|
#include "llvm/Passes/OptimizationLevel.h"
|
|
#include "llvm/Support/CommandLine.h"
|
|
|
|
/// Shared option in tools to control whether dynamically sized array
|
|
/// allocations should always be on the heap.
|
|
extern llvm::cl::opt<bool> dynamicArrayStackToHeapAllocation;
|
|
|
|
/// Shared option in tools to set a maximum value for the number of elements in
|
|
/// a compile-time sized array that can be allocated on the stack.
|
|
extern llvm::cl::opt<std::size_t> arrayStackAllocationThreshold;
|
|
|
|
/// Shared option in tools to ignore missing runtime type descriptor objects
|
|
/// when translating FIR to LLVM. The resulting program will crash if the
|
|
/// runtime needs the derived type descriptors, this is only a debug option to
|
|
/// allow compiling manually written FIR programs involving derived types
|
|
/// without having to write the derived type descriptors which are normally
|
|
/// generated by the frontend.
|
|
extern llvm::cl::opt<bool> ignoreMissingTypeDescriptors;
|
|
|
|
/// Shared option in tools to only generate rtti static object definitions for
|
|
/// derived types defined in the current compilation unit. Derived type
|
|
/// descriptor object for types defined in other objects will only be declared
|
|
/// as external. This also changes the linkage of rtti objects defined in the
|
|
/// current compilation unit from linkonce_odr to external so that unused rtti
|
|
/// objects are retained and can be accessed from other compilation units. This
|
|
/// is an experimental option to explore compilation speed improvements and is
|
|
/// an ABI breaking change because of the linkage change.
|
|
/// It will also require linking against module file objects of modules defining
|
|
/// only types (even for trivial types without type bound procedures, which
|
|
/// differs from most compilers).
|
|
extern llvm::cl::opt<bool> skipExternalRttiDefinition;
|
|
|
|
/// Default optimization level used to create Flang pass pipeline is O0.
|
|
extern llvm::OptimizationLevel defaultOptLevel;
|
|
|
|
extern llvm::codegenoptions::DebugInfoKind noDebugInfo;
|
|
|
|
/// Optimizer Passes
|
|
extern llvm::cl::opt<bool> disableCfgConversion;
|
|
extern llvm::cl::opt<bool> disableFirAliasTags;
|
|
extern llvm::cl::opt<bool> disableFirAvc;
|
|
extern llvm::cl::opt<bool> disableFirMao;
|
|
extern llvm::cl::opt<bool> enableFirLICM;
|
|
extern llvm::cl::opt<bool> useOldAliasTags;
|
|
|
|
/// CodeGen Passes
|
|
extern llvm::cl::opt<bool> disableCodeGenRewrite;
|
|
extern llvm::cl::opt<bool> disableTargetRewrite;
|
|
extern llvm::cl::opt<bool> disableDebugInfo;
|
|
extern llvm::cl::opt<bool> disableFirToLlvmIr;
|
|
extern llvm::cl::opt<bool> disableLlvmIrToLlvm;
|
|
extern llvm::cl::opt<bool> disableBoxedProcedureRewrite;
|
|
extern llvm::cl::opt<bool> enableSafeTrampoline;
|
|
|
|
extern llvm::cl::opt<bool> disableExternalNameConversion;
|
|
extern llvm::cl::opt<bool> enableConstantArgumentGlobalisation;
|
|
extern llvm::cl::opt<bool> disableCompilerGeneratedNamesConversion;
|
|
|
|
#endif // FORTRAN_OPTIMIZER_PASSES_COMMANDLINE_OPTS_H
|