This introduces a new `ptrtoaddr` instruction which is similar to
`ptrtoint` but has two differences:
1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance
2) `ptrtoaddr` only extracts (and then extends/truncates) the low
index-width bits of the pointer
For most architectures, difference 2) does not matter since index (address)
width and pointer representation width are the same, but this does make a
difference for architectures that have pointers that aren't just plain
integer addresses such as AMDGPU fat pointers or CHERI capabilities.
This commit introduces textual and bitcode IR support as well as basic code
generation, but optimization passes do not handle the new instruction yet
so it may result in worse code than using ptrtoint. Follow-up changes will
update capture tracking, etc. for the new instruction.
RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54
Reviewed By: nikic
Pull Request: https://github.com/llvm/llvm-project/pull/139357
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/SandboxIR` library.
These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
The bulk of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.
The following manual adjustments were also applied after running IDS on
Linux:
- Remove explicit `GlobalWithNodeAPI::LLVMGVToGV::operator()` template
function instantiations that were previously added for the dylib build.
Instead, directly annotate the `LLVMGVToGV::operator()` method with
`LLVM_ABI`. This is done so the DLL build works with both MSVC and
clang-cl.
- Explicitly `#include "llvm/SandboxIR/Value.h"` in `Tracker.h` so that
the symbol is available for exported templates in this file. These
templates get fully instantiated on DLL export, so they require the full
definition of `Value`.
- Add extern template instantiation declarations for `GlobalWithNodeAPI`
template types in `Constants.h` and annotate them with
`LLVM_TEMPLATE_ABI`.
- Add `LLVM_EXPORT_TEMPLATE` to `GlobalWithNodeAPI` template
instantiations in `Constants.cpp`.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
Currently, GlobalObject has an "alignment" property... but it's
basically nonsense: alignment doesn't mean the same thing for variables
and functions, and it's completely meaningless for ifuncs.
This "removes" (actually marking protected) the methods from
GlobalObject, adds the relevant methods to Function and GlobalVariable,
and adjusts the code appropriately.
This should make future alignment-related cleanups easier.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
The auxiliary vector is represented by the !sandboxaux metadata. But
until now adding an instruction to the aux vector would not
automatically add it to the region (i.e., it would not mark it with
!sandboxvec). This behavior was problematic when generating regions from
metadata, because you wouldn't know which region owns the auxiliary
vector.
This patch fixes this issue: We now require that an instruction with
!sandboxaux also defines its region with !sandboxvec.
This fixes several of these:
```
[1151/3381] Building CXX object lib\SandboxIR\CMakeFiles\LLVMSandboxIR.dir\Constant.cpp.obj
C:\git\llvm-project\llvm\lib\SandboxIR\Constant.cpp(331,52): warning: duplicate explicit instantiation of 'operator()' ignored as a Microsoft extension [-Wmicrosoft-template]
331 | llvm::GlobalObject>::LLVMGVToGV::operator()(llvm::GlobalIFunc
| ^
C:\git\llvm-project\llvm\include\llvm/SandboxIR/Constant.h(816,52): note: previous explicit instantiation is here
816 | llvm::GlobalObject>::LLVMGVToGV::operator()(llvm::GlobalIFunc
| ^
```
Move the most common if statement to the top and the least common ones
to the bottom. This should save CPU cycles during compilation.
This patch also prefixes the llvm variables with the LLVM prefix to make
the naming convention in this function more uniform. For example `C` to
`LLVMC`.
This patch implements a new subclass of the Value class used for Sandbox
IR Values that we don't support, like metadata or inline asm. The goal
is to never have null sandboxir::Value objects, because this is not the
expected behavior.
This patch moves the seed collection logic from the BottomUpVec pass
into a new Sandbox IR Function pass. The new "seed-collection" pass
collects the seeds, builds a region and runs the region pass pipeline.
This patch implements a callback mechanism similar to the existing ones,
but for getting notified whenever a Use edge gets updated. This is going
to be used in a follow up patch by the Dependency Graph.
This patch adds additional functionality to the sandboxir Region. The
Region is used as a way of passing a set of Instructions across region
passes in a way that can be represented in the IR with metadata. This is
a design choice that allows us to test region passes in isolation with
lit tests.
Up until now the region was only used to tag the instructions generated
by the passes. There is a need to represent an ordered set of
instructions, which can be used as a way to represent the initial seeds
to the first vectorization pass. This patch implements this auxiliary
vector that can be used to convey such information.
The TransactionAcceptOrRevert pass is the final pass in the Sandbox
Vectorizer's default pass pipeline. It's job is to check the cost
before/after vectorization and accept or revert the IR to its original
state.
Since we are now starting the transaction in BottomUpVec, tests that run
a custom pipeline need to accept the transaction. This is done with the
help of the TransactionAlwaysAccept pass (tr-accept).
`sandboxir::Context` is defined at a pass-level scope with the
`SandboxVectorizerPass` class because the function pass manager `FPM`
object depends on it, and that is in pass-level scope to avoid
recreating the pass pipeline every single time `runOnFunction()` is
called.
This means that the Context's state lives on across function passes. The
problem is twofold:
(i) the LLVM IR to Sandbox IR map can grow very large including objects
from different functions, which is of no use to the vectorizer, as it's
a function-level pass.
(ii) this can result in stale data in the LLVM IR to Sandbox IR object
map, as other passes may delete LLVM IR objects.
To fix both issues this patch introduces a `Context::clear()` function
that clears the `LLVMValueToValueMap`.
This patch removes the assertion that checks for an existing function.
If one exists it will remove it and create a new one. This helps remove
a crash when a function declaration object already exists and we are
about to create a SandboxIR object for the definition.
This patch implements cost modeling for Region. All instructions that
are added or removed get their cost counted in the Scoreboard. This is
used for checking if the region before or after a transformation is more
profitable.
If the operands of a CmpInst are constants then it gets folded into a
constant. Therefore CmpInst::create() should return a Value*, not a
Constant* and should handle the creation of the constant correctly.
The `EraseCallbackID` field is not always initialized in the ctor for
SeedCollector; if not, it will be used uninitialized by its dtor. This
could potentially lead to the erasure of a random callback, leading to a
bug.
Fixed by making `CallbackID` an opaque type, which is always
default-initialized to an invalid ID.
A few recent changes are causing build breaks when
`-DLLVM_ENABLE_MODULES=ON` (such as
834dfd23155351c9885eddf7b9664f7697326946 and
7dfdca1961aadc75ca397818bfb9bd32f1879248).
This PR makes the required updates so that clang/llvm builds when
`-DLLVM_ENABLE_MODULES=ON`.
rdar://140803058
This PR re-lands https://github.com/llvm/llvm-project/pull/115968 with a
fix for a buildbot failure.
The `IRSnapshotChecker` class is only defined in debug mode, so its unit
tests must also be inside `#ifndef NDEBUG`.
Preserving the case order is not strictly necessary to preserve
semantics (for example, operations like SwitchInst::removeCase will
happily swap cases around). However, I'm planning to introduce an
optional verification step for SandboxIR that will use StructuralHash to
compare IR after a revert to the original IR to help catch tracker bugs,
and the order difference triggers a difference there.
Note that PointerUnion::{is,get,dyn_cast} have been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
`insertBefore` can be called on a detached instruction, and we can't
check that the underlying instructions are ordered because instructions
without BB parents have no order.
This problem showed up as a different assertion failure in
`llvm::Instruction::comesBefore` in one of the unit tests when
`EXPENSIVE_CHECKS` are enabled.
These symbols will need to be explicitly exported for SandboxIRTests
when LLVM is built as shared library on window with explicitly
visibility macros are enabled.
This is part of the work to enable LLVM_BUILD_LLVM_DYLIB and plugins on
windows.
https://github.com/llvm/llvm-project/pull/111223 was reverted because of
a build failure with `-DBUILD_SHARED_LIBS=on`.
The Passes component depends on Vectorizer (because PassBuilder needs to
be able to instantiate SandboxVectorizerPass). This resulted in CMake
doing this
1. when it builds lib/libLLVMVectorize.so.20.0git it adds
lib/libLLVMSandboxIR.so.20.0git to the command line, because it's listed
as a dependency (as expected)
2. when it's trying to build lib/libLLVMPasses.so.20.0git it adds
lib/libLLVMVectorize.so.20.0git to the command line, because it's listed
as a dependency (also as expected). But not libLLVMSandboxIR.so.
When SandboxVectorizerPass has its ctors/dtors defined inline, this
caused "undefined reference to vtable" linker errors. This change works
around that by moving ctors/dtors out of line.
Also fix a bazel build problem by adding the new
`llvm/lib/Transforms/Vectorize/SandboxVectorizer/Passes/PassRegistry.def`
as a textual header in the Vectorizer target.
The main change is that the main SandboxVectorizer pass no longer has a
pipeline of function passes. Now it is a wrapper that creates sandbox IR
from functions before calling BottomUpVec.
BottomUpVec now builds its own RegionPassManager from the `sbvec-passes`
flag, using a PassRegistry.def file. For now, these region passes are
not run (BottomUpVec doesn't create Regions yet), and only a null pass
for testing exists.
This commit also changes the ownership model for sandboxir::PassManager:
instead of having a PassRegistry that owns passes, and PassManagers that
contain non-owning pointers to the passes, now PassManager owns (via
unique pointers) the passes it contains.
PassRegistry is now deleted, and the logic to parse and create a pass
pipeline is now in PassManager::setPassPipeline.
This patch implements the InsertPosition class that is used to specify
where an instruction should be placed.
It also switches a couple of create() functions from the old API to the
new one that uses InsertPosition.