`sandboxir::Context` is defined at a pass-level scope with the
`SandboxVectorizerPass` class because the function pass manager `FPM`
object depends on it, and that is in pass-level scope to avoid
recreating the pass pipeline every single time `runOnFunction()` is
called.
This means that the Context's state lives on across function passes. The
problem is twofold:
(i) the LLVM IR to Sandbox IR map can grow very large including objects
from different functions, which is of no use to the vectorizer, as it's
a function-level pass.
(ii) this can result in stale data in the LLVM IR to Sandbox IR object
map, as other passes may delete LLVM IR objects.
To fix both issues this patch introduces a `Context::clear()` function
that clears the `LLVMValueToValueMap`.
This patch removes the assertion that checks for an existing function.
If one exists it will remove it and create a new one. This helps remove
a crash when a function declaration object already exists and we are
about to create a SandboxIR object for the definition.
This patch changes the functionality of `VecUtils::getLowest(Vals, BB)`
such that it filters out any instructions in `Vals` that are not in BB.
This is useful when Vals contains instructions from different BBs,
because in that case we are only interested in one BB.
This patch implements a wrapper function for the LLVM IR verifier for
functions, and calls it (flag-guarded) within the bottom-up-vectorizer
for finding IR bugs as soon as they happen.
This patch implements cost modeling for Region. All instructions that
are added or removed get their cost counted in the Scoreboard. This is
used for checking if the region before or after a transformation is more
profitable.
This patch fixes a bug in the maintenance of the MemDGNode chain of the
DAG. Whenever we move a memory instruction, the DAG gets notified about
the move and maintains the chain of memory nodes. The bug was that if
the destination of the move was not a memory instruction, then the
memory node's next node would end up pointing to itself.
Crossing BBs is not currently supported by the structures of the
vectorizer. This patch fixes instances where this was happening,
including:
- a walk of use-def operands that updates the UnscheduledSuccs counter,
- the dead instruction removal is now done per BB,
- the scheduler, which will reject bundles that cross BBs.
VecUtils::getLowest(Valse) returns the lowest instruction in the BB among Vals.
If the instructions are not in the same BB, or if none of them is an
instruction it returns nullptr.
This patch implements the diamond pattern where we are vectorizing
toward the top of the diamond from both edges, but the second edge may
use elements from a different vector or just scalar values. This
requires some additional packing code (see lit test).
InstrMaps is a helper data structure that maps scalars to vectors and
the reverse. This is used by the vectorizer to figure out which vectors
it can extract scalar values from.
When we vectorize loads or stores we only keep the address of the first
lane. The rest may become dead. This patch adds the address operands of
vectorized loads or stores to the dead candidates set, such that they
get erased if dead.
With this patch we switch from the temporary dummy seeds to actual seeds
provided by the seed collector.
The seeds get sliced and each slice is used as the starting point for
vectorization.
The DAG maintains a chain of MemDGNodes that links together all the
nodes that may touch memroy.
Whenever a new instruction gets created we need to make sure that this
chain gets updated. If the new instruction touches memory then its
corresponding MemDGNode should be inserted into the chain.
Load/Store isSimple is a necessary condition for VectorSeeds, but not
sufficient, so reverse the condition and return value, and continue the
check. Add relevant tests.
This patch adds the callback registration logic in the DAG's constructor
and the corresponding deregistration logic in the destructor. It also
implements the code that makes sure that SchedBundle and DGNodes can be
safely destroyed in any order.
Up until now we could only support packing of scalar elements. This
patch fixes this by implementing packing of vector elements, by
generating extractelement and insertelement instruction pairs.
This patch implements packing of scalar operands when the vectorizer
decides to stop vectorizing. Packing is implemented with a sequence of
InsertElement instructions.
Packing vectors requires different instructions so it's implemented in a
follow-up patch.
This patch adds support for re-scheduling already scheduled
instructions. For now this will clear and rebuild the DAG, and will
reschedule the code using the new DAG.
This patch registers the "createInstr" callback that notifies the
scheduler about newly created instructions. This guarantees that all
newly created instructions have a corresponding DAG node associated with
them. Without this the pass crashes when the scheduler encounters the
newly created vector instructions.
This patch also changes the lifetime of the sandboxir Ctx variable in
the SandboxVectorizer pass. It needs to be destroyed after the passes
get destroyed. Without this change when components like the Scheduler
get destroyed Ctx will have already been freed, which is not legal.