Crossing BBs is not currently supported by the structures of the
vectorizer. This patch fixes instances where this was happening,
including:
- a walk of use-def operands that updates the UnscheduledSuccs counter,
- the dead instruction removal is now done per BB,
- the scheduler, which will reject bundles that cross BBs.
VecUtils::getLowest(Valse) returns the lowest instruction in the BB among Vals.
If the instructions are not in the same BB, or if none of them is an
instruction it returns nullptr.
This patch implements the diamond pattern where we are vectorizing
toward the top of the diamond from both edges, but the second edge may
use elements from a different vector or just scalar values. This
requires some additional packing code (see lit test).
InstrMaps is a helper data structure that maps scalars to vectors and
the reverse. This is used by the vectorizer to figure out which vectors
it can extract scalar values from.
When we vectorize loads or stores we only keep the address of the first
lane. The rest may become dead. This patch adds the address operands of
vectorized loads or stores to the dead candidates set, such that they
get erased if dead.
With this patch we switch from the temporary dummy seeds to actual seeds
provided by the seed collector.
The seeds get sliced and each slice is used as the starting point for
vectorization.
The DAG maintains a chain of MemDGNodes that links together all the
nodes that may touch memroy.
Whenever a new instruction gets created we need to make sure that this
chain gets updated. If the new instruction touches memory then its
corresponding MemDGNode should be inserted into the chain.
Load/Store isSimple is a necessary condition for VectorSeeds, but not
sufficient, so reverse the condition and return value, and continue the
check. Add relevant tests.
This patch adds the callback registration logic in the DAG's constructor
and the corresponding deregistration logic in the destructor. It also
implements the code that makes sure that SchedBundle and DGNodes can be
safely destroyed in any order.
Up until now we could only support packing of scalar elements. This
patch fixes this by implementing packing of vector elements, by
generating extractelement and insertelement instruction pairs.
This patch implements packing of scalar operands when the vectorizer
decides to stop vectorizing. Packing is implemented with a sequence of
InsertElement instructions.
Packing vectors requires different instructions so it's implemented in a
follow-up patch.
This patch adds support for re-scheduling already scheduled
instructions. For now this will clear and rebuild the DAG, and will
reschedule the code using the new DAG.
This patch registers the "createInstr" callback that notifies the
scheduler about newly created instructions. This guarantees that all
newly created instructions have a corresponding DAG node associated with
them. Without this the pass crashes when the scheduler encounters the
newly created vector instructions.
This patch also changes the lifetime of the sandboxir Ctx variable in
the SandboxVectorizer pass. It needs to be destroyed after the passes
get destroyed. Without this change when components like the Scheduler
get destroyed Ctx will have already been freed, which is not legal.
This patch implements a ready-list-based scheduler that operates on
DependencyGraph.
It is used by the sandbox vectorizer to test the legality of vectorizing
a group of instrs.
SchedBundle is a helper container, containing all DGNodes that
correspond to the instructions that we are attempting to schedule with
trySchedule(Instrs).
My previous attempt (#111904) hacked creation of Regions from metadata
into the bottom-up vectorizer. I got some feedback that it should be its
own pass. So now we have two SandboxIR function passes (`BottomUpVec`
and `RegionsFromMetadata`) that are interchangeable, and we could have
other SandboxIR function passes doing other kinds of transforms, so this
commit revamps pipeline creation and parsing.
First, `sandboxir::PassManager::setPassPipeline` now accepts pass
arguments in angle brackets. Pass arguments are arbitrary strings that
must be parsed by each pass, the only requirement is that nested angle
bracket pairs must be balanced, to allow for nested pipelines with more
arguments. For example:
```
bottom-up-vec<region-pass-1,region-pass-2<arg>,region-pass-3>
```
This has complicated the parser a little bit (the loop over pipeline
characters now contains a small state machine), and we now have some new
test cases to exercise the new features.
The main SandboxVectorizerPass now contains a customizable pipeline of
SandboxIR function passes, defined by the `sbvec-passes` flag. Region
passes for the bottom-up vectorizer pass are now in pass arguments (like
in the example above).
Because we have now several classes that can build sub-pass pipelines,
I've moved the logic that interacts with PassRegistry.def into its own
files (PassBuilder.{h,cpp} so it can be easily reused.
Finally, I've added a `RegionsFromMetadata` function pass, which will
allow us to run region passes in isolation from lit tests without
relying on the bottom-up vectorizer, and a new lit test that does
exactly this.
Note that the new pipeline parser now allows empty pipelines. This is
useful for testing. For example, if we use
```
-sbvec-passes="bottom-up-vec<>"
```
SandboxVectorizer converts LLVM IR to SandboxIR and runs the bottom-up
vectorizer, but no region passes afterwards.
```
-sbvec-passes=""
```
SandboxVectorizer converts LLVM IR to SandboxIR and runs no passes on
it. This is useful to exercise SandboxIR conversion on its own.
This patch implements the UnscheduledSuccs counter in DGNode. It counts
the number of unscheduled successors and is used by the scheduler to
determine when a node is ready.