This patch implements the check for not allowing re-scheduling of
instructions that have already been scheduled in a scheduling bundle.
Rescheduling should only happen if the instructions were temporarily
scheduled in singleton bundles during a previous call to
`trySchedule()`.
Up until now the generation of vector instructions was taking place
during the top-down post-order traversal of vectorizeRec(). The issue
with this approach is that the vector instructions emitted during the
traversal can be reordered by the scheduler, making it challenging to
place them without breaking the def-before-uses rule.
With this patch we separate the vectorization decisions (done in
`vectorizeRec()`) from the code generation phase (`emitVectors()`). The
vectorization decisions are stored in the `Actions` vector and are used
by `emitVectors()` to drive code generation.
This patch implements a small region pass that saves the context's
state. The patch is now used in the default pipeline to save the context
state instead of the hard-coded call to Context::save().
The concept behind this is that the passes themselves should not have to
do the actual saving/restoring of the IR state, because that would make
it challenging to reorder them in the pipeline. Having separate
save/restore passes makes the transformation passes more composable as
parts of arbitrary pipelines.
This patch moves the seed collection logic from the BottomUpVec pass
into a new Sandbox IR Function pass. The new "seed-collection" pass
collects the seeds, builds a region and runs the region pass pipeline.
When the operand comes from multiple inputs then we need additional
packing code. When the operands are scalar then we can use a single
InsertElementInst. But when the operands are vectors then we need a
chain of ExtractElementInst and InsertElementInst instructions to insert
the vector value into the destination vector. This is what this patch
implements.
This patch fixes a bug in the creation of shuffle masks when vectorizing
vectors in case of a diamond reuse with shuffle. The mask needs to
enumerate all elements of a vector, not treat the original vector value
as a single element. That is: if vectorizing two <2 x float> vectors
into a <4 x float> the mask needs to have 4 indices, not just 2.
This patch fixes the way the top-of-schedule variable gets set and
updated. Before this patch it used to get updated whenever we scheduled
a bundle, which is wrong, as the top-of-schedule needs to be maintained
across scheduling attempts.
It should get reset only when we clear the schedule or when we destroy
the current schedule and re-schedule.
The TransactionAcceptOrRevert pass is the final pass in the Sandbox
Vectorizer's default pass pipeline. It's job is to check the cost
before/after vectorization and accept or revert the IR to its original
state.
Since we are now starting the transaction in BottomUpVec, tests that run
a custom pipeline need to accept the transaction. This is done with the
help of the TransactionAlwaysAccept pass (tr-accept).
This patch implements the diamond pattern where we are vectorizing
toward the top of the diamond from both edges, but the second edge may
use elements from a different vector or just scalar values. This
requires some additional packing code (see lit test).
InstrMaps is a helper data structure that maps scalars to vectors and
the reverse. This is used by the vectorizer to figure out which vectors
it can extract scalar values from.
When we vectorize loads or stores we only keep the address of the first
lane. The rest may become dead. This patch adds the address operands of
vectorized loads or stores to the dead candidates set, such that they
get erased if dead.
With this patch we switch from the temporary dummy seeds to actual seeds
provided by the seed collector.
The seeds get sliced and each slice is used as the starting point for
vectorization.
Up until now we could only support packing of scalar elements. This
patch fixes this by implementing packing of vector elements, by
generating extractelement and insertelement instruction pairs.
This patch implements packing of scalar operands when the vectorizer
decides to stop vectorizing. Packing is implemented with a sequence of
InsertElement instructions.
Packing vectors requires different instructions so it's implemented in a
follow-up patch.
This patch adds support for re-scheduling already scheduled
instructions. For now this will clear and rebuild the DAG, and will
reschedule the code using the new DAG.
This patch registers the "createInstr" callback that notifies the
scheduler about newly created instructions. This guarantees that all
newly created instructions have a corresponding DAG node associated with
them. Without this the pass crashes when the scheduler encounters the
newly created vector instructions.
This patch also changes the lifetime of the sandboxir Ctx variable in
the SandboxVectorizer pass. It needs to be destroyed after the passes
get destroyed. Without this change when components like the Scheduler
get destroyed Ctx will have already been freed, which is not legal.