Should add some minor type safety to the use of this information, since
there's quite a bit of metadata being laundered through an `unsigned`.
I'm looking to potentially add more bitfields to that `unsigned`, but I
find InlineAsm's big ol' bag of enum values and usage of `unsigned`
confusing, type-unsafe, and un-ergonomic. These can probably be better
abstracted.
I think the lack of static_cast outside of InlineAsm indicates the prior
code smell fixed here.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D159242
When SelectiondDAG converts dbg.value intrinsics, it first ensures we have
already generated code for the value operator of the intrinsic. The rationale
being that if we haven't had the need to generate code for this value, it won't
be a debug value that causes the generation.
For example, if the first use the physical register of an argument is a
dbg.value, we are going to hit this code path. However, this is irrelevant for
entry value expressions: by definition we are not interested in the _current_
value of the physical register, but rather on its value at the start of the
function. To deal with this, this patch changes lowering to handle this case as
early as possible.
Differential Revision: https://reviews.llvm.org/D158649
This is rework of;
- rG13e77db2df94 (r328395; MVT)
Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h`
can be restored as well.
Depends on D148767
Differential Revision: https://reviews.llvm.org/D149024
This patch replaces the uses of PointerUnion.is function by llvm::isa,
PointerUnion.get function by llvm::cast, and PointerUnion.dyn_cast by
llvm::dyn_cast_if_present. This is according to the FIXME in
the definition of the class PointerUnion.
This patch does not remove them as they are being used in other
subprojects.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D148449
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D148303
Use RawLocationWrapper rather than a Value to represent the location operand(s)
so that it's possible to represent multiple location
operands. AssignmentTrackingAnalysis still converts variadic debug intrinsics
to kill locations so this patch is NFC.
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D145911
This removes the unused `FuncArgumentDbgValueKind::Addr` value originally added
by e24f5348798605a799c63ff09169d177d262cd37. The intent was to signal the
original intrinsic that marked a function argument, but the `Addr` part was
never used.
Part of `dbg.addr` removal
Discussed in https://discourse.llvm.org/t/what-is-the-status-of-dbg-addr/62898
Differential Revision: https://reviews.llvm.org/D144794
This patch adds 2 new intrinsics:
; Interleave two vectors into a wider vector
<vscale x 4 x i64> @llvm.vector.interleave2.nxv2i64(<vscale x 2 x i64> %even, <vscale x 2 x i64> %odd)
; Deinterleave the odd and even lanes from a wider vector
{<vscale x 2 x i64>, <vscale x 2 x i64>} @llvm.vector.deinterleave2.nxv2i64(<vscale x 4 x i64> %vec)
The main motivator for adding these intrinsics is to support vectorization of
complex types using scalable vectors.
The intrinsics are kept simple by only supporting a stride of 2, which makes
them easy to lower and type-legalize. A stride of 2 is sufficient to handle
complex types which only have a real/imaginary component.
The format of the intrinsics matches how `shufflevector` is used in
LoopVectorize. For example:
using cf = std::complex<float>;
void foo(cf * dst, int N) {
for (int i=0; i<N; ++i)
dst[i] += cf(1.f, 2.f);
}
For this loop, LoopVectorize:
(1) Loads a wide vector (e.g. <8 x float>)
(2) Extracts odd lanes using shufflevector (leading to <4 x float>)
(3) Extracts even lanes using shufflevector (leading to <4 x float>)
(4) Performs the addition
(5) Interleaves the two <4 x float> vectors into a single <8 x float> using
shufflevector
(6) Stores the wide vector.
In this example, we can 1-1 replace shufflevector in (2) and (3) with the
deinterleave intrinsic, and replace the shufflevector in (5) with the
interleave intrinsic.
The SelectionDAG nodes might be extended to support higher strides (3, 4, etc)
as well in the future.
Similar to what was done for vector.splice and vector.reverse, the intrinsic
is lowered to a shufflevector when the type is fixed width, so to benefit from
existing code that was written to recognize/optimize shufflevector patterns.
Note that this approach does not prevent us from adding new intrinsics for other
strides, or adding a more generic shuffle intrinsic in the future. It just solves
the immediate problem of being able to vectorize loops with complex math.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D141924
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
Add initial revision of assignment tracking analysis pass
---------------------------------------------------------
This patch squashes five individually reviewed patches into one:
#1https://reviews.llvm.org/D136320#2https://reviews.llvm.org/D136321#3https://reviews.llvm.org/D136325#4https://reviews.llvm.org/D136331#5https://reviews.llvm.org/D136335
Patch #1 introduces 2 new files: AssignmentTrackingAnalysis.h and .cpp. The
two subsequent patches modify those files only. Patch #4 plumbs the analysis
into SelectionDAG, and patch #5 is a collection of tests for the analysis as
a whole.
The analysis was broken up into smaller chunks for review purposes but for the
most part the tests were written using the whole analysis. It would be possible
to break up the tests for patches #1 through #3 for the purpose of landing the
patches seperately. However, most them would require an update for each
patch. In addition, patch #4 - which connects the analysis to SelectionDAG - is
required by all of the tests.
If there is build-bot trouble, we might try a different landing sequence.
Analysis problem and goal
-------------------------
Variables values can be stored in memory, or available as SSA values, or both.
Using the Assignment Tracking metadata, it's not possible to determine a
variable location just by looking at a debug intrinsic in
isolation. Instructions without any metadata can change the location of a
variable. The meaning of dbg.assign intrinsics changes depending on whether
there are linked instructions, and where they are relative to those
instructions. So we need to analyse the IR and convert the embedded information
into a form that SelectionDAG can consume to produce debug variable locations
in MIR.
The solution is a dataflow analysis which, aiming to maximise the memory
location coverage for variables, outputs a mapping of instruction positions to
variable location definitions.
API usage
---------
The analysis is named `AssignmentTrackingAnalysis`. It is added as a required
pass for SelectionDAGISel when assignment tracking is enabled.
The results of the analysis are exposed via `getResults` using the returned
`const FunctionVarLocs *`'s const methods:
const VarLocInfo *single_locs_begin() const;
const VarLocInfo *single_locs_end() const;
const VarLocInfo *locs_begin(const Instruction *Before) const;
const VarLocInfo *locs_end(const Instruction *Before) const;
void print(raw_ostream &OS, const Function &Fn) const;
Debug intrinsics can be ignored after running the analysis. Instead, variable
location definitions that occur between an instruction `Inst` and its
predecessor (or block start) can be found by looping over the range:
locs_begin(Inst), locs_end(Inst)
Similarly, variables with a memory location that is valid for their lifetime
can be iterated over using the range:
single_locs_begin(), single_locs_end()
Further detail
--------------
For an explanation of the dataflow implementation and the integration with
SelectionDAG, please see the reviews linked at the top of this commit message.
Reviewed By: jmorse
Hide the underlying DbgValueInst by adding methods to extract the necessary
information and by adding a raw_ostream &operator<< overload to print it.
Remove the DebugLoc field as this is always the same as the DbgValueInst's
DebugLoc (see D136247).
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D136249
handleDebugValue has two DebugLoc parameters that appear to always take the
same value. Remove one of the duplicate parameters. See phabricator review for
more detail.
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D136247
This removes the insertvalue constant expression, as part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
This is very similar to the extractvalue removal from D125795.
insertvalue is also not supported in bitcode, so no auto-ugprade
is necessary.
ConstantExpr::getInsertValue() can be replaced with
IRBuilder::CreateInsertValue() or ConstantFoldInsertValueInstruction(),
depending on whether a constant result is required (with the latter
being fallible).
The ConstantExpr::hasIndices() and ConstantExpr::getIndices()
methods also go away here, because there are no longer any constant
expressions with indices.
Differential Revision: https://reviews.llvm.org/D128719
This patch adds the minimum required to successfully lower vp.icmp via
the new ISD::VP_SETCC node to RVV instructions.
Regular ISD::SETCC goes through a lot of canonicalization which targets
may rely on which has not hereto been ported to VP_SETCC. It also
supports expansion of individual condition codes and a non-boolean
return type. Support for all of that will follow in later patches.
In the case of RVV this largely isn't a problem as the vector integer
comparison instructions are plentiful enough that it can lower all
VP_SETCC nodes on legal integer vectors except for boolean vectors,
which regular SETCC folds away immediately into logical operations.
Floating-point VP_SETCC operations aren't as well supported in RVV and
the backend relies on condition code expansion, so support for those
operations will come in later patches.
Portions of this code were taken from the VP reference patches.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D122743
The reason why I am making this change is that before this commit,
EmitFuncArgumentDbgValue relied on a boolean flag IsDbgDeclare both to signal
that a DBG_VALUE should be made to be indirect /and/ that the original intrinsic
was a dbg.declare. This is no longer always true if we add support for handling
dbg.addr since we will have an indirect DBG_VALUE that is a different intrinsic
from dbg.declare.
With that in mind, in this NFC patch, we prepare for future fixes by introducing
a 3 case-enum argument to EmitFuncArgumentDbgValue that allows the caller to
explicitly specify how the argument's DBG_VALUE should be emitted. This then
allows us to turn the indirect checks into a != FuncArgumentDbgValueKind::Value
and prepare us for a future where we add support here for llvm.dbg.addr
directly.
rdar://83957028
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D122945
D122053 set the ExtendType for ConstantSDNodes in getCopyToRegs to
ZERO_EXTEND to match assumptions in ComputePHILiveOutRegInfo. PHIs
are probably not the only way ConstantSDNodeNodes can get to
getCopyToRegs.
This patch adds an ExtendType parameter to CopyValueToVirtualRegister and
has HandlePHINodesInSuccessorBlocks pass ISD::ZERO_EXTEND for ConstantInts.
This way we only affect ConstantSDNodes for PHIs.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D122171
This patch introduces two new experimental IR intrinsics and SDAG nodes
to represent vector strided loads and stores.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D114884
There is a pointer to the DataLayout in SelectionDAGBuilder called
'DL' that is hardly ever used. In most cases the code seems to just
use `DAG.getDataLayout()` instead. Given that DL is also often used
as a shadowed variable for the debug location it seems sensible to
just kill off the few remaining uses and be consistent with the rest
of the code.
Differential Revision: https://reviews.llvm.org/D114451
This patch fixes a case where the 'align' parameter attribute on the
pointer operands to llvm.vp.load and llvm.vp.store was being dropped
during the conversion to the SelectionDAG. The default alignment
equal to the ABI type alignment of the vector type was kept. It also
updates the documentation to reflect the fact that the parameter
attribute is now properly supported.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D114422
This is a port of the feature that allows the StackProtector pass to omit
checking code for stack canary checks, and rely on SelectionDAG to do it at a
later stage. The reasoning behind this seems to be to prevent the IR checking
instructions from hindering tail-call optimizations during codegen.
Here we allow GlobalISel to also use that scheme. Doing so requires that we
do some analysis using some factored-out code to determine where to generate
code for the epilogs.
Not every case is handled in this patch since we don't have support for all
targets that exercise different stack protector schemes.
Differential Revision: https://reviews.llvm.org/D98200
Followup to D99355: SDAG support for vector-predicated load/store/gather/scatter.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D105871
SwiftTailCC has a different set of requirements than the C calling convention
for a tail call. The exact argument sequence doesn't have to match, but fewer
ABI-affecting attributes are allowed.
Also make sure the musttail diagnostic triggers if a musttail call isn't
actually a tail call.
I've taken the following steps to add unwinding support from inline assembly:
1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax:
```
invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"()
to label %exit unwind label %uexit
```
2.) Add Bitcode writing/reading support + LLVM-IR parsing.
3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled.
4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind.
5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled.
6.) Don't allow unwinding callbr.
Reviewed By: Amanieu
Differential Revision: https://reviews.llvm.org/D95745
This patch adds a new llvm.experimental.stepvector intrinsic,
which takes no arguments and returns a linear integer sequence of
values of the form <0, 1, ...>. It is primarily intended for
scalable vectors, although it will work for fixed width vectors
too. It is intended that later patches will make use of this
new intrinsic when vectorising induction variables, currently only
supported for fixed width. I've added a new CreateStepVector
method to the IRBuilder, which will generate a call to this
intrinsic for scalable vectors and fall back on creating a
ConstantVector for fixed width.
For scalable vectors this intrinsic is lowered to a new ISD node
called STEP_VECTOR, which takes a single constant integer argument
as the step. During lowering this argument is set to a value of 1.
The reason for this additional argument at the codegen level is
because in future patches we will introduce various generic DAG
combines such as
mul step_vector(1), 2 -> step_vector(2)
add step_vector(1), step_vector(1) -> step_vector(2)
shl step_vector(1), 1 -> step_vector(2)
etc.
that encourage a canonical format for all targets. This hopefully
means all other targets supporting scalable vectors can benefit
from this too.
I've added cost model tests for both fixed width and scalable
vectors:
llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll
llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll
as well as codegen lowering tests for fixed width and scalable
vectors:
llvm/test/CodeGen/AArch64/neon-stepvector.ll
llvm/test/CodeGen/AArch64/sve-stepvector.ll
See this thread for discussion of the intrinsic:
https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html
This patch introduces a new intrinsic @llvm.experimental.vector.splice
that constructs a vector of the same type as the two input vectors,
based on a immediate where the sign of the immediate distinguishes two
variants. A positive immediate specifies an index into the first vector
and a negative immediate specifies the number of trailing elements to
extract from the first vector.
For example:
@llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> ; index
@llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing element count
These intrinsics support both fixed and scalable vectors, where the
former is lowered to a shufflevector to maintain existing behaviour,
although while marked as experimental the recommended way to express
this operation for fixed-width vectors is to use shufflevector. For
scalable vectors where it is not possible to express a shufflevector
mask for this operation, a new ISD node has been implemented.
This is one of the named shufflevector intrinsics proposed on the
mailing-list in the RFC at [1].
Patch by Paul Walker and Cullen Rhodes.
[1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D94708
This patch adds partial support in Instruction Selection for dbg.values that use
a DIArgList. This patch does not add support for producing DBG_VALUE_LIST, but
adds the logic for processing DIArgLists within the ISel pass. This change is
largely focused on handleDebugValue and some of the functions that it calls.
Outside of this, salvageDebugInfo and transferDbgValues have been modified to
replace individual operands instead of the entire value; dangling debug info for
variadic debug values is not currently supported (but may be added later).
Differential Revision: https://reviews.llvm.org/D88589
This patch adds a new intrinsic experimental.vector.reduce that takes a single
vector and returns a vector of matching type but with the original lane order
reversed. For example:
```
vector.reverse(<A,B,C,D>) ==> <D,C,B,A>
```
The new intrinsic supports fixed and scalable vectors types.
The fixed-width vector relies on shufflevector to maintain existing behaviour.
Scalable vector uses the new ISD node - VECTOR_REVERSE.
This new intrinsic is one of the named shufflevector intrinsics proposed on the
mailing-list in the RFC at [1].
Patch by Paul Walker (@paulwalker-arm).
[1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html
Differential Revision: https://reviews.llvm.org/D94883
The mapping between registers and relative size has been updated to
use TypeSize to account for the size of scalable EVTs.
The patch is a NFCI, if not for the fact that with this change the
function `getUnderlyingArgRegs` does not raise a warning for implicit
conversion of `TypeSize` to `unsigned` when generating machine code
from the test added to the patch.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D92096
When processing PHI nodes after a callbr, we need to make sure that the
PHI nodes on the default branch are resolved after the callbr
(inserted after INLINEASM_BR). The PHI node values on the indirect
branches are processed before the INLINEASM_BR.
Differential Revision: https://reviews.llvm.org/D86260
This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130.
This required adding a SDNodeFlags to SelectionDAG::getSetCC.
Now we manage to contant fold some stuff undefs during the
initial getNode that we don't do in later DAG combines.
Differential Revision: https://reviews.llvm.org/D87200
This patch stops unconditionally transforming FSUB(-0,X) into an FNEG(X) while building the DAG. There is also one small change to handle the new FSUB(-0,X) similarly to FNEG(X) in the AMDGPU backend.
Differential Revision: https://reviews.llvm.org/D84056
I've always found the "findValue" a little odd and
inconsistent with other things in SDB.
This simplfifies the code in SDB to just handle a splat constant
address or a 2 operand GEP in the same BB. This removes the
need for "findValue" since the operands to the GEP are
guaranteed to be available. The splat constant handling is
new, but was needed to avoid regressions due to constant
folding combining GEPs created in CGP.
CGP is now responsible for canonicalizing gather/scatters into
this form. The pattern I'm using for scalarizing, a scalar GEP
followed by a GEP with an all zeroes index, seems to be subject
to constant folding that the insertelement+shufflevector was not.
Differential Revision: https://reviews.llvm.org/D76947