Vectorized ADDRSPACECASTs were not supported by the type legalizer.
This patch adds the support for:
- splitting the vector result: <2 x ptr> => 2 x <1 x ptr>
- scalarization: <1 x ptr> => ptr
- widening: <3 x ptr> => <4 x ptr>
This is all exercised by the added NVPTX tests.
Reverts llvm/llvm-project#90982
NIter was only declared in !NDEBUG, and only used for assertions - so it
was correct that it was incremented inside the assertion. (& in fact now
the non-asserts build fails, because the variable is incremented even
though it isn't declared)
If the shuffle mask contains no undef elements, then we can move the freeze through a shuffle node.
This requires special case handling to create a new ShuffleVectorSDNode.
Includes VECTOR_SHUFFLE support for isGuaranteedNotToBeUndefOrPoison / canCreateUndefOrPoison.
When looking through a right shift, we need to make sure that all of
the bits we are using from the shift come from the shift input and
not the sign or zero bits that are shifted in.
Fixes#90936.
This is the mirror to the recent atomic load change. The same
bitcast-back-to-integer case is a small code quality regression for the
same reason. This would disappear with a bitcastable legal 128-bit type.
In case of functions without a stack frame no "stack" field is
serialized into MIR which leads to isCalleeSavedInfoValid being false
when reading a MIR file back in. To fix this we should serialize
MachineFrameInfo::isCalleeSavedInfoValid() into MIR.
If `LLVM_APPEND_VC_REV` is on, add the git revision to the `.file`
string. The revision can be set with `LLVM_FORCE_VC_REVISION`.
Before:
`.file "git_revision.cpp",,"LLVM version 19.0.0git"`
After:
`.file "git_revision.cpp",,"LLVM version 19.0.0git (LLVM_REVISION)"`
If this flag is set, Xor will not be considered AddLike. If an Xor were
treated as an Add it may wrap. If we can prove there would be no carry out and
thus no wrap, the Xor would be turned into a disjoint Or by DAGCombine.
Use this new flag to fix a bug in X86 where an Xor is incorrectly being treated
as an NUWAdd.
Fixes#90668.
During analysis, we incorrectly leave the offset part of an address info
struct
as zero, when in actual fact we failed to decompose it into base +
offset.
This results in incorrectly assuming that the address is adjacent to
another store
addr. To fix this we wrap the offset in an optional<> so we can
distinguish between
real zero and unknown.
Fixes issue #90242
In #70452 DAGCombiner::mayAlias was taught to handle scalable sizes, but
when it checks via AA->isNoAlias it didn't take into account the case
where the size is scalable but there was an offset too.
For the fixed length case the offset was just accounted for by adding to
the LocationSize, but for the scalable case there doesn't seem to be a
way to represent both a scalable and fixed part in it. So this patch
works around it by bailing if there is an offset.
Fixes#90559
LegalizeVectorType is responsible for legalizing nodes that perform an
operation on each element may need to scalarize.
This is not true for nodes like VP_REDUCE.*, BUILD_VECTOR,
SHUFFLE_VECTOR, EXTRACT_SUBVECTOR, etc.
This patch drops any nodes with a scalar result from LegalizeVectorOps
and handles them in LegalizeDAG instead.
This required moving the reduction promotion to LegalizeDAG. I have
removed the support integer promotion as it was incorrect for integer
min/max reductions. Since it was untested, it was best to assert on it
until it was really needed.
There are a couple regressions that can be fixed with a small DAG
combine which I will do as a follow up.
In new pass system, `MachineFunction` could be an analysis result again,
machine module pass can now fetch them from analysis manager.
`MachineModuleInfo` no longer owns them.
Remove `FreeMachineFunctionPass`, replaced by
`InvalidateAnalysisPass<MachineFunctionAnalysis>`.
Now `FreeMachineFunction` is replaced by
`InvalidateAnalysisPass<MachineFunctionAnalysis>`, the workaround in
`MachineFunctionPassManager` is no longer needed, there is no difference
between `unittests/MIR/PassBuilderCallbacksTest.cpp` and
`unittests/IR/PassBuilderCallbacksTest.cpp`.
We don't need to explicitly create an instance of ArrayRef here
because getIndexedOffsetInType takes ArrayRef, and ArrayRef can be
implicitly constructed from a C array.
This reverts commit 16bd10a38730fed27a3bf111076b8ef7a7e7b3ee.
Re-applies:
b3c55b707110084a9f50a16aade34c3be6fa18da - "[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921)"
8e2f6495c0bac1dd6ee32b6a0d24152c9c343624 - "[DAGCombiner] Do not always fold FREEZE over BUILD_VECTOR (#85932)"
73472c5996716cda0dbb3ddb788304e0e7e6a323 - "[SelectionDAG] Treat CopyFromReg as freezing the value (#85932)"
with a fix in DAGCombiner::visitFREEZE.
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice
from the experimental namespace.
All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
This reverts:
b3c55b707110084a9f50a16aade34c3be6fa18da - "[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921)"
(because it updates a test case that I don't know how to resolve the conflict for)
8e2f6495c0bac1dd6ee32b6a0d24152c9c343624 - "[DAGCombiner] Do not always fold FREEZE over BUILD_VECTOR (#85932)"
73472c5996716cda0dbb3ddb788304e0e7e6a323 - "[SelectionDAG] Treat CopyFromReg as freezing the value (#85932)"
Due to a test suite failure on AArch64 when compiling for SVE.
https://lab.llvm.org/buildbot/#/builders/197/builds/13955
clang: ../llvm/llvm/include/llvm/CodeGen/ValueTypes.h:307: MVT llvm::EVT::getSimpleVT() const: Assertion `isSimple() && "Expected a SimpleValueType!"' failed.
According to langref, llvm.maximum/minimum has -0.0 < +0.0 semantics and
propagates NaN.
Expand the nodes on targets not supporting the operation, by adding
extra check for NaN and using is_fpclass to check zero signs.
[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison
Handle SELECT_CC similarly as SETCC.
Handle these operations that only propagate poison/undef based on the
input operands:
SADDSAT, UADDSAT, SSUBSAT, USUBSAT, MULHU, MULHS,
SMIN, SMAX, UMIN, UMAX
These operations may create poison based on shift amount and exact
flag being violated:
SRL, SRA
One goal here is to allow pushing freeze through these operations
when allowed, as well as letting analyses such as
isGuaranteedNotToBeUndefOrPoison to not break on such operations.
Since some problems have been observed with pushing freeze through
SRA/SRL we block that explicitly in DAGCombiner::visitFreeze now.
That way we can still model SRA/SRL properly in
SelectionDAG::canCreateUndefOrPoison, e.g. when used by
isGuaranteedNotToBeUndefOrPoison, even if we do not want to push
freeze through those instructions.
It's really unfortunate that STORE and ATOMIC_STORE are separate
opcodes. This duplicates a basic simplify demanded for the truncating
case. This avoids some AMDGPU lit regressions in a future patch.
I'm not sure how to craft a test that exposes this without first
introducing the regressions by promoting half to i16.
(Sorry, not an actual build_pair node just a similar pattern).
For cases where we're concatenating 2 integers into a double width integer, see if both integer sources are NOT patterns.
We could take this further and handle all logic ops with a constant operands, but I just wanted to handle the case reported on #89533 initially.
Fixes#89533
Avoid turning a BUILD_VECTOR that can be recognized as "all zeros",
"all ones" or "constant" into something that depends on
freeze(undef), as that would destroy those properties.
Instead we replace undef by 0/-1 in such vectors, making it possible
to fold away the freeze. We typically use -1 if the BUILD_VECTOR
would identify as "all ones", and otherwise we use the value 0.
The description of CopyFromReg in ISDOpcodes.h says that the input
valus is defined outside the scope of the current SelectionDAG. I
think that means that we basically can treat it as a FREEZE in the
sense that it can be seen as neither being undef nor poison.
Being able to fold freeze(CopyFromReg) into CopyFromReg seems
useful to avoid regressions if we start to introduce freeze
instruction in DAGCombiner/foldBoolSelectToLogic, e.g. to solve
https://github.com/llvm/llvm-project/issues/84653
Things _not_ dealt with in this patch:
- Depending on calling convention an input argument can be passed
also on the stack and not in a register. If it is allowed to treat
an argument received in a register as not being poison, then I think
we want to treat arguments received on the stack the same way. But
then we need to attribute load instructions, or add explicit FREEZE
when lowering formal arguments.
- A common pattern is that there is an AssertZext or AssertSext just
after CopyFromReg. I think that if we treat CopyFromReg as never
being poison, then it should be allowed to fold
(freeze(AssertZext(CopyFromReg))) -> AssertZext(CopyFromReg))
These hooks should be removed. This is a trivial legalization transform
the legalizer needs to support. The IR just complicates things, and it
was losing metadata. Implement the DAG promotion support, and switch
AMDGPU over to using it.
Really we'd be a lot better off merging ATOMIC_LOAD and LOAD like
GlobalISel does.
This matches the style used in the Analysis version of this routine, and
makes it less likely we'll miss a poison generating flag in future
changes. Unlike IR, the check for poison generating flags doesn't need
to switch over opcode since all nodes have the SDFlags storage.
As demonstrated in the test change, when deciding to sink a trunc we
were losing its flags. This patch moves to cloning the original
instruction instead.
All of these constructors were creating a SDVTList using an EVT* created
by SDNode::getValueTypeList. This EVT needs to live at least as long as
the SDNode that uses it. To do this, SDNode::getValueTypeList contains
several function scoped static variables that hold the memory for the
EVT. So the EVT lives until global destructors run.
This is problematic since an EVT contains a Type* that points to memory
allocated by an LLVMContext. If multiple LLVMContexts are used that
don't have overlapping lifetimes, we can end up with stale or or
incorrect pointers cached in the EVTs owned by SDNode::getValueTypeList.
I want to try to make the EVTs be owned by SelectionDAG instead. This is
already done for SDVTLists with more than 1 VT. The single value case is
a very old optimizaton that should be re-evaluated. In order to do this,
I need the SDVTLists to be created by SelectionDAG rather than by the
SDNode itself.
This patch doesn't change how the allocation is done yet. It just moves
the code around.
This patch does reduce the number of calls to getVTList since we now
share with the call needed for the SDNode FoldingSet.
Part of fixing #88233.