These new debug values get inserted after the place where the spill
happens, which means they won't be reached by the reverse traversal of
basic block instructions. This would crash or fail assertions if they
contained any virtual registers to be replaced. We can manually handle
the new debug values right away to resolve this.
Fixes https://github.com/llvm/llvm-project/issues/59172
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D139590
MC and lld/ELF defaults were flipped in 2016. For Clang: CMake
ENABLE_X86_RELAX_RELOCATIONS defaults to on in 2020. It makes sense for
the TargetOptions default to be true now.
R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX require GNU ld newer than 2015-10
(subsumed by the current requirement of -fbinutils-version=).
This should fix `rustc -Z plt=no` PIC relocatable files with GNU ld.
(See https://github.com/rust-lang/rust/pull/106380)
If the divisor is 1, the magic algorithm does not return a correct
result and we end up using a select to pick the numerator for those
elements at the end.
Therefore we can use undef for that element of the earlier operations
when the divisor is 1. We sometimes get this through SimplifyDemandedVectorElts,
but not always. Definitely seems like we don't if the NPQ fixup is used.
Unfortunately, DAGCombiner is unable to fold srl X, <0, undef> to X so
I had to add flags to avoid emitting the srl unless one of the shift
amounts is non-zero.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D141022
This error was introduced in 1d1de7467c32d52926ca56b9167a2c65c451ecfa (by me)
about 1 month ago. Found while testing the D140901 patch stack.
Reviewed By: jryans
Differential Revision: https://reviews.llvm.org/D141052
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
Add G_BUILD_VECTOR and G_BUILD_VECTOR_TRUNC to the list of opcodes in
`shouldCSEOpc`. This simplifies the code generated for vector splats.
Differential Revision: https://reviews.llvm.org/D140965
At the moment, `MachineIRBuilder::buildInstr` may build an instruction
with a different opcode than the one passed in as parameter. This may
cause confusion for its consumers, such as `CSEMIRBuilder`, which will
memoize the instruction based on the new opcode, but will search
through the memoized instructions based on the original one (resulting
in missed CSE opportunities). This is all the more unpleasant since
buildInstr is virtual and may call itself recursively both directly
and via buildCast, so it's not always easy to follow what's going on.
This patch simplifies the API of `MachineIRBuilder` so that the `buildInstr`
method does the least surprising thing (i.e. builds an instruction with
the specified opcode) and only the convenience `buildX` methods
(`buildMerge` etc) are allowed freedom over which opcode to use. This can
still be confusing (e.g. one might write a unit test using
`buildBuildVectorTrunc` but instead get a plain `G_BUILD_VECTOR`), but at
least it's explained in the comments.
In practice, this boils down to 3 changes:
* `buildInstr(G_MERGE_VALUES)` will no longer call itself with
`G_BUILD_VECTOR` or `G_CONCAT_VECTORS`; this functionality is moved to
`buildMerge` and replaced with an assert;
* `buildInstr(G_BUILD_VECTOR_TRUNC)` will no longer call itself with
`G_BUILD_VECTOR`; this functionality is moved to `buildBuildVectorTrunc`
and replaced with an assert;
* `buildInstr(G_MERGE_VALUES)` will no longer call `buildCast` and will
instead assert if we're trying to merge a single value; no change is
needed in `buildMerge` since it was already asserting more than one
source operand.
This change is NFC for users of the `buildX` methods, but users that
call `buildInstr` with relaxed parameters will have to update their code
(such instances will hopefully be easy to find thanks to the asserts).
Differential Revision: https://reviews.llvm.org/D140964
Instead of doing the adjustment in 3 different places in the code
base, do it inside UnsignedDivideUsingMagic::get.
Differential Revision: https://reviews.llvm.org/D141014
This appears to be the root problematic pattern
for AArch64 regression in D140677.
We already do this, and many more, as target-specific X86 combines,
so this isn't causing much of an impact.
The magic algorithm sets IsAdd indication for division by 1 that
the caller had to ignore.
I considered folding the ignore into UnsignedDivisionByConstantInfo,
but we only allow 1 for vectors of mixed visiors. And really what we
want to end up with is undef. Currently, we get to undef via
DemandedElts optimizations using the select instruction. We could
directly emit undef.
Differential Revision: https://reviews.llvm.org/D140940
The patch also adds expandVPCTLZ and expandVPCTTZ to expand vp.ctlz/cttz nodes
and the cost model of vp.ctlz/cttz.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D140370
If the src type is not extend type, after convert the truncate to and we need to truncate the and also to make sure the all user is legal.
The old fix D137613 doesn't work when the truncate convert to and have the other users. So this time I try to add the truncate after and to avoid all these potential issues.
Fix: #59554
Reviewed By: samparker
Differential Revision: https://reviews.llvm.org/D140869
The original computation was both making assumptions that do not hold
in practice, and being overly pessimistic. We should just check
every possible split factor, and pick the best one.
Fixes https://github.com/llvm/llvm-project/issues/59781
Single-element vectors are legalized by splitting,
so the the memory operations would also get scalarized.
While we do have some support to reconstruct scalarized loads,
we clearly don't catch everything.
The comment for the affected AArch64 store suggests that
having two stores was the desired outcome in the first place.
This was showing as a source of *many* regressions
with more aggressive ZERO_EXTEND_VECTOR_INREG recognition.
This kind of pattern seems to come up as regressions
with better ZERO_EXTEND_VECTOR_INREG recognition.
For initial implementation, this is quite restricted
to the minimal viable transform, otherwise there are
too many regressions to be dealt with.
HWASAN exposes some non-determinism in the pass and triggers:
ScalarEvolution.cpp:11540: bool llvm::ScalarEvolution::isLoopEntryGuardedByCond(const Loop *, ICmpInst::Predicate, const SCEV *, const SCEV *): Assertion `isAvailableAtLoopEntry(LHS, L) && "LHS is not available at Loop Entry"' failed.
E.g.
https://lab.llvm.org/buildbot/#/builders/236/builds/1629/steps/16/logs/stdio
is broken after D137838. I tried to split D137838 into smaller patches
and the one which reproduced was just a move of cpp from one dir to another.
Maybe it has something do to with comparison of tagged pointeres and
PtrSets used in pass.
Issues is hard to reproduce, even slight changes in path, or preprocessing
cpp file hide it.
DAGCombiner replaces (load const_addr1) directly chained with (store
(val, const_addr2)) with val if address space stripped const_addr1 ==
const_addr2. The patch fixes the issue by checking address spaces as
well. However, it might makes sense to not to chain together side
effects that belong to different address spaces in the first place and
make SelectionDAG::root address space aware.
This mainly cleans up a few patterns that are legalized by scalarization
from a wide-element vector, but then are further split apart to build
a more narrow-sized-element vector. In particular this happens in some
cases for illegal ISD::ZERO_EXTEND_VECTOR_INREG.
Given a ISD::EXTRACT_VECTOR_ELT, which is a glorified bit sequence extract,
recursively analyse all of it's users. and try to model themselves as
bit sequence extractions. If all of them agree on the new, narrower element
type, and all of them can be modelled as ISD::EXTRACT_VECTOR_ELT's of that
new element type, do that, but only if unmodelled users are ISD::BUILD_VECTOR.
If the dividend has leading zeros, we can use them to reduce the
size of the multiplier and avoid the fixup cases.
This patch is for scalars only, but we might be able to do this
for vectors in a follow up.
Differential Revision: https://reviews.llvm.org/D140750
The field is currently `void*`, which was originlly chosen in 2010 to not need to include `DenseMap`. Since then, `DenseMap` has been included in the header file anyways, so there is no more need to for the indirection via `void*` and the cruft around it can be removed.
Differential Revision: https://reviews.llvm.org/D140758
We might have sunk a bitcast into shuffle, and now it might be operating
on more fine-grained elements than what we'd match, so we must not be
dependent on whatever the granularity the shuffle happened to be in,
but transform it into the one canonical for us - with widest elements.
Sometimes we end up with a shuffles in DAG that would be
better represented as a `ISD::ZERO_EXTEND_VECTOR_INREG`,
and a failure to do so causes suboptimal codegen in a number of cases,
especially when we will then cast vector to scalar.
I acknowledge, the test changes here are rather underwhelming,
but as with all of codegen, it's always a yak shawing,
and this is the most stripped down version of the patch
that shows *some* effect without having insurmountable amount
of fallout to deal with. The next change resolves this regression.
The transformation will be extended in follow-ups.
This transformation could've triggered a verifier assert if RegA and RegB
were of different reg classes. Fix this by constraining as the comment
for replaceRegWith suggests.
Differential Revision: https://reviews.llvm.org/D140672
Adding zero-ext support isn't as straight-forward, and it's easier
to to so in a new function, but this helper is useful there.
This does not change any existing behaviour.
This patch fixes a simple bug where `DbgValueHistoryMap::hasNonEmptyLocation` was incorrectly handling DBG_VALUE_LIST instructions, treating empty values as non-empty, causing empty variables to be emitted into DWARF.
Reviewed By: Orlando
Differential Revision: https://reviews.llvm.org/D133925
Depending on the particular DAG, we might either create a `freeze`,
or not. And only in the former case, the cycle would be formed.
It would be nicer to have `ReplaceAllUsesOfValueWithIf()`,
like we have in IR, but we don't have that.
Fixes https://github.com/llvm/llvm-project/issues/59677
The original code was confusing. It was stripping poison-generating flags,
but the comments were saying that doing so was a TODO.
If the poison-generating flags are present, then even if all operands
are guaranteed not to be undef or poison, the whole operation may still
produce undef or poison. We can still deal with that case,
and we already do deal with it in fact, by also dropping those flags.
Refs. https://github.com/llvm/llvm-project/issues/59676
Lack of such operands implies that the op might be poison-producing due to
it's flags. We seem to drop them already, but the comments are confusing.
Fixes https://github.com/llvm/llvm-project/issues/59676