This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
constructors take LocationSize, and convert ~UINT64_C(0) to
LocationSize::beforeOrAfter(). The getSize methods return a
LocationSize.
This allows us to be more precise with unknown sizes, not accidentally
treating them as unsigned values, and in the future should allow us to
add proper scalable vector support but none of that is included in this
patch. It should mostly be an NFC.
Global ISel is still expected to use the underlying LLT as it needs, and
are not expected to see unknown sizes for generic operations. Most of
the changes are hopefully fairly mechanical, adding a lot of getValue()
calls and protecting them with hasValue() where needed.
Perform the requested arithmetic and produce a carry output in addition
to the normal result.
Clang has them as builtins (__builtin_add_overflow_p). The middle end
has intrinsics for them (sadd_with_overflow).
AArch64: ADDS Add and set flags
On Neoverse V2, they run at half the throughput of basic arithmetic and
have a limited set of pipelines.
This combine transforms an unmerge where only the first element is used
into a truncate. That works OK for scalar but for vector needs to insert
a bitcast to integers, perform the truncate then bitcast back to
vectors. This generates more awkward code than using an Unmerge.
It is purely based on symmetry. Registers can be scalars, vectors, and
non-constants.
X < 5.0 || X > 5.0
->
X != 5.0
X < Y && X > Y
->
FCMP_FALSE
X < Y && X < Y
->
FCMP_TRUE
see InstCombinerImpl::foldLogicOfFCmps
…tructions into account
Hint instructions like G_ASSERT_ZEXT cann be viewed as a copy. Including
this fact into the combiner allows the match more patterns involving
such instructions.
Since we already know which register we want to extend, we don't have to
ask its defining MI about it
---------
Co-authored-by: Emil Tywoniak <Emil.Tywoniak@hightec-rt.com>
Instcombine canonicalizes selects to floating point and integer minmax.
This and the dag combiner canonicalize to floating point minmax. None of
them canonicalizes to integer minmax. On Neoverse V2 basic integer
arithmetic and integer minmax have the same costs.
The pre-index matcher just needs some small heuristics to make sure it
doesn't cause regressions. Apart from that it's a simple change, since
the only difference is an immediate operand of '1' vs '0' in the
instruction.
There isn't a test for this yet since the combines aren't used atm, but it will
be tested as part of a future commit. I'm just making this a separate change
tidyness reasons.
Combine any funnel shift with a shift amount of 0 to a copy.
Modulo is applied to shift amount if it is larger than the
instruction's bitwidth.
Differential Revision: https://reviews.llvm.org/D157591
uses when looking for load/store users. This was a simple logic bug during translation
of the equivalent function in SelectionDAG:
```
for (SDNode *Node : N->uses()) {
if (auto *LoadStore = dyn_cast<MemSDNode>(Node)) {
```
After D157690 we are seeing some crashes from Global ISel, which seem to be
related to the shift_of_shifted_logic_chain combine that can remove too many
instructions if the shift amount is zero.
This limits the fold to non-zero shifts, under the assumption that it is better
in that case to fold away the shift to a COPY.
Differential Revision: https://reviews.llvm.org/D158596
Rewrites some simple rules that cause little to no codegen regressions as MIR patterns.
I may have missed some easy cases, but some other rules have intentionally been left as-is because bigger
changes are needed to make them work.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D157690
This check was unnecessary/incorrect, it was already being done by the target
hook default implementation, and the one in the matcher was checking for a
completely different thing. This change:
1) Removes the check and updates affected tests which now do some more reassociations.
2) Modifies the AMDGPU hooks which were stubbed with "return true" to also do the oneuse
check. Not sure why I didn't do this the first time.
There is no case where those functions return false. It's always return true.
Even if they were to return false, it's not really something we should rely on I think.
With the current combiner implementation, it would just make `tryCombineAll` return false without retrying anymore rules.
I also believe that if an applyer were to return false, it would mean that the match function is not good enough. Asserting on failure in an apply function is a better idea, IMO.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D153619
- (op (op X, C1), C2) -> (op X, (op C1, C2))
- (op (op X, C1), Y) -> (op (op X, Y), C1)
Some code duplication with the G_PTR_ADD reassociations unfortunately but no
easy way to avoid it that I can see.
Differential Revision: https://reviews.llvm.org/D150230