383 Commits

Author SHA1 Message Date
Amara Emerson
a946934a12 [GlobalISel][NFC] Use GPhi wrapper in more places instead of iterating over operands. 2024-01-11 22:25:53 -08:00
Emil J
3baedb4111
[GISel] Fix #77762: extend correct source registers in combiner helper rule extend_through_phis (#77765)
Since we already know which register we want to extend, we don't have to
ask its defining MI about it

---------

Co-authored-by: Emil Tywoniak <Emil.Tywoniak@hightec-rt.com>
2024-01-12 12:09:58 +08:00
Thorsten Schütt
d7642b2200
[GlobalIsel] Combine select to integer minmax (second attempt). (#77520)
Instcombine canonicalizes selects to floating point and integer minmax.
This and the dag combiner canonicalize to floating point minmax. None of
them canonicalizes to integer minmax. On Neoverse V2 basic integer
arithmetic and integer minmax have the same costs.
2024-01-11 09:50:33 +01:00
Thorsten Schütt
a085402ef5 Revert "[GlobalIsel] Combine select of binops (#76763)"
This reverts commit 1687555572ee4fb435da400dde02e7a1e60b742c.
2024-01-06 17:04:24 +01:00
Thorsten Schütt
1687555572
[GlobalIsel] Combine select of binops (#76763) 2024-01-06 11:28:10 +01:00
Thorsten Schütt
4b9194952d
[GlobalIsel] Combine selects with constants (#76089)
A first small step at combining selects.
2024-01-02 17:26:39 +01:00
Jie Fu
a8d5f731d6 [GlobalISel] Remove unused variable 'ResultTy' in CombinerHelper.cpp (NFC)
llvm-project/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:1179:7:
 error: unused variable 'ResultTy' [-Werror,-Wunused-variable]
  LLT ResultTy = MRI.getType(MI.getOperand(0).getReg());
      ^
1 error generated.
2023-12-06 14:08:07 +08:00
Pranav Taneja
41507fe595
[GISel] Combine (Scalarize) vector load followed by an element extract. 2023-12-06 11:23:23 +05:30
Antonio Frighetto
0ff5281c94 [GlobalISel] Treat shift amounts as unsigned in matchShiftImmedChain
A miscompilation issue in the GISel pre-legalization
phase has been addressed with improved routines.

Fixes: https://github.com/llvm/llvm-project/issues/71440.
2023-11-24 18:14:52 +01:00
Pierre van Houtryve
b26e6a8eb5
[GlobalISel] Add GITypeOf special type (#66079)
Allows creating a register/immediate that uses the same type as a
matched operand.
2023-10-31 09:57:10 +01:00
Amara Emerson
93659947d2
[AArch64][GlobalISel] Add support for pre-indexed loads/stores. (#70185)
The pre-index matcher just needs some small heuristics to make sure it
doesn't cause regressions. Apart from that it's a simple change, since
the only difference is an immediate operand of '1' vs '0' in the
instruction.
2023-10-26 10:29:12 -07:00
Amara Emerson
1b11729dc0
[AArch64][GlobalISel] Add support for post-indexed loads/stores. (#69532)
Gives small code size improvements across the board at -Os CTMark.

Much of the work is porting the existing heuristics in the DAGCombiner.
2023-10-24 13:51:59 -07:00
Jay Foad
21c2ba4bdb [GlobalISel] Remove TargetLowering::isConstantUnsignedBitfieldExtractLegal
Use LegalizerInfo::isLegalOrCustom instead.

Differential Revision: https://reviews.llvm.org/D116807
2023-09-27 15:58:01 +01:00
Amara Emerson
ea7157ff4f [GlobalISel] Propagate mem operands for indexed load/stores.
There isn't a test for this yet since the combines aren't used atm, but it will
be tested as part of a future commit. I'm just making this a separate change
tidyness reasons.
2023-09-25 14:41:20 -07:00
Amara Emerson
7e5c2672cb [GlobalISel][NFC] Clean up and modernize the indexed load/store combines.
Use wrappers and helpers to tidy it up, and remove some debug prints.
2023-09-25 00:32:36 -07:00
Amara Emerson
bc6e7f0573 [GlobalISel][NFC] Remove unused method CombinerHelper::tryCombine()
The combines were ported to the tablegen combiner a long time ago so this
manual method isn't needed.
2023-09-24 22:20:21 +08:00
Amara Emerson
985362e2f3
[AArch64][GlobalISel] Avoid running the shl(zext(a), C) -> zext(shl(a, C)) combine. (#67045) 2023-09-22 09:37:52 +08:00
Amara Emerson
eaab3245d4
[GlobalISel] Add constant folding support for G_FMA/G_FMAD in the combiner. (#65659) 2023-09-09 16:32:02 +08:00
Amara Emerson
1cc9f626cb
[GlobalISel] Add constant-folding of FP binops to combiner. (#65230) 2023-09-07 19:33:35 +03:00
Tuan Chuong Goh
b7a305deca [AArch64][GlobalISel] Optimise Combine Funnel Shift
Combine any funnel shift with a shift amount of 0 to a copy.
Modulo is applied to shift amount if it is larger than the
instruction's bitwidth.

Differential Revision: https://reviews.llvm.org/D157591
2023-09-07 11:58:12 +01:00
Vladislav Dzhidzhoev
1d0d2dfce7 [GlobalISel] Fold G_SHUFFLE_VECTOR with a single element mask to G_EXTRACT_VECTOR_ELT
It introduces minor regression in arm64-vcvt_f.ll, which will be fixed
later.
2023-09-07 12:03:04 +02:00
Amara Emerson
6c31f20fee
[GlobalISel] Fold fmul x, 1.0 -> x (#65379) 2023-09-06 03:14:16 +08:00
Amara Emerson
08e04209d8
[GlobalISel] Commute G_FMUL and G_FADD constant LHS to RHS. (#65298) 2023-09-05 23:48:34 +08:00
Amara Emerson
91746d15d2
[GlobalISel] Fix G_PTR_ADD immediate chain combine using the wrong im… (#65271) 2023-09-05 08:06:40 +08:00
Amara Emerson
0065640f40 [GlobalISel] Look through a G_PTR_ADD's def register instead of it's source operand's
uses when looking for load/store users. This was a simple logic bug during translation
of the equivalent function in SelectionDAG:
```
    for (SDNode *Node : N->uses()) {
      if (auto *LoadStore = dyn_cast<MemSDNode>(Node)) {
```
2023-09-04 00:28:57 -07:00
Amara Emerson
1ef1eec8db [GlobalISel][NFC] Use GenericMachineInstrs in CombinerHelper::reassociationCanBreakAddressingModePattern(). 2023-09-03 21:17:31 -07:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Amara Emerson
c95ed6e492 [GlobalISel] Try to commute G_CONSTANT_FOLD_BARRIER LHS operands to RHS.
Differential Revision: https://reviews.llvm.org/D159097
2023-08-30 08:07:22 -07:00
Matt Arsenault
d86a7d631c GlobalISel: Add constant fold combine for zext/sext/anyext
Could use more work for vectors.

https://reviews.llvm.org/D156534
2023-08-24 08:10:01 -04:00
David Green
adaf545a50 [GlobalISel] Limit shift_of_shifted_logic_chain to non-zero folds
After D157690 we are seeing some crashes from Global ISel, which seem to be
related to the shift_of_shifted_logic_chain combine that can remove too many
instructions if the shift amount is zero.

This limits the fold to non-zero shifts, under the assumption that it is better
in that case to fold away the shift to a COPY.

Differential Revision: https://reviews.llvm.org/D158596
2023-08-23 18:17:37 +01:00
pvanhout
2d87319f06 [GlobalISel] Rewrite some simple rules using MIR Patterns
Rewrites some simple rules that cause little to no codegen regressions as MIR patterns.

I may have missed some easy cases, but some other rules have intentionally been left as-is because bigger
changes are needed to make them work.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D157690
2023-08-22 09:09:54 +02:00
Jay Foad
a1a9c53ae7 [GlobalISel] Fix infinite loop in reassociation combine
Don't reassociate (C1+C2)+Y -> C1+(C2+Y).

Fixes https://github.com/llvm/llvm-project/issues/63849

Differential Revision: https://reviews.llvm.org/D155284
2023-07-16 14:15:24 +01:00
Amara Emerson
3a80bdb316 [GlobalISel] Remove an erroneous oneuse check in the G_ADD reassociation combine.
This check was unnecessary/incorrect, it was already being done by the target
hook default implementation, and the one in the matcher was checking for a
completely different thing. This change:
 1) Removes the check and updates affected tests which now do some more reassociations.
 2) Modifies the AMDGPU hooks which were stubbed with "return true" to also do the oneuse
    check. Not sure why I didn't do this the first time.
2023-07-10 01:03:12 -07:00
pvanhout
5eb8cb0949 [NFC][GlobalISel] Don't return bool from apply functions
There is no case where those functions return false. It's always return true.
Even if they were to return false, it's not really something we should rely on I think.
With the current combiner implementation, it would just make `tryCombineAll` return false without retrying anymore rules.

I also believe that if an applyer were to return false, it would mean that the match function is not good enough. Asserting on failure in an apply function is a better idea, IMO.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D153619
2023-06-26 09:23:58 +02:00
Amara Emerson
6f6298e5b3 [GlobalISel] Fix D144336 in a different way, by choosing operands from the first of the div/rem insts.
Differential Revision: https://reviews.llvm.org/D144336
2023-06-09 15:06:06 -07:00
Amara Emerson
086601eac2 [GlobalISel] Implement some binary reassociations, G_ADD for now
- (op (op X, C1), C2) -> (op X, (op C1, C2))
- (op (op X, C1), Y) -> (op (op X, Y), C1)

Some code duplication with the G_PTR_ADD reassociations unfortunately but no
easy way to avoid it that I can see.

Differential Revision: https://reviews.llvm.org/D150230
2023-06-08 21:14:58 -07:00
Mirko Brkusanin
792667dadd [GlobalISel] Check if ShiftAmt is greater then size of operand
matchCombineShlOfExtend did not check if the size of new shift would be
wider then a size of operand. Current condition did not work if the value
being shifted was zero. Updated to support vector splat.

Patch by: Acim Maravic

Differential Revision: https://reviews.llvm.org/D151122
2023-06-08 17:37:59 +02:00
Amara Emerson
fbdcd54442 [GlobalISel] Fix DIVREM combine from inserting a divrem before its operands' defs.
In some rare corner cases where in between the div/rem pair there's a def of
the second instruction's source (but a different vreg due to the combine's
eqivalence checks), it will place the DIVREM at the first instruction's point,
causing a use-before-def. There wasn't an obvious fix that stood out to me
without doing more involved analysis than a combine should really be doing.

Fixes issue #60516

I'm open to new suggestions on how to approach this, as I'm not too happy
at bailing out here. It's not the first time we run into issues with value liveness
that the DAG world isn't affected by.

Differential Revision: https://reviews.llvm.org/D144336
2023-06-04 00:24:38 -07:00
Matt Arsenault
a5e03972f7 GlobalISel: Move fconstant matching into tablegen
I don't really understand what the point of wip_match_opcode is.
It doesn't seem to have any purpose other than to list opcodes
to have all the logic in pure C++. You can't seem to  use it to
select multiple opcodes in the same way you use match.

Something is wrong with it, since the match emitter prints
"errors" if an opcode is covered by wip_match_opcode and
then appears in another pattern. For exmaple with this patch,
you see this several times in the build:

  error: Leaf constant_fold_fabs is unreachable
  note: Leaf idempotent_prop will have already matched

The combines are actually produced and the tests for them
do pass, so this seems to just be a broken warning.
2023-05-19 22:44:12 +01:00
Matt Arsenault
7f54b38e28 GlobalISel: Refactor unary FP op constant folding 2023-05-18 08:33:43 +01:00
Mikael Holmen
6647f3cd01 [CombinerHelper] Fix gcc warning [NFC]
Without the fix gcc complains with
 ../lib/CodeGen/GlobalISel/CombinerHelper.cpp:1652:52: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
  1652 |          SrcDef->getOpcode() == TargetOpcode::G_OR && "Unexpected op");
       |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
2023-05-10 09:08:06 +02:00
Amara Emerson
e1472db58e [GlobalISel] Implement commuting shl (add/or x, c1), c2 -> add/or (shl x, c2), c1 << c2
There's a target hook that's called in DAGCombiner that we stub here, I'll
implement the equivalent override for AArch64 in a subsequent patch since it's
used by different shift combine.

This change by itself has minor code size improvements on arm64 -Os CTMark:
Program                                       size.__text
                                              outputg181ppyy output8av1cxfn diff
consumer-typeset/consumer-typeset             410648.00      410648.00       0.0%
tramp3d-v4/tramp3d-v4                         364176.00      364176.00       0.0%
kimwitu++/kc                                  449216.00      449212.00      -0.0%
7zip/7zip-benchmark                           576128.00      576120.00      -0.0%
sqlite3/sqlite3                               285108.00      285100.00      -0.0%
SPASS/SPASS                                   411720.00      411688.00      -0.0%
ClamAV/clamscan                               379868.00      379764.00      -0.0%
Bullet/bullet                                 452064.00      451928.00      -0.0%
mafft/pairlocalalign                          246184.00      246108.00      -0.0%
lencod/lencod                                 428524.00      428152.00      -0.1%
                           Geomean difference                               -0.0%

Differential Revision: https://reviews.llvm.org/D150086
2023-05-08 22:37:43 -07:00
NAKAMURA Takumi
d45fae6010 Move CodeGen/LowLevelType => CodeGen/LowLevelTypeUtils
Before restoring `CodeGen/LowLevelType`, rename this to `LowLevelTypeUtils`.

Differential Revision: https://reviews.llvm.org/D148768
2023-04-25 08:53:17 +09:00
Amara Emerson
29c851f4e2 [GlobalISel] Move the truncstore_merge combine to the LoadStoreOpt pass and add support for an extra case.
If we have set of mergeable stores of shifts, but the original source value being shifted
is wider than the merged size, we should still be able to merge if we truncate first. To do this
however we need to search for stores speculatively up the block, without knowing exactly how
many stores we should see before we stop. The old algorithm has to match an exact number of
stores to fit the wide type, or it dies. The new one will try to set the wide type to however
many stores we found in the upwards block traversal and use later checks to verify if they're
a valid mergeable set.

The reason I need to move this to LoadStoreOpt is because the combiner works going top down
inside a block, which means that we end up doing partial merges because we haven't seen all
the possible stores before we mutate the MIR. In LoadStoreOpt we can go bottom up.

As a side effect of this change, we also end up doing better on an existing test case (missing_store)
since we manage to do a partial merge there.
2023-04-12 16:43:14 -07:00
Amara Emerson
4bc6434624 [GlobalISel] Fix an assertion failure in matchHoistLogicOpWithSameOpcodeHands().
We use this combine in the AArch64 postlegalizer combiner, which causes this
function to query the legalizer rules for the action for an invalid opcode/type
combination (G_AND and p0). Moving the legalizer query until after the validity
check in matchHoistLogicOpWithSameOpcodeHands() fixes this.
2023-02-26 15:42:57 -08:00
Kazu Hirata
9e5d2495ac Use APInt::isOne instead of APInt::isOneValue (NFC)
Note that isOneValue has been soft-deprecated in favor of isOne.
2023-02-19 23:06:36 -08:00
Kazu Hirata
f8f3db2756 Use APInt::count{l,r}_{zero,one} (NFC) 2023-02-19 22:04:47 -08:00
Amara Emerson
ddf167c442 [GlobalISel] Fix G_ZEXTLOAD being converted to G_SEXTLOAD incorrectly.
The extending loads combine tries to prefer sign-extends folding into loads vs
zexts, and in cases where a G_ZEXTLOAD is first used by a G_ZEXT, and then used
by a G_SEXT, it would select the G_SEXT even though the load is already
zero-extending.

Fixes issue #59630
2023-02-18 10:05:08 -08:00
Amara Emerson
b309bc04ee [GlobalISel] Combine out-of-range shifts to undef.
Differential Revision: https://reviews.llvm.org/D144303
2023-02-17 15:05:00 -08:00
Kazu Hirata
7e6e636fb6 Use llvm::has_single_bit<uint32_t> (NFC)
This patch replaces isPowerOf2_32 with llvm::has_single_bit<uint32_t>
where the argument is wider than uint32_t.
2023-02-15 22:17:27 -08:00