53 Commits

Author SHA1 Message Date
Jay Foad
2012b25420
[AMDGPU][GlobalISel] Disable fixed-point iteration in all Combiners (#105517)
Disable fixed-point iteration in all AMDGPU Combiners after #102163.

This saves around 2% compile time in ad hoc testing on some large
graphics shaders. I did not notice any regressions in the generated
code, just a bunch of harmless differences in instruction selection and
register allocation.
2024-08-22 17:14:53 +01:00
David Green
b635d690ed [NFC] Fix laod -> load typos. NFC 2024-06-21 09:26:44 +01:00
paperchalice
837dc542b1
[CodeGen][NewPM] Split MachineDominatorTree into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
2024-06-11 21:27:14 +08:00
Jay Foad
e2c89254e1 [AMDGPU] Fix typo in function name 2024-05-06 08:55:09 +01:00
Jay Foad
d5ca2e46ca
[AMDGPU] Improve MIR pattern for FMinFMaxLegacy combine. NFC. (#90968) 2024-05-05 10:29:01 +01:00
Jay Foad
1cde1240ed [AMDGPU] Use replaceOpcodeWith instead of applyCombine_s_mul_u64. NFC. 2024-05-03 15:32:47 +01:00
Jay Foad
99ca40849d [AMDGPU] Remove unneeded calls to setInstrAndDebugLoc in matchers. NFC. 2024-05-03 15:01:47 +01:00
Jay Foad
72e07d48e0 [AMDGPU] Simplify applySelectFCmpToFMinToFMaxLegacy. NFC. 2024-05-03 14:20:53 +01:00
Nick Anderson
8bd327d6fe
[AMDGPU][GlobalISel] Add fdiv / sqrt to rsq combine (#78673)
Fixes #64743
2024-02-22 09:47:36 +01:00
Jay Foad
4a77414660
[AMDGPU] CodeGen for GFX12 8/16-bit SMEM loads (#77633) 2024-01-17 10:28:03 +00:00
Jay Foad
08da7ac80c
[AMDGPU] Fix broken sign-extended subword buffer load combine (#77470) 2024-01-10 10:50:13 +00:00
Jay Foad
daa4728dee
[AMDGPU] Add CodeGen support for GFX12 s_mul_u64 (#75825) 2024-01-08 19:13:38 +00:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
pvanhout
aaf6755631 [GlobalISel] Refactor Combiner API
Remove CodeGen leftovers from the old combiner backend and adapt the API to fit the new backend better.
It's now quite a bit closer to how InstructionSelector works.

- `CombinerInfo` is now a simple "options" struct.
- `Combiner` is now the base class of all TableGen'd combiner implementation.
    - Many fields have been moved from derived classes into that class.
    - It has been refactored to create & own the Observer and Builder.
- `tryCombineAll` TableGen'd method can now be renamed, which allows targets to implement the actual `tryCombineAll` call manually and do whatever they want to do before/after it.

Note: `CombinerHelper` needs to be mutable because none of its methods are const. This can be revisited later.

Depends on D158710

Reviewed By: aemerson, dsanders

Differential Revision: https://reviews.llvm.org/D158713
2023-09-05 08:19:05 +02:00
Matt Arsenault
e954085f80 AMDGPU: Fix more unsafe rsq formation
Introducing rsq contract flags is wrong, and also requires some level
of approximate functions. AMDGPUCodeGenPrepare already should handle
the f32 cases with appropriate flags, and I don't see how new
situations to handle would arise during legalization (other than cases
involving the rcp intrinsic, which instcombine tries to
handle). AMDGPUCodeGenPrepare does need to learn better handling of
rcp/rsq for f64 though, which we never bothered to handle well.

Removes another obstacle to correctly lowering sqrt.

https://reviews.llvm.org/D158099
2023-08-23 19:28:49 -04:00
Sameer Sahasrabuddhe
d9847cde48 [GlobalISel] convergent intrinsics
Introduced the convergent equivalent of the existing G_INTRINSIC opcodes:

- G_INTRINSIC_CONVERGENT
- G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS

Out of the targets that currently have some support for GlobalISel, the patch
assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154766
2023-07-31 12:15:39 +05:30
Sameer Sahasrabuddhe
7c760b224b Restore "[GlobalISel] GIntrinsic subclass to represent intrinsics in Generic Machine IR"
Some opcodes in generic MIR represent calls to intrinsics, where the intrinsic
ID is the first non-def operand to the instruction. These are now represented as
a subclass of GenericMachineInstr, and the method MachineInstr::getIntrinsicID()
is now moved to this subclass GIntrinsic.

Some target-defined instructions behave like GMIR intrinsics, and have an
Intrinsic::ID operand. But they should not be recognized as generic intrinsics,
and should not use GIntrinsic::getIntrinsicID(). Separated these out by
introducing a new AMDGPU::getIntrinsicID().

Reviewed By: arsenm, Pierre-vh

Differential Revision: https://reviews.llvm.org/D155556

This restores commit baa3386edb11a2f9bcadda8cf58d56f3707c39fa.
Originally reverted in d0f7850b01cf17e50a4f4b00e3b84dded94df6b8.
2023-07-27 14:49:17 +05:30
Sameer Sahasrabuddhe
d0f7850b01 Revert "[GlobalISel] GIntrinsic subclass to represent intrinsics in Generic Machine IR"
This reverts commit baa3386edb11a2f9bcadda8cf58d56f3707c39fa.

The changes did not cover all occurrences of the deteleted function
MachineInstr::getIntrinsicID().
2023-07-27 10:14:24 +05:30
Sameer Sahasrabuddhe
baa3386edb [GlobalISel] GIntrinsic subclass to represent intrinsics in Generic Machine IR
Some opcodes in generic MIR represent calls to intrinsics, where the intrinsic
ID is the first non-def operand to the instruction. These are now represented as
a subclass of GenericMachineInstr, and the method MachineInstr::getIntrinsicID()
is now moved to this subclass GIntrinsic.

Some target-defined instructions behave like GMIR intrinsics, and have an
Intrinsic::ID operand. But they should not be recognized as generic intrinsics,
and should not use GIntrinsic::getIntrinsicID(). Separated these out by
introducing a new AMDGPU::getIntrinsicID().

Reviewed By: arsenm, Pierre-vh

Differential Revision: https://reviews.llvm.org/D155556
2023-07-27 10:00:45 +05:30
pvanhout
8444038d16 [AMDGPU] Use GlobalISel MatchTable Combiner Backend
Use the new matchtable-based combiner backend for all AMDGPU combiners.
This drop-in from the user's perspective; there are no test changes, the new combiner behaves exactly like the old one.

Depends on D153757

NOTE: This would land iff D153757 (RFC) lands too.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D153758
2023-07-11 11:27:13 +02:00
Konstantina Mitropoulou
944f429b21 [AMDGPU] Improve the lowering of raw_buffer_load_{i8,i16} and struct_buffer_load_{i8,i16} intrinsics
Currently, raw_buffer_load_{i8,i16} and struct_buffer_load_{i8,i16}
intrinsics are lowered as buffer_load_{u8,u16}. This patch combines
buffer_load_{u8,u16} and sign extension instructions in order to
generate buffer_load_{i8,i16} instructions.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D144313
2023-02-22 09:01:33 -08:00
Matt Arsenault
36cfe26a52 AMDGPU: Try to unfold fneg source when matching legacy fmin/fmax
This is NFC as it stands, since other combines will effectively
prevent this from being reachable. This will avoid regressions in a
future change which tries to make better use of select source
modifiers.

Didn't bother with the GlobalISel part for now, since the baseline
combine doesn't seem to work on the existing test.
2023-02-02 22:50:23 -04:00
Pierre van Houtryve
63390dccd8 [GlobalISel] Add Predicates to GICombineRule
Small QoL change to allow Predicates to be used in GICombineRule.
Currently only one combine in the AMDGPU backend makes use of it.

The implementation is pretty simple to get started but of course we can expand this later on and optimize predicate checking better if needed.

Reviewed By: dsanders

Differential Revision: https://reviews.llvm.org/D136681
2022-10-26 07:13:40 +00:00
Jessica Paquette
42cb2f8b12 [GlobalISel] Mark mi_match as nodiscard
Typically when you match something, you want to check the result.

Fix a couple warnings in the AMDGPUPostLegalizerCombiner which appear as a
result of this.

Differential Revision: https://reviews.llvm.org/D135491
2022-10-07 15:47:05 -07:00
Amara Emerson
3daf7ddaef [GlobalISel] Allow prelegalizer combiners to have access to LegalizerInfo.
Before, the isPreLegalize() query in CombinerHelper only checked for the
presence of a LegalizerInfo object. This is problematic when we want to have
a combine actually check for legality in a pre-legalizer combine pass, since
if we pass a LegalizerInfo object to the constructor it causes the combines to
think that we're running *post* legalizer, which isn't true.

This change fixes it to instead check an explicit bool that passes to signal
whether the pass will be run before or after legalization.

Doing so exposed a bug in the extending loads combine, which tried to check for
legality of candidate extending loads if LegalizerInfo was present. Since we
only ran it pre-legalizer and therefore with a null LegalizerInfo, it never
actually ran. Also fixes the legality checks to keep the tests passing.

Differential Revision: https://reviews.llvm.org/D135044
2022-10-03 07:36:18 +01:00
Mateja Marjanovic
ca57b80cd6 Code quality: Combine V_RSQ
Combine V_RCP and V_SQRT into V_RSQ on AMDGPU for GlobalISel.

Change-Id: I93c5dcb412483156a6e8b68c4085cbce83ac9703
2021-11-30 17:17:15 +01:00
Mirko Brkusanin
db6bc2ab51 [AMDGPU][GlobalISel] Fold G_FNEG above when users cannot fold mods
If possible fold fneg into instruction above if users cannot fold mods and we
know it will decrease instruction count.
Follows same logic as SDAG combiner in choosing opportunities to combine.

Differential Revision: https://reviews.llvm.org/D112827
2021-11-17 14:25:13 +01:00
Petar Avramovic
fb7be0d912 AMDGPU/GlobalISel: Remove redundant G_FCANONICALIZE
Add basic version of isCanonicalized for global-isel. Copied from sdag.
Add post legalizer combine that deletes G_FCANONICALIZE when its input
is already Canonicalized.

Differential Revision: https://reviews.llvm.org/D96605
2021-04-27 12:26:37 +02:00
Thomas Symalla
faeed774d1 Fixed includes.
Differential Revision: https://reviews.llvm.org/D93708
2021-02-02 09:14:54 +01:00
Thomas Symalla
09508d2849 Reverted whitespace changes.
Differential Revision: https://reviews.llvm.org/D90968
2021-02-02 09:14:54 +01:00
Thomas Symalla
52bfb50145 Formatting changes 2021-02-02 09:14:53 +01:00
Thomas Symalla
7d24026ed2 Formatting changes. 2021-02-02 09:14:53 +01:00
Thomas Symalla
bcd6c2d203 Updating formatting changes. 2021-02-02 09:14:53 +01:00
Thomas Symalla
ecbed4e0ab Resolve formatting changes. 2021-02-02 09:14:53 +01:00
Thomas Symalla
cdfd9b3bf5 Move Combiner to PreLegalize step 2021-02-02 09:14:53 +01:00
Thomas Symalla
9a8da909f1 Reverted unintended git-format change. 2021-02-02 09:14:52 +01:00
Thomas Symalla
dae85e4671 Fixed the lit tests and a bug in the implementation. 2021-02-02 09:14:52 +01:00
Thomas Symalla
88a832aef1 Refactored the pattern matching. 2021-02-02 09:14:52 +01:00
Thomas Symalla
fce3230be2 Added early exit. 2021-02-02 09:14:52 +01:00
Thomas Symalla
d722924f20 Added comments. 2021-02-02 09:14:52 +01:00
Thomas Symalla
ec043967ec clang-format 2021-02-02 09:14:52 +01:00
Thomas Symalla
62af0305b7 Added clamp i64 to i16 global isel pattern. 2021-02-02 09:14:52 +01:00
dfukalov
560d7e0411 [NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036
2021-01-20 22:22:45 +03:00
dfukalov
6a87e9b08b [NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
2021-01-07 22:22:05 +03:00
Petar Avramovic
0031418dce AMDGPU/GlobalISel: Use same builder/observer in post-legalizer-combiner
Change match/apply functions into methods of new target specific combiner
helper class. Use reference to MachineIRBuilder from helper instead of
constructing new MachineIRBuilder each time new instruction needs to made.
Allows correct tracking of newly created instructions.

Differential Revision: https://reviews.llvm.org/D90623
2020-11-03 09:24:50 +01:00
Matt Arsenault
66ffa0e91f AMDGPU/GlobalISel: Fix using post-legal combiner without LegalizerInfo 2020-08-17 09:19:22 -04:00
Matt Arsenault
16bcd54570 AMDGPU/GlobalISel: Mark GlobalISel classes as final 2020-07-28 11:42:17 -04:00
Matt Arsenault
db777eaea3 AMDGPU/GlobalISel: Fix asserts on non-s32 sitofp/uitofp sources
The combine to form cvt_f32_ubyte0 was assuming the source type was
always 32-bit, but this needs to tolerate any legal source type.
2020-06-23 10:00:35 -04:00
Daniel Sanders
e35ba09961 [gicombiner] Allow generated combiners to store additional members
Summary:
Adds the ability to add members to a generated combiner via
a State base class. In the current AArch64PreLegalizerCombiner
this is used to make Helper available without having to
provide it to every call.

As part of this, split the command line processing into a
separate object so that it still only runs once even though
the generated combiner is constructed more frequently.

Depends on D81862

Reviewers: aditya_nandakumar, bogner, volkan, aemerson, paquette, arsenm

Reviewed By: arsenm

Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81863
2020-06-16 14:47:04 -07:00
Matt Arsenault
0ba40d4ccf AMDGPU/GlobalISel: Combines for V_CVT_F32_UBYTE[0-3]
Ports the existing DAG combines, minus the simplify demanded bits
which seems to have no equivalent now. Without these, this isn't
particularly helpful in most of the IR sample cases.
2020-04-13 19:18:19 -04:00