choikwa
45759fe5b4
[AMDGPU] Filter candidates of LiveRegOptimizer for profitable cases ( #124624 )
...
It is known that for vector whose element fits in i16 will be split and
scalarized in SelectionDag's type legalizer
(see SIISelLowering::getPreferredVectorAction).
LRO attempts to undo the scalarizing of vectors across basic block
boundary and shoehorn Values in VGPRs. LRO is beneficial for operations
that natively work on illegal vector types to prevent flip-flopping
between unpacked and packed. If we know that operations on vector will
be split and scalarized, then we don't want to shoehorn them back to
packed VGPR.
Operations that we know to work natively on illegal vector types usually
come in the form of intrinsics (MFMA, DOT8), buffer store, shuffle, phi
nodes to name a few.
2025-03-05 18:44:48 -05:00
Kazu Hirata
66c31f5d02
[AMDGPU] Avoid repeated hash lookups (NFC) ( #126401 )
...
This patch just cleans up the "if" condition. Further cleanups are
left to subsequent patches.
2025-02-08 23:17:06 -08:00
Shilei Tian
f15da5fb78
[AMDGPU] Fix an invalid cast in AMDGPULateCodeGenPrepare::visitLoadInst ( #122494 )
...
Fixes: SWDEV-507695
2025-01-12 23:40:25 -05:00
Jay Foad
f9f7c42ca6
[AMDGPU] Refine AMDGPULateCodeGenPrepare class. NFC. ( #118792 )
...
Use references instead of pointers for most state and initialize it all
in the constructor, and similarly for the LiveRegOptimizer class.
2024-12-05 14:05:51 +00:00
Jay Foad
3923e0451a
[AMDGPU] Preserve all analyses if nothing changed ( #117994 )
2024-11-28 14:33:05 +00:00
Kazu Hirata
be187369a0
[AMDGPU] Remove unused includes (NFC) ( #116154 )
...
Identified with misc-include-cleaner.
2024-11-13 21:10:03 -08:00
Kazu Hirata
0cb80c4f00
[AMDGPU] Avoid repeated hash lookups (NFC) ( #113409 )
2024-10-22 23:02:34 -07:00
Jay Foad
8d13e7b8c3
[AMDGPU] Qualify auto. NFC. ( #110878 )
...
Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)
2024-10-03 13:07:54 +01:00
Kazu Hirata
d7db094340
[AMDGPU] Avoid repeated hash lookups (NFC) ( #109506 )
2024-09-21 00:02:19 -07:00
Matt Arsenault
05b75e006b
AMDGPU/NewPM: Port AMDGPULateCodeGenPrepare to new pass manager ( #102806 )
2024-08-12 15:09:12 +04:00
Jeffrey Byrnes
03936534b5
[AMDGPU] Protect against null entries in ValMap
...
Change-Id: Icbda7c3fecf38679d06006986e5e17cb1f1b8749
2024-07-22 16:50:54 -07:00
Jeffrey Byrnes
6e68b75e66
[AMDGPU] Reland: Do not use original PHIs in coercion chains
...
Change-Id: I579b5c69a85997f168ed35354b326524b6f84ef7
2024-07-19 09:02:28 -07:00
Jay Foad
1612e4a351
Revert "[AMDGPU] Do not use original PHIs in coercion chains ( #98063 )"
...
This reverts commit dc8ea046a516c3bdd0ece306f406c9ea833d4dac.
It generated broken IR as described here:
https://github.com/llvm/llvm-project/pull/98063#issuecomment-2225259451
2024-07-15 15:15:29 +01:00
Jeffrey Byrnes
dc8ea046a5
[AMDGPU] Do not use original PHIs in coercion chains ( #98063 )
...
It's possible that we are unable to coerce all the incoming values of a
PHINode (A). Thus, we are unable to coerce the PHINode. In this
situation, we previously would add the PHINode back to the ValMap. This
would cause a problem is PhiNode (B) was a user of A. In this scenario,
if B has been coerced, we would hit an assert regarding the incompatible
type between the PHINode and its incoming value.
Deleting non-coerced PHINodes from the map, and propagating the removal
to users, resolves the issue.
2024-07-10 11:32:45 -07:00
Jeffrey Byrnes
5da7179cb3
[AMDGPU] Reland: Add IR LiveReg type-based optimization
2024-07-03 09:26:19 -07:00
Vitaly Buka
3e53c97d33
Revert "[AMDGPU] Add IR LiveReg type-based optimization" ( #97138 )
...
Part of #66838 .
https://lab.llvm.org/buildbot/#/builders/52/builds/404
https://lab.llvm.org/buildbot/#/builders/55/builds/358
https://lab.llvm.org/buildbot/#/builders/164/builds/518
This reverts commit ded956440739ae326a99cbaef18ce4362e972679.
2024-06-28 23:18:26 -07:00
Jeffrey Byrnes
ded9564407
[AMDGPU] Add IR LiveReg type-based optimization
...
Change-Id: Ia0d11b79b8302e79247fe193ccabc0dad2d359a0
2024-06-28 15:01:39 -07:00
Jay Foad
89226ecbb9
[AMDGPU] Do not widen scalar loads on GFX12 ( #78724 )
...
GFX12 has subword scalar loads so there is no need to do this.
2024-01-19 15:30:07 +00:00
Jay Foad
4a77414660
[AMDGPU] CodeGen for GFX12 8/16-bit SMEM loads ( #77633 )
2024-01-17 10:28:03 +00:00
Matt Arsenault
3e16167c14
AMDGPU: Use getTypeStoreSizeInBits
2023-04-29 10:35:06 -04:00
Matt Arsenault
4202ad5d94
AMDGPU: Don't create a pointer bitcast in AMDGPULateCodeGenPrepare
2023-04-29 10:34:21 -04:00
pvanhout
036431e31e
[AMDGPU] Use UniformityAnalysis in LateCodeGenPrepare
...
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D145366
2023-03-06 13:35:57 +01:00
Kazu Hirata
4bef0304e1
[AArch64, AMDGPU] Use make_early_inc_range (NFC)
2021-11-03 09:22:51 -07:00
Nikita Popov
357756ecf6
[OpaquePtr] Remove uses of CreateConstGEP1_64() without element type
...
Remove uses of to-be-deprecated API.
2021-07-17 16:43:20 +02:00
Matt Arsenault
a15ed701ab
AMDGPU: Fix assert on constant load from addrspacecasted pointer
...
This was trying to create a bitcast between different address spaces.
2021-05-11 20:12:20 -04:00
Nikita Popov
46354bac76
[OpaquePtrs] Remove some uses of type-less CreateLoad APIs (NFC)
...
Explicitly pass loaded type when creating loads, in preparation
for the deprecation of these APIs.
There are still a couple of uses left.
2021-03-11 14:40:57 +01:00
dfukalov
6a87e9b08b
[NFC][AMDGPU] Reduce include files dependency.
...
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
2021-01-07 22:22:05 +03:00
Michael Liao
46c3d5cb05
[amdgpu] Add the late codegen preparation pass.
...
Summary:
- Teach that pass to widen naturally aligned but not DWORD aligned
sub-DWORD loads.
Reviewers: rampitec, arsenm
Subscribers:
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80364
2020-10-27 14:07:59 -04:00