10 Commits

Author SHA1 Message Date
Phoebe Wang
08ddbab866
[X86][AMX] Fix missing stride register for tileloadd (#110226)
Fixes: #110190
2024-10-15 13:02:00 +08:00
Phoebe Wang
e5c93ed348
[X86][AMX] Checking AMXProgModel in X86LowerTileCopy (#94358)
This fixes compile time regression after #93692.
2024-06-06 21:10:14 +08:00
Phoebe Wang
8aa33f16e9
[X86][AMX] Check also AMX register live out for copy lowering (#93692)
Another bug fix for #83628.
2024-06-03 22:12:22 +08:00
Phoebe Wang
b576a6b045
[X86][AMX] Fix a bug after #83628 (#91207)
We need to check if `GR64Cand` a valid register before using it.

Test is not needed since it's covered in llvm-test-suite.

Fixes #90954
2024-05-15 23:15:48 +08:00
Phoebe Wang
42bc4f692d Reland "[X86] X86LowerTileCopy: Find dead register to use to prevent save-reload of tile register (#83628)"
Fixes compile time regression in previous commit.
2024-04-29 09:52:50 +08:00
Nikita Popov
e32c4dfefc Revert "[X86] X86LowerTileCopy: Find dead register to use to prevent save-reload of tile register (#83628)"
This reverts commit 34acbb3801515f9f41cc2d790d26072eb004ac46.

This change causes major compile-time regressions.
2024-04-21 16:02:34 +09:00
AtariDreams
34acbb3801
[X86] X86LowerTileCopy: Find dead register to use to prevent save-reload of tile register (#83628) 2024-04-21 09:40:06 +08:00
XinWang10
dd6fec5d4f
[X86][APX]Support lowering for APX promoted AMX-TILE instructions (#78689)
The enc/dec of promoted AMX-TILE instructions have been supported in
https://github.com/llvm/llvm-project/pull/76210.
This patch support lowering for promoted AMX-TILE instructions and
integrate test to existing tests.
2024-01-22 11:33:23 +08:00
Kazu Hirata
2c4ba3e9d3 [Target] Use make_early_inc_range (NFC) 2021-11-05 09:14:32 -07:00
Luo, Yuanke
8f48ddd193 [X86][AMX] Lower tile copy instruction.
Since there is no tile copy instruction, we need to store tile
register to stack and load from stack to another tile register.
We need extra GR to hold the stride, and we need stack slot to
hold the tile data register. We would run this pass after copy
propagation, so that we don't miss copy optimization. And we
would run this pass before prolog/epilog insertion, so that we
can allocate stack slot.

Differential Revision: https://reviews.llvm.org/D97112
2021-02-23 07:49:42 +08:00