602 Commits

Author SHA1 Message Date
Gang Chen
60dbde69cd
[AMDGPU] report named barrier cnt part2 (#154588) 2025-08-20 12:00:45 -07:00
Gang Chen
ef68d1587d
[AMDGPU] upstream barrier count reporting part1 (#154409) 2025-08-19 16:42:31 -07:00
Stanislav Mekhanoshin
3d6177c14b
[AMDGPU] Avoid setting op_sel_hi bits if there is matrix_b_scale. NFCI. (#154176)
This is NFCI now as there is no matrix_b_scale without matrix_b_reuse,
but technically this condition shall be here.
2025-08-18 12:13:31 -07:00
Kazu Hirata
cbf5af9668
[llvm] Remove unused includes (NFC) (#154051)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-08-17 23:46:35 -07:00
Fangrui Song
2cedb286b8 MCSymbol: Remove unused IsTarget parameter from declareCommon 2025-08-16 15:47:39 -07:00
Stanislav Mekhanoshin
57c1e01e48
[AMDGPU] Don't allow wgp mode on gfx1250 (#153680)
- gfx1250 only supports cu mode
2025-08-14 15:16:56 -07:00
Stanislav Mekhanoshin
ea14834966
[AMDGPU] Per-subtarget DPP instruction classification (#153096)
This is NFCI at this point.
2025-08-11 15:41:02 -07:00
Stanislav Mekhanoshin
37fe9f6382
[AMDGPU] Add gfx1250 v_wmma_scale[16]_f32_16x16x128_f8f6f4 MC support (#152014)
This adds new VOP3PX2e encoding
2025-08-04 14:20:12 -07:00
Stanislav Mekhanoshin
dd0737bd99
[AMDGPU] gfx1250 v_wmma_ld_scale instructions (#152010) 2025-08-04 11:36:48 -07:00
Fangrui Song
e640ca8b9a MCSymbolELF: Migrate away from classof
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
2025-08-03 15:45:36 -07:00
Fangrui Song
d3589edafc MCAsmBackend::applyFixup: Change Data to indicate the relocated location
`Data` now references the first byte of the fixup offset within the current fragment.

MCAssembler::layout asserts that the fixup offset is within either the
fixed-size content or the optional variable-size tail, as this is the
most the generic code can validate without knowing the target-specific
fixup size.

Many backends applyFixup assert
```
assert(Offset + Size <= F.getSize() && "Invalid fixup offset!");
```

This refactoring allows a subsequent change to move the fixed-size
content outside of MCSection::ContentStorage, fixing the
-fsanitize=pointer-overflow issue of #150846

Pull Request: https://github.com/llvm/llvm-project/pull/151724
2025-08-02 09:27:06 -07:00
Fangrui Song
491c7bdd58 MCAsmBackend::applyFixup: Replace Data.getSize() with F.getSize()
to facilitate replacing `MutableArrayRef<char> Data` (fragment content)
with the relocated location. This is necessary to fix the
pointer-overflow sanitizer issue and reland #150846
2025-08-01 00:31:51 -07:00
Stanislav Mekhanoshin
c7bb105e97
[AMDGPU] Add v_cvt_scale_pk8_* gfx1250 instructions (#151616) 2025-07-31 18:55:59 -07:00
Stanislav Mekhanoshin
ce40863209
[AMDGPU] Add v_cvt_sr|pk_bf8|fp8_f16 gfx1250 instructions (#151415) 2025-07-30 17:24:45 -07:00
Pierre van Houtryve
be17791f26
[AMDGPU][gfx1250] Add cu-store subtarget feature (#150588)
Determines whether we can use `SCOPE_CU` stores (on by default), or
whether all stores must be done at `SCOPE_SE` minimum.
2025-07-29 11:38:43 +02:00
Stanislav Mekhanoshin
a0b854d576
[AMDGPU] MC support for gfx1250 scale_offset modifier (#149881) 2025-07-21 15:04:59 -07:00
Changpeng Fang
d6094370cb
AMDGPU: Support v_wmma_f32_16x16x128_f8f6f4 on gfx1250 (#149684)
Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
2025-07-21 10:09:42 -07:00
Stanislav Mekhanoshin
6d8e53d4af
[AMDGPU] Support nv memory instructions modifier on gfx1250 (#149582) 2025-07-18 14:38:46 -07:00
Stanislav Mekhanoshin
2d6534b7da
[AMDGPU] gfx1250 64-bit relocations and fixups (#148951) 2025-07-15 17:13:42 -07:00
Changpeng Fang
b80b02536b
AMDGPU: Implement MC layer support for gfx1250 wmma instructions. (#148570)
Regular wmma/swmmac plus matrix reuse only.

---------

Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
Co-authored-by: Shilei Tian <Shilei.Tian@amd.com>
2025-07-15 00:48:57 -07:00
Fangrui Song
5ba458c559 MCFixup: Replace getTargetKind with getKind 2025-07-15 00:21:07 -07:00
Fangrui Song
0b674f4c52 MCFixup: Replace getTargetKind with getKind
MCFixupKind is now a type alias (fixup kinds are inherently
target-specific). getTargetKind is no longer necessary.
2025-07-15 00:08:45 -07:00
Stanislav Mekhanoshin
f090554359
[AMDGPU] MC support for v_fmaak_f64/v_fmamk_f64 gfx1250 intructions (#148282) 2025-07-11 14:17:03 -07:00
Mariusz Sikora
d7859ed047
[AMDGPU][NFC] Remove unused return (#147912) 2025-07-10 11:33:35 +02:00
Stanislav Mekhanoshin
00a85e5704
[AMDGPU] gfx1250: MC support for 64-bit literals (#147861) 2025-07-09 22:25:47 -07:00
LU-JOHN
85cc4afdef
[NFC][AMDGPU] Do not hardcode minimum instruction alignment (#147785)
Use symbolic value for minimum instruction alignment.

Signed-off-by: John Lu <John.Lu@amd.com>
2025-07-09 14:24:33 -04:00
Fangrui Song
0393084adc
MC: Store MCRelaxableFragment MCInst out-of-line
Follow-up to #146307

Moved MCInst storage to MCSection, enabling trivial ~MCRelaxableFragment
and eliminating the need for a fragment walk in ~MCSection.

Updated MCRelaxableFragment::getInst to construct an MCInst on demand.
Modified MCAssembler::relaxInstruction's mayNeedRelaxation to accept
opcode and operands instead of an MCInst, avoiding redundant MCInst
creation. Note that MCObjectStreamer::emitInstructionImpl calls
mayNeedRelaxation before determining the target fragment for the MCInst.

Unfortunately, we also have to encode `MCInst::Flags` to support
the EVEX prefix, e.g. `{evex} xorw $foo, %ax`

There is a small decrease in max-rss (stage1-ReleaseLTO-g (link only))
with negligible instructions:u change.
https://llvm-compile-time-tracker.com/compare.php?from=0b533f2d9f0551aaffb13dcac8e0fd0a952185b5&to=f26b57f33bc7ccae749a57dfc841de7ce2acc2ef&stat=max-rss&linkStats=on

Next: Enable MCFragment to store fixed-size data (was MCDataFragment's job)
and optional Opcode/Operands data (was MCRelaxableFragment's job),
and delete MCDataFragment/MCRelaxableFragment.
This will allow re-encoding of Data+Relax+Data+Relax sequences as
Frag+Frag. The saving should outweigh the downside of larger
MCFragment.

Pull Request: https://github.com/llvm/llvm-project/pull/147229
2025-07-08 09:44:27 -07:00
Fangrui Song
5a40023497 MCAsmBackend: Reduce FK_NONE uses 2025-07-05 13:22:07 -07:00
Fangrui Song
244e053b6c MC: Remove llvm/MC/MCFixupKindInfo.h
The file used to define `MCFixupKindInfo`, a simple structure,
which is now in MCAsmBackend.h.
2025-07-05 11:24:11 -07:00
Fangrui Song
158fa4ae83 AMDGPU: Replace deprecated FK_PCRel_ with FK_Data_ fixup and PCRel flag
We will unify the generic fixup kinds FK_Data_ and FK_PCRel_. A
FK_PCRel_ kind is essentially the corresponding FK_Data_ fixup with the
PCRel flag set.
2025-07-04 22:45:52 -07:00
Fangrui Song
b418e73bec AMDGPUMCCodeEmitter: Set PCRel at fixup creation
Avoid reliance on the MCAssembler::evaluateFixup workaround that checks
MCFixupKindInfo::FKF_IsPCRel. Additionally, standardize how fixups are
appended. This helper will facilitate future fixup data structure
optimizations.
2025-07-04 18:00:29 -07:00
Fangrui Song
20b3ab5683 MCFixup: Remove unused Loc argument
MCFixup::Loc has been removed in favor of MCExpr::Loc through
`const MCExpr *Value` (commit 777391a2164b89d2030ca013562151ca3c3676d1).
2025-07-04 12:23:04 -07:00
Changpeng Fang
eda3161c35
AMDGPU: Implement tensor load and store instructions for gfx1250 (#146636)
Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
2025-07-03 13:49:34 -07:00
Fangrui Song
b59763a7db MCAsmBackend: Simplify shouldForceRelocation overrides 2025-07-02 23:30:36 -07:00
Fangrui Song
dd2891535d
MCAsmBackend: Merge addReloc into applyFixup (#146820)
Follow-up to #141333. Relocation generation called both addReloc and
applyFixup, with the default addReloc invoking shouldForceRelocation,
resulting in three virtual calls. This approach was also inflexible, as
targets needing additional data required extending
`shouldForceRelocation` (see #73721, resolved by #141311).

This change integrates relocation handling into applyFixup, eliminating
two virtual calls. The prior default addReloc is renamed to
maybeAddReloc. Targets overriding addReloc now call their customized
addReloc implementation.
2025-07-02 23:14:11 -07:00
Fangrui Song
56b2c7d988 MC: Rename initializeVariantKinds to initializeAtSpecifiers
We introduced VariantKinds after MCSymbolRefExpr::VariantKind and then
deprecated the VariantKind naming in favor of AtSpecifier (#133214).
Rename the function and type to use the recommended convention.
2025-06-26 22:46:20 -07:00
Nicolai Hähnle
d58b0f23d0
AMDGPU/MC: Try harder to evaluate absolute MC expressions (#145146)
This is a follow-up to commit 24c860547e8 ("AMDGPU/MC: Fix emitting
absolute expressions (#136789)").

In some downstream work, we end up with an MCTargetExpr that is a
maximum (AGVK_Max) in an instruction operand. getMachineOpValueCommon
recognizes the absolute nature of the expression and doesn't emit a
fixup. getLitEncoding needs to be aligned with this decision, else we
end up with a 0 immediate without a corresponding fixup.

Note that evaluateAsAbsolute checks for MCConstantExpr as a fast path,
so this accepts strictly more cases than before.

I've tried several ways to write a test for this without success. The
challenge is that there is no upstream way to generate this kind of
expression in an instruction operand natively, and trying to create one
via inline assembly fails because the assembly parser evaluates the
expression to a constant during parsing.
2025-06-26 19:22:44 -07:00
Diana Picus
a201f8872a
[AMDGPU] Replace dynamic VGPR feature with attribute (#133444)
Use a function attribute (amdgpu-dynamic-vgpr) instead of a subtarget
feature, as requested in #130030.
2025-06-24 11:09:36 +02:00
Stanislav Mekhanoshin
69974658f0
[AMDGPU] Initial support for gfx1250 target. (#144965)
This is just a stub for now.
2025-06-19 22:52:51 -07:00
Andrew Rogers
19658d1474
[llvm] annotate interfaces in llvm/Target for DLL export (#143615)
## Purpose

This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/Target` library.
These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.

## Background

This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).

A sub-set of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.

The bulk of this change is manual additions of `LLVM_ABI` to
`LLVMInitializeX` functions defined in .cpp files under llvm/lib/Target.
Adding `LLVM_ABI` to the function implementation is required here
because they do not `#include "llvm/Support/TargetSelect.h"`, which
contains the declarations for this functions and was already updated
with `LLVM_ABI` in a previous patch. I considered patching these files
with `#include "llvm/Support/TargetSelect.h"` instead, but since
TargetSelect.h is a large file with a bunch of preprocessor x-macro
stuff in it I was concerned it would unnecessarily impact compile times.

In addition, a number of unit tests under llvm/unittests/Target required
additional dependencies to make them build correctly against the LLVM
DLL on Windows using MSVC.

## Validation

Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:

- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
2025-06-17 13:28:45 -07:00
Fangrui Song
95acd6199f AMDGPU: Replace deprecated MCExpr::print with MCAsmInfo::printExpr 2025-06-15 17:11:20 -07:00
Fangrui Song
28bda77843
Introduce MCAsmInfo::UsesSetToEquateSymbol and prefer = to .set
Introduce MCAsmInfo::UsesSetToEquateSymbol to control the preferred
syntax for symbol equating. We now favor the more readable and common
`symbol = expression` syntax over `.set`. This aligns with pre- https://reviews.llvm.org/D44256 behavior.

On Apple platforms, this resolves a clang -S vs -c behavior difference (resolves #104623).

For targets whose = support is unconfirmed, UsesSetToEquateSymbol is set to false.
This also minimizes test updates.

Pull Request: https://github.com/llvm/llvm-project/pull/142289
2025-06-11 22:19:31 -07:00
Fangrui Song
ce270b495d MCExpr: Move isSymbolUsedInExpression workaround to AMDGPU
This function was a workaround used to detect cyclic dependency
(properly resolved by 343428c666f9293ae260bbcf79130562b830b268).
We do not want backends to use it. However, #112251 exposed it to MCExpr
to be reused by AMDGPU. Keep the workaround within AMDGPU to prevent
other backends from accidentally relying on it.
2025-06-08 00:02:27 -07:00
Fangrui Song
97a32f2ad9 MC: Add MCSpecifierExpr to unify target MCExprs
Many targets define MCTargetExpr subclasses just to encode an expression
with a relocation specifier. Create a generic MCSpecifierExpr to be
inherited instead. Migrate M68k and SPARC as examples.
2025-06-07 11:33:40 -07:00
Fangrui Song
b3873e8aa4 MCSymbol: Remove the default argument of getVariableValue
It has been made ineffective by e015626f189dc76f8df9fdc25a47638c6a2f3feb.
This change migrates the users.
2025-05-27 20:34:18 -07:00
Fangrui Song
fe32806d67 ELFObjectWriter: Remove the MCContext argument from getRelocType
Additionally, swap MCFixup/MCValue order to match addReloc/recordRelocation.
2025-05-24 21:48:11 -07:00
Fangrui Song
068868d7ac ELFObjectWriter: Replace Ctx.reportError with reportError
Prepare for removing MCContext from getRelocType functions.
2025-05-24 21:11:17 -07:00
Kazu Hirata
1e8e662174
[AMDGPU] Remove unused includes (NFC) (#141376)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-05-24 14:48:46 -07:00
Fangrui Song
15c9f2781e MCAsmBackend: Remove the MCAssembler argument from shouldForceRelocation
It is only required by ARM, which can now use the member variable.
2025-05-23 23:21:30 -07:00
Fangrui Song
871b0a3221
MCAsmBackend: Simplify applyFixup (#141333)
Remove the MCSubtargetInfo argument from applyFixup, introduced in
https://reviews.llvm.org/D45962 , as it's only required by ARM. Instead,
add const MCFragment & so that ARMAsmBackend can retrieve
MCSubtargetInfo via a static member function.

Additionally, remove the MCAssembler argument, which is also only
required by ARM.

Additionally, make applyReloc non-const. Its arguments now fully cover
addReloc's functionality.
2025-05-23 23:09:56 -07:00