5 Commits

Author SHA1 Message Date
Sergei Barannikov
03847f19f2
[SelectionDAG] Add empty implementation of SelectionDAGInfo to some targets (#119968)
#119969 adds a couple of new methods to this class, which will need to
be overridden by these targets.

Part of #119709.

Pull Request: https://github.com/llvm/llvm-project/pull/119968
2024-12-16 15:13:46 +03:00
Jay Foad
6f956e3117
[AMDGPU] Rename LocalMemorySize features to AddressableLocalMemorySize (#110242)
Change the names of the TableGen features to match the names used by
AMDGPUSubtarget. "Addressable" refers to the amount that can be accessed
by a single workgroup. Add some explanatory comments. NFC.
2024-09-30 10:29:31 +01:00
Nicolai Hähnle
10cef708a7 AMDGPU: Clean up LDS-related occupancy calculations
Occupancy is expressed as waves per SIMD. This means that we need to
take into account the number of SIMDs per "CU" or, to be more precise,
the number of SIMDs over which a workgroup may be distributed.

getOccupancyWithLocalMemSize was wrong because it didn't take SIMDs
into account at all.

At the same time, we need to take into account that WGP mode offers
access to a larger total amount of LDS, since this can affect how
non-power-of-two LDS allocations are rounded. To make this work
consistently, we distinguish between (available) local memory size and
addressable local memory size (which is always limited by 64kB on
gfx10+, even with WGP mode).

This change results in a massive amount of test churn. A lot of it is
caused by the fact that the default work group size is 1024, which means
that (due to rounding effects) the default occupancy on older hardware
is 8 instead of 10, which affects scheduling via register pressure
estimates. I've adjusted most tests by just running the UTC tools, but
in some cases I manually changed the work group size to 32 or 64 to make
sure that work group size chunkiness has no effect.

Differential Revision: https://reviews.llvm.org/D139468
2023-01-23 21:43:06 +01:00
Jay Foad
8a53b25ed5 [AMDGPU] Use default member initializers in Subtarget classes
Use default member initializers in AMDGPUSubtarget and subclasses. This
is to guard against adding a new feature boolean in AMDGPUSubtarget.h
but forgetting to initialize it to false in AMDGPUSubtarget.cpp.

This was mostly autogenerated by:
clang-tidy -checks=-*,cppcoreguidelines-prefer-member-initializer,modernize-use-default-member-init -header-filter=Subtarget -fix lib/Target/AMDGPU/*Subtarget.cpp

Differential Revision: https://reviews.llvm.org/D123613
2022-04-12 16:42:30 +01:00
Daniil Fukalov
48958d02d2 [NFC][AMDGPU] Reduce includes dependencies.
1. Splitted out some parts of R600 target to separate modules/headers.
2. Reduced some include lists in headers.
3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()`
   and `R600TargetMachine::getSubtargetImpl()` had different return value type
   than base class.
4. Minor forward declarations cleanup.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D108596
2021-08-25 12:01:55 +03:00