Previously any load (global, local or constant) feeding into a
global load or store would be counted as an indirect access. This
patch only counts global loads feeding into a global load or store.
The rationale is that the latency for global loads is generally
much larger than the other kinds.
As a side effect this makes it easier to write small kernels test
cases that are not counted as having indirect accesses, despite
the fact that arguments to the kernel are accessed with an SMEM
load.
Differential Revision: https://reviews.llvm.org/D122804
Unfortunately this just shows that the test case for D47740 never
really tested what it was supposed to test.
Differential Revision: https://reviews.llvm.org/D122664
A function with less memory instructions but wider access
is the same as a function with more but narrower accesses
in terms of memory boundness. In fact the pass would give
different answers before and after vectorization without
this change.
Differential Revision: https://reviews.llvm.org/D105651
Rational: if there is indirect access that is usually an issue
because load is not ready by the use. However, if use is inside a
loop and load is outside that is potentially an issue for a first
iteration only.
Differential Revision: https://reviews.llvm.org/D47740
llvm-svn: 334420
This is adoption of HSAIL perfhint pass. Two types of hints are produced:
1. Function is memory bound.
2. Kernel can use wave limiter.
Currently these hints are used in the scheduler. If a function is suspected
to be memory bound we allow occupancy to decrease to 4 waves in the course
of scheduling.
Differential Revision: https://reviews.llvm.org/D46992
llvm-svn: 333289