
This adds support for all the surface read and write calls to clang. It extends the pattern used for textures to surfaces too. I tested this by generating all the various permutations of the calls and argument types in a python script, compiling them with both clang and nvcc, and comparing the generated ptx for equivilence. They all agree, ignoring register allocation, and some places where Clang picks different memory write instructions. An example kernel is: ``` __global__ void testKernel(cudaSurfaceObject_t surfObj, int x, float2* result) { *result = surf1Dread<float2>(surfObj, x, cudaBoundaryModeZero); } ``` --------- Signed-off-by: Austin Schuh <austin.linux@gmail.com>
3 lines
66 B
C
3 lines
66 B
C
// required for __clang_cuda_runtime_wrapper.h tests
|
|
#pragma once
|