Renaud Kauffmann 27e458c8cb
[flang][cuda] Distinguish constant fir.global from globals with a #cuf.cuda<constant> attribute (#118912)
1. In `CufOpConversion` `isDeviceGlobal` was renamed
`isRegisteredGlobal` and moved to the common file. `isRegisteredGlobal`
excludes constant `fir.global` operation from registration. This is to
avoid calls to `_FortranACUFGetDeviceAddress` on globals which do not
have any symbols in the runtime. This was done for
`_FortranACUFRegisterVariable` in #118582, but also needs to be done
here after #118591
2. `CufDeviceGlobal` no longer adds the `#cuf.cuda<constant>` attribute
to the constant global. As discussed in #118582 a module variable with
the #cuf.cuda<constant> attribute is not a compile time constant. Yet,
the compile time constant also needs to be copied into the GPU module.
The candidates for copy to the GPU modules are
- the globals needing regsitrations regardless of their uses in device
code (they can be referred to in host code as well)
       - the compile time constant when used in device code 

3. The registration of "constant" module device variables (
#cuf.cuda<constant>) can be restored in `CufAddConstructor`
2024-12-05 18:36:48 -08:00

57 lines
2.1 KiB
C++

//===-- CUFCommon.cpp - Shared functions between passes ---------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#include "flang/Optimizer/Transforms/CUFCommon.h"
#include "flang/Optimizer/Dialect/CUF/CUFOps.h"
#include "mlir/Dialect/Func/IR/FuncOps.h"
#include "mlir/Dialect/LLVMIR/NVVMDialect.h"
/// Retrieve or create the CUDA Fortran GPU module in the give in \p mod.
mlir::gpu::GPUModuleOp cuf::getOrCreateGPUModule(mlir::ModuleOp mod,
mlir::SymbolTable &symTab) {
if (auto gpuMod = symTab.lookup<mlir::gpu::GPUModuleOp>(cudaDeviceModuleName))
return gpuMod;
auto *ctx = mod.getContext();
mod->setAttr(mlir::gpu::GPUDialect::getContainerModuleAttrName(),
mlir::UnitAttr::get(ctx));
mlir::OpBuilder builder(ctx);
auto gpuMod = builder.create<mlir::gpu::GPUModuleOp>(mod.getLoc(),
cudaDeviceModuleName);
mlir::Block::iterator insertPt(mod.getBodyRegion().front().end());
symTab.insert(gpuMod, insertPt);
return gpuMod;
}
bool cuf::isInCUDADeviceContext(mlir::Operation *op) {
if (!op)
return false;
if (op->getParentOfType<cuf::KernelOp>() ||
op->getParentOfType<mlir::gpu::GPUFuncOp>())
return true;
if (auto funcOp = op->getParentOfType<mlir::func::FuncOp>()) {
if (auto cudaProcAttr = funcOp->getAttrOfType<cuf::ProcAttributeAttr>(
cuf::getProcAttrName())) {
return cudaProcAttr.getValue() != cuf::ProcAttribute::Host;
}
}
return false;
}
bool cuf::isRegisteredDeviceGlobal(fir::GlobalOp op) {
if (op.getConstant())
return false;
auto attr = op.getDataAttr();
if (attr && (*attr == cuf::DataAttribute::Device ||
*attr == cuf::DataAttribute::Managed ||
*attr == cuf::DataAttribute::Constant))
return true;
return false;
}