| Version 1 (modified by , 4 years ago) ( diff ) |
|---|
CUDA Overview
Introduction
This page describes how we translate CUDA programs into CIVL-C code. Primarily, we focus on how the cuda-civl library is organized and is used in our final translation of a CUDA program. We assume basic knowledge of CUDA concepts such as steams, kernels, blocks, and threads.
The CUDA Context
cuda-civl provides a structure called $cuda_context_t which is meant to house all CUDA information that pertains globally to a CUDA program. As such, our translation only creates one instance of this structure as a global variable simply called $cuda_current_context. Currently, the only information that $cuda_context_t manages is the set of CUDA streams being used in the program (including the null stream which is present in every program).
typedef struct $cuda_context $cuda_context_t;
struct $cuda_context {
$cuda_stream_node_t* headNode;
cudaStream_t nullStream;
int numStreams;
};
$cuda_stream_node_t is simply a structure which holds a cudaStream_t and a pointer to another $cuda_stream_node_t. In other words it is a linked list of cudaStream_t's. In general, we use the pattern in which types of the form <T>_node_t are structures representing nodes of a linked list containing type T. The streams in this list are the "non-default" or "non-null" CUDA streams meant for asynchronous execution of kernels. The integer numStreams represents the size of this list. nullStream obviously holds the null stream which is used by default when executing kernels and is executed sequentially. Thus, the number of total streams at any given time of the program is
$cuda_current_context.numStreams + 1.
The way in which these values are initialized, managed, and ultimately destroyed, is discussed later in section ADD SECTION REF HERE
