= CUDA Overview = == Introduction == This page describes how we translate CUDA programs into CIVL-C code. Primarily, we focus on how the cuda-civl library is organized and is used in our final translation of a CUDA program. We assume basic knowledge of CUDA concepts such as steams, kernels, blocks, and threads. == The CUDA Context == cuda-civl provides a structure called `$cuda_context_t` which is meant to house all CUDA information that pertains globally to a CUDA program. As such, our translation only creates one instance of this structure as a global variable simply called `$cuda_current_context`. Currently, the only information that `$cuda_context_t` manages is the set of CUDA streams being used in the program (including the null stream which is present in every program). {{{ typedef struct $cuda_context $cuda_context_t; struct $cuda_context { $cuda_stream_node_t* headNode; cudaStream_t nullStream; int numStreams; }; }}} `$cuda_stream_node_t` is simply a structure which holds a `cudaStream_t` and a pointer to another `$cuda_stream_node_t`. In other words it is a linked list of `cudaStream_t`'s. In general, we use the pattern in which types of the form `_node_t` are structures representing nodes of a linked list containing type `T`. The streams in this list are the "non-default" or "non-null" CUDA streams meant for asynchronous execution of kernels. The integer `numStreams` represents the size of this list. `nullStream` obviously holds the null stream which is used by default when executing kernels and is executed sequentially. Thus, the number of total streams at any given time of the program is `$cuda_current_context.numStreams + 1`. The way in which these values are initialized, managed, and ultimately destroyed, is discussed later in section '''ADD SECTION REF HERE''' == CUDA Streams ==