= CIVL Support For CUDA =

== Introduction ==

CIVL has support for a small subset of CUDA features such as

1. Defining CUDA kernels with the `__global__` specifier,
2. Enqueuing kernel calls into streams,
3. Allocating, deallocating, and copying to and from device memory.

CIVL supports CUDA by automatically detecting the use of these features and translating them into CIVL-C code to be analyzed as usual.

== Supported Features ==

* CUDA kernels with the `__global__` specifier
* The dim3 struct type
* Use of the CUDA variables `threadIdx`, `blockIdx`, `gridDim`, and `blockDim`
* `__syncthreads`
* `__shared__`
* Enqueuing multiple kernel calls into streams
* `cudaMalloc`
* `cudaMemcpy`
* `cudaFree`
* `cudaDeviceSynchronize`

== Major Limitations ==

=== Missing Features ===

* Use of the `warpSize` variable
* Atomic functions (e.g. `atomicAdd`)