Changes between Version 4 and Version 5 of Implementation_of_CUDA_in_CIVL


Ignore:
Timestamp:
03/17/22 13:56:49 (4 years ago)
Author:
Alex Wilton
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Implementation_of_CUDA_in_CIVL

    v4 v5  
    6262
    6363== CUDA Kernels ==
     64=== The Signature of a Kernel ===
     65A typical cuda kernel call has the form:
     66{{{
     67K<<<gridDim, blockDim, memSize, cudaStream>>>(args);
     68}}}
     69The first four parameters inside the angle brackets are called the ''execution configuration parameters'' or simply the ''configuration parameters''. The parameters `gridDim` and `blockDim` are then implicitly available from within the kernel under those names. To support these features, we simply transformation the definition of the kernel to be a regular function with the function signature
     70{{{
     71void _cuda_K(dim3 gridDim, dim3 blockDim, size_t _cuda_mem_size, cudaStream_t _cuda_stream, args);
     72}}}
     73The function call is then transformed to match this new signature.
     74
     75=== The Layers of a CUDA Kernel in CIVL-C ===
     76The transformation described above certainly allows us to support the use of the configuration parameters. However, we still have to somehow add code which will emulate the true execution of a CUDA kernel with the 4 configuration parameters given. That means we must spawn the appropriate number of threads, each with appropriate local CUDA parameters `blockIdx` and `threadIdx` in scope and given a value, and then appropriately enqueue the kernel into the given stream, waiting as necessary on other cuda kernels in the stream.