| | 1 | This is a proposal to add some new constructs to CIVL-C. The goals are to make it easier to translate from source languages/APIs to CIVL-C and to enable optimizations in the model checker and possibly other static analysis tools. |
| | 2 | |
| | 3 | == New types == |
| | 4 | |
| | 5 | * `$range` : an ordered set of integers |
| | 6 | * `$domain(n)` : an ordered set of n-tuples of integers; an "n-dimensional" domain. n is at least 1. |
| | 7 | * `$domain`: union over all n>=1 of `$domain(n)`. For each n, `$domain(n)` is a subtype of `$domain`. it is exactly the same as the array types in CIVL-C. |
| | 8 | |
| | 9 | == New functions dealing with the types above == |
| | 10 | |
| | 11 | {{{ |
| | 12 | // range consisting of lo, lo+step, lo+2*step, ... |
| | 13 | // the sequence stops just before the first number |
| | 14 | // greater than hi. |
| | 15 | $range $range_regular(int lo, int hi, int step); |
| | 16 | |
| | 17 | // domain which is the Cartesian products of the given ranges. |
| | 18 | // this is a var-args function; the number of arguments determines |
| | 19 | // the dimension of the domain returned. The order is dictionary |
| | 20 | // order. |
| | 21 | $domain $domain_rectangular($range r1, ...); |
| | 22 | |
| | 23 | // returns the dimension of the domain |
| | 24 | int $domain_get_dim($domain d); |
| | 25 | }}} |
| | 26 | |
| | 27 | |
| | 28 | -- New statements == |
| | 29 | |
| | 30 | {{{ |
| | 31 | $for (i1, i2, ... : dom) S |
| | 32 | }}} |
| | 33 | |
| | 34 | `i1`, `i2`, ... are variables of integer type. These are called the |
| | 35 | "index variables". `dom` is an expression of type `$domain`. |
| | 36 | The number of index variables must equal the dimension of |
| | 37 | `dom`. |
| | 38 | |
| | 39 | `S` is a statement. |
| | 40 | |
| | 41 | Semantics: loop over all elements of the domain |
| | 42 | and execute S for each such element. |
| | 43 | |
| | 44 | In the form above, the index variables must have been previously |
| | 45 | declared; alternatively, they can be declared inside the statement: |
| | 46 | {{{ |
| | 47 | $for (int i1, i2, ... : dom) S |
| | 48 | }}} |
| | 49 | |
| | 50 | Next, a parallel version of above: |
| | 51 | |
| | 52 | {{{ |
| | 53 | $parfor (i1, i2, ... : dom) S |
| | 54 | }}} |
| | 55 | Similar to `$for`, but instead of executing sequentially, |
| | 56 | a process (thread) is instantiated for |
| | 57 | each element of the domain and they all execute concurrently. |
| | 58 | There is an implicit barrier at the end of S and all the processes |
| | 59 | are waited on. |
| | 60 | |
| | 61 | == Example 1: OpenMP parallel construct == |
| | 62 | |
| | 63 | {{{ |
| | 64 | #pragma omp parallel { |
| | 65 | ... |
| | 66 | } |
| | 67 | }}} |
| | 68 | |
| | 69 | => |
| | 70 | |
| | 71 | {{{ |
| | 72 | int nthreads = 1+$choose_int(THREADS_MAX); |
| | 73 | $range thread_range = $range_regular(0, nthreads-1); |
| | 74 | $domain(1) thread_dom = $domain_rectangular(thread_range); |
| | 75 | $parfor (int i : thread_dom) { |
| | 76 | ... |
| | 77 | } |
| | 78 | }}} |
| | 79 | |
| | 80 | == Example 2: CUDA 3d grid == |
| | 81 | |
| | 82 | {{{ |
| | 83 | __kernel__ grid() { |
| | 84 | ... |
| | 85 | } |
| | 86 | |
| | 87 | grid<<<nx,ny,nz>>>(); |
| | 88 | }}} |
| | 89 | |
| | 90 | => |
| | 91 | |
| | 92 | {{{ |
| | 93 | grid(int idx, int idy, int idz) { |
| | 94 | ... |
| | 95 | } |
| | 96 | |
| | 97 | $range grid_range_x = $range_regular(0, nx-1); |
| | 98 | $range grid_range_y = $range_regular(0, ny-1); |
| | 99 | $range grid_range_z = $range_regular(0, nz-1); |
| | 100 | $domain(3) block_dom = $domain_rectangular(grid_range_x, |
| | 101 | grid_range_y, grid_range_z); |
| | 102 | $parfor (int idx, idy, idz : block_dom) grid(idx, idy, idz); |
| | 103 | }}} |
| | 104 | |
| | 105 | == Optimization == |
| | 106 | |
| | 107 | The purpose of the `parfor` construct is to enable possible optimization |
| | 108 | in CIVL. The model checker, for example, when control reaches the |
| | 109 | `parlor` loop, may try to determine that any two iterations are |
| | 110 | non-interfering, i.e., any atomic statement from one iteration will |
| | 111 | commute with any atomic statement from a different iteration. |
| | 112 | In this case, it will just execute the loop sequentially without |
| | 113 | creating any new threads. |