| 65 | | |
| 66 | | |
| 67 | | |
| 68 | | |
| 69 | | == Modeling shared variables == |
| | 67 | * `void $omp_read(void *result, $omp_ref ref)` |
| | 68 | ** called by a thread to read a shared object pointed to by `ref`. The result of the read is stored in the memory unit pointed to by `result`. |
| | 69 | * `void $omp_write($shared_ref ref, void *value)` |
| | 70 | ** called by a thread to write to the shared object pointed to by `ref`. The value to be written is taken from the memory unit pointed to by `value`. |
| | 71 | * `void $omp_flush($omp_shared shared)` |
| | 72 | ** performs an OpenMP flush operation on the shared object |
| | 73 | * `void $omp_flush_all($omp_team)` |
| | 74 | ** performs an OpenMP flush operation on all shared objects. This is the default in OpenMP if no argument is specified for a flush construct. |
| | 75 | |
| | 76 | === Worksharing and barriers === |
| | 77 | |
| | 78 | * `void $omp_barrier($omp_team team)` |
| | 79 | ** performs a barrier only. Note however that usually (always?) a barrier is accompanied by a flush-all, so `$omp_barrier_and_flush` should be used instead. |
| | 80 | * `void $omp_barrier_and_flush($omp_team team)` |
| | 81 | ** combines a barrier and a flush on all shared objects owned by the team. Implicit in many OpenMP worksharing constructs. |
| | 82 | * `$domain $omp_arrive_loop($omp_team, $domain loop_dom)` |
| | 83 | ** called by a thread when it reaches an omp for loop, this function returns the subset of the loop domain specifying the iterations that this thread will execute. |
| | 84 | ** `$domain $omp_arrive_sections($omp_team, int numSections)` |
| | 85 | ** called by a thread when it reaches an omp sections construct, this function returns the subset of the integers 0..numSections-1 specifying the indexes of the sections that this thread will execute. The sections are numbered from 0 in increasing order. |
| | 86 | * `int $omp_arrive_single($omp_team team)` |
| | 87 | ** called by a thread when it reaches on omp single construct, returns the thread ID of the thread that will execute the single construct. |
| | 88 | |
| | 89 | |
| | 90 | |
| | 91 | |
| | 92 | |
| | 93 | |
| | 94 | == Memory model == |
| | 95 | |
| | 96 | This section describes how the memory model is modeled. These protocols are used in the implementations |
| | 97 | of the system function dealing with shared objects. |
| 92 | | We will implement the following function, which is implicit in many of the OpenMP constructs: |
| 93 | | {{{ |
| 94 | | barrier_and_flush(); |
| 95 | | }}} |
| 96 | | It does a barrier on `_barrier` and a flush on all shared variables. After this completes, all local copies will agree with each other and with the shared copy of the variable, and all state variables will be -1. |
| 97 | | |
| 98 | | == Modeling worksharing state == |
| 99 | | |
| 100 | | The worksharing state will be stored in another handle type object. The situation here is analogous to the `$gcomm` and `$comm` use for MPI. Those objects store the shared state for message-passing. We need similar object for shared state the coordinates work-sharing and barrier constructs: |
| 101 | | * `$omp_gws`: global work-sharing state |
| 102 | | * `$omp_ws`: local state. A reference to a global object and a thread ID. |
| 103 | | |
| 104 | | The following object is used to specify the sequence of iterations to be assigned to one thread executing an omp for loop: |
| 105 | | {{{ |
| 106 | | typedef struct { |
| 107 | | int numIters; |
| 108 | | int collapse; |
| 109 | | int iters[][]; |
| 110 | | } CIVL_omp_loop_info; |
| 111 | | }}} |
| 112 | | |
| 113 | | The dimensions are `iters[numIters][collapse]`. The integer `iters[i][j]` is the value of the j-th loop variable in the i-th iteration performed by this thread. |
| 114 | | |
| 115 | | The following object is used to specify the subset of section assigned to one thread executing an omp sections construct: |
| 116 | | {{{ |
| 117 | | typedef struct { |
| 118 | | int numSections; |
| 119 | | int sections[]; |
| 120 | | } CIVL_omp_sections_info; |
| 121 | | }}} |
| 122 | | The length of the array `sections` is `numSections`. The integer `sections[i]` is the index of the i-th section that this thread will execute. |
| 123 | | |
| 124 | | API: |
| 125 | | {{{ |
| 126 | | /* Creates new global work-sharing state object, returning |
| 127 | | * handle to it. nthreads is the number of threads in |
| 128 | | * the parallel region. There is one of these per parallel region, |
| 129 | | * created upon entering the region */ |
| 130 | | $omp_gws $omp_gws_create($scope scope, int nthreads); |
| 131 | | |
| 132 | | $omp_gws_destroy($omp_gws gws); |
| 133 | | |
| 134 | | /* Creates a local work-sharing object, which is basically |
| 135 | | * a pair consisting of a global work-sharing handle and |
| 136 | | * a thread id. */ |
| 137 | | $omp_ws $omp_ws_create($scope scope, $omp_gws, int tid); |
| 138 | | |
| 139 | | $omp_ws_destroy($omp_ws ws); |
| 140 | | |
| 141 | | /* for "for" loops only: called when a thread arrives, it |
| 142 | | * returns the sequence of loop iterations to be performed by |
| 143 | | * the thread. Parameter location is the ID of the model location |
| 144 | | * of the top of the loop. It is needed to check that all threads |
| 145 | | * encounter the same worksharing statements in the same order. |
| 146 | | * The implementation will need the value start, the initial value of the loop variable; |
| 147 | | * end is its final value; and inc, the increment (which can be |
| 148 | | * positive or negative). These values can all be obtained by getting |
| 149 | | * the loop statement from the location and evaluating the expressions |
| 150 | | * occurring there.*/ |
| 151 | | CIVL_omp_loop_info $omp_ws_arrive_loop($omp_ws ws, int location); |
| 152 | | |
| 153 | | /* for sections: called at arrival, returns the sequence of sections to |
| 154 | | * be executed by calling thread. The sections are numbered in order, |
| 155 | | * starting from 0. */ |
| 156 | | CIVL_omp_sections_info $omp_ws_arrive_sections($omp_ws ws, int location); |
| 157 | | |
| 158 | | /* for single: called on arrival, returns whether or not to execute |
| 159 | | * the single code */ |
| 160 | | _Bool $omp_ws_arrive_single($omp_ws ws, int location); |
| 161 | | |
| 162 | | /* called when arriving at a barrier. This does not |
| 163 | | * impose the barrier, you still need to call system function |
| 164 | | * $barrier... for that. This is needed to ensure all threads |
| 165 | | * in the team call the same sequence of worksharing and barrier |
| 166 | | * constructs. */ |
| 167 | | void $omp_ws_arrive_barrier($omp_ws ws, int location); |
| 168 | | }}} |
| 169 | | |
| 170 | | What these functions do: basically the global data structure comprises a FIFO queue for each thread. The queue contains work-sharing records, one record for each work-sharing or barrier construct encountered. The record contains the basic information about the construct as provided by the arguments to the arrival function, as well as the distribution chosen for that thread. |
| | 120 | The function `$omp_barrier_and_flush` performs a barrier on the team and a flush on all shared variables. After this completes, all local copies will agree with each other and with the shared copy of the variable, and all state variables will be -1. |
| | 121 | |
| | 122 | == Worksharing model == |
| | 123 | |
| | 124 | This section describes how the system functions dealing with worksharing are implemented. |
| | 125 | |
| | 126 | The global data structure `$omp_gteam` contains a FIFO queue for each thread. The queue contains work-sharing records, one record for each work-sharing or barrier construct encountered. The record contains the basic information about the construct as provided by the arguments to the arrival function, as well as the distribution chosen for that thread. |
| 197 | | $proc _threads[_nthreads]; |
| 198 | | $omp_gws _gws = $omp_gws_create($here, _nthreads); |
| 199 | | $gbarrier _gbarrier = $gbarrier_create($here, _nthreads); |
| 200 | | // declare shared variables and corresponding state variables |
| 201 | | // initialize all state components to -1 |
| 202 | | void _thread(int _tid) { |
| 203 | | $omp_ws _ws = $omp_ws_create($here, _gws, _tid); |
| 204 | | $barrier _barrier = $barrier_create($here, _gbarrier, tid); |
| | 153 | $omp_gteam = $omp_gteam_create($here, nthreads); |
| | 154 | // declare shared variables and create shared objects |
| | 155 | |
| | 156 | $parfor (int _tid : 0..nthreads-1) { |
| | 157 | $omp_team team = $omp_team_create($here, gteam, _tid); |