| Version 18 (modified by , 12 years ago) ( diff ) |
|---|
OpenMP Primitives
Constructs
parallel- worksharing
forsectionsandsectionsingle
- synchronization
barriercriticalatomicorderedmaster
threadprivateflush
Clauses
private(list)firstprivate(list)lastprivate(list)copyin(list)shared(list)default(none|shared)num_threads(n)schedule(static, n)schedule(dynamic, n)- ...
orderednowait
Functions
omp_get_num_threads()omp_get_thread_num()
Helper primitives
$int_iter is a handle type for an iterator of integers.
/* Tells whether the integer iterator has any more elements */ _Bool $int_iter_hasNext($int_iter iter); /* Returns the next element in the iterator (and updates the iterator) */ int $int_iter_next($int_iter iter);
Worksharing state
The worksharing state will be stored in another handle type object. The situation here is analogous to the $gcomm and $comm use for MPI. Those objects store the shared state for message-passing. We need similar object for shared state the coordinates work-sharing and barrier constructs:
$omp_gws: global work-sharing state$omp_ws: local state. A reference to a global object and a thread ID.
API:
/* Creates new global work-sharing state object, returning * handle to it. nthreads is the number of threads in * the parallel region. There is one of these per parallel region, * created upon entering the region */ $omp_gws $omp_gws_create($scope scope, int nthreads); $omp_gws_destroy($omp_gws gws); /* Creates a local work-sharing object, which is basically * a pair consisting of a global work-sharing handle and * a thread id. */ $omp_ws $omp_ws_create($scope scope, $omp_gws, int tid); $omp_ws_destroy($omp_ws ws); /* for "for" loops only: called when a thread arrives, it * returns the sequence of loop iterations to be performed by * the thread. Parameter location is the ID of the model location * of the top of the loop. It is needed to check that all threads * encounter the same worksharing statements in the same order. * Parameter start is the initial value of the loop variable; * end is its final value; and inc is the increment (which can be * positive or negative). */ $int_iter $omp_ws_arrive_loop($omp_ws ws, int location, int start, int end, int inc); /* for sections: called at arrival, returns the sequence of sections to * be executed by calling thread. The sections are numbered in order, * starting from 0. */ $int_iter $omp_ws_arrive_sections($omp_ws ws, int location); /* for single: called on arrival, returns whether or not to execute * the single code */ _Bool $omp_ws_arrive_single($omp_ws ws, int location); /* called when arriving at a barrier. This does not * impose the barrier, you still need to call system function * $barrier... for that. This is needed to ensure all threads * in the team call the same sequence of worksharing and barrier * constructs. */ void $omp_ws_arrive_barrier($omp_ws ws, int location);
Translation Strategies
Translating shared variables
For each shared variable v introduce a second variable v_state. The type of v_state is obtained from the type of v by replacing all primitive types (leaf nodes in the type tree) by int. Initially all these ints are -1.
A write to (some part of) the shared variable by thread tid:
- if the state value is -1, set it to tid, then do the write
- if the state value is tid, do the write
- else report a data race.
A read from (some part of) the shared variable by thread tid:
- if the state value is -1 or tid, do the read
- else report a data race.
Translating flushof (some part of) the shared variable by thread tid:
- if the state value is -1: no-op
- if the state value is tid: set it to -1
- else: some other thread has some write to the variable which it hasn't flushed. Not exactly sure what is supposed to happen, but I think this is like a race condition
Translating parallel
parallel: this spawns some nondeterministic number of threads. We will assume there is a constant THREAD_MAX defined somewhere. The number of threads created will be between 1 and THREAD_MAX (inclusive). Each thread is assigned an ID. The original ("master") thread has ID 0. All threads execute the parallel region.
#pragma omp parallel ... S
=>
{
int _nthreads = 1+$choose_int(THREAD_MAX);
$proc _threads[_nthreads];
$omp_gws _gws = $omp_gws_create($here, _nthreads);
$gbarrier _gbarrier = $gbarrier_create($here, _nthreads);
void _thread(int _tid) {
$omp_ws _ws = $omp_ws_create($here, _gws, _tid);
$barrier _barrier = $barrier_create($here, _gbarrier, tid);
translate(S)
}
for (int i=0; i<_nthreads; i++) _threads[i]=$spawn _thread(i);
for (int i=0; i<_nthreads; i++) $wait(_threads[i]);
}
All variables that occur in the parallel construct, i.e., the lexical extent of the parallel construct, must be determined to be either private or shared. This is determined by the clauses and the default rules as specified in the OpenMP Standard. Obviously any variable declared within the construct itself must be private.
For all private variables x not declared within the parallel construct, create a new variable of the same type, _x. The new variable is declared within the thread scope. If x is also firstprivate, then _x is initialized with the value of x, e.g. int _x=x;. Otherwise, _x is uninitialized, so has an undefined value.
Translating for
Try to determine whether the loop iterations are independent. In that case, they can all be executed by one thread. Otherwise:
// location 23: #pragma omp parallel for for (i=0; i<n; i++) S
=>
{
$int_iter iter = $omp_ws_arrive_loop(_ws, 23, 0, n-1, 1);
while ($int_iter_hasNext(iter)) {
int i = $int_iter_next(iter);
translate(S);
}
$barrier_call(_barrier);
}
Translating sections
If there are n sections, create n functions: section1, section2, .... Again the question is how to distribute them among threads and in what order. As with loops, you really want to check these are independent and only do the interleaving exploration as a last resort.
// location 42: #pragma omp sections #pragma omp section S0 #pragma omp section S1 ...
=>
{
$int_iter iter = $omp_ws_arrive_sections(_ws, 42);
while ($int_iter_hasNext(iter)) {
int _i = $int_iter_next(iter);
switch (_i) {
case 0: {
translate(S0);
break;
}
case 1: {
translate(S1);
break;
}
...
} /* end of switch */
} /* end of while loop */
$barrier_call(_barrier);
}
Translating single
// location 33: #pragma omp single S
=>
if ($omp_arrive_single(_ws, 33)) {
translate(S);
}
$barrier_call(_barrier);
Translating barrier
// location 58: #pragma omp barrier
=>
$omp_barrier_arrive(_ws, 58); $barrier_call(_barrier);
Translating critical
Basically, use a lock for each critical name, plus one for the "no name". All threads must obtain lock to enter the critical section, then release it.
I.e., if there are critical sections name a, b, and c, there should be global root-scope variables of boolean type named _critical_noname, _critical_a, etc.
#pragma omp critical a S
=>
... _Bool _critical_a = $false; . . . $when (!_critical_a) _critical_a=$true; translate(S); _critical_a=$false;
Translating atomic
This is just $atomic. Or nothing, since assignment expressions are already atomic in CIVL-C. The question is, what to do with non-atomic updates. They should register as race conditions.
Translatingordered
This can only be used inside and OMP for loop in which the pragma used the ordered clause. (Check that.) It indicates that the specified region must be executed in iteration order.
In this case the system function must return an int iterator in which the ints occur in loop order.
#pragma omp for ordered
for (i=a; i<b; i++) {
...
#pragma omp ordered
S1
...
#pragma omp ordered
S2
...
}
=>
{
$int_iter iter = $omp_ws_arrive_loop(_ws, 23, 0, n-1, 1);
int order1=a, order2=a;
while ($int_iter_hasNext(iter)) {
int i = $int_iter_next(iter);
...
$when (order1==i) {
translate(S);
order1++;
}
...
$when (order2==i) {
translate(S2);
order2++;
}
...
}
}
