| Version 3 (modified by , 12 years ago) ( diff ) |
|---|
OpenMP Constructs
parallel- worksharing
forsectionsandsectionsingle
- synchronization
barriercriticalatomicorderedmaster
threadprivateflush
Clauses
private(list)firstprivate(list)lastprivate(list)copyin(list)shared(list)default(none|shared)num_threads(n)schedule(static, n)schedule(dynamic, n)- ...
orderednowait
Functions
omp_get_num_threads()omp_get_thread_num()
Translation Strategies
Shared variables
parallel
parallel: this spawns some nondeterministic number of threads. Each thread is assigned an ID. The original ("master") thread has ID 0. All threads execute the parallel region.
#pragma omp parallel ... S
=>
{
int _nthreads = 1+$choose_int(THREAD_MAX);
$proc _threads[_nthreads];
void _thread(int _tid) {
translate(S)
}
for (int i=0; i<_nthreads; i++) _threads[i]=$spawn _thread(i);
for (int i=0; i<_nthreads; i++) $wait(_threads[i]);
}
All variables that occur in the parallel construct, i.e., the lexical extent of the parallel construct, must be determined to be either private or shared. This is determined by the clauses and the default rules as specified in the OpenMP Standard. Obviously any variable declared within the construct itself must be private.
For all private variables x not declared within the parallel construct, create a new variable of the same type, _x. The new variable is declared within the thread scope. If x is also firstprivate, then _x is initialized with the value of x, e.g. int _x=x;. Otherwise, _x is uninitialized, so has an undefined value.
for
Try to determine whether the loop iterations are independent. In that case, they can all be executed by one thread.
Otherwise, iterations must be distributed among the threads in some nondeterministic way. This could blow up rapidly! Also, a thread does not have to execute its iterations in increasing order. It can execute them in any order.
Trying a few different things for now: picking a particular scheduling policy like round-robin (status with chunk size 1). Of course you can always do this if schedule is specified to be static.
The question is do we ever want to try to explore these interleavings?
Is there any loss of generality by just running all iterations concurrently?
sections
If there are n sections, create n functions: section1, section2, .... Again the question is how to distribute them among threads and in what order. As with loops, you really want to check these are independent and only do the interleaving exploration as a last resort.
