Changes between Version 6 and Version 7 of OpenMPTransformation
- Timestamp:
- 04/20/14 10:21:00 (12 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
OpenMPTransformation
v6 v7 1 1 2 2 3 == OpenMP Constructs ==3 == OpenMP Primitives == 4 4 5 Constructs 5 6 * `parallel` 6 7 * worksharing … … 37 38 == Translation Strategies == 38 39 39 === Shared variables ===40 === Translating shared variables === 40 41 41 === `parallel` ===42 === Translating `parallel` === 42 43 43 `parallel`: this spawns some nondeterministic number of threads. Each thread is assigned an ID. The original ("master") thread has ID 0. All threads execute the parallel region.44 `parallel`: this spawns some nondeterministic number of threads. We will assume there is a constant `THREAD_MAX` defined somewhere. The number of threads created will be between 1 and `THREAD_MAX` (inclusive). Each thread is assigned an ID. The original ("master") thread has ID 0. All threads execute the parallel region. 44 45 45 46 {{{ … … 67 68 For all private variables `x` not declared within the parallel construct, create a new variable of the same type, `_x`. The new variable is declared within the thread scope. If `x` is also firstprivate, then `_x` is initialized with the value of `x`, e.g. `int _x=x;`. Otherwise, `_x` is uninitialized, so has an undefined value. 68 69 69 === `for` ===70 === Translating `for` === 70 71 71 72 Try to determine whether the loop iterations are independent. In that case, they can all be executed by one thread. … … 78 79 79 80 Is there any loss of generality by just running all iterations concurrently? 81 82 One approach: assume you have a function or macro `CIVL_owns(n, t, i)`. It takes three ints and returns a boolean. The arguments are `n`: the number of threads; `t`: a thread ID between 0 and `n`-1 (inclusive); and `i`, an iteration index. 80 83 81 84 {{{ … … 97 100 98 101 99 === `sections` ===102 === Translating `sections` === 100 103 101 104 If there are n sections, create n functions: section1, section2, .... Again the question is how to distribute them among threads and in what order. 102 105 As with loops, you really want to check these are independent and only do the interleaving exploration as a last resort. 103 106 104 === `single` === 107 {{{ 108 #pragma omp sections 109 { 110 #pragma omp section 111 ... 112 #pragma omp section 113 ... 114 } 115 }}} 116 117 => 118 119 {{{ 120 { 121 void section0() { 122 ... 123 } 124 void section1() { 125 ... 126 } 127 ... 128 if (CIVL_owns(_nthreads, _tid, 0) 129 section0(); 130 if (CIVL_owns(_nthreads, _tid, 1) 131 section1(); 132 ... 133 barrier unless nowait; 134 } 135 }}} 136 137 === Translating `single` === 105 138 106 139 Nondeterministically choose a thread, i.e, `$choose_int(threads)`. That thread executes the code, the rest skip it. 107 The question is, which thread does the choosing? The first thread to arrive at that construct? Once again, try to determine if it matters. If the modifications and reads do not involve and private data, it doesn't matter which thread does it, so make it thread 0. 140 The question is, which thread does the choosing? The first thread to arrive at that construct? 141 142 Once again, try to determine if it matters. If the modifications and reads do not involve any private data, it doesn't matter which thread does it, so make it thread 0. 108 143 109 144 There is a barrier at the end. 110 145 111 146 112 === `barrier` ===147 === Translating `barrier` === 113 148 114 149 Provide some system functions for this. All the threads in the team (threads[i]) register with a barrier object and partake in the barrier. Can re-use that barrier object for multiple barriers. 115 150 116 === `critical` ===151 === Translating `critical` === 117 152 118 153 Basically, use a lock for each critical name, plus one for the "no name". All threads must obtain lock to enter the critical section, then release it. 119 154 120 === `atomic` ===155 === Translating `atomic` === 121 156 122 157 This is just `$atomic`. 123 158 124 === `ordered` ===159 === Translating`ordered` === 125 160 126 161 This can only be used inside and OMP `for` loop in which the pragma used the `ordered` clause. (Check that.) It indicates that the specified region must be executed in iteration order.
