Changes between Version 24 and Version 25 of OpenMPTransformation
- Timestamp:
- 04/24/14 11:54:39 (12 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
OpenMPTransformation
v24 v25 26 26 * `default(none`|`shared)` 27 27 * `num_threads(`n`)` 28 * `collapse(n)` 28 29 * `schedule(static, n)` 29 30 * `schedule(dynamic, n)` … … 73 74 does a barrier on `_barrier` and a flush on all shared variables. 74 75 76 PROBLEM: the above does not seem to guarantee that both the writing and reading thread do flushes. According to the Standard, events should follow this order: thread 1 writes , thread 1 flushed, thread 2 flushes, thread 2 reads. The protocol above doesn't require the thread 2 flush. 77 75 78 76 79 == Modeling worksharing state == … … 79 82 * `$omp_gws`: global work-sharing state 80 83 * `$omp_ws`: local state. A reference to a global object and a thread ID. 84 85 The following object is used to specify the sequence of iterations to be assigned to one thread executing an omp for loop: 86 {{{ 87 typedef struct { 88 int numIters; 89 int collapse; 90 int iters[][]; 91 } CIVL_omp_loop_info; 92 }}} 93 94 The dimensions are `iters[numIters][collapse]`. The integer `iters[i][j]` is the value of the j-th loop variable in the i-th iteration performed by this thread. 95 96 The following object is used to specify the subset of section assigned to one thread executing an omp sections construct: 97 {{{ 98 typedef struct { 99 int numSections; 100 int sections[]; 101 } CIVL_omp_sections_info; 102 }}} 103 The length of the array `sections` is `numSections`. The integer `sections[i]` is the index of the i-th section that this thread will execute. 81 104 82 105 API: … … 104 127 * Parameter start is the initial value of the loop variable; 105 128 * end is its final value; and inc is the increment (which can be 106 * positive or negative). */ 107 $int_iter $omp_ws_arrive_loop($omp_ws ws, int location, int start, int end, int inc); 129 * positive or negative). These values can all be obtained by getting 130 * the loop statement from the location and evaluating the expressions 131 * occurring there.*/ 132 CIVL_omp_loop_info $omp_ws_arrive_loop($omp_ws ws, int location); 108 133 109 134 /* for sections: called at arrival, returns the sequence of sections to 110 135 * be executed by calling thread. The sections are numbered in order, 111 136 * starting from 0. */ 112 $int_iter$omp_ws_arrive_sections($omp_ws ws, int location);137 CIVL_omp_sections_info $omp_ws_arrive_sections($omp_ws ws, int location); 113 138 114 139 /* for single: called on arrival, returns whether or not to execute … … 188 213 {{{ 189 214 { 190 $int_iter iter = $omp_ws_arrive_loop(_ws, 23, 0, n-1, 1); 191 192 while ($int_iter_hasNext(iter)) { 193 int i = $int_iter_next(iter); 215 CIVL_omp_loop_info info = $omp_ws_arrive_loop(_ws, 23); 216 217 int numIters = info.numIters; 218 for (int j=0; j<numIters; j++) { 219 int i = info.iters[j][0]; 194 220 195 221 translate(S); … … 200 226 201 227 We vary the way the way iterators are chosen to explore different tradeoffs and strategies. On one extreme, every kind of partition can be explored; on the other, some fixed strategy like round-robin with chunksize 1 can be used. This only changes the definition of `$omp_ws_arrive_loop`, not the translation above. 228 229 {{{ 230 // location 78: 231 #pragma omp parallel for collapse(3) 232 for (i=0; i<n; i++) 233 for (j=0; j<m; j++) 234 for (k=0; k<l; k++) { 235 S 236 } 237 }}} 238 239 => 240 241 {{{ 242 { 243 CIVL_omp_loop_info info = $omp_ws_arrive_loop(_ws, 78); 244 245 int numIters = info.numIters; 246 for (int count=0; count<numIters; count++) { 247 int i = info.iters[count][0]; 248 int j = info.iters[count][1]; 249 int k = info.iters[count][2]; 250 251 translate(S); 252 } 253 barrier_and_flush(); 254 } 255 }}} 256 202 257 203 258 === Translating `sections` ===
