Changes between Version 24 and Version 25 of OpenMPTransformation


Ignore:
Timestamp:
04/24/14 11:54:39 (12 years ago)
Author:
siegel
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • OpenMPTransformation

    v24 v25  
    2626* `default(none`|`shared)`
    2727* `num_threads(`n`)`
     28* `collapse(n)`
    2829* `schedule(static, n)`
    2930* `schedule(dynamic, n)`
     
    7374does a barrier on `_barrier` and a flush on all shared variables.
    7475
     76PROBLEM: the above does not seem to guarantee that both the writing and reading thread do flushes.  According to the Standard, events should follow this order: thread 1 writes , thread 1 flushed, thread 2 flushes, thread 2 reads.     The protocol above doesn't require the thread 2 flush.
     77
    7578
    7679== Modeling worksharing state ==
     
    7982* `$omp_gws`:  global work-sharing state
    8083* `$omp_ws`: local state.  A reference to a global object and a thread ID.
     84
     85The following object is used to specify the sequence of iterations to be assigned to one thread executing an omp for loop:
     86{{{
     87typedef struct {
     88  int numIters;
     89  int collapse;
     90  int iters[][];
     91} CIVL_omp_loop_info;
     92}}}
     93
     94The dimensions are `iters[numIters][collapse]`.  The integer `iters[i][j]` is the value of the j-th loop variable in the i-th iteration performed by this thread.
     95
     96The following object is used to specify the subset of section assigned to one thread executing an omp sections construct:
     97{{{
     98typedef struct {
     99  int numSections;
     100  int sections[];
     101} CIVL_omp_sections_info;
     102}}}
     103The length of the array `sections` is `numSections`.  The integer `sections[i]` is the index of the i-th section that this thread will execute.
    81104
    82105API:
     
    104127 * Parameter start is the initial value of the loop variable;
    105128 * end is its final value; and inc is the increment (which can be
    106  * positive or negative). */
    107 $int_iter $omp_ws_arrive_loop($omp_ws ws, int location, int start, int end, int inc);
     129 * positive or negative).   These values can all be obtained by getting
     130 * the loop statement from the location and evaluating the expressions
     131 * occurring there.*/
     132CIVL_omp_loop_info $omp_ws_arrive_loop($omp_ws ws, int location);
    108133
    109134/* for sections: called at arrival, returns the sequence of sections to
    110135 * be executed by calling thread.  The sections are numbered in order,
    111136 * starting from 0. */
    112 $int_iter $omp_ws_arrive_sections($omp_ws ws, int location);
     137CIVL_omp_sections_info $omp_ws_arrive_sections($omp_ws ws, int location);
    113138
    114139/* for single: called on arrival, returns whether or not to execute
     
    188213{{{
    189214{
    190   $int_iter iter = $omp_ws_arrive_loop(_ws, 23, 0, n-1, 1);
    191 
    192   while ($int_iter_hasNext(iter)) {
    193     int i = $int_iter_next(iter);
     215  CIVL_omp_loop_info info = $omp_ws_arrive_loop(_ws, 23);
     216
     217  int numIters = info.numIters;
     218  for (int j=0; j<numIters; j++) {
     219    int i = info.iters[j][0];
    194220
    195221    translate(S);
     
    200226
    201227We vary the way the way iterators are chosen to explore different tradeoffs and strategies.  On one extreme, every kind of partition can be explored; on the other, some fixed strategy like round-robin with chunksize 1 can be used.  This only changes the definition of `$omp_ws_arrive_loop`, not the translation above.
     228
     229{{{
     230// location 78:
     231#pragma omp parallel for collapse(3)
     232for (i=0; i<n; i++)
     233  for (j=0; j<m; j++)
     234    for (k=0; k<l; k++) {
     235      S
     236    }
     237}}}
     238
     239=>
     240
     241{{{
     242{
     243  CIVL_omp_loop_info info = $omp_ws_arrive_loop(_ws, 78);
     244
     245  int numIters = info.numIters;
     246  for (int count=0; count<numIters; count++) {
     247    int i = info.iters[count][0];
     248    int j = info.iters[count][1];
     249    int k = info.iters[count][2];
     250
     251    translate(S);
     252  }
     253  barrier_and_flush();
     254}
     255}}}
     256
    202257
    203258=== Translating `sections` ===