== The Sequentially Consistent Subset of OpenMP ==
OpenMP programs that:
* Do not use non-sequentially consistent atomic directives;
* Do not rely on the accuracy of a false result from omp_test_lock and omp_test_nest_lock; and
* Correctly avoid data races as required in Section 1.4.1 on page 23 (OpenMP spec 5.0) ([https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf spec])

The relaxed consistency model is invisible for such programs, and any explicit flush operations in such programs are redundant.

== OpenMP Constructs ==

* `parallel`
 * `private(`list`)`
 * `firstprivate(list)`
 * `copyin(list)`
 * `shared(`list`)`
 * `default(none`|`shared)`
 * `num_threads(`n`)`
 * `reduction(op:list)`

* `sections`
 * `private(`list`)`
 * `firstprivate(list)`
 * `lastprivate(list)`
 * `reduction(op:list)`
 * `nowait`

* `section`

* `single`
 * `private(`list`)`
 * `firstprivate(list)`
 * `copyprivate(list)`
 * `nowait`
 
* `for`
 * `private(`list`)`
 * `firstprivate(list)`
 * `lastprivate(list)`
 * `reduction`
 * `schedule`
 * `collapse`
 * `nowait`

* `simd`
 * `safelen(n)`
 * `linear(n)`
 * `aligned(n)`
 * `private`
 * `lastprivate`
 * `reduction`
 * `collaplse`

* `for simd`
 * `safelen(n)`
 * `linear(n)`
 * `aligned(n)`
 * `private`
 * `lastprivate`
 * `reduction`
 * `collapse`
 * `firstprivate`
 * `nowait`
 * `schedule`

* `declare simd`
 * `simdlen(n)`
 * `linear`
 * `aligned(n)`
 * `uniform`
 * `inbranch`
 * `notinbranch`

* `barrier`

* `critical`
 * `[name]`

* `atomic`
 * `read | write | update | capture`
 * `seq_cst`

* `master`

== OpenMP Types ==
* `omp_lock_t`

== OpenMP Functions ==
* `omp_get_num_threads()`
* `omp_get_thread_num()`
* `omp_get_wtime()`

== OpenMP Functions ==
* `omp_init_lock`
* `omp_destroy_lock`
* `omp_set_lock`
* `omp_unset_lock`
* `omp_test_lock`

== Plan ==
* We are going to get rid of the current OMP2CIVL transformer and come up a new transformer that assumes given OpenMP programs are sequentially consistent
* We are gong to improve the current OmpSimplifier using pointer alias analysis
* Note that an atomic construct without `seq_cst` is outside of the  sequentially consistent subset of the language, we need a way to deal with that.

== Notes ==
* Currently, the simplifier is not aware of the cases that out-of-bound access on multiple dimensional arrays can raise data race.  For example, 
{{{
int a[10][5];
#pragma omp parallel for 
for (int i = 0; i < 5; i++)
  for (int j = 1; j < 10; j++)
    a[i][j] = a[i][j-1] // a[0][4] and a[1][-1] refer to the same element
}}}

The current simplifier will incorrectly sequentialize the example above without realizing the fact that this example is sequentializable if and only if no "logical" out-of-bound happens during the execution.
A fix for the simplifier could be sequentializing the example with inserted assertion for making sure that there is no "logical" out-of-bound error.

== Related ==
* [[wiki: StaticAnalysis| Static Analysis]]

== OpenMP Simplifier ==
* We improve the existing OpenMP simplifier with the informations provided by the [[wiki: StaticAnalysis| Static Analysis]].
* Example 1: (`DRB067-restrictpointer1-orig-no.c` from [https://github.com/LLNL/dataracebench/releases/tag/v1.2.0 DataRaceBench v1.2.0])
{{{
#include <stdlib.h>
typedef double real8;

void foo(real8 * restrict newSxx, real8 * restrict newSyy, int length)
{
  int i;

#pragma omp parallel for private (i) firstprivate (length)
  for (i = 0; i <= length - 1; i += 1) {
    newSxx[i] = 0.0;
    newSyy[i] = 0.0;
  }
}

int main()
{
  int length=10;
  real8* newSxx = malloc (length* sizeof (real8));
  real8* newSyy = malloc (length* sizeof (real8));

  foo(newSxx, newSyy, length);

  free (newSxx);
  free (newSyy);
  return 0;
}
}}}
  * The OpenMP simplifier analyzes the parallel region with points-to informations about the two pointer argument `newSxx` and `newSyy` from static analysis.
  * The OpenMP simplifier can determine that no data race will happen in the parallel region as long as no array out-of-bound error happens in it.
  * The OpenMP simplifier sequentializes the program which will be checked by CIVL.  Any possible array out-of-bound error will be caught by CIVL.

* Example 2 : ( `DRB014-outofbounds-orig-yes.c` from  [https://github.com/LLNL/dataracebench/releases/tag/v1.2.0 DataRaceBench v1.2.0] )
{{{
#include <stdio.h>
int main(int argc, char* argv[])
{
  int i,j;
  int n=10, m=10;
  double b[n][m];
#pragma omp parallel for private(j)
  for (i=1;i<n;i++)
    for (j=0;j<m;j++) // Note there will be out of bound access                                                               
      b[i][j]=b[i][j-1];

  printf ("b[50][50]=%f\n",b[50][50]);

  return 0;
}
}}}
  * OpenMP Simplifier proves that `b[i][j]` and `b[i][j-1]` that accessed by different threads which hold different values of `i` will have no data race as long as no array out-of-bound error happens.
  * OpenMP Simplifier sequentializes the OpenMP program while the verification condition for array out-of-bound freedom will be checked by CIVL by default.
  * The CIVL back-end runs the sequentialized program and detects the array out-of-bound access.

* Example 3
{{{
#include <omp.h>
int main()
{
  int a[10],i;

#pragma omp parallel for
  for(i=0; i<10; i++)
    {
      int c;

      c = i % 2;
      a[i + c] = 0;
    }
}
}}}
  * OpenMP Simplifier cannot prove data-race freedom for this program.
  * OpenMP2CIVL Transformer will transform this program to a equivalent CIVL-C program for model checking
  * CIVL reports PROVABLE errors:
  {{{
Thread 1 can not safely write to memory location &<d5>a[0], because thread 0 has
written to that memory location and hasn't flushed yet.

Violation 0 encountered at depth 64:
CIVL execution violation in p2 (kind: ASSERTION_VIOLATION, certainty: PROVEABLE)
at OpenMPTransformer "_omp_a_shared, &(_om" inserted by OpenMPTransformer.a_sharedWriteCall before civlc.cvh:105.14-20 "$malloc"
Assertion: false
 -> false
. . .
Call stacks:
process 0:
  main at OpenMPTransformer "$parfor (int _omp_ti" inserted by OpenMPTransformer.parallelPragma before civlc.cvh:105.14-20 "$malloc"
process 1:
  $barrier_exit at concurrency.cvl:58.2-6 "$when" called from
  $barrier_call at concurrency.cvl:63.2-14 "$barrier_exit" called from
  $omp_barrier_and_flush at civl-omp.cvl:322.2-14 "$barrier_call" called from
  _par_proc0 at OpenMPTransformer "_omp_team" inserted by OpenMPTransformer.barrierAndFlushCall before civlc.cvh:105.14-20 "$malloc"
process 2:
  $omp_write at civl-omp.cvl:247.4-10 "$assert" called from
  _par_proc0 at OpenMPTransformer "_omp_a_shared, &(_om" inserted by OpenMPTransformer.a_sharedWriteCall before civlc.cvh:105.14-20 "$malloc"
process 3:
  $omp_arrive_loop at civl-omp.cvl:405.2-8 "$atomic" called from
  _par_proc0 at OpenMPTransformer "_omp_team, 0, ($doma" inserted by OpenMPTransformer.myItersDeclaration before civlc.cvh:105.14-20 "$malloc"

Logging new entry 0, writing trace to CIVLREP/testOmp5_0.trace
Terminating search after finding 1 violation.

  }}}

== Capture Read / Write at Runtime ==
* If the OpenMP Simplifier fails to sequentialize a program,  transform the OpenMP program to the following form for model checking:
{{{
$mem writes[nthreads], reads[nthreads];
$parfor (int tid:0..nthreads-1) {
  $write_set_push();
  $read_set_push();
  block1;
  // barrier
  writes[tid] = $write_set_pop();                                                                                                                
  reads[tid] = $read_set_pop();
  // check for dataraces (collective operation)                                                                            
  $write_set_push();
  $read_set_push();
  // barrier                                                                                                                  
  stmt2;
  // barrier
  writes[tid] = $write_set_pop();                                                                                                                
  reads[tid] = $read_set_pop();                                                                                                                 
  // check for dataraces (collective operation)
}
}}}
* Functions for managing `$mem` objects can be found in `mem.cvh`