\section{MPI Function Contract System Implementation} \label{sec:mpi:impl}
%intro: what are needed by a contract system starting from CIVL
%framwork ?
The implementation of the MPI function contract system in CIVL is a
solution to the following 4 problems: 1) CIVL is designed for
verifying complete programs monolithically hence cannot verify
functions separately; 2) the semantics of the contract system is
different from the one of CIVL-C, e.g., a call to a function with a
contract shall be summarized only from the contract instead of the
function definition; 3) there is no existing primitive in CIVL-C that
is close to collate or collective states; 4) CIVL does not support
path predicates.

Problem 1 and 2 are tackled by AST-level code transformation, and
Problem 3 and 4 are solved by implementing CIVL-C libraries for
collates and absence assertions.  Both the two methods are in keeping
with the philosophy of CIVL, which is to support a small verification
kernel well, and do the rest of the work by customized libraries and
source code transformation.

In the rest of this section, we first describe the collate library in
\S\ref{sec:mpi:impl:collate}, then explain the method of reasoning
about absence assertions in \S\ref{sec:mpi:impl:absence}.  Finally, in
\S\ref{sec:mpi:impl:transform}, we show that how does source code
transformation converts an MPI program to a CIVL-C program, which can
be verified by the common CIVL verifier with respect to the contract
system semantics.


\subsection{Implementing Collates} \label{sec:mpi:impl:collate}

One of the extensions the contract system made to CIVL is a
\texttt{collate} library.  This library provides data types and
functions to realize the collate-based algorithm for constructing
collective states.  Like the MPI library in CIVL, the \texttt{collate}
library hides the details that a collate queue is in fact a shared
object from the clients---MPI processes.  Therefore, conceptually
the \texttt{collate} library introduces no shared object.

In the library, a collate queue is implemented by a data structure,
\texttt{\$gcollator}.  A collate queue is associated with a group of
processes in a communicator.  Each process owns a process-local
handle, \texttt{\$collator}, to the collate queue.  Operations on the
collate queue can only be performed through the handle.  A collate in
the queue is implemented as a data structure,
\texttt{\$gcollate\U{}state}.  Similarly, a process can only access a
collate through a process-local handle, \texttt{\$collate\U{}state}.

The CIVL-C data structure representing the collate queue
\texttt{\$gcollator} (left) and the collate
\texttt{\$gcollate\U{}state} (right) is given in Fig.\
\ref{fig:mpi:impl:data-structure}.  The left data structure is simply
a wrap of a CIVL-C sequence of collates and two integers: the number
of processes and the queue size.  The right data structure represents
a collate.  In a collate, the array, \texttt{status}, keeps the track
of whether every process have copied its snapshot to this collate.  If
\texttt{status[$p$]} is true, process $p$ has copied its snapshot to
this collate.  The field \texttt{state} holds the program state that
is merged from the snapshots that have been copied to this
collate. The two arrays , \texttt{requirements} and
\texttt{guarantees}, store the corresponding absence assertions to
this collate for each process.  The field \texttt{isPre} labels the
collate for whether it is associated to a collective pre-state.  The
field \texttt{isTarget} labels the collate for whether it is
associated to a collective state of the verifying function.  Finally,
\texttt{id} holds an identifier for the function associated with this
collate.

The \texttt{\$state} is a CIVL-C built-in type representing program
states.  The \texttt{state} field in a collate will keep being updated
via being merged with the snapshots from processes.  Once
\texttt{status[$p$]} is true for every process $p$, the collate is
completed, the \texttt{state} field becomes equivalent to the
collative state, to which the collate is associated.

Following the collate management algorithm (described in
\S\ref{sec:minimp:system:onthefly}), an object (collate) of
\texttt{\$gcollate\U{}state} type will be created and enqueued by the
first process that reaches a collective location, i.e., the entry or
exit of a contracted function.  The \texttt{id} field is initialized
during the creation.  When every following process reaches a
collective location and copies its snapshot to the collate, the
process compares the identifier of the reached function with the
\texttt{id} in the collate.  A mismatch means that there is a
collective-style function in the program that is not called by all
processes collectively.

The other four fields are used by the absence assertion checking
algorithm, which will be described in the next subsection.

\begin{figure}
  \begin{center}  
    \begin{minipage}[t][][t]{.45\textwidth}
      \begin{program}
        \ccode{struct} \U{}gcollator \lb{} \\
        \ \  \ccode{int} nprocs;  \\
        \ \  \ccode{int} queue\U{}length; \\
        \ \  \$gcollate\U{}state queue[]; \\
\rb{};
      \end{program}
    \end{minipage}
    \begin{minipage}[t][][t]{.45\textwidth}
    \begin{program}
      \ccode{struct} \U{}gcollate\U{}state \lb{} \\
      \ \ \U{}Bool status[]; \\
      \ \ \$state state; \\
      \ \ \U{}Bool requirements[]; \\
      \ \   \U{}Bool guarantees[];   \\
      \ \   \U{}Bool isPre;       \\
      \ \   \U{}Bool isTarget;   \\
      \ \ \ccode{int} id;\\
      \rb{};
    \end{program}
  \end{minipage}
\end{center}
\caption{CIVL-C implementation for collate and collate queue}
\label{fig:mpi:impl:data-structure}
\end{figure}

Around these data structures, the \texttt{collate} library provides a
set of functions for the clients to interact with them.  Among those
functions, there is a pair for processes to copy their snapshots to
proper collates and to free collates collectively:

\begin{itemize}
\item[] \begin{programNoNum}
\textbf{1.}    \$collate\U{}state \$collate\U{}arrives(\$collator c, \$scope
    scope, \\
    \ \ \ \ \ \ \ \ \U{}Bool isPre, \U{}Bool isTarget, 
					  \U{}Bool requirement, \\
    \ \ \ \ \ \ \ \ \U{}Bool guarantee, \ccode{int} id);
  \end{programNoNum}
  
\item[] \begin{programNoNum}
\textbf{2.}    \ccode{void} \$collate\U{}departs(\$collator c, \$collate\U{}state cs);
  \end{programNoNum}
\end{itemize}

When a process reaches a collective location associated with a
collective-style function $g$, the process calls the
\texttt{\$collate\U{}arrives} function to save its snapshot to the
proper collate in a collate queue.  The queue is given by the
\texttt{\$collator} handle.  In addition to the snapshot, the
partially evaluated path-requirement and -guarantee of $g$ of the
process are stored in the collate.  This \texttt{\$collate\U{}arrives}
function returns a handle to the corresponding collate.  To free a
collate, each process calls \texttt{\$collate\U{}departs} with a
handle to the collate collectively.  The function does not guarantee
that the collate object is actually destroyed.  But a process shall
not access a collate once it is ``freed'' by a call to
\texttt{\$collate\U{}departs}.



A snapshot of a process is also a \texttt{\$state}.  Initially, the
\texttt{state} field in a collate holds the snapshot of the first
``arrived'' process.  Then, whenever a process ``arrives'', its
snapshot will be merged into the \texttt{state} field.  

In CIVL, a program state is mainly a tree of \emph{dynamic scopes}
(dyscopes) and a group of call stacks.  A dynamic scope is a runtime
instance of a lexical program scope and is assigned a unique ID.  A
call stack belongs to a process.  An entry of a call stack is a
reference to a dyscope.

Figure \ref{fig:civl:state} shows a CIVL state of a CIVL-C program
transformed from a general MPI program.  Recall that the
transformation template is given in Fig.\ \ref{fig:sec:mpi:transform}.
The state belongs to a run with two MPI processes.  Note that in
addition to the two processes that run the
\texttt{\U{}mpi\U{}process}, the state includes a ``root'' process p0
that was spawned at the time the CIVL-C program launches.  The
``parent-of'' relation in between dyscopes relates to their lexical
scopes.  If a dyscope $d$ is the parent of another dyscope $d'$, the
lexical scope of $d$ is the parent of the lexical scope of $d'$.  For
example, in Fig.\ \ref{fig:civl:state}, \textbf{d0} is a runtime
instance of the root scope $s_0$ and \textbf{d1} is a runtime instance
of the lexical scope $s_1$ of the \texttt{main} function.  So $s_0$ is
the parent of $s_1$ and \textbf{d0} is the parent of \textbf{d1}.  For
a lexical scope, it can have multiple runtime instances in a state.

\begin{figure}[t]
  \begin{center}
    \includegraphics[scale=.45]{civl-state}
  \end{center}
  \caption{A CIVL state of a CIVL-C program transformed from an
    general MPI program, which contains a function \texttt{f}.  The
    boxes labeled by \textbf{d0, d1, ...} are dyscopes.  Arrows
    between dyscopes denote the ``parent-of'' relation.  Boxes
    under ``p0, p1, p2'' are call stack entries.  Dashed arrows mark
    the referred dyscope of every call stack entry.}
  \label{fig:civl:state}
\end{figure}

In a snapshot of a process $p$ taken from a state $\omega$, the
dyscope tree contains only the dyscopes reachable by $p$ in $\omega$.
There is only one call stack in the snapshot, which belongs to $p$.

%%For two dyscopes $d$ and $d'$ in any different dyscope trees, $d$ is
%%said to be equal to $d'$ iff they have the same ID.  Let
%%$\mymath{merge(t, l)}$ be a tree, which depends on a tree $t$ and a
%%list $l$.  The $\mymath{merge(t, l)}$ is obtained by appending a
%%sub-list $l'$ of $l$ to $\mymath{lca(t, l)}$ in $t$.  Here $l'$ is a
%%sub-list of $l$ such that the root of $l'$ is the child of
%%$\mymath{lca(t, l)}$ in $l$.

Note that the \texttt{\$collate\U{}arrives} function also takes a
\texttt{\$scope} parameter, which shall refer to the dyscope where the
process invokes the call to this function.  Let $h$ be such
\texttt{\$scope} parameter.  In the snapshot of a process taken at a
call to \texttt{\$collate\U{}arrives}, no descendant dyscopes of $h$
needs to be included in the snapshot.

The \texttt{\$state} field in a collate is initialized by the snapshot
of the first process that ``arrives'' the collate.  We call the
\texttt{\$state} object in an incomplete collate a \emph{partially
  merged state}.  Merging a snapshot $\omega_p$ to a partially merged
state $\omega_c$ includes the following steps:

\begin{enumerate}
\item Re-numbering dyscope IDs in $\omega_p$ to be consistent with the
  $\omega_c$.  For a dyscope $d$ in $\omega_p$, if there is a dyscope
  $d'$ in $\omega_c$ such that $d$ and $d'$ have the same ID (denoted,
  $d = d'$), let $d$ keeps its ID.  Otherwise, re-numbering the ID of
  $d$ to be a new unique one that does not exist in $\omega_c$.

\item Let $d$ and $d'$ be two dyscopes in the re-numbered $\omega_p$
  and $\omega_c$, respectively.  We call $d$ or $d'$ the \emph{least
    common ancestor} of the two trees in $\omega_p$ and $\omega_c$
  if 
  \begin{itemize}
  \item $d = d'$,
  \item any leaf dyscope in the dyscope tree of $\omega_p$ is a
    descendent of $d$, and any leaf dyscope in the dyscope tree of
    $\omega_c$ is a descendent of $d'$, and
  \item there is not a pair of dyscopes $\delta$ and $\delta'$ such
    that $\delta$ is $d$ or a descendent of $d$, $\delta'$ is $d'$ or
    a descendent of $d'$, and $\delta$ and $\delta'$ satisfy the both
    two conditions above.    
  \end{itemize}
  The merging is then carried out by appending all child dyscopes of
  $d$ to $d'$.

\item Finally, copying the sole call stack in $\omega_p$ to
  $\omega_c$.  Re-number the ID of the process of the call stack to
  its rank corresponding to the collate.
\end{enumerate}


\begin{exmp}
  Figure \ref{fig:state:merge} shows merging a snapshot (on the
  right), which is taken from the state in Fig.\ \ref{fig:civl:state}
  for process p1, to a \emph{partially merged state} (on the left), on
  which processes are re-numbered by their ranks and are at some
  collective locations.  The dyscope \textbf{d1} is the least common
  ancestor of the two dyscope trees on the left and right,
  respectively.  Dyscopes \textbf{d2} and \textbf{d3} on the right
  will then be re-numbered to \textbf{d6} and \textbf{d7},
  respectively.  The process p0 on the right will then be re-numbered
  in correspondence to its rank in the collate.  $\qed$
\end{exmp}

  \begin{figure}[t]
    \begin{center}
      \includegraphics[scale=.35]{merging}
    \end{center}
    \caption{Merging a snapshot to a state.  The snapshot (on the
      right) was taken from the state in Fig.\ \ref{fig:civl:state}
      for process p1.}
    \label{fig:state:merge}
  \end{figure}

According to such state merging process, the ``shared'' dyscopes,
i.e., the ones reachable by all processes, are coming from the
snapshot of the first ``arrived'' process.  The MPI communicators
reside in these ``shared'' dyscopes.  Following the collate management
algorithm, message channels in the MPI communicators in a collate
shall then be updated whenever a process $p$, which has not arrived
the collate, executes a send or a receive operation.

To update the message channels in collates, the \texttt{collate}
library provides a pair of functions:

\begin{itemize}
\item \texttt{\$collate\U{}enqueue(\$collator
    c, \$comm comm, \$message m)}

\item \texttt{\$collate\U{}dequeue(\$collator
    c, \$comm comm, \$message m)}  
\end{itemize}
For every collate that has not ``arrived'' by the running process in
the queue referred by \texttt{c}, the \texttt{\$collate\U{}enqueue}
function enqueues the \texttt{\$message m} to the proper message
channel of the communicator referred by \texttt{comm} residing in the
merged state of the collate.  Note the source and destination
information of a \texttt{\$message} are encoded in itself.  Vice versa
for the \texttt{\$collate\U{}dequeue} function.

\subsection{Implementing Absence
  Assertions} \label{sec:mpi:impl:absence} Absence assertions in the
MPI contract language are translated to CIVL-C \emph{abstract
  functions}.  An abstract function is a mathematical uninterpreted
function.

\begin{itemize}
\item \texttt{\mygammaMPI{p}{t}} is translated to an boolean abstract
  function \texttt{\mygammaCIVL{p}{t}};
\item \texttt{\myguarantyMPI{p}{t}{q}} is translated to an boolean
  abstract function \texttt{\myguarantyCIVL{p}{t}{q}};
\item  \texttt{\mywaitsforMPI{p}} is translated to an boolean abstract
  function \texttt{\mywaitsforCIVL{p}}.
\end{itemize}

We define these abstract functions in a CIVL-C library,
\texttt{absence\U{}assertion.cvh}, which is another extension to CIVL.
In addition to these abstract functions, the library provides a set of
functions for reasoning about the absence assertions.  For example,
there is an \texttt{\$aa\U{}not\U{}exists(\U{}Bool set, \U{}Bool
  absence)} function.  It takes two boolean expressions: \texttt{set}
is the conjunction of a set of absence assertions and \texttt{absence}
is a single absence assertion.  The function returns true iff the
\texttt{absence} assertion is not an element of the \texttt{set}.  For
instance, if for some integers $a, b, c, n, r, p$, \texttt{set} is
$\forall{i\in\nprocs{n}}.\forall{t \in \mathbb{Z}}.\,
\texttt{\myguarantyCIVL{i}{t}{r}} \wedge{}
\texttt{\mywaitsforCIVL{p}}$ and \texttt{absence} is
\texttt{\myguarantyCIVL{a}{b}{c}}, the function returns $(\forall{i
  \in \nprocs{n}}.\, i \not= a) \vee (\forall{t\in\mathbb{Z}}.\,
t\not=b) \vee r \not= c$.

Absence assertions are checked for the freedom of the errors defined
in Fig.\ \ref{fig:minimp:mono:error} and
\ref{fig:modular:verify:error} when processes reach specific
locations.  We use the CIVL-C function \texttt{check\U{}guarantee}
that checks the freedom of \textbf{PGV} as an
example.  Figure \ref{fig:mpi:imple:check:upsilon} shows the
simplified CIVL-C code of this \texttt{check\U{}guarantee} function.
The function is executed immediately after a process performs a send
operation.  At line 6-10, the code searches the collate of the
collective pre-state of the verifying function by iterating the given
collate queue.  By the collate management algorithm, the search will
succeed because the goal collate will not be freed until all processes
have reached the end of the verifying function.  Line 16-20 checks for
each process $q$ that has not entered the verifying function, if the
guarantee of the function of the running process (identified by
\texttt{place} in the communicator) contains
\begin{center}
  \texttt{\myguarantyMPI{\texttt{dest}}{\texttt{tag}}{q}},
\end{center}
the guarantee is violated.


\begin{figure}[t]
  \begin{small}
  \begin{program}
    \ccode{void} check\U{}guarantee(\ccode{int} place, \ccode{int} dest, \ccode{int} tag, \ccode{int} nprocs, \\
    \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \$collator c)  \lb{} \\
    \ \   \$collate\U{}state cs; \\
    \ \  \ccode{int} queue\U{}size = \$collate\U{}queue\U{}size(c, place); \\
    \\
    \ \   \ccode{for} (\ccode{int} i = 0; i < queue\U{}size; i++)  \lb{} \\
    \ \ \ \    cs = \$collate\U{}get\U{}in\U{}queue(c, i); \\
    \ \ \ \    \ccode{if} (\$collate\U{}state\U{}is\U{}pre(cs) \&\& \$collate\U{}state\U{}is\U{}target(cs)) \\
    \ \ \ \ \ \      \ccode{break};       \\
    \ \ \rb{} \\
    \ \ \textcolor{red}{// at this point, cs must hold the handle to the collate} \\
    \ \ \textcolor{red}{// associated to the collective pre-state of the function}\\
    \ \ \textcolor{red}{// being verified.}\\
    \ \   \U{}Bool element;\\
    \ \ \U{}Bool set = \$collate\U{}get\U{}guarantee(cs, place);\textcolor{red}{// get my guarantee}  \\
    \\
    \ \   \ccode{for} (\ccode{int} i = 0; i < nprocs; i++) \\
    \ \ \ \  \ccode{if} (!\$collate\U{}arrived(cs, i))  \lb{} \\
    \ \ \ \ \ \ element = \$absent\U{}sendto(dest, tag, i); \\
    \ \ \ \ \ \ \$assert(\$aa\U{}not\U{}exists(element, set),\\
    \ \ \ \ \ \ \ "the guarantee
    of the verifying function is violated!"); \\
    \ \ \ \     \rb{}\\
    \rb{}
  \end{program}
  \end{small}
  \caption{The simplified code of the CIVL-C function that will be
    called every time a send operation was performed by a process in
    order to check for the free of \textbf{PGV}.}
  \label{fig:mpi:imple:check:upsilon}
\end{figure}


\subsection{Transformation} \label{sec:mpi:impl:transform}

A \emph{contract transformer} was developed and added to CIVL for
transforming a C/MPI/CIVL-C program along with ACSL function contracts
to a pure CIVL-C program.  The transformer takes a parameter $f$,
which is the name of the verifying function, and then transforms all
the annotated functions in the original program in two ways:

\begin{enumerate}
\item For the function $f$, the transformer copies $f$ to
  $f\texttt{\U{}origin}$ and creates a new function
  $f\texttt{\U{}driver}$.  The $f\texttt{\U{}driver}$ initializes all
  the variables with respect to the state-requirement of $f$, then
  calls $f\texttt{\U{}origin}$ to launch the exploration of the
  original function body.  The state-guarantee of $f$ is checked in
  $f\texttt{\U{}driver}$ after the control returns from
  $f\texttt{\U{}origin}$.
  
\item For every annotated function $g$ (including $f$), the
  transformer replaces the definition of $g$ with a CIVL-C code that
  1) asserts the state-requirement of $g$; 2) refreshes the objects
  specified in \texttt{assigns} clauses; 3) assumes the
  state-guarantee of $g$.
\end{enumerate}
Note that the verifying function $f$ will be transformed in both of
the ways.  This results in three different functions that all
originate from $f$: $f\texttt{\U{}driver}$, $f\texttt{\U{}origin}$ and
the $f$ with a new body.

The transformation relies on a number of basic CIVL-C primitives, as
well as helper libraries that provide contract-specific data types and
functions.  A selected set of such primitives are listed below:

\begin{itemize}
   
\item \texttt{\$value\U{}at(\textsf{expr}, \textsf{state}, \textsf{pid})}
  is an expression that represents the evaluation of
  \texttt{\textsf{expr}} on \texttt{\textsf{state}} by the process of
  \texttt{\textsf{pid}}.  The \texttt{\BSL{}on}, as well as the
  \texttt{\BSL{}old} construct from ACSL, will be translated to this
  expression.  
  
\item \texttt{\$run \textsf{stmt}} is a CIVL-C statement that creates
  a new process to execute \textsf{stmt}. 
  
\item \texttt{\$with(\textsf{state}) \textsf{stmt}} is a CIVL-C
  statement that executes the given \texttt{\textsf{stmt}} starting from
  the given \texttt{\textsf{state}}.
  
\item \texttt{\$havoc(\textsf{pointer})} ``refreshes'' an object by
  assigning a unique and unconstrained symbolic constant to the
  object referred by the \texttt{\textsf{pointer}}.
  
\item \texttt{\$mem} is a type representing a set of memory
  locations.

\item \texttt{\$mem\U{}havoc($m$)} assigns a unique and
  unconstrained symbolic constant to every object in the memory
  locations in $m$.
  
\item \texttt{\$mem\U{}contains($m_0$, $m_1$)} returns true iff every
  element in the memory location set $m_0$ is also an element of the
  memory location set $m_1$.

\item \texttt{\$write\U{}set\U{}push()} informs the verifier to
  start to save all the memory locations that are modified during the
  execution.  This is implemented by pushing a new empty memory
  location set to a stack that is maintained by every process.  The
  stack is called the \emph{write set stack}.  During the execution, a
  write operation that is performed by a process $p$ on a memory
  location $m$ will cause $m$ be added into the top entry of the write
  set stack of $p$ unless the stack is empty.

\item \texttt{\$write\U{}set\U{}pop()} pops and returns the top
  entry of the write set stack of the executing processes.  When the
  write set stack is empty, the verifier will not save any memory
  location.

\item \texttt{\$mpi\U{}snapshot(\textsf{comm}, \textsf{scope},
    \textsf{isPre}, \textsf{isTarget}, \textsf{requirement},
    \textsf{guarantee}, \textsf{fid})} is a helper function for the
  transformer that wraps the \texttt{\$collate\U{}arrives} function.
  In addition to a call to \texttt{\$collate\U{}arrives}, this
  function contains CIVL-C code that checks absence assertion
  violation and performs POR.  Dually, there is a
  \texttt{\$mpi\U{}unsnapshot(\textsf{collate\U{}state})} function.

  We remark that the \texttt{\textsf{comm}} parameter is of
  \texttt{MPI\U{}Comm} type.  In our extended implementation for MPI
  contracts, an MPI communicator was extended with an extra field
  holding a collate queue (i.e., a \texttt{\$gcollator}).  This is be
  in consistent with the theory that a collate queue is associated
  with a communication universe.  As a consequence, collate queues in
  different MPI communicators are naturally independent.
\end{itemize}

All the MPI specific constructs in the MPI contract language are
translated to special CIVL system functions.  A system function in
CIVL is a function that is implemented in the language that writes
CIVL, i.e., Java 8 \cite{java8}, instead of CIVL-C.  For example, the
\texttt{\BSL{}mpi\U{}agree($e$)} expression in contract will be
translated to a call to the system function
\texttt{\$mpi\U{}agree($e$)}.  The system function returns true iff
every process evaluates $e$ to a same value at the current state.


\textbf{Driver Function Generation.}
For a given verifying function \texttt{f}, \texttt{f} must be assigned
a function contract.  The transformer moves the original body of
\texttt{f} to the new function \texttt{f\U{}origin}.  Then generates a
new function \texttt{f\U{}driver} with respect to the contract.
Figure \ref{fig:mpi:impl:transform:driver} summarizes the idea of
generating \texttt{f\U{}driver} with a template.

In the template, the pseudocode above the horizontal line is the
original verifying function \texttt{f} with its contract.  The
\texttt{T, T1, T2} and \texttt{T3} represent arbitrary types. The
$\psi$ and $\phi$ are boolean expressions.  The $\Delta$ represents a
list of lvalue expressions.  The $\Gamma$ and $\Upsilon$ are sets of
absence assertions.

The pseudocode below the horizontal line is the generated
\texttt{f\U{}driver} function.  The MPI communicator \texttt{comm}
associated to the collective behavior in the contract is initialized
by duplicating the \texttt{MPI\U{}COMM\U{}WORLD} (line 4).  Two
variables \texttt{\$mpi\U{}comm\U{}rank} and
\texttt{\$mpi\U{}comm\U{}size} are created to replace the two constants
\texttt{\BSL{}mpi\U{}comm\U{}rank} and
\texttt{\BSL{}mpi\U{}comm\U{}size}, respectively.
Formal parameters are declared as local variables. These local
variables, as well as the global variables, are initialized by
unconstrained symbolic constants (line 8).

For a pointer type formal parameter $v$, one cannot assume that $v$
can be safely dereferenced unless the contract ``requires''
\texttt{\BSL{}valid($v$)}.  For such $v$, a unique object $o$ of the
referenced type of $v$ will be declared and ``refreshed''.  The
expression \texttt{\BSL{}valid($v$)} in the state-requirement is then
replaced by \texttt{$v$ == \&$o$}.

We remark that such a transformation for \texttt{\BSL{}valid}
expressions in state-requirements is only sound under an assumption
that there is no \emph{pointer aliasing}, i.e., a pointer type formal
parameter is assumed not to refer to a program variable or an object
referred by another pointer unless it is explicitly expressed in the
state-requirement.

The state-requirement is \texttt{\$assume}d in a pair of barriers so
that the state from which this \texttt{\$assume} statement is executed
will be equivalent to a pre-state of \texttt{f}.  Similar for the case
of \texttt{\$assert}-ing the state-guarantee.

Although for \texttt{f}, there is no need to construct its collective
pre- and post-states, there are still \texttt{\$mpi\U{}snapshot} calls
that create collates for \texttt{f}.  This is because that we use
these two calls to represent the events of entering and exiting of
\texttt{f}, respectively.  In addition, the created collective
pre-state of \texttt{f} will be used to evaluate the
\texttt{\BSL{}old} expressions in the state-guarantee.

The call to the \texttt{f\U{}origin} is surrounded by a pair of
\texttt{\$write\U{}set\U{}push} and \texttt{\$write\U{}set\U{}pop}.
By doing so, any memory location that is modified during the execution
of the original function body will be saved in \texttt{m} (line 17).
Then the validity of the frame condition can be verified by testing if
\texttt{m} is a subset of $\Delta$.

Finally, the collates created by calls to \texttt{\$mpi\U{}snapshot}
will be collectively ``destroyed'' through the calls to
\texttt{\$mpi\U{}unsnapshot} (line 26-27).

\begin{figure}
  \begin{program}
    T3 g, ...\ ; \\
    \mycomment{/*@ \ \BSL{}mpi\U{}collective(comm, $\mymath{kind}$):}\\
    \mycomment{\ \ \ \ \ \ requires $\bigwedge{\Gamma}$;}\\
    \mycomment{\ \ \ \ \ \ requires $\psi$;}\\
    \mycomment{\ \ \ \ \ \ assigns $\Delta$;}\\
    \mycomment{\ \ \ \ \ \ ensures $\phi$;}\\
    \mycomment{\ \ \ \ \ \ ensures $\bigwedge{\Upsilon}$; */}\\
    T f(T1 a, T2 * b, ...\ ) \lb{} 
     ...\ 
    \rb{}
  \end{program}
  
  \rule{\textwidth}{.5pt}
  
  \begin{program}
    \ccode{void} f\U{}driver() \lb{} \\
    \ \ \ccode{int} \$mpi\U{}comm\U{}rank, \$mpi\U{}comm\U{}size;\\
    \ \ MPI\U{}Comm comm; \\
    \ \ MPI\U{}Comm\U{}dup(MPI\U{}COMM\U{}WORLD, \&comm); \\
    \ \ MPI\U{}Comm\U{}rank(comm, \&\$mpi\U{}comm\U{}rank); \\
    \ \ MPI\U{}Comm\U{}size(comm, \&\$mpi\U{}comm\U{}size);\\
    \ \ T1 a, T2 *b, T2 b\U{}obj ...\ ; \\
    \ \ \$havoc(\&g); \$havoc(\&a); \$havoc(\&b); \$havoc(\&b\U{}obj);
    ...\; \\
     \ \ MPI\U{}Barrier(comm); \\
     \ \ \$assume($\psi$); \\
     \ \ MPI\U{}Barrier(comm); \\
    \ \ \$collate\U{}state \U{}cs\U{}pre = \$mpi\U{}snapshot(comm,
    \$here, \$true,  \$true,  \\
    \ \ \ \ $\bigwedge{}\Gamma$,  $\bigwedge{}\Upsilon$, ''f'');\\    
    \\
    \ \ \$write\U{}set\U{}push(); \\
    \ \ T \$result = f\U{}origin(comm, a, b, ...\ );\\
    \ \ \$mem m = \$write\U{}set\U{}pop();\\
    \ \ \$assert(\$mem\U{}contains($\Delta$, m));\\
    \\
    \ \ \$collate\U{}state \U{}cs\U{}post = \$mpi\U{}snapshot(comm,
    \$here, \$false, \$true, \\
    \ \ \ \  $\bigwedge{}\Gamma$,  $\bigwedge{}\Upsilon$, ''f''); \\
    \ \ MPI\U{}Barrier(comm);\\
    \ \ \$state* \U{}s\U{}pre = \$collate\U{}get\U{}state(\U{}cs\U{}pre);\\
    \ \ \$assert($\phi$); \\
    \ \ MPI\U{}Barrier(comm);\\
    \ \ \$mpi\U{}unsnapshot(\U{}cs\U{}pre);\\
    \ \ \$mpi\U{}unsnapshot(\U{}cs\U{}post); \\
    \rb{}
  \end{program}
  \caption{Transformation template for generating \texttt{f\U{}driver}
    from a function \texttt{f} with its contract. The upper shows the
    original annotated function \texttt{f} and the lower shows the
    generated \texttt{f\U{}driver}.}
  \label{fig:mpi:impl:transform:driver}
\end{figure}


\textbf{Summarizing Function Behavior From Contract.}  For every
contracted function, including the verifying function, its original
function body will be replaced by the transformer with a CIVL-C code
that summarizes the function from the function contract.  The template
for generating such a new function body from a contract is given in
Fig.\ \ref{fig:mpi:impl:transform:callee}.

In this template, a pair of collective pre- and post-state of the
function \texttt{f} are created through calls to
\texttt{\$mpi\U{}snapshot}.  The state-requirement and -guarantee will
be evaluated on these two states respectively once they are completed.
For each MPI process, it shall not wait at the location until a
collective state is completed because if doing so, the semantics of
the program gets changed.  Therefore, an MPI process will ``fork'' a
new ``deamon'' process that will asynchronously waits for the
completion of a collective state and then evaluates the
state-requirement or -guarantee on the state.  The ``deamon'' process
is created by the \texttt{\$run} statement (line 10 and 21).  The
``deamon'' process will be blocked by the \texttt{\$when} statement
(line 11 and 22), in which the guard is a system function
\texttt{\$collate\U{}complete($c$)} that returns true iff the collate
referred by $c$ is completed.  Once the guard evaluates to true, the
``deamon'' process asserts (or assumes) the state-requirement (or the
state-guarantee) from the completed collective state.  The collective
state is given by a pointer, which points to the \texttt{\$state}
object in a collate (line 9 and 20).  The ``deamon'' process
terminates automatically after the assertion (or assumption).

In addition to making assertions and assumptions, the objects listed
in \texttt{assigns} clauses will be ``refreshed'' by the
\texttt{\$havoc} calls.

A special variable \texttt{\$result} is created to represent the
returned value of \texttt{f}.  The expression \texttt{\BSL{}result} in
the contract will be replaced by \texttt{\$result}.

The running process will be blocked (at line 24) if it has to wait for
some other processes $p$ that have not ``entered'' this function.
Recall that the entering-function event is represented by the
transition that an MPI process marks itself on the collate associated
to the collective pre-state.  This code realizes the guarantees in the
contract of the form: \texttt{\mywaitsforMPI{$\mymath{p}$}}.

\begin{figure}
  \begin{program}
    T3 g, ...\ ;\\
    T f(MPI\U{}Comm comm, T1 a, T2 *b, ...\ ) \lb{} \\
    \ \ \ccode{int} \$mpi\U{}comm\U{}rank, \$mpi\U{}comm\U{}size;\\
    \ \ MPI\U{}Comm\U{}rank(comm, \&\$mpi\U{}comm\U{}rank); \\
    \ \ MPI\U{}Comm\U{}size(comm, \&\$mpi\U{}comm\U{}size); \\
\\
  \ \ \$collate\U{}state \U{}cs\U{}pre = \$mpi\U{}snapshot(comm,
  \$here, \$true, \$false, \\
  \ \ \ \ $\bigwedge{}\Gamma$, $\bigwedge{}\Upsilon$, ''f''); \\
  \ \ \$state* \U{}s\U{}pre =
  \$collate\U{}get\U{}state(\U{}cs\U{}pre); \\
  \ \ \$run \\
  \ \ \ \ \$when (\$collate\U{}complete(\U{}cs\U{}pre))  \\
  \ \ \ \ \ \  \$with (*\U{}s\U{}pre) \$assert($\psi$); \\
  \\
  \ \ T \$result; \\
  \ \  $\mymath{\forall{d \in \Delta}.\, }$
  \$havoc(\&$d$);\\
  \ \ \$havoc(\&\$result);\\
\\
  \ \ \$collate\U{}state \U{}cs\U{}post = \$mpi\U{}snapshot(comm,
  \$here, \$false, \$false, \\
  \ \ \ \ $\bigwedge{}\Gamma$, $\bigwedge{}\Upsilon$, ''f''); \\
  \ \ \$state* \U{}s\U{}post =
  \$collate\U{}get\U{}state(\U{}cs\U{}post); \\
  \ \ \$run \\
  \ \ \ \ \$when (\$collate\U{}complete(\U{}cs\U{}post)) \\
  \ \ \ \ \ \  \$with (*\U{}s\U{}post) \$assume($\phi$);\\
  \ \ \$when(\$collate\U{}arrived(\U{}cs\U{}pre, $\mymath{p}$)); \\
  \ \ \$mpi\U{}unsnapshot(\U{}cs\U{}pre);\\
  \ \ \$mpi\U{}unsnapshot(\U{}cs\U{}post);\\
\rb{}
\end{program}
\caption{The transformation template for summarizing \texttt{f} with
  respect to its contract.}
\label{fig:mpi:impl:transform:callee}
\end{figure}

\begin{comment}
\begin{exmp}
  For the function \texttt{ring} given in Fig.\ \ref{}, we present the
  \texttt{ring\U{}driver} function in Fig.\
  \ref{exmp:mpi:contract:ring} and the transformed \texttt{ring}
  function in Fig.\ \ref{exmp:mpi:impl:ring-driver}. $\qed$
\end{exmp}


\begin{figure}
  \begin{program}
    \ccode{int} rank, size;\\
    ...\ \\
    \ccode{void} ring\U{}driver() \lb{}\\
    \ \   \ccode{int} \$mpi\U{}comm\U{}rank, \$mpi\U{}comm\U{}size; \\
    \ \   MPI\U{}Comm comm;\\
    \ \   MPI\U{}Comm\U{}dup(MPI\U{}COMM\U{}WORLD, comm);\\
    \ \   MPI\U{}Comm\U{}rank(comm, \&\$mpi\U{}comm\U{}rank); \\
    \ \ MPI\U{}Comm\U{}size(comm,   \&\$mpi\U{}comm\U{}size);  \\
    \\
    \ \  \$havoc(\&rank); \$havoc(\&size);\\
    \ \  MPI\U{}Barrier(comm);\\
    \ \ \$assume(rank == \$mpi\U{}comm\U{}rank \&\& size ==
    \$mpi\U{}comm\U{}size);\\
    \ \  MPI\U{}Barrier(comm);\\
    \ \ \$collate\U{}state \U{}cs \U{}pre =
    \$mpi\U{}snapshot(comm, \$here, 1, 1, 1, 1, \\
    \ \ \ \ ''ring'');\\
    \\
    \ \  \$write\U{}set\U{}push();\\
    \ \ ring\U{}origin();\\
    \ \ \$mem m = \$write\U{}pop\U{}push();\\
    \ \ \$assert(\$mem\U{}contains((\$mem)\&buf, m));\\
    \\
    \ \ \$collate\U{}state \U{}cs \U{}post =
    \$mpi\U{}snapshot(comm, \$here, 0, 1, 1, \\
    \ \ \ \ \$abent\U{}exit((rank + 1) \% size), ''ring'');\\
    \ \  MPI\U{}Barrier(comm);\\
    \ \ \$state * \U{}s\U{}post = \$collate\U{}get\U{}state(\U{}cs\U{}post);\\
    \ \ \$assert(\$value\U{}at(buf, *\$state\U{}post, (rank + 1) \%
    size) == data);\\
        \ \  MPI\U{}Barrier(comm);\\
    \ \ \$mpi\U{}unsnapshot(\U{}cs\U{}pre); \\
    \ \ \$mpi\U{}unsnapshot(\U{}cs\U{}post);\\
    \rb{}\\
  \end{program}
  \caption{The \texttt{ring\U{}driver} function created by the
    contract transformer.  This function verifies that whether
    \texttt{ring} satisfies its function contract.}
  \label{exmp:mpi:impl:ring-driver}
\end{figure}

  
\begin{figure}
  \begin{program}
    \ccode{int} ring() \lb{}  \\
    \ \   \ccode{int} \$mpi\U{}comm\U{}rank, \$mpi\U{}comm\U{}size; \\
    \ \   MPI\U{}Comm\U{}rank(comm, \&\$mpi\U{}comm\U{}rank); \\
    \ \ MPI\U{}Comm\U{}size(comm,
    \&\$mpi\U{}comm\U{}size);   \\
    \ \ \$collate\U{}state \U{}cs\U{}pre =
    \$mpi\U{}snapshot(comm, \$here, 1, 0, 1, 1, \\
    \ \ \ \ ''ring''); \\
    \ \ \$state* \U{}s\U{}pre =
    \$collate\U{}get\U{}state(\U{}cs\U{}pre);\\
    \ \ \$run \$when (\$collate\U{}complete(\U{}cs\U{}pre)) \\
    \ \ \ \  \$with (*\U{}s\U{}pre) \\
    \ \ \ \ \ \ \$assert(rank == \$mpi\U{}comm\U{}rank \&\& size ==
    \$mpi\U{}comm\U{}size);\\
    \ \ \$havoc(\&buf); \\ \\
    \ \ \$collate\U{}state \U{}cs\U{}post = \$mpi\U{}snapshot(comm,
    \$here, 0, 0, 1, \\
    \ \ \ \ \$absent\U{}exit((rank + 1) \% size), ''ring''); \\
    \ \ \$state* \U{}s\U{}post =
    \$collate\U{}get\U{}state(\U{}cs\U{}post);\\
    \ \ \$run \$when (\$collate\U{}complete(\U{}cs\U{}post)) \\
    \ \ \ \  \$with (*\U{}s\U{}post) \\
    \ \ \ \ \ \ \$assume(\$value\U{}at(buf, \$state\U{}null, (rank +
    1) \% size) == data);\\
    \ \ \$when (\$collate\U{}arrived(
    \U{}cs\U{}pre, (rank + 1) \% size,)); \\
    \ \ \$mpi\U{}unsnapshot(\U{}cs\U{}pre); \\
    \ \ \$mpi\U{}unsnapshot(\U{}cs\U{}post); \\
    \rb{}\\
  \end{program}
  \caption{The \texttt{ring} function after transformation.  The
    original function body is replaced with CIVL-C code that
    summarizes the behavior of \texttt{ring} with respect to its
    contract.}
  \label{exmp:mpi:impl:ring}
\end{figure}
\end{comment}
