%% Messaga-passing is hard to get right

%% Verification for Message-passing mostly are state-searching based but suffers state space
%% explosiotn

%% model message=passing in CIVL, CIVL has the same problem

%% a divide and conquer approach: advantages

%% collective-style feature, the approach is based on collective-style procedures
%% real applications use collective style procedures

%%  overview of HPC, MPI and reliability

Message-passing is used prevasively in high-performance computing
(HPC).  This is expected to continue for the foreseeable future: a
recent survey, for example, found that 93\% of applications being
developed under the U.S.\ Department of Energy's Exascale Computing
Project are currently using or plan to use the Message Passing
Interface (MPI, \cite{...}).

These applications deal with areas of fundamental importance
to science, engineering and society more generally.
They include modeling of earthquakes to predict structural
damage (DOI: 10.1109/MCSE.2017.3421558) ...

These applications are typically run on supercomputers.  These
machines are extremely expensive to build, maintain, and operate.
Executing erroneous versions of programs or debugging at scale on
these machines is a therefore a waste of valuable resources.  More
importantly, the results computed by the applications are used to
inform decisions of the utmost importance to society, such as how to
manage climate change, ....[other examples, medical?  weather
prediction, the design of nuclear reactors, climate modeling, the
design of aircraft and buildings?].  For all of these reasons, the
ability to verify correctness of HPC applications is of paramount
importance.

--------


From here, I think the distinction you are drawing between automated
and deductive is not quite right.   These are the important points
I think you should make:

1.  There are many aspects to correctness.   There are generic
properties related to the programming language or concurrency APIs
used.  For example, any correct C program should not ....   Programs
should not deadlock.   Etc.

The techniques described in this disseration will deal with these
issues, but the focus is more on *functional correctness* --- that is,
the program is actually computing what the programmer intends.   
This requires some form of *specification*.

Many approaches have been taken to functional specification and
verification of programs.  One of the most successful involves
*contracts*.  Contracts decompose the spec/ver. problem into smaller
problems that be worked in independently.  Procedure
contracts... (brief idea of what they are, mention LARCH, JML, ACSL).

Verifying contracts --- different approaches have been taken.

All of above research has primarily involved sequential programs.
Very little on parallel programs, and almost none on message-passing
parallel programs.



2.  It is not at obvious how to extend the ideas to message-passing.
For a procedure called by all processes in an MPI programs --- each
process may enter the procedure at a different time, and may exit at
different times.  There is not necessarily a single ``pre-state'' and
``post-state''.

The first contribution of this dissertation is the development of a
theory of contracts for message passing programs.  New concepts...  a
precise semantics to these contracts.  The main idea is the
``collective'' point of view.  [CAN YOU GIVE A STATISTIC ON HOW
PREVALENT COLLECTIVE PROCEDURES ARE IN MPI PROGRAMS?  We once had
Yihao do a study, I think on AMG, and found that almost all procedures
are collective.  This would be a very nice stat to give.]

Second contribution is a method to verify such contracts.  Instead of
deductive reasoning (explain what that is, pros and cons) you will use
bounded model checking and symbolic execution.  **The great advantage
of this approach is that it does not require the user to formulate
complex invariants**.  The drawback is that you must place a small
upper bound on the number of processes.  Talk about the small scope
hypothesis and why it is effective.

Third: the theoretical contributions above are formulate in a simple
'toy' message-passing language (MiniMP).  This allows the ideas and
results to be presented and proved with mathematical precisions.
However the real world is much more messy.  MPI involves things like
multiple communicators, buffers specified with void * pointers,
datatypes specified as value, ....   

The third contribution is the development of a specification language
extending ACSL (which applies to sequential C programs) for C/MPI
programs.    ....

Fourth: Finally, the theoretical techniques for verifying such
contracts have been realized in the CIVL verification platform.
Briefly describe what this involved.
The approach has been evaluated by applying to a number of different
simple C/MPI programs, ......


Now go to the organization of the dissertation.

-----




has been and will continue
\cite{bernholdt-etal:2018:mpi_exascale} to be pervasively used to
parallelize high-performance computing (HPC) applications such as the
Monte Carlo particle transport code OpenMC \cite{ROMANO201590},
\cite{osti_1389816} [this is AMG?  That's a benchmark, not an
application], the PETSc library for solving partial differential
equations \cite{petsc-user-ref} [better reference?  maybe also web
page?], the hypre library for solving sparse linear systems with
preconditioning \cite{10.1007/3-540-47789-6_66}, and
\cite{tramm2014xsbench} [this is XSBench which is a
micro-benchmark]. These applications are normally run on
supercomputers.  Each of such runs is expensive.  Besides, the
computation work carried out by an HPC application is critical, such
as climate predication \cite{Drake2005OverviewOT}.  Therefore, the
reliability of the message-passing HPC applications is profoundly
concerned by researchers and developers
\cite{gopalakrishnan_hovland_iancu_krishnamoorthy_laguna_lethin_sen_siegel_solar-lezama_2017}.


%% meaning of correctness
A program is considered \emph{reliable}, if it always behave as
expected.  The expectation to a program is called the
\emph{specification} of the program.  A program is reliable if it
\emph{satisfies} its specification.  A specification describes the
\emph{properties} of a program without ambiguity.  For example, a
common property for message-passing parallel programs is deadlock
freedom; free of invalid pointer dereferencing is a common C
\cite{iso:2011:c11} program property.

%% hard to debug and test, verification approaches are more effective:
The behaviors of a message-passing programs is asynchronous and
nondeterministic, so traditional software testing methods can hardly
cover all possible behaviors of a message-passing program
\cite{Pervez:2010:FMA:1712067.1712069, godefroid2008concurrency}.  The
research work in recent decades \cite{vakkalanka2008dynamic,
Yu:2018:CSE:3183440.3190336, khanna2018dynamic, siegel2011tass,
Luo:2017:VMP:3127024.3127032, Strout:2006:DAM:1156433.1157634} shows
that static analysis and formal verification approaches are effective
in finding bugs in message-passing concurrent systems.  These
approaches can roughly be divided into two kinds: \emph{automated} and
\emph{deductive}.

Automated analysis and verification approaches require little effort
from the user side and is able to check a set of pre-defined
\emph{standard} properties for a message-passing program.  For
example, the ISP \cite{vakkalanka2008dynamic} model checker, which
uses the model checking \cite{Clarke:2000:MC:332656} technique, takes
a message-passing program as an input and verifies whether the program
is free of deadlock or assertion violation.  Message-passing
applications usually have large codebases, hence automated approaches
are attractive to developers.

In contrast, deductive verification approaches \cite{luo2018towards,
Lopez:2015:PVM:2814270.2814302, oortwijn2016future} require the user
to provide a specification for the verifying program.  The
specification usually is written in a specific programming language so
that it can be understood by computers.  By writting a specification,
the user is able to express rich and customized properties.  A
deductive verification tool performs parameteric verification for a
program against its specification in a composite and modular way.

Both kinds of the verification approaches have their disadvantages.
Automated approaches are mainly based on finite-state searching or
static analysis techniques.  The former technique requires the program
parameters to be bounded in small scopes.  The latter technique
produces false positives.  For deductive approaches, the human effort
that needs to be made is not only excessive but also requiring expert
knowledge.


%% what does this dissertation do ?
The main contribution of this dissertation is the attempt to combine
these two kinds of approaches to verify message-passing programs.  In
summary, our approach is to divide a message-passing program into a
set of \emph{collective-style} procedures.  The division is done via
assigning each collective-style procedure a specification.  By
verifying the satisfaction to the specification for each
\emph{collective-style} procedure independently, the whole program is
thus verified.  The verification of each procedure is based on
automated approaches.

%% the advantage of the combinatorial approach ?
Our approach has three main advantages.  First, it enables composite
and modular verification.  Verifying each of the divided procedures is
a much smaller problem than the verification of the whole program.
During the verification of a single procedure, all other procedures
will be assumed being verified and their verification results will be
re-used.  Second, instead of dividing a message-passing program into
arbitrary procedures, the collective-style based division allows the
user to pay more attention to the inputs and outputs of procedures
than the interaction among processes.  All the non-collective style
procedures are automatically executed during the verification.
Finally, encoding specifications in a programming language encourages
the developers to document and be careful with their code.


This approach is implemented in a general verification framework, CIVL
\cite{siegel-etal:2015:civl_sc}.  The implementation is evaluated with
MPI programs.  MPI, the Message-Passing Interface
\cite{mpi-forum:2012:mpi30}, is the \emph{de-facto} standard of
message-passing programming in HPC.  In order to specify properties
for MPI programs, we extended and designed a \emph{contract language}
for C/MPI programs from an existing specification language for
sequential C programs.  This language is another contribution of this
dissertation.


\textbf{Outline.}  The dissertation is composed of three parts.

Part I includes chapters for preliminary information. Chapter
\ref{chp:background-related} introduces background knowledge and
related work.  Chapter \ref{chp:notation} gives general notations that
will be used through the dissertation.

In Part II, we describe our verification approach in theory.  For
simplicity and precision, we introduce a simple message-passing
programming language, MiniMP, to help illustrating the idea.  The
MiniMP language is formally described in Chapter \ref{chp:minimp}.  In
Chapter \ref{chp:minimp:spec}, we designed a specification language
for MiniMP.  The correctness of a MiniMP program then can be expressed
in this specification language.  Chapter \ref{chp:minimp:rules}
presents an inference system, which consists of four rules, for
reasoning about MiniMP programs with specifications.  Based on the
inference system, Chapter \ref{chp:minimp:system} talks about a
contract system for verifying MiniMP programs using model checking and
symbolic execution techniques.

Part III focus on bringing the theoretical approach to practice.
Chapter \ref{chp:mpi:model} gives an overview on the CIVL framework,
on which a contract system for verifying C/MPI programs will be built.
In Chapter \ref{chp:mpi:system}, we introduce our \emph{MPI contract
language} and the implementation of an C/MPI contract system.  Chapter
\ref{chp:mpi:eval} presents our evaluation of the contract language
and system.  It also includes a discussion on the experimental
results.

Finally, we conclude this dissertation in Chapter \ref{chp:conclusion}.
