source: CIVL/doc/manual/part-language.tex@ d87ec9c

1.23 2.0 acw/focus-triggers main test-branch
Last change on this file since d87ec9c was f3e58ea, checked in by Manchun Zheng <zmanchun@…>, 12 years ago

added $equals to manual.

git-svn-id: svn://vsl.cis.udel.edu/civl/trunk@1239 fb995dde-84ed-4084-dfe6-e5aef3e2452c

  • Property mode set to 100644
File size: 66.2 KB
Line 
1\part{Language}
2\label{part:lang}
3
4\chapter{Overview of CIVL-C}
5
6\section{Main Concepts}
7
8CIVL-C is an extension of a subset of the C11 dialect of C. It
9includes the most commonly-used elements of C, including most of the
10syntax, types, expressions, and statements. Missing are some of the
11more esoteric type qualifiers, bitwise operations (at least for now),
12and much of the standard library. Moreover, none of the C11 language
13elements dealing with concurrency are included, as CIVL-C has its own
14concurrency primitives.
15
16The keywords in CIVL-C not already in C begin with the symbol \cckey.
17This makes them readily identifiable and also prevents any naming
18conflicts with identifiers in C programs. This means that most legal
19C programs will also be legal CIVL-C programs.
20
21One of the most important features of CIVL-C not found in standard C
22is the ability to define functions in any scope. (Standard C allows
23function definitions only in the file scope.) This feature is also
24found in GNU C, the GNU extension of C.
25
26Another central CIVL-C feature is the ability to \emph{spawn}
27functions, i.e., run the function in a new \emph{process} (thread).
28
29\emph{Scopes} and \emph{processes} are the two central themes of
30CIVL-C. Each has a static and a dynamic aspect. The static scopes
31correspond to the lexical scopes in the program---typically, regions
32delimited by curly braces \lb \ldots \rb. At runtime, these scopes
33are \emph{instantiated} when control in a process reaches the
34beginning of the scope. Processes are created dynamically by
35\emph{spawning} functions; hence the functions are the static
36representation of processes.
37
38\section{Example Illustrating Scopes and Processes}
39
40To understand the static and dynamic nature of scopes and processes,
41and the relations between them, we consider the (artificial) example
42code of Figure \ref{fig:scopecodeex}. The static scopes in the scope
43are numbered from $0$ to $6$.
44
45\begin{figure}[t]
46 \centering
47 \includegraphics[scale=1.2]{scopeCodeExample}
48 \caption{CIVL-C code skeleton to illustrate scope hierarchy}
49 \label{fig:scopecodeex}
50\end{figure}
51
52\begin{figure}
53 \centering
54 \includegraphics[scale=1.2]{scopeStateExample}
55 \caption{Static scope tree and a state for example program}
56 \label{fig:scopestateex}
57\end{figure}
58
59The static scopes have a tree structure: one scope is a child of
60another if the first is immediately contained in the second. Scope 0,
61which is the file scope (or \emph{root} scope) is the root of this
62tree. The static scope tree is depicted in Figure
63\ref{fig:scopestateex} (left). Each scope is identified by its
64integer ID. Additionally, if the scope happens to be the scope of a
65function definition, the name of the function is included in this
66identifier. A node in this tree also shows the variables and
67functions declared in the scope. For brevity, we omit the \emph{proc}
68variables.
69
70We now look at what happens when this program executes. Figure
71\ref{fig:scopestateex} (right) illustrates a possible state of the
72program at one point in an execution. We now explain how
73this state is arrived at.
74
75First, there is an implicit \emph{root function} placed around the
76entire code. The body of the \emph{main} function becomes the body
77of the root function, and the \emph{main} function itself disappears.
78This minor transformation does not change the structure of the scope
79tree.
80
81Execution begins by spawning a process $p_0$ to execute the root
82function. This causes scope $0$ to be instantiated. An instance of a
83static scope is known as a \emph{dynamic scope}, or \emph{dyscopes}
84for short. The dynamic scopes are represented by the ovals with
85double borders on the right side of Figure \ref{fig:scopestateex}.
86Each dyscope specifies a value for every variable declared in the
87corresponding static scope. In this case, the value $3$ has been
88assigned to variable \texttt{x}.
89
90The state of process $p_0$ is represented by a \emph{call stack}
91(green). The entries on this stack are \emph{activation frames}.
92Each frame contains two data: a reference to a dyscope (indicated by
93blue arrows) and a current location (or programmer counter vaule) in
94the static scope corresponding to that dyscope (not shown). The
95dyscope defines the environment in which the process evaluates
96expressions and executes statements. The currently executing function
97of a process, corresponding to the top frame in the call stack, can
98``see'' only the variables in its dyscope and those of all the
99ancestors of its dyscope in the dyscope tree.
100
101Returning to the example, $p_0$ enters scope 6, instanitating that
102scope, and then spawns procedure \texttt{f}. This creates process
103$p_1$, with a new stack with a frame pointing to a dyscope
104corresponding to static scope 1. The new process proceed to run
105concurrently with $p_0$. Meanwhile, $p_0$ calls procedure \texttt{g},
106which pushes a new entry onto its call stack, and instantiates scope
1075. Hence $p_0$ has two entries on its stack: the bottom one pointing
108to the instance of scope 6, the top one pointing to the instance of
109scope 5.
110
111Meanwhile, assume $\texttt{x}>0$, so that $p_1$ takes the \emph{true}
112branch of the \texttt{if} statement, instantiating scope 3 under the
113instance of scope 1. It then spawns two copies of procedure
114\texttt{f1}, creating processes $p_2$ and $p_3$ and two instances of
115scope 2. Then $p_1$ spawns \texttt{f2}, creating process $p_4$ and an
116instance of scope 4. Note that the instance of scope 4 is a child of
117the instance of scope 3, since the (static) scope 4 is a child of
118scope 3. Finally, $p_1$ calls \texttt{f2}, pushing a new entry on its
119stack and creating another instance of scope 4. The final state
120arrived at is the one shown.
121
122There are few key points to understand:
123\begin{itemize}
124\item In any state, there is a mapping from the dyscope tree to the
125 static scope tree which maps a dyscope to the static scope of which
126 it is an instance. This mapping is a \emph{tree homomorphism},
127 i.e., if dyscope $u$ is a child of dyscope $v$, then the static
128 scope corresponding to $u$ is a child of the static scope
129 corresponding to $v$.
130\item A static scope may have any number of instances, including 0.
131\item Dynamic scopes are created when control enters the corresponding
132 static scope; they disappear from the state when they become
133 unreachable. A dyscope $v$ is ``reachable'' if some process has a
134 frame pointing to a dyscope $u$ and there is a path from $u$ up to
135 $v$ that follows the parent edges in the dyscope tree.
136\item Processes are created when functions are spawned; they disappear
137 from the state when their stack becomes empty (either because the
138 process terminates normally or invokes the \emph{exit} system
139 function).
140\end{itemize}
141
142\section{Structure of a CIVL-C program}
143
144A CIVL-C program is structured very much like a standard C program.
145In particular, a CIVL-C program may use the preprocessor directives
146specified in the C Standard, and with the same meaning. A source
147program is preprocessed, then parsed, resulting in a translation unit,
148just as with standard C. The main differences are the nesting of
149function definitions and the new primitives beginning with
150\texttt{\$}, which are described in detail in the remainder of this
151part of the manual.
152
153A CIVL-C program must begin with the line
154\begin{verbatim}
155#include <civlc.h>
156\end{verbatim}
157which includes the main CIVL-C header file, which declares all the
158types and other CIVL primitives.
159
160As usual, a translation unit consists of a sequence of variable
161declarations, function prototypes, and function definitions in file
162scope. In addition, \emph{assume} statements may occur in the file
163scope. These are used to state assumptions on the input values
164to a program.
165
166\chapter{Sequential Elements}
167
168In this chapter we describe the main sequential elements of the
169language. For the most part these are the same as in C.
170Primitives dealing with concurrency are introduced in Chapter
171\ref{chap:concurrency}.
172
173\section{Types}
174
175\subsection{Standard types inherited from C}
176
177The boolean type is denoted \verb!_Bool!, as in C. Its values are $0$
178and $1$, which are also denoted by $\cfalse$ and $\ctrue$,
179respectively.
180
181There is one integer type, corresponding to the mathematical integers.
182Currently, all of the C integer types \texttt{int}, \texttt{long},
183\texttt{unsigned\ int}, \texttt{short}, etc., are mapped to the CIVL
184integer type.
185
186There is one real type, corresponding to the mathematical real
187numbers. Currently, all of the C real types \texttt{double},
188\texttt{float}, etc., are mapped to the CIVL real type.
189
190Array types, \texttt{struct} and \texttt{union} types, \texttt{char},
191and pointer types (including pointers to functions) are all exactly as
192in C.
193
194% \subsection{The heap type $\cheap$ and handles}
195
196% Unlike C, a CIVL-C program does not necessarily have access to a
197% single, global heap. Instead, there is a $\cheap$ type, and heaps may
198% be declared explicitly wherever they are needed. Hence a CIVL-C
199% program may have several heaps, and these may exist in different
200% scopes.
201
202% A heap is declared and created as follows:
203% \begin{verbatim}
204% $heap h = $heap_create();
205% \end{verbatim}
206% The function \verb!$heap_create()! creates a new empty heap in the
207% current scope and returns a \emph{handle} to that heap. A handle is
208% like a pointer: it is a reference to another object. However, a handle
209% is much more restricted than a general pointer. In particular, it
210% cannot be dereferenced (by the \ct{*} operator). The underlying heap
211% object can only be accessed by using a handle to it as an argument to
212% a system function.
213
214% Handles can be used in assignments and passed as arguments to functions.
215% For example, this declaration could follow the one above:
216% \begin{verbatim}
217% $heap h2=h;
218% \end{verbatim}
219% After executing this code, \ct{h2} and \ct{h} will be aliased, i.e., the two
220% handles will refer to the same heap object.
221
222% The heap object exists in the scope in which it is created. In
223% particular, it will disappear when that scope disappears, i.e., when
224% control reaches the right curly brace that defines the end of the
225% scope. At that point, any references into the heap become invalid.
226
227% The following system functions deal with heaps:
228% \begin{verbatim}
229% void* $malloc($heap h, int size);
230% void free(void *p)
231% \end{verbatim}
232% The first function is like C's \texttt{malloc}, except that you
233% specify the heap in which the allocation takes place.
234% This modifies the specified heap and returns a pointer to the new object.
235% The function can only occur in a context in which the type of the object is
236% specified, as in:
237% \begin{verbatim}
238% $heap h;
239% int n = 10;
240% double *p = (double*)$malloc(h, n*sizeof(double));
241% \end{verbatim}
242% The function \ct{free} is exactly the same as in C. Note that
243% \texttt{free} modifies the heap which was used to allocate \texttt{p}.
244
245
246\subsection{The bundle type: \cbundle}
247
248CIVL-C includes a type named \cbundle. A bundle is basically a
249sequence of data, wrapped into an atomic package. A bundle is created
250using a function that specifies a region of memory. One can create a
251bundle from an array of integers, and another bundle from an array of
252reals. Both bundles have the same type, \cbundle. They can therefore
253be entered into an array of \cbundle, for example. Hence bundles are
254useful for mixing objects of different (even statically unknown) types
255into a single data structure. Later, the contents of a bundle can be
256extracted with another function that specifies a region of memory into
257which to unpack the bundle; if that memory does not have the right
258type to receive the contents of the bundle, a runtime error is
259generated.
260
261\begin{figure}
262\begin{verbatim}
263/* Creates a bundle from the memory region specified by ptr and size,
264 * copying the data into the new bundle */
265$bundle $bundle_pack(void *ptr, int size);
266
267/* Returns the size (number of bytes) of the bundle */
268int $bundle_size($bundle b);
269
270/* Copies the data out of the bundle into the region specified */
271void $bundle_unpack($bundle bundle, void *ptr);
272\end{verbatim}
273 \caption{The \emph{bundle} abstract data type}
274 \label{fig:bundle}
275\end{figure}
276
277The relevant functions for creating and manipulating bundles
278are given in Figure \ref{fig:bundle}.
279
280\subsection{The \cscope{} type}
281\label{sec:scopetype}
282
283An object of type $\cscope$ is a reference to a dynamic scope. It may
284be thought of as a ``dynamic scope ID, '' but it is not an integer and
285cannot be converted to an integer. Operations defined on scopes are
286discussed in Section \ref{sec:scopeexpr}.
287
288\subsection{The \crange{} and \cdomain{} types}
289
290CIVL-C provides certain abstract datatypes that are useful for
291representing iteration spaces of loops in an abstract way.
292
293First, there is a built-in type $\crange$. An object of this type
294represents an ordered set of integers. There are expressions for
295specifying range values; these are described in Section
296\ref{sec:range_expr}. Ranges are typically used as a step
297in constructing \emph{domains}, described next.
298
299A domain type is used to represent a set of tuples of integer values.
300Every tuple in a domain object has the same arity (i.e., number of
301components). The arity must be at least 1, and is called the
302\emph{dimension} of the domain object.
303
304For each integer constant expression $n$, there is a type
305\cdomainof{\(n\)}, representing domains of dimension $n$.
306% \texttt{\cdomain{}(}]\(n\)\texttt{)}
307The \emph{universal domain type}, denoted \cdomain{}, represents
308domains of all positive dimensions, i.e., it is the union over all
309$n\geq 1$ of \cdomainof{\(n\)}. In particular, each \cdomainof{\(n\)}
310is a subtype of \cdomain{}.
311
312There are expressions for specifying domain values; these are
313described in Section \ref{sec:domain_expr}. There are also certains
314statements that use domains, such as the ``CIVL-\emph{for}'' loop
315\cfor; see Section \ref{sec:cfor}.
316
317
318\section{Expressions}
319
320\subsection{Expressions inherited from C}
321
322The following C expressions are included in CIVL:
323\begin{itemize}
324\item \emph{constant} expressions
325\item \emph{identifier} expressions (\texttt{x})
326\item parenthetical expressions (\verb!(e)!)
327\item numerical \emph{addition} (\verb!a+b!), \emph{subtraction} (\verb!a-b!),
328 \emph{multiplication} (\verb!a*b!), \emph{division} (\verb!a/b!),
329 \emph{unary plus} (\verb!+a!), \emph{unary minus} (\verb!-a!),
330 \emph{integer division} (\verb!a/b!) and \emph{modulus} (\verb!a%b!),
331 all with their ideal mathematical interpretations
332\item array \emph{index} expressions (\verb!a[e]!) and struct or union
333 \emph{navigation} expressions (\verb!x.f!, \verb!p->f!)
334\item \emph{address-of} (\verb!&e!), pointer \emph{dereference} (\verb!*p!),
335 pointer \emph{addition} (\verb!p+i!) and \emph{subtraction} (\verb!p-q!)
336 expressions
337\item relational expressions (\verb!a==b!, \verb~a!=b~, \verb!a>=b!,
338 \verb!a<=b!, \verb!a<b!, \verb!a>b!)
339\item logical \emph{not} (\verb~!p~), \emph{and} (\verb!p&&q!), and
340 \emph{or} (\verb!p||q!)
341\item \emph{sizeof} a type (\verb!sizeof(t)!) or expression (\verb!sizeof(e)!)
342\item \emph{assignment} expressions (\verb!a=b!, \verb!a+=b!, \verb!a-=b!,
343 \verb!a*=b!, \verb!a/=b!, \verb!a%=b!, \verb!a++!, \verb!a--!)
344\item function \emph{calls} \verb!f(e1,...,en)!
345\item \emph{conditional} expressions (\verb!b ? e : f!).
346\item \emph{cast} expressions (\verb!(t)e!)
347\end{itemize}
348
349Bit-wise operations are not yet supported.
350
351\subsection{Scope expressions}
352\label{sec:scopeexpr}
353
354As mentioned in Section \ref{sec:scopetype}, CIVL-C provides a type
355\cscope. An object of this type is a reference to a dynamic scope.
356Several constants, expressions, and functions dealing with the
357\cscope{} type are also provided.
358
359The $\cscope$ type is like any other object type. It may be used as
360the element type of an array, a field in a structure or union, and so
361on. Expressions of type $\cscope$ may occur on the left or right-hand
362sides of assignments and as arguments in function calls just like any
363other expression. Two different variables of type $\cscope$ may be
364aliased, i.e., they may refer to the same dynamic scope.
365
366A dynamic scope $\delta$ is \emph{reachable} if there exists a path
367which starts from the dyscope referenced by some frame on the call
368stack of a process, follows the parent edges in the dyscope tree, and
369terminates in $\delta$. If a dyscope is not reachable, it can never
370become reachable, and it cannot have any effect on the subsequent
371execution of the program.
372
373Normally, a dynamic scope will eventually become unreachable. At some
374point after it becomes unreachable, it will be collected in a
375garbage-collection-like sweep, and any existing references to that
376scope will become \emph{undefined}. An object of type $\cscope$ is
377also undefined before it is initialized. Any use of an undefined
378value is reported as an error by CIVL, so it is important to be sure
379that a scope variable is defined before using it.
380
381
382\subsubsection{Checking if a dyscope is defined: \cscopedefined}
383
384The system function \cscopedefined{} has signature
385\begin{verbatim}
386 _Bool $scope_defined($scope s);
387\end{verbatim}
388It returns \emph{true} if the dynamic scope specified by \texttt{s} is
389defined, else it returns \emph{false}.
390
391\subsubsection{The constant \chere}
392
393A constant \chere{} exists in every scope. This constant has
394type \cscope{} and refers to the dynamic scope in which it is
395contained. For example,
396\begin{verbatim}
397 { // scope s
398 int *p = (int*)$malloc($here, n*sizeof(int));
399 }
400\end{verbatim}
401allocates an object consisting of $n$ ints in the scope $s$.
402
403\subsubsection{The constant \cscoperoot{}}
404
405There is a global constant \cscoperoot{} of type $\cscope$ which
406refers to the root dynamic scope.
407
408
409\subsubsection{Scope relational operators}
410
411Let $s_1$ and $s_2$ be expressions of type \cscope. The following are
412all CIVL-C expressions of boolean type:
413\begin{itemize}
414\item $s_1$ \ct{==} $s_2$. This is \emph{true} iff $s_1$ and $s_2$
415 refer to the same dynamic scope.
416\item $s_1$ \ct{!=} $s_2$. This is \emph{true} iff $s_1$ and $s_2$
417 refer to different dynamic scopes.
418\item $s_1$ \ct{<=} $s_2$. This is \emph{true} iff $s_1$ is equal to
419 or a descendant of $s_2$, i.e., $s_1$ is equal to or contained in $s_2$.
420\item $s_1$ \ct{<} $s_2$. This is \emph{true} iff $s_1$ is a strict
421 descendant of $s_2$, i.e., $s_1$ is contained in $s_2$ and is not
422 equal to $s_2$.
423\item $s_1$ \ct{>} $s_2$. This is equivalent to $s_2$ \ct{<} $s_1$.
424\item $s_1$ \ct{>=} $s_2$. This is equivalent to $s_2$ \ct{<=} $s_1$.
425\end{itemize}
426If $s_1$ or $s_2$ is undefined in any of these expressions, an error
427will be reported.
428
429\subsubsection{Scope parent function \texorpdfstring{\cscopeparent}{\$scope\_parent}}
430
431The system function
432\begin{verbatim}
433 $scope $scope_parent($scope s);
434\end{verbatim}
435returns the parent dynamic scope of the dynamic scope referenced by
436\ct{s}. If \ct{s} is the root dynamic scope, it returns the undefined
437value of type $\cscope$.
438
439\subsubsection{Lowest Common Ancestor: \ct{+}}
440
441The expression $s_1$ \ct{+} $s_2$, where $s_1$ and $s_2$ are
442expressions of type \cscope, evaluates to the lowest common ancestor
443of $s_1$ and $s_2$ in the dynamic scope tree. This is the smallest
444dynamic scope containing both $s_1$ and $s_2$.
445
446\subsubsection{The \cscopeof{} expression}
447
448Given any left-hand-side expression \ct{expr}, the expression
449\begin{verbatim}
450 $scopeof(expr)
451\end{verbatim}
452evaluates to the dynamic scope containing the object specified by
453\ct{expr}.
454
455The following example illustrates the semantics of the \cscopeof{}
456operator. All of the assertions hold:
457\begin{verbatim}
458{
459 $scope s1 = $here;
460 int x;
461 double a[10];
462
463 {
464 $scope s2 = $here;
465 int *p = &x;
466 double *q = &a[4];
467
468 assert($scopeof(x)==s1);
469 assert($scopeof(p)==s2);
470 assert($scopeof(*p)==s1);
471 assert($scopeof(a)==s1);
472 assert($scopeof(a[5])==s1);
473 assert($scopeof(q)==s2);
474 assert($scopeof(*q)==s1);
475 }
476}
477\end{verbatim}
478
479\subsection{Range and domain expressions}
480
481\subsubsection{Regular range expressions}
482\label{sec:range_expr}
483
484An expression of the form
485\begin{verbatim}
486 lo .. hi
487\end{verbatim}
488where \texttt{lo} and \texttt{hi} are integer expressions, represents
489the range consisting of the integers $\texttt{lo}, \texttt{lo}+1,
490\ldots, \texttt{hi}$ (in that order).
491
492An expression of the form
493\begin{verbatim}
494 lo .. hi # step
495\end{verbatim}
496where \texttt{lo}, \texttt{hi}, and \texttt{step} are integer
497expressions is interpreted as follows. If \texttt{step} is positive,
498it represents the range consisting of $\texttt{lo},
499\texttt{lo}+\texttt{step}, \texttt{lo}+2*\texttt{step}, \ldots$, up to
500and possibly including \texttt{hi}. To be precise, the infinite
501sequence is intersected with the set of integers less than or equal to
502\texttt{hi}.
503
504If \texttt{step} is negative, the expression represents the range
505consisting of $\texttt{hi}, \texttt{hi}+\texttt{step},
506\texttt{hi}+2*\texttt{step}, \ldots$, down to and possibly including
507\texttt{lo}. Precisely, the infinite sequence is intersected with the
508set of integers greater than or equal to \texttt{lo}.
509
510\subsubsection{Cartesian domain expressions}
511\label{sec:domain_expr}
512
513An expression of the form
514\begin{verbatim}
515 ($domain) { r1, ..., rn }
516\end{verbatim}
517where \texttt{r1}, \ldots, \texttt{rn} are $n$ expressions of type
518\crange, is a \emph{Cartesian domain expression}. It represents the
519domain of dimension $n$ which is the Cartesian product of the $n$
520ranges, i.e., it consists of all $n$-tuples $(x_1,\ldots,x_n)$ where
521$x_1\in\texttt{r1}$, \ldots, $x_n\in\texttt{rn}$. The order on the
522domain is the dictionary order on tuples. The type of this expression
523is \cdomainof{\(n\)}.
524
525When a Cartesian domain expression is used to initialize an object of
526domain type, the ``\texttt{(}\cdomain\texttt{)}'' may be omitted.
527For example:
528\begin{verbatim}
529 $domain(3) dom = { 0 .. 3, r2, 10 .. 2 # -2 };
530\end{verbatim}
531
532
533\section{Statements}
534
535\subsection{C Statements}
536
537The usual C statements are supported:
538\begin{itemize}
539\item \emph{no-op} (\ct{;})
540\item expression statements (\ct{e;})
541\item labeled statements, including \ct{case} and \ct{default} labels
542 (\ct{l: s})
543\item \emph{for} (\ct{for (init; cond; inc) s}), \emph{while}
544 (\ct{while (cond) s}) and \emph{do} (\ct{do s while (cond)})
545 loops
546\item compound statements (\lb \ct{s1;s2;} \ldots \rb)
547\item \texttt{if} and \verb!if! \ldots \verb!else!
548\item \verb!goto!
549\item \verb!switch!
550\item \verb!break!
551\item \verb!continue!
552\item \verb!return!
553\end{itemize}
554
555\subsection{Guards and nondeterminism}
556
557\subsubsection{Guarded commands: \cwhen}
558
559A guarded command is encoded in CIVL-C using a $\cwhen$ statement:
560\begin{verbatim}
561 $when (expr) stmt;
562\end{verbatim}
563All statements have a guard, either implicit or explicit. For most
564statements, the guard is \ctrue. The \cwhen{} statement allows one to
565attach an explicit guard to a statement.
566
567When \texttt{expr} is \emph{true}, the statement is enabled, otherwise
568it is disabled. A disabled statement is \emph{blocked}---it will not
569be scheduled for execution. When it is enabled, it may execute by
570moving control to the \texttt{stmt} and executing the first atomic
571action in the \texttt{stmt}.
572
573If \texttt{stmt} itself has a non-trivial guard, the guard of the
574\cwhen{} statement is effectively the conjunction of the \texttt{expr}
575and the guard of \texttt{stmt}.
576
577The evaluation of \texttt{expr} and the first atomic action of
578\texttt{stmt} effectively occur as a single atomic action. There is
579no guarantee that execution of \texttt{stmt} will continue atomically
580if it contains more than one atomic action, i.e., other processes may
581be scheduled.
582
583Examples:
584\begin{verbatim}
585 $when (s>0) s--;
586\end{verbatim}
587This will block until \texttt{s} is positive and then decrement
588\texttt{s}. The execution of \texttt{s--} is guaranteed to take place
589in an environment in which \texttt{s} is positive.
590
591\begin{verbatim}
592 $when (s>0) {s--; t++}
593\end{verbatim}
594The execution of \texttt{s--} must happen when \texttt{s>0}, but
595between \texttt{s--} and \texttt{t++}, other processes may execute.
596
597\begin{verbatim}
598 $when (s>0) $when (t>0) x=y*t;
599\end{verbatim}
600This blocks until both \texttt{x} and \texttt{t} are positive then
601executes the assignment in that state. It is equivalent to
602\begin{verbatim}
603 $when (s>0 && t>0) x=y*t;
604\end{verbatim}
605
606\subsubsection{Nondeterministic selection statement: \cchoose}
607
608A \cchoose{} statement has the form
609\begin{verbatim}
610 $choose {
611 stmt1;
612 stmt2;
613 ...
614 default: stmt
615 }
616\end{verbatim}
617The \texttt{default} clause is optional.
618
619The guards of the statements are evaluated and among those that are
620\emph{true}, one is chosen nondeterministically and executed. If none
621are \emph{true} and the \texttt{default} clause is present, it is
622chosen. The \texttt{default} clause will only be selected if all
623guards are \emph{false}. If no \texttt{default} clause is present and
624all guards are \emph{false}, the statement blocks. Hence the implicit
625guard of the \cchoose{} statement without a \texttt{default} clause is
626the disjunction of the guards of its sub-statements. The implicit
627guard of the \cchoose{} statement with a default clause is
628\emph{true}.
629
630Example: this shows how to encode a ``low-level'' CIVL guarded
631transition system:
632
633\begin{verbatim}
634 l1: $choose {
635 $when (x>0) {x--; goto l2;}
636 $when (x==0) {y=1; goto l3;}
637 default: {z=1; goto l4;}
638 }
639 l2: $choose {
640 ...
641 }
642 l3: $choose {
643 ...
644 }
645\end{verbatim}
646
647
648\subsubsection{Nondeterministic choice of integer:
649 \texorpdfstring{\cchooseint}{\$choose\_int}}
650
651The system function \cchooseint{} has the following signature:
652\begin{verbatim}
653 int $choose_int(int n);
654\end{verbatim}
655This function takes as input a positive integer \texttt{n} and
656nondeterministicaly returns an integer in the range
657$[0,\texttt{n}-1]$.
658
659\subsection{Iteration using domains with \cfor}
660\label{sec:cfor}
661
662A \emph{CIVL-for} statement has the form
663\begin{verbatim}
664 $for (int i1, ..., in : dom) S
665\end{verbatim}
666where \texttt{i1}, \ldots, \texttt{in} are $n$ identifiers,
667\texttt{dom} is an expression of type \cdomainof{\(n\)}, and
668\texttt{S} is a statement. The identifiers declare $n$ variables of
669integer type. Control iterates over the values of the domain,
670assigning the integer variables the components of the current tuple in
671the domain at the start of each iteration. The scope of the variables
672extends to the end of \texttt{S}. The iterations takes place in the
673order specified by the domain, e.g., dictionary order for a Caretesian
674domain.
675
676There is a also a parallel version of this construct, \cparfor,
677described in \ref{sec:parfor}.
678
679\section{Functions}
680\subsection{Abstract function: \cabstract}
681
682An abstract function declares a function without a body, and it has the form
683
684\begin{verbatim}
685 $abstract type function(list-of-parameters);
686\end{verbatim}
687
688It is required that the function should have a non-void return type and take at least one parameter.
689The return value of the function is evaluated symbolically using the actual arguments of the function call.
690
691\subsection{Utility functions}
692
693\subsubsection{Equality checking of any two objects: \texorpdfstring{\cequals}{\$equals}}
694
695The system function \cequals has the form
696\begin{verbatim}
697 _Bool $equals(void *x, void *y);
698\end{verbatim}
699
700This function takes two non-null pointers as input. If the two objects that the pointers refer to have the same value, then the
701function returns \ctrue. Otherwise, it returns \cfalse.
702
703\chapter{Concurrency}
704\label{chap:concurrency}
705
706\section{Process creation and management}
707
708\subsection{The process type: \cproc}
709
710This is a primitive object type and functions like any other primitive
711C type (e.g., \texttt{int}). An object of this type refers to a
712process. It can be thought of as a process ID, but it is not an
713integer and cannot be cast to one. It is analogous to the $\cscope$
714type for dynamic scopes.
715
716Certain expressions take an argument of \cproc{} type and some return
717something of \cproc{} type. The operators \verb!==! and \verb~!=~ may
718be used with two arguments of type \cproc{} to determine whether the
719two arguments refer to the same process.
720
721\subsection{Checking if a process is defined: \cprocdefined}
722
723An object of type \cproc{} is initially undefined, so a use of that
724object would result in an error. One can check whether a \cproc{}
725object is defined using the method \cprocdefined:
726\begin{verbatim}
727 _Bool $proc_defined($proc p);
728\end{verbatim}
729
730\subsection{Obtaining the null process reference: \cprocNull}
731
732This function takes a pointer to an object of \cproc{} type as a parameter:
733
734\begin{verbatim}
735 void $proc_null($proc *p);
736\end{verbatim}
737
738It updates the corresponding object with the default null value of the \cproc{} type.
739The null value of \cproc{} type is considered as defined, i.e., there will be no
740error to use an \cproc{} object with null value.
741
742\subsection{The \emph{self} process constant: \cself}
743
744This is a constant of type \cproc. It can be used wherever an argument
745of type \cproc{} is called for. It refers to the process that is
746evaluating the expression containing \cself.
747
748\subsection{Spawning a new process: \cspawn}
749
750A \emph{spawn} expression is an expression with side-effects. It
751spawns a new process and returns a reference to the new process, i.e.,
752an object of type \cproc. The syntax is the same as a procedure
753invocation with the keyword \cspawn{} inserted in front:
754\begin{verbatim}
755 $spawn f(expr1, ..., exprn)
756\end{verbatim}
757Typically the returned value is assigned to a variable, e.g.,
758\begin{verbatim}
759 $proc p = $spawn f(i);
760\end{verbatim}
761If the invoked function \texttt{f} returns a value, that value is
762simply ignored.
763
764\subsection{Waiting for another process to terminate: \cwait}
765
766The system function $\cwait$ has signature
767\begin{verbatim}
768 void $wait($proc p);
769\end{verbatim}
770When invoked, this function will not return until the process
771referenced by \ct{p} has terminated. Note that $p$ can be any
772expression of type \cproc{}, not just a variable.
773
774\subsection{Waiting for a number of processes to terminate: \cwaitall}
775
776The system function $\cwaitall$ has signature
777\begin{verbatim}
778 void $waitall($proc *procs, int numProcs);
779\end{verbatim}
780When invoked, this function will not return until all the \ct{numProcs} processes
781referenced by the memory specified by \ct{procs} have terminated.
782
783\subsection{Terminating a process immediately: \cexit}
784
785This function takes no arguments. It causes the
786calling process to terminate immediately, regardless of the state of
787its call stack:
788\begin{verbatim}
789 void $exit(void);
790\end{verbatim}
791
792\section{Atomicity}
793
794\subsection{Atom blocks: \catom} This defines a number of statements
795to be executed as a single atomic transition. An \catom~block has the
796following form:
797\begin{verbatim}
798 $atom {
799 stmt1;
800 stmt2;
801 ...
802 }
803\end{verbatim}
804
805The statements inside an \catom\ block are to be executed as one
806transition. It is required that the execution of the statements in an
807\catom\ block satisfy all of the following properties:
808\begin{enumerate}
809\item \emph{deterministic}: at each step in the execution of the atom
810 block, there must be at most one enabled statement;
811\item \emph{nonblocking}: at each step in the execution, there must be
812 at least one enabled statement, hence, together with (1), there must
813 be exactly one enabled statement;
814\item \emph{finite}: the execution of the atom block must terminate
815 after a finite number of steps; and
816\item \emph{isolated}: there are no jumps from outside the atom block
817 to inside the atom block, or from inside the atomc block to outside
818 of it.
819\end{enumerate}
820
821Violations of the \emph{deterministic}, \emph{nonblocking}, or
822\emph{isolated} properties will be reported either statically or
823dynamically. If the \emph{finite} property is violated, the
824verification may just run forever.
825
826Once the process enters an \catom\ block is said to be \emph{executing
827 atomly}. The process remains executing atomly until it reaches the
828terminating right brace of the block. Hence \emph{executing atomly}
829is a dynamic, not static condition. For example, the block might
830contain a function call which takes the process to a point in code
831which is not statically contained in an atom block; that process is
832nevertheless still executing atomly and is subject to the rules above.
833The process only stops executing atomly when that function call
834returns and control finally reaches the right curly brace at the end
835of the atom block (assuming the block is not contained in another atom
836block).
837
838\emph{Note:} \cwait\ statements are not allowed in \catom\ blocks.
839The rationale for this is that there is never a way to know for
840certain that another process has terminated (until \cwait\ has
841returned) so there is never a way to be certain the \cwait\ statement
842will not block. If one does occur in an \catom\ block, an error will
843be reported statically (if it can be detected statically) or
844dynamically (otherwise). Note that it is not always possible to
845detect this statically because the \catom\ block may contain a
846function call, and the function may contain the \cwait\ statement.
847
848\subsection{Atomic blocks: \catomic}
849
850The statements in an \emph{atomic} block will be executed without
851other processes interleaving, to the extent possible. It has the
852form:
853\begin{verbatim}
854 $atomic {
855 stmt1;
856 stmt2;
857 ...
858 }
859\end{verbatim}
860It is essentially a weaker form of \catom. Unlike \catom, there are
861no restrictions on the statements that can go inside an \catomic\
862block. A process executing an \catomic~block will try to execute the
863statements without interleaving with other processes, unless it
864becomes blocked. Unlike an \catom, the statements in an atomic block
865do not necessarily execute as a single transition; they may be spread
866out over multiple transitions.
867
868When no statement is enabled, the execution of the \catomic\ block
869will be interrupted. At this point, other processes are allowed to
870execute. Eventually, if the original process becomes enabled due to
871the actions of other processes, it may be scheduled again, in which
872case it regains atomicity and continues where it left off. For
873example, after executing the first loop, the process executing the
874following code will become blocked at the first \cwait\ statement:
875 \begin{verbatim}
876$atomic{
877 for(int i = 0; i < 5; i++) p[i] = $spawn foo(i);
878 for(int i = 0; i < 5; i++) $wait p[i];
879}
880\end{verbatim}
881Other processes will then execute. Eventually, if the process being
882waited on terminates, the original process becomes enabled and may be
883scheduled, in which case it regain atomicity, increments \texttt{i}
884and proceeds to the next $\cwait$ statement. This is in fact a common
885idiom for spawning and waiting on a set of processes.
886
887A process that enters an $\catomic$ block is said to be
888\emph{executing atomically}; it remains executing atomically until it
889reaches the closing curly brace.
890
891Both $\catom$ and $\catomic$ blocks can be nested arbitrarily, but
892$\catom$ overrides $\catomic$: a process that is executing atomly will
893continue executing atomly if it encounters an $\catomic$ statement;
894but a process executing atomically that encounters an $\catom$ will
895begin executing atomly.
896
897The atomic semantics are defined more precisely as follows: there is a
898single global variable called the \emph{atomic lock}. This variable
899can either be null (meaning the atomic lock is ``free''), or it can
900hold the PID of a process; that process is said to ``hold'' the atomic
901lock. Moreover, each process contains a special integer variable, its
902\emph{atomic counter}, which is initially 0. Every time a process
903enters an atomic block, it increments its atomic counter; every time
904it exits an atomic block, it decrements its counter. In order to
905increment its counter from $0$ to $1$, it must first wait for the
906atomic lock to become free, and then take the lock. When it
907decrements its counter from $1$ to $0$, it releases the atomic lock.
908When a process executing atomically becomes blocked, it releases the
909lock (without changing the value of its atomic counter).
910
911\section{Parallel loops with \cparfor}
912\label{sec:parfor}
913
914A parallel loop statement has the form
915\begin{verbatim}
916 $parfor (int i1, ..., in : dom) S
917\end{verbatim}
918The syntax is exactly the same as that for the sequential loop \cfor
919(Section \ref{sec:cfor}), only with \cparfor{} replacing \cfor.
920
921The semantics are as follows: when control reaches the loop, one
922process in spawned for each element of the domain. That process has
923local variables corresponding to the iteration variables, and those
924local variables are initialized with the components of the tuple for
925the element of the domain that process is assigned. Each process
926executes the statement \texttt{S} in this context. Finally, each of
927these processes is waited on at the end. In particular, there is an
928effective barrier at the end of the loop, and all the spawned
929processes disappear after this point.
930
931\section{Message-Passing}
932
933CIVL-C provides a number of additional primitives that can be used to
934model message-passing systems. This part of the language is built in
935two layers: the lower layer defines an abstract data type for
936representing messages; the higher layer defines an abstract data type
937of \emph{communicators} for managing sets of messages being
938transferred among some set of processes.
939
940\subsection{Messages: \cmessage}
941
942Messages are similar to bundles, but with some additional meta-data.
943The \emph{data} component of the message is the ``contents'' of the
944message and is formed and extracted much like a bundle. The meta-data
945consists of an integer identifier for the \emph{source} place of the
946message, an integer identifier for the message \emph{destination}
947place, and an integer \emph{tag} which can be used by a process to
948discriminate among messages for reception. This is very similar to
949MPI.
950
951\begin{figure}
952 \begin{small}
953\begin{verbatim}
954/* creates a new message, copying data from the specified buffer */
955$message $message_pack(int source, int dest, int tag, void *data, int size);
956
957/* returns the message source */
958int $message_source($message message);
959
960/* returns the message tag */
961int $message_tag($message message);
962
963/* returns the message destination */
964int $message_dest($message message);
965
966/* returns the message size */
967int $message_size($message message);
968
969/* transfers message data to buf, throwing exception if message
970 * size exceeds specified size */
971void $message_unpack($message message, void *buf, int size);
972\end{verbatim}
973 \end{small}
974 \caption{The \emph{message} abstract data type}
975 \label{fig:message}
976\end{figure}
977
978The functions for creating, and extracting information from, messages
979are given in Figure \ref{fig:message}.
980
981\subsection{Communicators: \cgcomm{} and \ccomm}
982\label{sec:communicators}
983
984CIVL-C defines a \emph{global communicator} type $\cgcomm$ and a
985\emph{local communicator} type $\ccomm$. The global communicator is an
986abstraction for a ``communication universe'' that stores buffered
987messages and perhaps other data. The local communicator wraps
988together a reference to a global communicator and an integer
989\emph{place}. Most of the message-passing commands take a local
990communicator as an argument to specify the communication universe used
991for that operation and the place from which that operation will be
992executed. The communication universes are isolated from one
993another---a message sent on one can never be received using a
994different communicator, for example.
995
996The global communicator is the shared object that must be declared in
997a scope containing all scopes in which communication in that universe
998will take place. It is created by specifying the number of
999\emph{places} that will comprise the communicator. A place is an
1000address to which messages may be sent or where they may be received.
1001There is not necessarily a one-to-one correspondence between places and
1002processes: many processes can use the same place.
1003
1004Local communicators are created (typically in some child scope of the
1005scope in which the global communicator is declared) by specifying the
1006gobal communicator to which the local one will be associated and the
1007place ID. The local communicator will be used in most of the
1008message-passing functions; it may be thought of as an ordered pair
1009consisting of a reference to the global communicator and the integer
1010place ID. The place ID must be in $[0,\texttt{size}-1]$, where
1011\texttt{size} is the size of the global communicator. The place ID
1012specifies the place in the global communication universe that will be
1013occupied by the local communicator. The local communicator handle may
1014be used by more than one process, but all of those processes will be
1015viewed as occupying the same place. Only one call to \ccommcreate{}
1016may occur for each gcomm-place pair.
1017
1018
1019Both types ($\cgcomm$ and $\ccomm$) are handle types. When declared
1020with a call to the corresponding creation function, they create an
1021object in the specified scope and return a handle to that object. The
1022object can only be accessed through the specified system functions
1023that take this handle as an argument.
1024
1025 % This local communicator handle will be used as an
1026 % * argument in most message-passing functions. The place must be in
1027 % * [0,size-1] and specifies the place in the global communication universe
1028 % * that will be occupied by the local communicator. The local communicator
1029 % * handle may be used by more than one process, but all of those
1030 % * processes will be viewed as occupying the same place.
1031 % * Only one call to $comm_create may occur for each gcomm-place pair.
1032
1033\begin{figure}
1034 \begin{small}
1035\begin{verbatim}
1036/* Creates a new global communicator object and returns a handle to it.
1037 * The global communicator will have size communication places. The
1038 * global communicator defines a communication "universe" and encompasses
1039 * message buffers and all other components of the state associated to
1040 * message-passing. The new object will be allocated in the given scope. */
1041$gcomm $gcomm_create($scope s, int size);
1042
1043void $gcomm_destroy($gcomm gcomm); // Destroys the gcomm
1044
1045_Bool $gcomm_defined($gcomm gcomm); // Is the gcomm object defined?
1046
1047/* Creates a new local communicator object and returns a handle to it.
1048 * The new communicator will be affiliated with the specified global
1049 * communicator. The new object will be allocated in the given scope. */
1050$comm $comm_create($scope s, $gcomm gcomm, int place);
1051
1052void $comm_destroy($comm comm); // Destroys the comm
1053
1054_Bool $comm_defined($comm comm); // Is the comm object defined?
1055
1056/* Returns the size (number of places) in the global communicator associated
1057 * to the given comm. */
1058int $comm_size($comm comm);
1059
1060/* Returns the place of the local communicator. This is the same as the
1061 * place argument used to create the local communicator. */
1062int $comm_place($comm comm);
1063
1064/* Adds the message to the appropriate message queue in the communication
1065 * universe specified by the comm. The source of the message must equal
1066 * the place of the comm. */
1067void $comm_enqueue($comm comm, $message message);
1068
1069/* Returns true iff a matching message exists in the communication universe
1070 * specified by the comm. A message matches the arguments if the destination
1071 * of the message is the place of the comm, and the sources and tags match. */
1072_Bool $comm_probe($comm comm, int source, int tag);
1073
1074/* Finds the first matching message and returns it without modifying
1075 * the communication universe. If no matching message exists, returns a message
1076 * with source, dest, and tag all negative. */
1077$message $comm_seek($comm comm, int source, int tag);
1078
1079/* Finds the first matching message, removes it from the communicator,
1080 * and returns the message */
1081$message $comm_dequeue($comm comm, int source, int tag);
1082\end{verbatim}
1083 \end{small}
1084 \caption{The \emph{communicator} interface specifies handle
1085 types $\cgcomm$ and $\ccomm$ and the functions above}
1086 \label{fig:comm}
1087\end{figure}
1088
1089The communicator interface is given in Figure \ref{fig:comm}.
1090
1091Certain restrictions are enforced on some relations between the
1092objects involved in a communication universe.
1093
1094Fix a \cgcomm{} object. This object corresponds to a single
1095communication universe with, say, $n$ places. At any time, there can
1096be \emph{at most one} \ccomm{} object associated to a given place. If
1097a program attempts to create a \ccomm{} object with the same \cgcomm{}
1098and place as an earlier created \ccomm{} object, a runtime error will
1099occur. In particular, there can be at most $n$ \ccomm{} objects
1100associated to the \cgcomm.
1101
1102The relation between processes and \ccomm{} objects is unconstrained.
1103One process may use any number of \ccomm{} objects. (Of course, the
1104process must have access to handles for those \ccomm{} objects.)
1105Dually, a single \ccomm{} object may be used by any number of
1106processes; this situation arises naturally when modeling a
1107multi-threaded MPI program.
1108
1109\begin{figure}
1110 \begin{small}
1111\begin{verbatim}
1112$gcomm gcomm = $gcomm_create($here, nprocs);
1113void Process(int rank) {
1114 $comm comm = $comm_create($here, gcomm, rank);
1115
1116 void Thread(int tid) {
1117 ...$comm_enqueue(comm, msg)...
1118 ...$comm_dequeue(comm, source, tag)...
1119 }
1120
1121 for (int i=0; i<nthreads; i++) $spawn Thread(i);
1122 ...
1123 $comm_destroy(comm);
1124}
1125for (int i=0; i<nprocs; i++) $spawn Process(i);
1126...
1127$gcomm_destroy(gcomm);
1128\end{verbatim}
1129 \end{small}
1130 \caption{Code skeleton for model of multithreaded MPI program
1131 showing placement of global and local communicator objects}
1132 \label{fig:mpi-threads-comm}
1133\end{figure}
1134
1135There is no special status given to the process which creates the
1136\ccomm{} object of a given place. Any process which can access a
1137handle for that \ccomm{} object can use it to send or receive
1138messages, regardless of whether that process was the one that created
1139the \ccomm{} object. However, users should be aware that verification
1140is likely to be most efficient when variables are declared as locally
1141as possible, so it is best to declare the \ccomm{} object in the
1142innermost scope possible. Figure \ref{fig:mpi-threads-comm}
1143illustrates an effective way to do this in the context of modeling a
1144multithreaded MPI program. In the code skeleton, each thread can
1145access the local communicator object of its process, but not that of
1146any other process.
1147
1148\subsection{Barriers: \cgbarrier{} and \cbarrier}
1149\label{sec:barriers}
1150
1151CIVL-C defines a \emph{global barrier} type $\cgbarrier$ and a
1152\emph{local barrier} type $\cbarrier$. They provide an implementation of
1153a barrier for concurrent programs.
1154
1155The global barrier is a shared object that must be declared in
1156a scope containing all scopes in which the barrier will be called.
1157 It is created by specifying the number of
1158\emph{places} that will comprise the barrier.
1159
1160Local barriers are created (typically in some child scope of the
1161scope in which the global barrier is declared) by specifying the
1162gobal barrier to which the local one will be associated and the
1163place ID. The local barrier will be used in the call to the barrier;
1164it may be thought of as an ordered pair
1165consisting of a reference to the global barrier and the integer
1166place ID. The place ID must be in $[0,\texttt{size}-1]$, where
1167\texttt{size} is the size of the global barrier.
1168Only one call to \cbarriercreate{}
1169may occur for each gbarrier-place pair.
1170
1171
1172Both types ($\cgbarrier$ and $\cbarrier$) are handle types. When declared
1173with a call to the corresponding creation function, they create an
1174object in the specified scope and return a handle to that object. The
1175object can only be accessed through the specified system functions
1176that take this handle as an argument.
1177
1178
1179\begin{figure}
1180 \begin{small}
1181\begin{verbatim}
1182/* Creates a new barrier object and returns a handle to it.
1183 * The barrier has the specified size.
1184 * The new object will be allocated in the given scope. */
1185$gbarrier $gbarrier_create($scope scope, int size);
1186
1187/* Destroys the gbarrier */
1188void $gbarrier_destroy($gbarrier barrier);
1189
1190/* Creates a new local barrier object and returns a handle to it.
1191 * The new barrier will be affiliated with the specified global
1192 * barrier. This local barrier handle will be used as an
1193 * argument in most barrier functions. The place must be in
1194 * [0,size-1] and specifies the place in the global barrier
1195 * that will be occupied by the local barrier.
1196 * Only one call to $barrier_create may occur for each barrier-place pair.
1197 * The new object will be allocated in the given scope. */
1198$barrier $barrier_create($scope scope, $gbarrier gbarrier, int place);
1199
1200/* Calls the barrier associated with this local barrier object.*/
1201void $barrier_call($barrier barrier);
1202
1203/* Destroys the barrier. */
1204void $barrier_destroy($barrier barrier);
1205\end{verbatim}
1206 \end{small}
1207 \caption{The \emph{barrier} interface specifies handle
1208 types $\cgbarrier$ and $\cbarrier$ and the functions above}
1209 \label{fig:barrier}
1210\end{figure}
1211
1212The barrier interface is given in Fig.\ \ref{fig:barrier}.
1213
1214\chapter{Specification}
1215
1216\section{Overview}
1217
1218Specification is the means by which one expresses what a program is
1219supposed to do, i.e., what it means for it to be correct.
1220
1221There are several specification mechanisms in CIVL-C. First, there are
1222the default properties: these are generic properties which are checked
1223by default in any program, and require no additional specification
1224effort. These properties include absence of deadlocks, division by 0,
1225illegal pointer dereferences, and out of bounds array indexes.
1226
1227Many more program-specific properties can be specified using
1228assertions. CIVL-C has a rich assertion language which extends the
1229language of boolean-valued C expressions. Assumptions are a
1230specification dual to assertions in that they restrict the set
1231of executions on which the assertions are checked.
1232
1233Functional equivalence is a power specification mechanism. In this
1234approach, two programs are provided, one playing the role of the
1235specification, the other the role of the implementation. The
1236implementation is correct if, for all inputs $x$, it produces the same
1237output as that produced by the specification on input $x$. In other
1238words, the two programs define the same function; this is sometimes
1239known as \emph{input-output equivalence}. In order to take this
1240approach, one must first have a way to specify what the inputs and
1241outputs of a programs are; CIVL-C provides special keywords for this.
1242
1243Procedure contracts are another powerful specification mechanisms.
1244These typically involve specifying preconditions and postconditions
1245for a function. The function is correct if, whenever it is called in a
1246state satisfying the precondition, when it returns the state will
1247satsify the postcondition. A program is correct if all its functions
1248satsify their contract.
1249
1250\section{Input-output signature}
1251
1252\subsection{Input type qualifier: \cinput}
1253
1254The declaration of a variable in the root scope may
1255include the type qualifier \cinput, e.g.,
1256\begin{verbatim}
1257 $input int n;
1258\end{verbatim}
1259This declares the variable to be an input variable, i.e., one which is
1260considered to be an input to the program. Such a variable is
1261initialized with an arbitrary (unconstrained) value of its type. When
1262using symbolic execution to verify a program, such a variable will be
1263assigned a unique symbolic constant of its type.
1264
1265In contrast, variables in the root scope which are not input variables
1266will instead be initialized with the ``undefined'' value. If an
1267undefined value is used in some way (such as in an argument to an
1268operator), an error occurs.
1269
1270In addition, input variables may only be read, never written to.
1271
1272Alternatively, it is also possible to specify a particular concrete
1273initial value for an input variable. This is done using a command
1274line argument when verifying or running the program.
1275
1276Input (and output) variables also play a key role when determining
1277whether two programs are functionally equivalent. Two programs are
1278considered functionally equivalent if, whenever they are given the
1279same inputs (i.e., corresponding \cinput{} variables are initialized
1280with the same values) they will produce the same outputs (i.e.,
1281corresponding \coutput{} variables will end up with the same values at
1282termination).
1283
1284\subsection{Output type qualifier: \coutput}
1285
1286A variable in the root scope may be declared with this type qualifier
1287to declare it to be an output variable. Output variables are ``dual''
1288to input variables. They may only be written to, never read. They
1289are used primarily in functional equivalence checking.
1290
1291\section{Assertions and assumptions}
1292
1293\subsection{Assertions: \cassert}
1294
1295The system function \cassert{} takes an argument of boolean type:
1296\begin{verbatim}
1297 void $assert (_Bool expr);
1298\end{verbatim}
1299During verification, the assertion is checked. If it cannot be proved
1300that it must hold, a violation is reported.
1301
1302Note that CIVL-C boolean expressions have a richer syntax than C
1303expressions, and may include universal or existential quantifiers
1304(see below), and the boolean values \ctrue{} and \cfalse{}.
1305
1306The assertion function may take additional optional arguments used to
1307print a specific message if the assertion is violated. These
1308additional arguments are similar in form to those used in C's
1309\texttt{printf} statement: a format string, followed by some number of
1310arguments which are evaluated and substituted for successive codes in
1311the format string. For example,
1312\begin{verbatim}
1313 $assert(x<=B, "x-coordinate %f exceeds bound %f", x, B);
1314\end{verbatim}
1315
1316The C function \texttt{assert}, included in the standard library
1317\texttt{assert.h}, is identical to \cassert. Programmers are
1318free to use either one.
1319
1320
1321\subsection{Assume statements: \cassume}
1322
1323As \emph{assume statement} has the form
1324\begin{verbatim}
1325 $assume expr;
1326\end{verbatim}
1327During verification, the assumed expression is assumed to hold. If
1328this leads to a contradiction on some execution, that execution is
1329simply ignored. It never reports a violation, it only restricts the
1330set of possible executions that will be explored by the verification
1331algorithm.
1332
1333Like as assertion statement, as assume statement can be used any place
1334a statement is expected. In addition, as assume statement can be used
1335in file scope to place restrictions on the global variables of the
1336programs. For example,
1337\begin{verbatim}
1338$input int B;
1339$input int N;
1340$assume 0<=N && N<=B;
1341\end{verbatim}
1342declares \texttt{N} and \texttt{B} to be integer inputs and restricts
1343consideration to inputs satisfying $0\leq\texttt{N}\leq\texttt{B}$.
1344
1345
1346\section{Formulas}
1347
1348A formula is a boolean expression that can be used in an assert
1349statement, assume statement, procedure contract (below), or invariant.
1350Any ordinary C boolean expression is a formula. CIVL-C provides some
1351additional kinds of formulas, described below.
1352
1353\subsection{Implication: \cimplies}
1354
1355The binary operation \cimplies{} represents logical implication.
1356The expression \verb!p=>q! is equivalent to \verb~(!p)||q~.
1357
1358\subsection{Universal quantifier: \cforall}
1359
1360The universally qunatified formula has the form
1361\begin{verbatim}
1362 $forall { type identifier | restriction} expr
1363\end{verbatim}
1364where \verb!type! is a type name (e.g., \texttt{int} or
1365\texttt{double}), \verb!identifier! is the name of the bound variable,
1366\verb!restriction! is a boolean expression which expresses some
1367restriction on the values that the bound variable can take, and
1368\verb!expr! is a formula. The universally quantified formula
1369holds iff for all values assignable to the bound variable
1370for which the restriction holds, the formula \ct{expr} holds.
1371
1372A variation on the construct above can be used in the special case
1373where the bound variable is to range over a finite interval
1374of integers. In this case the quantified formula may be written:
1375\begin{verbatim}
1376 $forall { type identifier=lower .. upper } expr
1377\end{verbatim}
1378where \ct{lower} and \ct{upper} are integer expressions.
1379
1380\subsection{Existential quantifier: \cexists}
1381
1382The syntax for existentially quantified expressions is exactly the
1383same as for universally quantified expressions, with \cexists{} in
1384place of \cforall{}.
1385
1386\section{Contracts}
1387
1388\subsection{Procedure contracts: \crequires{} and \censures{}}
1389The \crequires{} and \censures{} primitives are used to encode
1390procedure contracts. There are optional
1391elements that may occur in a procedure declaration or definition,
1392as follows. For a function prototype:
1393\begin{verbatim}
1394 T f(...)
1395 $requires expr;
1396 $ensures expr;
1397 ;
1398\end{verbatim}
1399For a function definition:
1400\begin{verbatim}
1401 T f(...)
1402 $requires expr;
1403 $ensures expr;
1404 {
1405 ...
1406 }
1407\end{verbatim}
1408The value \cresult{} may be used in post-conditions to refer
1409to the result returned by a procedure.
1410
1411\emph{Status}: parsed, but nothing is currently done with this
1412information.
1413
1414\subsection{Loop invariants: \cinvariant}
1415
1416This indicates a loop invariant. Each C loop
1417construct has an optional invariant clause as follows:
1418\begin{verbatim}
1419 while (expr) $invariant (expr) stmt
1420 for (e1; e2; e3) $invariant (expr) stmt
1421 do stmt while (expr) $invariant (expr) ;
1422\end{verbatim}
1423The invariant encodes the claim that if \texttt{expr} holds upon
1424entering the loop and the loop condition holds, then it will hold
1425after completion of execution of the loop body. The invariant is used
1426by certain verification techniques.
1427
1428\emph{Status:} parsed, but nothing is currently done with this
1429information.
1430
1431\section{Concurrency specification}
1432
1433\subsection{Remote expressions: \texttt{e@x}}.
1434
1435These have the form \verb!expr@x! and refer to a variable in another
1436process, e.g., \verb!procs[i]@x!. This special kind of expression is
1437used in collective expressions, which are used to formulate collective
1438assertions and invariants.
1439
1440The expression \verb!expr! must have \cproc{} type. The variable
1441\texttt{x} must be a statically visible variable in the context in
1442which it is occurs. When this expression is evaluated, the evaluation
1443context will be shifted to the process referred to by \texttt{expr}.
1444
1445\emph{Status}: not implemented.
1446
1447\subsection{Collective expressions: \ccollective}. These have the form
1448\begin{verbatim}
1449 $collective(proc_expr, int_expr) expr
1450\end{verbatim}
1451This is a collective expression over a set of processes. The
1452expression \texttt{proc{\U}expr} yields a pointer to the first element
1453of an array of \cproc. The expression \texttt{int{\U}expr} gives the
1454length of that array, i.e., the number of processes. Expression
1455\texttt{expr} is a boolean-valued expression; it may use remote
1456expressions to refer to variables in the processes specified in the
1457array. Example:
1458\begin{verbatim}
1459 $proc procs[N];
1460 ...
1461 $assert $collective(procs, N) i==procs[(pid+1)%N]@i ;
1462\end{verbatim}
1463
1464\emph{Status}: not implemented.
1465
1466\chapter{Pointers and Heaps}
1467\label{chap:pointers}
1468
1469CIVL-C supports pointers, using the same operators with the same
1470meanings as C (\texttt{\&}, \texttt{*}, pointer arithmetic). There is
1471also a heap in every scope, and system functions to allocate and
1472deallocate objects in the specified scope.
1473
1474\section{Memory functions: \texttt{memcpy}}
1475
1476The function \texttt{memcpy} is defined in the standard C library
1477\texttt{string.h} and works exactly the same in CIVL-C: it copies
1478data from the region pointed to by \ct{q} to that pointed to by
1479\ct{p}. The signature is
1480
1481\begin{verbatim}
1482 void memcpy(void *p, void *q, size_t size);
1483\end{verbatim}
1484
1485\section{Heaps, \cmalloc{} and \cfree}
1486
1487As mentioned above, each dynamic scope has an implicit heap on which
1488objects can be allocated and deallocated dynamically. To allocate an
1489object, one first needs a reference to the dynamic scope to be used.
1490The system function $\cmalloc$ is like C's \texttt{malloc}, but takes
1491this extra scope argument:
1492\begin{verbatim}
1493 void * $malloc($scope scope, int size);
1494\end{verbatim}
1495The standard C function
1496\begin{verbatim}
1497 void * malloc(int size);
1498\end{verbatim}
1499declared in \texttt{stdlib.h}, is equivalent to \verb!$malloc($root, size)!.
1500
1501The system function \cfree{} is used to deallocate a heap object;
1502it is just like C's \texttt{free}:
1503\begin{verbatim}
1504 void $free(void *p);
1505\end{verbatim}
1506An error is generated if the pointer is not one that was returned by
1507\cmalloc, or if it was already freed. The standard C function
1508\texttt{free}, declared in \texttt{stdlib.h} is identical to \cfree.
1509The two functions are interchangeable.
1510
1511% \section{Pointer types}
1512
1513% Given any object type $T$ and a static scope $s$ in a CIVL-C program,
1514% there is a type \emph{pointer-to-$T$-in-$s$}. The type is used to
1515% represent a pointer to a memory location of type $T$ in scope $s$ or a
1516% descendant of $s$ (i.e., some scope contained in $s$).
1517
1518% If scope $s_1$ is a descendant of $s_2$ (i.e., $s_1$ is lexically
1519% contained in $s_2$), the type \emph{pointer-to-$T$-in-$s_1$} is a
1520% subtype of \emph{pointer-to-$T$-in-$s_2$}. This means that any
1521% expression of the first type can be used wherever an object of the
1522% second type is expected. In particular, any expression $e$ of the
1523% subtype can be assigned to a left-hand-side expression of the
1524% supertype without explicit casts; also $e$ can be used as an argument
1525% to a function for which the corresponding parameter has the supertype.
1526
1527% The syntax for denoting this type adheres to the usual C syntax for
1528% denoting the type \emph{pointer-to-$T$} with the addition of a scope
1529% parameter within angular brackets immediately following the \texttt{*}
1530% token. For example, to declare a variable \texttt{p} of type
1531% \emph{pointer-to-$T$-in-$s$}, one writes
1532% \begin{verbatim}
1533% int *<s> p;
1534% \end{verbatim}
1535% If the scope modifier \texttt{<...>} is absent, the scope is taken to
1536% be the root scope $s_0$. The object has type
1537% \emph{pointer-to-$T$-in-$s_0$}, which is abreviated as
1538% \emph{pointer-to-$T$}. In this way, stanard C programs can be
1539% interpreted as CIVL-C programs.
1540
1541% \section{Address-of operator}
1542
1543% The address-of operator \texttt{\&} returns a pointer of the
1544% appropriate subtype using the innermost scope in which its left-hand-side
1545% argument is declared. For example
1546
1547% \begin{verbatim}
1548% {
1549% $scope s1 = $here();
1550% int x;
1551% double a[N];
1552% int *<s1> p = &x;
1553% double *<s1> q = &a[2];
1554% }
1555% \end{verbatim}
1556% is correct (in particular, it is type-correct) because \texttt{\&x}
1557% has type \emph{pointer-to-\texttt{int}-in-\texttt{s1}}, since
1558% \texttt{s1} is the scope in which \texttt{x} is declared.
1559
1560% Another pointer example:
1561% \begin{small}
1562% \begin{verbatim}
1563% { $scope s0 = $here();
1564% { $scope s1 = $here();
1565% double x;
1566% { $scope s2 = $here();
1567% double y;
1568% double *<s1> p;
1569% /* p can only point to something in s1 or descendant, for example, s2 */
1570% p = &x; // fine
1571% p = &y; // fine
1572% p = (double*)$malloc(s0, 10*sizeof(double)); // static type error
1573% }
1574% }
1575% }
1576% \end{verbatim}
1577% \end{small}
1578
1579% \section{Pointer addition and subtractions}
1580
1581% If \texttt{e} is an expression of type \emph{pointer-to-$T$-in-$s$}
1582% and \texttt{i} is an expression of integer type then \texttt{e+i} also
1583% has type \emph{pointer-to-$T$-in-$s$}. In other words, pointer
1584% addition cannot leave the scope of the original pointer. This
1585% reflects the fact that every object is contained in one scope, and
1586% pointer addition cannot leave the object.
1587
1588% Pointer subtraction is defined on two pointers of the same type, where
1589% ``same'' includes the scope. That is checked statically. As in C, it
1590% is only defined if the two pointers point to the same object. In
1591% CIVL-C, a runtime error will be thrown if they do not point to the
1592% same object.
1593
1594% \section{Semantics of scopes and pointer types}
1595
1596% A variable of type \cscope{} is treated like any other variable.
1597% It becomes part of the state when the scope in which it is declared
1598% is instantiated to form a dynamic scope. The variable is
1599% initialized at that time and its value cannot change.
1600
1601% Each time a dynamic scope is instantiated, it is assigned a unique ID
1602% number. The exactly value of the ID number is not relevant, it just
1603% has to be distince from any other scope ID number that currently
1604% exists in the state. This is the value that is assigned to the scope
1605% variable. Therefore, if a static scope contains a scope variable, and
1606% that scope is instantiated twice to form two distinct dynamic scopes,
1607% the values assigned to the two variables will be distinct.
1608
1609% A pointer value is an ordered pair $\langle \delta,r \rangle$, where
1610% $\delta$ is a dynamic scope ID and $r$ is a reference to a memory
1611% location in the static scope associated to $\delta$. (We will define
1612% the exact form of a reference later.)
1613
1614% When a dynamic scope is instantiated, each new variable created is
1615% assigned a \emph{dynamic type}. This is a refinement of the static
1616% type associated to the static variable. Every dynamic type
1617% is an instance of exactly one static type. The dynamic
1618% type of the newly instantiated variable is an instance of the
1619% static type of the static variable.
1620
1621% The dynamic pointer types have the form
1622% \emph{pointer-to-$t$-in-$\delta$}, where $t$ is a dynamic type and
1623% $\delta$ is a dynamic scope ID. For a program to be dynamically type
1624% safe, such a variable should hold only values of the form $\langle
1625% \delta, r\rangle$. In particular, the variable should never be
1626% assigned a value where the dynamic scope component is a different
1627% instance of the static scope $s$ associated to $\delta$.
1628
1629% \section{Pointer casts}
1630
1631% If scope $s_1$ is contained in scope $s_2$, an expression of type
1632% \emph{pointer-to-$T$-in-$s_1$} can always be cast to
1633% \emph{pointer-to-$T$-in-$s_2$},
1634% because the first is a subtype of the second. (As described above,
1635% the cast is unnecessary.)
1636
1637% The cast in the other direction is also allowed, but the dynamic type
1638% safety of that cast will only be checked at runtime. In particular, a
1639% runtime error will result if the cast attempts to cast the pointer
1640% value to a dynamic scope which does not contain (is an ancestor of)
1641% the dynamic scope component of the pointer value.
1642
1643% A type \emph{pointer-to-$T_1$-in-$s$} can be cast to a type
1644% \emph{pointer-to-$T_2$-in-$s$} according to the usual rules of C. In
1645% other words, usual casting rules apply as long as you don't change the
1646% scope.
1647
1648% \section{Scope-Parameterized Functions}
1649
1650% Coming soon. (Parsed, type checked, not currently used otherwise.)
1651
1652% \section{Scope-Parameterized Type Definitions}
1653
1654% Coming soon. (Ditto.)
1655
1656\chapter{Libraries}
1657
1658Each of the following libraries is at least partially implemented and can
1659be included in a CIVL-C program:
1660\begin{itemize}
1661\item \ct{assert}
1662 \begin{itemize}
1663 \item \verb!void assert(_Bool expr);!
1664 \end{itemize}
1665\item \ct{math}
1666 \begin{itemize}
1667 \item \verb!double sqrt(double x);!
1668 \item \verb!double ceil(double x);!
1669 \item \verb!double exp(double x);!
1670 \end{itemize}
1671\item \ct{stdlib}
1672 \begin{itemize}
1673 \item \verb!size_t!
1674 \item \verb!void * malloc(size_t size);!
1675 \item \verb!void free(void * ptr);!
1676 \end{itemize}
1677\item \ct{stdbool}
1678 \begin{itemize}
1679 \item \verb!true!
1680 \item \verb!false!
1681 \end{itemize}
1682\item \ct{stddef}
1683 \begin{itemize}
1684 \item \verb!size_t!
1685 \item \verb!NULL!
1686 \end{itemize}
1687\item \ct{stdio}
1688 \begin{itemize}
1689 \item \verb!int printf(const char * restrict format, ...);!
1690 \end{itemize}
1691\item \ct{string}
1692 \begin{itemize}
1693 \item \verb!size_t!
1694 \item \verb!NULL!
1695 \item \verb!void memcpy(void * restrict dst, const void * restrict src, size_t n);!
1696 \end{itemize}
1697\end{itemize}
Note: See TracBrowser for help on using the repository browser.