wiki:IR2

Version 68 (modified by siegel, 5 years ago) ( diff )

--

CIVL IR v2.0

Language principles

  • CIVL-IR is a subset of CIVL-C. A CIVL-IR program is a CIVL-C program, and has the same semantics.
  • CIVL-IR should be a small language. Whenever there is a way to express a new construct using other existing constructs, the preferred approach is to not add the new construct to the language.
  • CIVL-IR is to be as close as possible to a language representation of a CIVL "Model". This is the lowest-level representation of a CIVL program, one appropriate for efficient (symbolic) execution and/or model checking.
  • The basic representation of a function is as a "program graph"---a directed graph in which the nodes are locations and edges are atomic statements.
  • Expressions are side-effect free.
  • There are no automatic conversions. All conversions must be by explicit casts or other functions. Operations such as numeric addition (add) require that both operands have the exact same type.
  • In general, symbols can be used before they are defined, as long as they are in scope. For example, a function f can call g, even if g is defined after f in the same scope. There is no need for a "prototype". Similarly, a type definition can refer to a type defined later, or can refer to itself in its own definition.
  • Many of the core operations are implemented as functions in the standard library. This is similar to C. The header files for the standard library are an integral part of the language.
  • Evaluation of expressions and execution of statements should always result in a value or state. There is no concept of exception or error. Hence anything that should be checked should be checked by explicit assertions.

Questions

  1. Do variables have default initial values?
    • No, a declared variable must be initialized before it is used. Otherwise, the behavior is undefined.
  2. How do you initialize a variable?
    • By assigning a value to it. For example n=$new($int); will assign n an arbitrary integer, while n=0; will assign the integer 0 to n.
  3. How is an array allocated?
    • An array variable a is declared with a decl such as T a[];, and then a statement such as a=$new(T[n]); will assign to a a new arbitrary array value for an array of length n of elements of type T. For heap-allocation, a pointer is declared with a decl such as T * p;, and a heap variable is also declared somewhere with a decl such as $heap heap; and then a statement such as p = $alloc(&heap, n, T); will add a new object to the heap and return a pointer to the first element. An $alloc-ed object can be deallocated with $free(p);.
  4. Is there an "array-pointer pun", as in C?
    • No, if you want a pointer to element 0 of an array, you have to explicitly write something like &a[0].
  5. How to you translate between sequences and arrays?
    • There are functions in the standard library to do that.
  6. Can you make types values? (reification)
    • For now, no. There are several statements and expressions which do take a type name as an argument, but there is no way for one to define new expressions or functions that do that.
  7. How do you monitor reads and writes?
    • There are functions in the standard library (mem.h) for this.
  8. Is there a type for $state ?
    • Coming soon.
  9. How do you model malloc when the element type is not known?
    • For now, you can't. [Note to self: consider creating a type like $byte for this purpose?]
  10. How do you iterate over a domain?
    • Coming soon.
  11. Bitvectors?
    • Coming soon.

Grammar

program: typedef* decl* function-definition+ ;
typedef:
  struct ID '{' decl* '}' ';'
  | union ID '{' decl* '}' ';'
  | 'typedef' type-specifier declarator ';'
  ;
decl: qualifier* type-specifier declarator contract-clause* ';' ;
function-definition: qualifier* type-specifier declarator contract-clause* block ;
qualifier:
    '$input'  /* input variable; only for global decls */
  | '$output'  /* output variable; only for global decls */
  | '$abstract_f'  /* abstract function; only for function decls */
  | '$pure_f'  /* function is a mathematical function of its parameters */
  | '$state_f'  /* function is a mathematical function of the state */
  | '$atomic_f'  /* function invocations take place atomically */
  | '$system_f' '<' STRING ',' STRING '>'  /* function is defined in system code elsewhere */
  ;
block: '{' typedef* decl* function-definition* statement* '}' ;
statement: block | simpleStmt | chooseStmt ;
simpleStmt: label? dependency? guard? primitiveStmt gotoStmt? ;
chooseStmt: label? '$choose' '{' simpleStmt* '}' ;
label: ID ':' ;
guard: '$when' '(' expr ')' ;
dependency: '$depends_on' '(' expression-list? ')'
expression-list: expr (',' expr)* ;
gotoStmt: 'goto' ID ;
INT: ... /* integer constant */
ID: ... /* identifier */
STRING: ... /* string literal in double quotes */

Notes

  • A declaration declares a variable to have either an object type or a function type. (An object type is any type that is not a function type.)
  • A variable of function type represents either an abstract or a system function. The declaration of such a variable must use either the qualifier $abstract_f or $system_f. Moreover, those two qualifiers can only be used in a declaration of function type.
  • An abstract function represents a new uninterpreted pure function.
  • A system function has no definition in the program, but is instead defined elsewhere (for example, in C or Java code). Such a function will always be executed atomically. The first string specifies a path (e.g., Java package) to the library containing the function, the second is the name of the library. These two Strings should be enough to tell CIVL where to find the system definition of the function.
  • $atomic_f can only be used in a function definition. It indicates that a defined function is to behave atomically, i.e., every call to such a function will be executed as if in atomic region. (An abstract or system function must necessarily be atomic.)
  • $pure_f can be applied to a system function or defined function only. (An abstract function is necessarily pure.) The use of $pure_f indicates that the function has no side-effects, and the value returned in a mathematical function of the arguments. If the function is not actually pure, the behavior is undefined.
  • $state_f can be applied to a system function or defined function only. It declares that the function has no side-effects, and the value returned is a mathematical function of the state in which the call occurred; i.e., the value returned may depend on the arguments in the call, the values of global variables, or any other component of the state. But, if called twice from the same state, it will always return the same value. If the function does not actually have this property, the behavior is undefined. Any $pure_f function is necessarily a $state_f, so at most one of these two qualifiers can occur in a declaration or definition.
  • Contract clauses can occur only with system functions and defined functions (not with abstract functions, not with variables).
  • A program must contain a function definition for a function named main.
  • $input and $output can be used only on global variable declarations (not on function definitions, not in block scope).
  • The expr in a guard must have type $bool.
  int a[];
  void main() {
    a = (int[10])$lambda(int i; $undefined);
    a = $new(int[10]); // 
    ... initialized a somehow ...
   $assume($defined(a, 0, 10)); // ?

Contracts

contract-clause:
    '$assigns' '(' expression-list? ')'   /* frame condition */
  | '$requires' '(' expr ')'  /* precondition */
  | '$ensures' '(' expr ')'  /* postcondition */
  | '$depends_on' '(' expression-list? ')'  /* dependency specification */
  | '$when '(' ID ')'  /* guard clause */
  ;

Notes

  • A guard clause can occur only in the declaration of a system function. The identifier is the name of the function that will be used as the guard in all calls to the system function. The guard function must have the same input signature as the system function, but must return $bool. Any call to the system function will block unless the guard function returns $true on the arguments used in the call. The execution of the system function and the return of "true" from the guard function will occur atomically, i.e., no state change will occur between the return of true and the call of the system function. The guard function must be a $state_f function; in particular, it must be side-effect free. The guard function may be a system function or a defined function; it cannot be abstract. If omitted, the guard is understood to be "true", i.e., a call to the system function will never block.
  • The expressions in a dependency clause must have pointer type. This declares that the following transition depends only on the objects pointed to.

Types

type-specifier:
    ID  /* typedef use */
  | '$int'  /* mathematical integers */
  | '$bool'  /* boolean type ($true and $false, unrelated to integers) */
  | '$char'  /* character type (Unicode characters, unrelated to integers) */
  | '$real'   /* mathematical reals */
  | '$float' '<' INT ',' INT '>'  /* IEEE floating-point numbers e=significand bits, f=exponent bits */
  | '$herbrand' '<' type-name '>'  /* Herbrand type of non-Herbrand numeric type T */
  | '$proc'  /* process type */
  | '$bundle'  /* bundle type for sequence of any type (same as seq<T>?) */
  | '$heap'  /* heap type, for dynamic allocation */
  | '$mem'  /* set of memory locations */
  | 'enum' ID  /* enumerated type */
  | 'struct' ID  /* structure type */
  | 'union' ID  /* union type */
  | 'void'   /* use as pointer element-type and return type of a function */
  | '$seq' '<' type-name '>'  /* sequence of T */
  | '$set' '<' type-name '>'  /* set of T */
  | '$map' '<' type-name ',' type-name '>'  /* partial function from T1 to T2 */
  | '$rel' '<' type-list '>'  /* relation: set of n-tuples with specified component types */
  ;
declarator '*'* direct-declarator ;
direct-declarator:
    ID  /* variable being declared */
  | direct-declarator '[' expr? ']'  /* array of ... */
  | direct-declarator  '(' type-list? ')'  /* function consuming ... and returning ... */
  | '(' declarator ')'
  ;
type-name: ... /* same as declarator but without the ID */
type-list: type-name (',' type-name)* ;

Notes

  • Sequences, sets, maps, relations, and $mem objects are immutable. An assignment using objects of this type creates a new copy of the object, just as with primitive types like $int.
  • Arrays are similar to sequences. The main differences are as follows:
    • An object (i.e., a variable or component of an object) of array type is initialized once, then will never be assigned to again. Hence there cannot be a statement of the form a=... where a has array type.
    • After initialization, an object or array type can appear only as the first (left) argument to the subscript operator [] or as the argument to the $length operator.
    • Arrays are mutable. As in C, the left hand size of an assignment may have the form a[i], where a is an object of array type. Sequences are immutable.
    • Elements of an array are addressable, i.e., one can form a pointer such as &a[i]. This is not possible with sequences, sets, maps, or relations---there is no way to have a pointer to any component of such a type.
    • A function may neither consume nor return an object of array type. There is no such restriction for sequences.
  • The difference between the function type and map type: a function is really a procedure in the language, so it can modify the state as well as return a value. This is like the C notion of "function". A map is a logical partial function: it is defined on some subset of the domain type, it will always "return" the same value on a given input, and reading it cannot modify the state.

Statements

primitiveStmt:
    ';'  /* noop */
  | lvalue '=' expr ';'  /* assignment */
  | expr '(' arg-list? ')' ';'  /* function call */
  | (lvalue '=')? '$spawn' expr '(' expression-list? ')' ';'  /* process creation */
  | '$wait' expr ';'  /* wait until p terminates */
  | 'return' expr? ';'  /* return from function call */
  | (lvalue '=')? '$alloc' '(' expr ',' expr ',' type-name ')' ';'  /* heap allocation */
  | '$free' expr ';'  /* frees something that was $alloc-ed */
  | '$atomic_enter' ';'  /* enter atomic region */
  | '$atomic_exit' ';'  /* exit atomic region */
  | '$yield' ';'  /* release atomic lock */
  | '$assert' expr (',' expr)* ';'  /* assertion with optional error message */
  | '$assume' expr ';'  /* assumption */
  ;

Notes

  • For function calls and spawns, the first expression shall have a functional type. That function must be a system or defined function (not an abstract function).
  • The first expression following $alloc has type $heap*. It is a pointer to the heap that will be modified by allocating the new memory. The second expression has type $int and is the number of elements being allocated. This is followed by the element type. The function returns a pointer to the first element of an array, similar to C's malloc. It is deallocated using function$free.

Expressions

expr:
  | ID
  | lvalue
  | '$true'
  | '$false'
  | INT
  | REAL
  | FLOAT
  | CHAR
  | STRING
  | 'NULL'
  | '$proc_null'
  | '(' type-name ')' expr  /* cast */
  | '(' type-name ')' '{' expression-list? '}'  /* concrete array, struct, $seq, $set, $mem */
  | '(' type-name ')' '{' expr-pair-list? '}'  /* concrete map */
  | '(' type-name ')' '$lambda' '(' '$int' identifier-list ')' expr  /* array literal, aka array lambda */
  | expr '+' expr  /* numeric or pointer addition */
  | expr '-' expr  /* numeric or pointer subtraction */
  | expr '/' expr  /* division */
  | expr '%' expr  /* modulo */
  | expr '&&' expr  /* logical and */
  | expr '||' expr  /* logical or */
  | expr '==>' expr  /* logical implication */
  | expr '==' expr  /* equality */
  | '!' expr  /* logical not */
  | expr '<' expr  /* less than */
  | expr '<=" expr  /* less than or equal to */
  | expr '[' expr ']'  /* array element access */
  | expr '.' ID  /* field access */
  | '(' expr ')'
  | '-' expr  /* negative */
  | '*' expr  /* pointer dereference */
  | '&' lvalue  /* address-of */
  | $new(type-name)  /* returns a new arbitrary value of the given type */
  | '$forall' '(' decl expr? ')' expr  /* universal quantification */
  | '$exists' '(' decl expr? ')' expr  /* existential quantification */
  | expr '?' expr ':' expr  /* if-then-else expression */
  | '$length' '(' expr ')'  /* length of array */
  ;
expr-pair-list: expr-pair (',' expr-pair)* ;
expr-pair: '{' expr ',' expr '}' ;

Standard CIVL Library

civlc.cvh

Work in progress...

  • round(e,t): returns value in numerical type t that is "closest" to the given value e
    • add argument for rounding mode?
    • exception if out of range?
  • floor(e) : given a real or floating number, returns the greatest Integer less than or equal to it.
  • ceil(e) : given a real or floating number, returns the least Integer greater than or equal to it.
  • abs(e): given any numeric expression e, returns the absolute value; result is same type as e.
  • pow(e,n): power operator
    • given any numeric expression e and expression n of any integral type, returns e to the n-th power.
    • n must evaluate to a nonnegative integer.
  • herbrand(e): "Herbrandize" a numeric value.
    • given a numeric value of type T, returns the value as type Herbrand[T], a trivial symbolic expression consisting of a single node.
  • eval(e) : evaluate a Herbrand expression
    • given a value of type Herbrand[T], returns the value of type T obtained by evaluating all the delayed operations
    • exceptions may be thrown if evaluating any of the delayed operations results in an exception

mem.cvh

Work in progress...

  • mem_reach(ptr), where ptr is an expression with a pointer type.
    • This represents the set of all memory units reachable from ptr, including the memory unit pointed to by ptr itself.
  • mem_union(mem1,mem2), where mem1 and mem2 are expressions of type Mem.
    • This is the union of the two memory sets.
  • mem_isect(mem1,mem1) : set intersection
  • mem_comp(mem1) : set complement (everything not in mem1)
  • mem_slice(a,dom)
    • where a is an expression of array type and dom is an expression of Domain type.
    • The dimension of the array must match the dimension of the domain. This represents all memory units which are the cells in the array indexed by a tuple in dom.

seq.cvh

<T> $int $seq_length( $seq<T> a ); // length of a
<T> T $seq_get( $seq<T> a, $int i ); // get element i of a
<T> $seq<T> $seq_set( $seq<T> a, $int i, T x ); // seq obtained by replacing element i of a with x
<T> $seq<T> $seq_subseq( $seq<T> a, $int start, $int stop ); // subsequence from start to stop-1
<T> $seq<T> $seq_add( $seq<T> a, T x ); // sequence obtained by adding element x to the end of a
<T> $seq<T> $seq_append( $seq<T> a1, $seq<T> a2 ); //  sequence obtained by concatenating a1 and a2
<T> $seq<T> $seq_remove( $seq<T> a, $int i ); // sequence obtained by removing element at position i from a
<T> $seq<T> $seq_insert( $seq<T> a, $int i, T x ); // sequence obtained by inserting element x at position i in a
<T> void $seq_write( $int n, T * ptr, $seq<T> a );  // write the first n elements of a to ptr, ptr+1, ..., ptr+n-1
<T> $seq<T> $seq_read( $int n, T * ptr ); // form a sequence by reading ptr, ptr+1, ..., ptr+n-1

set.cvh

<T> $int $set_size( $set<T> s ); // number of elements in s
<T> $seq<T> $set_seq( $set<T> s ); // elements of s arranged in a sequence
<T> $bool $set_contains( $set<T> s, T x ); // does s contain x as member?
<T> $bool $set_containsAll( $set<T> s1, $set<T> s2 ); // is s2 a subset of s1?
<T> $set<T> $set_add( $set<T> s, T x ); // result of adding x to s
<T> $set<T> $set_union( $set<T> s1, $set<T> s2 ); // union of s1 and s2
<T> $set<T> $set_intersection( $set<T> s1, $set<T> s2 ); // intersection of s1 and s2
<T> $set<T> $set_difference( $set<T> s1, $set<T> s2 ); // elements of s1 not in s2

map.cvh

rel.cvh

concurrency.cvh

void $parspawn($proc * proc_array, $domain d, void (*f)($int, ...));  /* parallel spawn returns immediately*/
void $waitall( $int nprocs, $proc * procs );  /* wait for all procs in list to terminate */

  • In$parspawn *f must have the function type that consumes n $ints, where n is the dimension of the domain. The function is spawned once for each element of the domain. References to the new processes are stored in the process array. The call to $parspawn returns immediately.
Note: See TracWiki for help on using the wiki.