| Version 78 (modified by , 10 years ago) ( diff ) |
|---|
The CIVL-IR language. A program in this language is also known as a "CIVL model".
Questions:
- fix notation for contracts
- can labels, whens, chooses be optional?
- quotes around variables in declarations?
- quotes around function names in definitions, prototypes?
- is there ever a need for a function prototype? Could variable decl notation be used instead? NO, variable decls are different, those are state variables.
- should we leave our parameter names in abstract and system functions? They are not needed for anything.
Properties of the language:
- the language is not intended to be written by humans; it is an intermediate form constructed by CIVL. However it should be readable to help debug things
- a CIVL-IR program represents a guarded-transition system explicitly
- as in CIVL-C, there are functions, scopes, and functions can be defined in any scope
- all blocks (including a function body) consist of the following elements, in this order:
- a sequence of type definitions
- a sequence of variable declarations with no initializers
- a sequence of function definitions
- a sequence of labeled statements. Each clause in the labeled statement is a
whenstatement with some guard and a primitive statement, followed by agotostatement
- an array is declared without any length expression. When it is initialized it can specify length.
- curly braces are used only to indicate scopes, as in
{ // new scope ... } - parentheses are used to indicate function invocations, as in add(x,y)
- angular brackets are used to delimit tuples or sequences
- square brackets are used to delimit parameters in types
- unlike C, there is no "array-pointer pun". If an array
aneeds to be converted to a pointer, you must use addr(asub(a, 0))`. - there are no automatic conversions. All conversions must be by explicit casts or other functions. Operations such as numeric addition (
add) require that both operands have the exact same type.
Example:
def f(u:Integer, a:Array[Real]): Integer {
x: Real;
y: Real;
z: Float[16,23];
L1 :
when g1 do stmt1; goto L2;
when g2 do stmt2; goto L3;
{ // begin new scope
x: Real;
L2 :
when g3 do stmt3; goto L4;
...
} // end new scope
...
}
// etc.
Example:
{
int x=3*y;
int a[x+1];
}
translates to:
{
x: Integer;
a_t : Dytype[Array[Integer]];
a: Array[Integer];
L1:
when true do ASSIGN "x", mul(3,"y"); goto L2;
L2:
when true do ASSIGN "a_t", dytype(Array[Integer, add("x",1)]); goto L3;
L3:
when true do ASSIGN "a", new("a_t"); goto L4;
L4:
}
Types
The types are:
Bool: boolean type, values aretrueandfalseProc: process typeScope: scope typeChar: character typeBundle: type representing some un-typed chunk of dataHeap: heap typeRange: ordered set of integersDomain: ordered set of tuples of integersDomain[n], n is an integer at least 1; subtype ofDomainin which all tuples have arity n.Enumtypes.- different from integers or like C?
Integer: the mathematical integersInt[lo,hi,wrap]lo,hiare concrete (?) integers,wrapis boolean- finite interval of integers
[lo,hi]. Ifwrapis true then all operations "wrap", otherwise, any operation resulting in a value outside of the interval results in an exception being thrown. - Do we want to allow
loandhito be any values of typeInteger, which means they are dynamic types, like complete array types?
HerbrandInt: Herbrand integers. Values are unsimplified symbolic expressions.Real: the mathematical real numbersFloat[e,f],e,fare concrete integers, each at least 1.- IEEE754 floating point numbers
HerbrandReal: Herbrand real numbers. Values are unsimplified symbolic expressions.Tuple[<T0, T1, ...>]: a tuple type, the Cartesian product ofT0,T1, ...- What about bit-widths?
Union[<T0, T1, ...>]: union type, the disjoint union ofT0,T1, ...Array[T]: arrays of any length whose elements belong to TFunction[<T0,T1,...>,T]: functions consuming T0,T1,... and returning T. T can bevoidto indicate nothing is returned.Mem: type representing a memory set. May be thought of as a set of pointers.Pointer: all pointers, a subtype ofMemPointer[T]: pointer-to-T, subtype ofPointerDytype: the set of all dynamic typesDytype[T]: dynamic types refining T. Values of this type represent dynamic types that refine T. For exampledytype(Array[Integer,24])has typeDytype[Array[Integer]]
Type facts:
Static types are the types assigned to variables in a program statically. A static type contains no values anywhere in the type tree. That is, there is no array length expression in the type. These are the types that are used in declarations. Each variable is declared to have some static type.
Value types (aka dynamic types) are the types associated to values. They include all the static types plus possible length expressions. A value type refines a static type if when you delete the values from the value type you get the static type.
A type name is a syntactic element that names a (static or value) type. Examples of type names include Array[Integer] and Array[Integer,24].
The expression new(t) takes a Dytype and returns the initial value for an object of that type. The initial value of Integer and other primitive (non-compound) types is "undefined". The initial value of Array[Integer] is an array of length 0 of Integer. The initial value of Pointer[Real] is the undefined pointer to Real. The initial value of Array[Real, 10] is the array of length 10 in which each element is undefined. In general, the initial value of an array of length n is the sequence of length n in which every element is the initial value of the element type of the array. The initial value of a tuple type is the tuple in which each component is assigned the initial value for its type.
Example: the C code
int n = 10;
struct S { int a[n]; };
struct S x1;
n=20;
struct S x2;
may be translated as
typedef S=Tuple[<Array[Integer]>];
n: Integer;
S_d: Dytype[S];
x1: S;
x2: S;
L0:
when true do ASSIGN "n", 10; goto L1;
L1:
when true do ASSIGN "S_d", dytype(Tuple[<Array[Integer, "n"]]>); goto L2;
L2:
when true do ASSIGN "x1", new("S_d"); goto L3;
L3:
when true do ASSIGN "n", 20; goto L4;
L4:
when true do ASSIGN "x2", new("S_d"); goto L5;
L5:
Example of self-referential data definition, a linked list:
typedef Node=Tuple[<Integer,Pointer[Node]>];
Expressions
In the following list of expressions, e, e0, e1, etc., are expressions. T is a type name. t is an expression of type Dytype.
Logical
true,false: literal values of typeBoolnot(e): logical notand(e1,e2),or(e1,e2): logical and/or operation. These operators are short-circuiting, which matters because of exception side-effects.implies(e1,e2): logical implication. Short-circuiting.eq(e1,e2),neq(e1,e2): equality/inequality testforall(<i1:T1,i2:T2,...>,e): universal quantification. For all i1 in type T1, i2 in type T2, ..., e2 holds.exists(<i1:T1,i2:T2,...>,e): existential quantification. There is some i1 in type T1, i2 in type T2, ..., such that e holds.
Numeric
- 123, -123, 3.1415, etc. : values of type
Integer,Int,Real,Float. NEED TO BE MORE SPECIFIC add(e1,e2): numeric addition.e1ande2have the same numeric type. Note that there are no "automatic conversions" as there are in C. If the original expressions have different types, explicit casts must be inserted.
sub(e1,e2): subtractionmul(e1,e2): multiplicationdiv(e1,e2): division- If both are integer types, the result is integer division. Otherwise it is real division. Need to define what happens for negative integers.
mod(e1,e2): integer modulusneg(e): negativelt(e1,e2),lte(e1,e2): less than/less than or equal to
Characters and Strings
- 'a', 'b', ... : Char values. UNICODE?
string("abc"): string literals: value of typeArray[Char, n+1], where n is the length of the string (the last element is the character\0)
Ranges and Domains
range(e1,e2,e3): value of typeRange. Ife3is positive, the integers e1, e1+e3, e1+2*e3, ... that are less than or equal toe2. Ife3is negative, the integers e1, e1+e3, e1+2*e3, ... that are greater than or equal toe2. Exception ife3is 0.domain(<r1,...,rn>): value of typeDomain[n], the Cartesian product of the n ranges, with dictionary orderhasnext(dom, <i,j,…>): an expression of boolean type, testing if the domaindomcontains any element after<i,j,...>
Arrays
array(T,<e0,...,en-1>): value of typeArray[T, n], a literal arrayarray(T,n,e): value of typeArray[T,n]in which each of the n elements iseasub(e1,e2): array subscript expression. Note thate1must have array type, not pointer type. (This is different from C.) Ife1has pointer type, usederef(padd(e1, e2))instead.seq_add(a,e): array obtained by adding element e to the end of the array. Original array a is not modified.seq_append(a1,a2): array obtained by concatenating two arrays. Original array not modified.seq_remove(a,i): array obtained by removing element at position i from a. Original array a not modified.bit_and(e1, e2),bit_or(e1, e2),bit_xor(e1, e2),bit_comp(e1): bit-wise operations: arguments are arrays of booleans of equal length.
Tuples
tuple(S,<e0,e1,...>): value of tuple typeS(tuple literal)tsel(e1,i): tuple selection of component i of e1. i must be a literal natural number.
Unions
union_inj(U,i,x):xis inTi, result is in the union typeUunion_sel(U,i,y):yis inU, result is inTi(or exception ifyis not in image of injection fromTi)union_test(U,i,y):yis inU; determines ifyis in the image under injection ofTi. Returns a Boolean.
Pointers and Memory
NULL: value of typevoid*deref(e): pointer dereferenceaddr(e): address-of operatorpadd(e1,e2): pointer addition.e1has pointer type ande2has an integer type or range type. Ife2has integer type the result has pointer type. Otherwise, the result hasMemtype.psub(e1,e2): pointer subtractionmem_reach(ptr), whereptris an expression with a pointer type. This represents the set of all memory units reachable fromptr, including the memory unit pointed to byptritself.mem_union(mem1,mem2), wheremem1andmem2are expressions of typeMemory. This is the union of the two memory sets.mem_isect(mem1,mem1): set intersectionmem_comp(mem1): set complement (everything not inmem1)mem_slice(a,dom), whereais an expression of array type anddomis an expression ofDomaintype. The dimension of the array must match the dimension of the domain. This represents all memory units which are the cells in the array indexed by a tuple indom.
Scopes and Processes
root,here: values of typeScopeself,proc_null: values of typeProc
Other expressions
"x": x is an identifier, naming either a type, variable, or functionsizeof_type(t): the size of the dynamic type t;Integertypesizeof_expr(e): the size of the value of expressione;Integertypenew(t): new (default) value ofDytypetdefined(e): isedefined?Booltypecast(e,T): castseto a value of the named type- need to list all of the legal casts and what they mean exactly
- cast of integer to array-of-boolean, and vice-versa?
- Instead of casts would it be better to have explicit functions for each legal kind of cast?
ite(e1,e2,e3): if-then-else (conditional) expression, likee1?e2:e3in C.call(e0,<e1,...,en>): a function invocation wheree0must evaluate to either an abstract or pure system functionchoose_int(e): nondeterministic choice of integer
The Primitive Statements
ASSERT e, "msg", ...;: assertion with messageASSUME e;ASSIGN e1,e2;CALL f, <e1,...,en>;andCALL e, f, <e1,...,en>;- call to a function which is not abstract and is not a pure system function. The first form has no left-hand-side. The second form assigns the result returned by the call to
e.
- call to a function which is not abstract and is not a pure system function. The first form has no left-hand-side. The second form assigns the result returned by the call to
SPAWN f, <e1,...,en>;andSPAWN e, f, <e1,...,en>;WAIT e;WAITALL e, n;whereeis a pointer to a process reference andnis the number of processes to be waited forALLOCATE e, h, t, e0;ehas typePointerhhas typeHeapthas typeDytypee0has integer type.- Allocates
e0objects of typeton heaph, returning pointer to first element ine - To translate the C
mallocyou first need to figure out the type of the elements being malloced. If the argument to malloc isn, then you first need to insert an assertioneq(mod(n, sizeof_type(t)), 0), and thenALLOCATE e, h, t, div(n, sizeof_type(t)).
FREE p;EVAL e;, whereeis an expression that might contain exceptions (e.g., array index out of bound, division by zero);NOOP;- Is there a need to add annotations for "true" or "false" branch, etc.? If so, we can just make these parameters to the Noop.
RETURN;andRETURN e;ATOMIC_ENTER;ATOMIC_EXIT;PARSPAWN p, d, f;wherepis pointer to process reference,dhasDomaintype andfhasFunctiontype.NEXT dom, <i,j,…>;: domain iterator: updatesi,j,... to be the value of the next tuple indomafter<i,j,...>FOR_ENTER dom;FIX ME
Declarations and Function Definitions
Function prototypes are considered to be declarations similar to variable declarations.
Example of declaration of a system function defined in library Concurrency:
def[lib="Concurrency"] f(x: Real, b: Bool): Float[32,33];
Function modifiers that may be placed in the brackets after def:
pure: the function has no side effects, but may be nondeterministicabstract: function is a pure, mathematical function: deterministic function of inputsatomic: function definition is atomic, and it never blocks ISN'T THIS TRUE OF EVERY SYSTEM FUNCTION? IN WHICH CASE, IS THIS ONLY FOR NON-SYSTEM FUNCTIONS?lib="...": function is a system function defined in specified libraryguard="...":the name of the guard function
System functions:
- A function declaration which is not abstract and for which no definition is provided is a system function.
- If the system function is called anywhere in the program, it must be defined by providing Java code in an Enabler and Executor. Failure to do so will result in an exception.
- A system function may modify any memory it can reach. This includes allocating new data on heaps it can reach.
- A system function may have a guard.
Example of a declaration of a system function with guard.
g(x:Real):Bool { ... }
def[lib="Blah",guard="g"] f(x:Real):Integer;
Function Contracts
- event set expressions:
EventSetExpression : read(MemorySetExpression) | write(MemorySetExpression) | access(MemorySetExpression) | calls(FunctionCallExpression) | \nothing | \everything | ‘(’ EventSetExpression ‘)’ | EventSetExpression + EventSetExpression | EventSetExpression - EventSetExpression | EventSetExpression & EventSetExpression
- depends clause:
depends [condition] { event1, event2, ...}- Example:
depends { access(n) - (calls(inc(MemorySetExpression)) + calls(dec(MemorySetExpression))) } - absence of depends clause:
- Example:
- assigns-or-reads clause
- assigns clause:
assigns [condition] {memory-list} - reads clause:
reads [condition] {memory-list} reads {nothing}impliesassigns {nothing}reads {\nothing}is equivalent to:reads {\nothing} assigns {\nothing}assigns {X}whereX != \nothing, impliesreads {X}assigns {X}is equivalent to:assigns{X} reads{X}- absence:
- absence of
readsclause: there is no assumption about the read access of the function, i.e., the function could read anything - absence of
assignsclause: similar to the absence ofreadsclause
- absence of
reads/assigns {\nothing}doesn’t necessarily means that the function never reads or assigns any variable. The function could still reads/assigns its “local” variables, including function parameters and any variable declared inside the function body.
- assigns clause:
- For an independent function which has
depends {\nothing}, usually we also need to specifyreads{nothing}, for the purpose of reachability analysis.
e.g.,
/* Returns the size of the given bundle b. */
bundle_size(Bundle b; Int)
depends {\nothing}
reads {\nothing}
;
- Example of a function declaration with contracts:
atomic_f sendRecv(Int cmd, Pointer buf; Int) depends [eq(cmd, SEND)] {write(buf)} depends [eq(cmd, RECV)] {access(deref(buf))} assigns [eq(cmd, SEND)] {\nothing} assigns [eq(cmd, RECV)] {deref(buf)} reads {deref(buf)} { L0: when (eq(cmd, SEND)) send(deref(buf), ...); goto L1; when (eq(cmd, RECV)) deref(buf):=recv(...); goto L1; when (and(neq(cmd, SEND), neq(cmd, RECV))) ; goto L1; L1: }
Program
A program consists of a sequence of global variable declarations, which may include declarations annotated by input and output,
followed by a sequence of function declarations and definitions.
inputoutput
Semantics
Semantics issues
- define every possible cast
- define every possible +, etc.
- define every kind of pointer value and casts between pointer types
- casts between pointer and integer types?
Values
Transitions
Libraries
Which libraries require system functions? pointer, ... Should these be called part of the language, or not?
