The CIVL-IR language. A program in this language is also known as a "CIVL model". Properties of the language: * the language is not intended to be written by humans; it is an intermediate form constructed by CIVL. However it should be readable to help debug things * the language (and grammar) are subsets of CIVL-C * a CIVL-IR program represents a guarded-transition system explicitly * as in CIVL-C, there are functions, scopes, and functions can be defined in any scope * all blocks (including a function body) consist of the following elements: * a sequence of type definitions * a sequence of variable declarations with no initializers * a sequence of function definitions * a sequence of labeled `$choose` statements. Each clause in the choose statement is a `$when` statement with some guard and a primitive statement, followed by a `goto` statement * an array is declared without any length expression. When it is initialized it can specify length. **Can we specify that elements must appear in the order above?** Or would that create a problem in something like {{{ { $integer x=3*y; $integer a[x+1]; } }}} Example: {{{ $integer f() { $real x; $real y; $float(16,23) z; L1 : $choose { $when (g1) stmt1; goto L2; $when (g2) stmt2; goto L3; } { // begin new scope $real x; L2 : $choose { $when (g3) stmt3; goto L4; ... } } // end new scope ... } // etc. }}} == Types == The types are: * `$type` : dynamic type. This allows a type to be realized as a value, a value of type `$type`. For example, when the type name `int[n]` is evaluated, the result is a value of type `$type`. * `$bool` : boolean type, values are `$true` and `$false` * `$proc` : process type * `$scope` : scope type * `$char` : character type. Alternatively, get rid of this and just use an integer type. * `$mem` : type representing set of memory units * `$bundle` : type representing some un-typed chunk of data * `$heap` : heap type * `$range` : ordered set of integers * `$domain` : ordered set of tuples of integers * `$domain(n)`, n is an integer at least 1; subtype of above in which all tuples have arity n. * `enum` types. * **different from integers or like C?** * `$integer` : the mathematical integers * `$int(lo,hi,wrap)` * lo, hi are integers, wrap is boolean * finite interval of integers [lo,hi]. If `wrap` is true then all operations "wrap", otherwise, any operation resulting in a value outside of the interval results in an exception being thrown. * **Do we want to allow `lo` and `hi` to be any values of type `$integer`, which means they are dynamic types, like complete array types?** * `$hint` : Herbrand integers. Values are unsimplified symbolic expressions. * `$real` : the mathematical real numbers * `$float(e,f)`, e, f are integers, each at least 1. **Same question for e and f as for lo and hi.** * IEEE754 floating point numbers * `$hreal` : Herbrand real numbers. Values are unsimplified symbolic expressions. * `struct(T1,...,Tn)` * structure type with named fields. Names may not seem necessary but if you want a subset of CIVL-C... * **What about bit-widths?** * `union(T1,...,Tn)` : similar to struct * `T[]` : array-of-T * Function * function consuming S1,...,Sn and returning T. T can be void. The actual notation is the horrible C notation. * `void*` : all pointers * `T*` : pointer-to-T, subtype of above Type facts: **Pure types** contain no values anywhere in the type tree. That is, there is no array length expression in the type. The pure types are the static types of the CIVL-IR. Each variable is declared to have some pure type. **Augmented types** include all the pure types plus possible length expressions. A **type name** is a syntactic element that names a (pure or augmented) type Examples include `int[]` and `int[n*m]`. This is the same as in C. The expression `$typefrom(T)`, where `T` is a type name, returns a value of type `$type` representing the type `T`. The expression `$typeof(e)`, where `e` is an expression, returns the type of `e`, a value of type `$type`. The expression `$initval(d)` takes a `$type` value `d` and returns the initial value for an object of type `d`. The initial value of `$integer` and other primitive (non-compound) types is "undefined". The initial value of `$integer[]` is an array of length 0 of `$integer`. The initial value of `$real*` is the undefined pointer to `$real`. The initial value of `$real[10]` is the array of length 10 in which each element is undefined. In general, the initial value of an array of length n is the sequence of length n in which every element is the initial value of the element type of the array. The initial value of a structure is the tuple in which each component is assigned the initial value for its type. Example: {{{ // type definitions struct S { int a[];} // variable decls int n; $type _S; struct S x1; struct S x2; // statements (leaving out the chooses and whens for brevity) n=10; _S=$typefrom(struct S { int a[n]; }; x1=$initval(_S); n=20; x2=$initval(_S); }}} is semantically equivalent to the C code {{{ int n = 10; struct S { int a[n]; }; struct S x1; n=20; struct S x2; }}} == Declarations == Declarations follow the C notation. Function prototypes are considered to be declarations similar to variable declarations. Example of declaration of a function: {{{ $integer f($real x, $bool y); }}} Additional modifiers that may be placed on any of above: * `$pure` : the function has no side effects, but may be nondeterministic * `$abstract`: function is a pure, mathematical function: deterministic function of inputs System functions: * A function declaration which is not abstract and for which no definition is provided is a system function. * If the system function is called anywhere in the program, it must be defined by providing Java code in an Enabler and Executor. Failure to do so will result in an exception. * A system function may modify any memory it can reach. This includes allocating new data on heaps it can reach. * A system function may have an implicit guard. This is specified **how**? * A system function may have an explicit guard. This is specified **how**? Example of a declaration of a system function with guard. {{{ $bool g($real x, $bool y) { ... } $integer f($real x, $bool y) $guard {g}; }}} == Expressions == In the following list of expressions, `e`, `e0`, `e1`, etc., are expressions. `T` is a type name. `t` is an expression of type `$type`. * literals * `$true`, `$false` : values of type `$bool` * 123, -123, 3.1415, etc. : values of type `$integer`, `$int`, `$real`, `$float` * what particular notations for floating values? * 'a', 'b', ... UNICODE? * `(T[]){e0, e1, ...}` : values of type `T[]` * `(S){e0, ...}` : values of type `S` (struct literal) * `e1..e2`, `e1..e2#e3` : values of type `$range` * `($domain){r1,...,rn}` : value of type `$domain(n)` * `"abc"` : string literals: value of type `$char[]` * `$root`, `$here` : values of type `$scope` * `$self`, `$proc_null` : values of type `$proc` * `NULL` : value of type `void*` * variables * `$typefrom(T)` : an expression of type `$type`, the type represented by the type name `T` * `$typeof(e)` : expression of type `$type`, the type of the expression `e` * `$sizeof(t)` : the size of that type. Note that the C `sizeof(expr)` can be translated as `$sizeof($typeof(expr))`. The C `sizeof(T)` can be translated as `$sizeof($typefrom(T))`. * `$initval(t)` : initial value of type `t` * `$defined(e)` * `e1+e2` : addition. One of the following must hold: * `e1` and `e2` have the same numeric type. Note that there are no "automatic conversions" as there are in C. If the original expressions have different types, explicit casts must be inserted. * `e1` has pointer type and `e2` has an integer type. (Alternatively, we could define a separate function `$pointer_add(p,n)`.) * `e1-e2` : subtraction * `e1*e2` : multiplication * `e1/e2` : division * If both are integer types, the result is integer division. Otherwise it is real division. Need to define what happens for negative integers. * `e1%e2` : modulus * `e1[e2]` : array subscript expression. Note that `e1` must have array type, not pointer type. (This is different from C.) If `e1` has pointer type, use `*(e1+e2)` instead. * `*e` : pointer dereference * `&e` : address-of * `!e` : logical not * `-e` : negative * `$cast(e,t)` : casts `e` to a value of type `t` * need to list all of the legal casts and what they mean exactly * cast of integer to array-of-boolean, and vice-versa? * `e1==e2`, `e1!=e2` * `e1&&e2`, `e1||e2` * `e1?e2:e3` * `e1