= PIL: Parallel Intermediate Language == Basic Properties * PIL programs are target of translators from C, Fortran, MPI/OpenMP/etc. They can also be written by humans. * PIL should be a high-level language like CIVL-C. * PIL programs can be easily translated to PFG. * PIL must also have a totally well-defined semantics and syntax. * PIL programs can have nested funtion definitions. * PIL programs can use preprocessor directives like in C * There are no automatic conversions. All must be done by explicit casts. * `$input`/`$output` variables in global scope (like CIVL-C) * Identifiers are like in C, e.g., `x`, `f10`, ... * Keywords, functions, etc. not in C start with `$` (like CIVL-C) * Libraries: similar to C, `#include `, but these name PIL libraries and there are additional PIL libraries * PIL programs can be divided into multiple files (translation units) * One TU can refer to a variable, function, type, in another TU. * the variable just needs to be declared somewhere in the TU * the function just needs a prototype somewhere in the TU * the use of "static" in a variable decl or function def makes it private to that TU (so two can have same name) * no need for "extern" * at most one of the TUs declaring the variable can have an initializer * at most one of the TUs can have a definition for a function * Program order: * a program is a sequence of variable, function, and type definitions. * the order is totally irrelevant. * a variable can be used anywhere it is in scope * a function can be called anywhere it is in scope == Types - type names look like `int[]`, `int*[](double)`, etc. - basic types: same as C: `_Bool`, `char`, `short`, `int`, `long`, `long long`, `unsigned`, `float`, `double`, ... - what are the limits of these types? to be decided - `$int`, `$real` (mathematical types) - `struct`: as in C - `union`: as in C - `T[]`: sequence of `T`. Note: no "T[n]". sequences can be assigned, returned, passed as arguments, etc. - `$set` : finite set of `T`, ops to add, contains, remove, size, ... - `$map` : finite map from `T1` to `T2`, ops to put, get, containsKey, ... - `$fun` : "logic functions": deterministic, total, side-effect free functions from `T1` to `T2` - `$tuple` : tuples of specified type (similar to struct) - `$rel` : relations of specified type (set of tuples) - `T*`: pointer to `T` - what can be pointed to: var, array element, struct element, union member, ... - Type definitions: `typedef typename ID;` == Functions There are two kinds of functions in PIL: 1. Imperative functions = "procedures". A procedure has a type of the form `R(T1,...,Tn)` where `R` is the return type and the Ti are the input types. 1. Logic functions. One of these has a type of the form `$fun`. === Procedures Like in C. These can be nested. Obviously they can have side-effects, be nondeterministic, and the behavior can depend on non-local state. A procedure call `f(x1,...,xn)` is an expression that can be used anywhere an expression with side-effects is allowed. Imperative procedures are not values and cannot be assigned to anything. However, as in C, a pointer to a procedure is a first-class value that can be assigned, used as an argument, returned, etc. There is a procedure type, similar to C's function type, e.g., {{{ typedef int(int) i2i; }}} defines `i2i` to be the type "procedure consuming int and returning int". Most commonly, pointers to procedure types will be used, e.g., {{{ typedef int(int)* i2ip; }}} defines the type "pointer to procedure from int to int". As in C, a pointer to a procedure can be used in a procedure call. It is an error to call a procedure f when f is not in scope. (This is similar to Gnu C.) In other words, if the call takes place in dyscope d, then the definition of f must be in d's static scope, or in the parent of d's static scope, or its parent, etc. A procedure definition can be templated: {{{ }}} This defines one procedure for each assignment of types to the Ti. === Logic functions Typically specified by a $lambda expression. These are first-class objects in PIL: one can be assigned to a variable, passed as an argument in a function call (including a procedure call), and can be returned by a function. Each has a type of the form `$fun` (functions from `T1` x ... x `Tn` to `U`). They are side-effect-free. An application of a logic function `f(x1,...,xn)` is a side-effect-free expression that can be used anywhere an expression is allowed. A logic function is not necessarily pure, i.e., the value of an application may depend on any part of the state, not just the arguments. === Lambda expressions There are two kinds of lambda expressions. The first kind defines a procedure, the second a logical function. Procedural lambda syntax: {{{ $lambda [U1 v1=init1; ... Um vm=initm;] R (T1 x1, ..., Tn xn) { S1; ... } }}} where * the `Ti` and `Uj` are types * the `xi` and `vj` are variables * R is a type (the return type), which may be void * {S1; ...} is a block (same as in function definition) * the only variables that can occur free in the block are the xi and vj * if R is not void, the block must return a value of type R The type of this expression is `R(T1, ..., Tn)`. Logic function lambda syntax: {{{ $lambda [U1 v1=init1; ... Um vm=initm;] R (T1 x1, ..., Tn xn) expr }}} where * the `Ti` and `Uj` are types * the `xi` and `vj` are variables * R is a type (the return type), which cannot be void * `expr` is a side-effect-free expression of type R * the only variables that can occur free in `expr` are the xi and vj The type of this expression is `$fun`. In both cases, the initializer expressions `initj` are expressions using any variables in scope. ==== Semantics FIX THIS: difference between logic and procedural functions... The lambda expression is evaluated to yield a *closure*: an ordered pair consisting of a dyscope and an expression. The dyscope has variables `v1`, ..., `vm`, initialized by evaluating `init1`, ..., `initm`. The dyscope has no parent scope, it is associated only with the closure. The expression is `expr`. The dyscope is destroyed whenever it becomes unreachable, i.e., whenever no part of the state is assigned the closure. A logic function application `f(e1,...,en)` is evaluated by first evaluating the arguments `ei`, then assigning the `ei` to the `xi` and evaluating expr in the closure dyscope. Note that since expr has no side effects, the values of the xi are constant. [Note: a possible change is to allow side-effects in logic functions.] == Tuples Non-mutating expressions: - `t1 == t2` - `t.i` - `($tuple){ x1, ... }` - `$logic $tuple $tuple_write($tuple t, $int i, Ti x);` Mutating expressions: - `t.i = x;` == Sets Non-mutating expressions: - `s1 == s2` - `($set)$empty` // empty set of type T - `$logic _Bool $set_in(T x, $set s);` // is x an element of s? - `$pure $set $set_with($set s, T x);` // s U {x} - `$pure $set $set_without($set s, T x);` // s - {x} - `$pure $set $set_union($set s1, $set s2);` // s1 U s2 - `$pure $set $set_difference($set s1, $set s2);` // s1-s2 - `$pure $set_intersection($set s1, $set s2);` // s1 \cap s2 - `$pure T[] $set_elements($set s);` - `$pure _Bool $set_isSubsetOf($set s1, $set s2);` - `$pure $set $set_map($set s, $fun f);` Mutating procedures: - ` _Bool $set_add($set * this, T x);` - ` _Bool $set_remove($set * this, T x);` - `void $set_addAll($set * this, $set that);` - `void $set_removeAll($set * this, $set that);` - `void $set_keepOnly($set * this, $set that);` == Sequences (arrays) Non-mutating expressions: - `a1 == a2` - `a[i]` - `(T[]){ x1, ... }` Logic functions: - `$pure T[] $seq_fun($int len, $fun<$int,T> f);` - `$pure T[] $seq_uniform($int n, T val);` - `$pure $int $length(T[] a);` // length of a - `$pure T[] $seq_write(T[] a, int i, T x);` // a[i:=x] - `$pure T[] $seq_subseq(T[] a, int i, int n);` // a[i..i+n-1] - `$pure T[] $seq_without(T[] a, int i);` // a with position i removed - `$pure T[] $seq_with(T[] a, int i, T x);` - `$pure T[] $seq_concat(T[] a1, T[] a2);` - `$pure U[] $seq_map(T[] a, $fun f);` - `$pure T[] $seq_filter(T[] a, $fun f);` - `$pure U $seq_foldl(T[] a, $fun<$tuple,U> f, U init);` - `$pure U $seq_foldr(T[] a, $fun<$tuple,U> f, U init); ` Mutating expressions: - `a[i]=x;` Mutating procedures: - `T $seq_remove(T[] * this, int i);` - `void $seq_insert(T[] * this, int i, T x);` - `void $seq_append(T[] * this, T[] that);` == Maps Non-mutating expressions: - `m1 == m2` - `($map)$empty` - `$pure V $map_get($map K key);` - `$pure _Bool $map_containsKey($map map, K key);` - `$pure _Bool $map_containsValue($map map, V val);` - `$pure $map $map_with($map map, K key, V val);` - `$pure $map $map_without($map map, K key);` - `$pure $set $map_keys($map map);` - `$pure $set<$tuple> $map_entries($map map);` Mutating procedures: - `V $map_put($map * this, K key, V val);` - `V $map_remove($map * this, K key);` - `void $map_removeAll($map * this, $set keys);` - `void $map_addAll($map * this, $map that);` == Heaps There is a single heap for dynamic allocation. The following built-in procedures are provided: - `T* $new()`: creates an object `A` of type `T` in heap and returns `&A`. - `T* $alloc(int n)`: creates an array `A` of length `n` of `T` in heap and returns `&A[0]`. - `void $free(T* p)`: frees the object referred to by `p`, the pointer returned by an earlier call to `$new` or `$alloc`. == Questions * how to deal with "undefined" values? * can there be nondeterministic expressions, i.e., expressions that evaluate to a set of values instead of one value? * how to implement C's malloc? Ideas on malloc: Union of all types occurring in program: {{{ typedef union { T1 t1; T2 t2; ... } BigUnion; }}} A call to malloc returns a memory block: - size (in bytes) (int) - num_elements (int) - sequence of tuples: - offset (int) - size (int) - value (type `BigUnion`)