= PIL: Parallel Intermediate Language

== Basic Properties
* PIL programs are target of translators from C, Fortran, MPI/OpenMP/etc. They can also be written by humans.
* PIL should be a high-level language like CIVL-C.
* PIL programs can be easily translated to PFG.
* PIL must also have a totally well-defined semantics and syntax.
* PIL programs can have nested funtion definitions.
* PIL programs can use preprocessor directives like in C
* There are no automatic conversions.  All must be done by explicit casts.
* `$input`/`$output` variables in global scope (like CIVL-C)
* Identifiers are like in C, e.g.,  `x`, `f10`, ...
* Keywords, functions, etc. not in C start with `$` (like CIVL-C) 
* Libraries: similar to C, `#include <stdlib.h>`, but these name PIL libraries and there are additional PIL libraries
* PIL programs can be divided into multiple files (translation units)
 *  One TU can refer to a variable, function, type, in another TU.
 * the variable just needs to be declared somewhere in the TU
 * the function just needs a prototype somewhere in the TU
 * the use of "static" in a variable decl or function def makes it private to that TU (so two can have same name)
 * no need for "extern"
 * at most one of the TUs declaring the variable can have an initializer
 * at most one of the TUs can have a definition for a function
* Program order:
 * a program is a sequence of variable, function, and type definitions.
 * the order is totally irrelevant.
 * a variable can be used anywhere it is in scope
 * a function can be called anywhere it is in scope

== Types
- type names look like `int[]`, `int*[](double)`, etc.
- basic types: same as C: `_Bool`, `char`, `short`, `int`, `long`, `long long`, `unsigned`, `float`, `double`, ...
- what are the limits of these types?  to be decided
- `$int`, `$real` (mathematical types)
- `struct`: as in C
- `union`: as in C
- `T[]`: sequence of `T`.  Note: no "T[n]".  sequences can be assigned, returned, passed as arguments, etc.  
- `$set<T>` : finite set of `T`, ops to add, contains, remove, size, ...
- `$map<T1,T2>` : finite map from `T1` to `T2`, ops to put, get, containsKey, ...
- `$fun<T1,T2>` :  "logic functions":  deterministic, total, side-effect free functions from `T1` to `T2`
- `$tuple<T,...>` : tuples of specified type (similar to struct)
- `$rel<T,...>` : relations of specified type (set of tuples)
- `T*`: pointer to `T`
 -   what can be pointed to: var, array element, struct element, union member,  ...
- Type definitions: `typedef typename ID;`

== Functions

There are two kinds of functions in PIL:
1. Imperative functions = "procedures".  A procedure has a type of the form `R(T1,...,Tn)` where `R` is the return type and the Ti are the input types.
1. Logic functions.  One of these has a type of the form `$fun<R,T1,...,Tn>`.

=== Procedures

Like in C.  These can be nested.  Obviously they can have side-effects, be nondeterministic, and the behavior can depend on non-local state.  A procedure call `f(x1,...,xn)` is an expression that can be used anywhere an expression with side-effects is allowed.  Imperative procedures are not values and cannot be assigned to anything.  However, as in C, a pointer to a procedure is a first-class value that can be assigned, used as an argument, returned, etc.   There is a procedure type, similar to C's function type, e.g.,
{{{
       typedef int(int) i2i;
}}}
defines `i2i` to be the type "procedure consuming int and returning int".  Most commonly, pointers to procedure types will be used, e.g.,
{{{
       typedef int(int)* i2ip;
}}}
defines the type "pointer to procedure from int to int".  As in C, a pointer to a procedure can be used in a procedure call.  It is an error to call a procedure f when f is not in scope.  (This is similar to Gnu C.)   In other words, if the call takes place in dyscope d, then the definition of f must be in d's static scope, or in the parent of d's static scope, or its parent, etc.

A procedure definition can be templated:
{{{     
       <T1,T2,...> <procedure def>
}}}   
This defines one procedure for each assignment of types to the Ti.
     
=== Logic functions

Typically specified by a $lambda expression. These are first-class objects in PIL: one can be assigned to a variable, passed as an argument in a function call (including a procedure call), and can be returned by a function.  Each has a type of the form `$fun<U,T1,...,Tn>` (functions from `T1` x ... x `Tn` to `U`).  They are side-effect-free.  An application of a logic function `f(x1,...,xn)` is a side-effect-free expression that can be used anywhere an expression is allowed.  A logic function is not necessarily pure, i.e., the value of an application may depend on any part of the state, not just the arguments.
  
=== Lambda expressions

There are two kinds of lambda expressions.  The first kind defines a procedure, the second a logical function.

Procedural lambda syntax:
{{{
  $lambda [U1 v1=init1; ... Um vm=initm;] R (T1 x1, ..., Tn xn) { S1; ... }
}}}
where
* the `Ti` and `Uj` are types
* the `xi` and `vj` are variables
* R is a type (the return type), which may be void
* {S1; ...} is a block (same as in function definition)
* the only variables that can occur free in the block are the xi and vj
* if R is not void, the block must return a value of type R

The type of this expression is `R(T1, ..., Tn)`.

Logic function lambda syntax:
{{{
  $lambda [U1 v1=init1; ... Um vm=initm;] R (T1 x1, ..., Tn xn) expr
}}}
where 
* the `Ti` and `Uj` are types
* the `xi` and `vj` are variables
* R is a type (the return type), which cannot be void
* `expr` is a side-effect-free expression of type R 
* the only variables that can occur free in `expr` are the xi and vj

The type of this expression is `$fun<R, T1, ..., Tn>`. 

In both cases, the initializer expressions `initj` are expressions using any variables in scope.

==== Semantics

FIX THIS:  difference between logic and procedural functions...



The lambda expression is evaluated to yield a *closure*:  an ordered pair consisting of a dyscope and an expression.  The dyscope has variables `v1`, ..., `vm`, initialized by evaluating `init1`, ..., `initm`.  The dyscope has no parent scope, it is associated only with the closure.  The expression is `expr`.  The dyscope is destroyed whenever it becomes unreachable, i.e., whenever no part of the state is assigned the closure.

A logic function application `f(e1,...,en)` is evaluated by first evaluating the arguments `ei`, then assigning the `ei` to the `xi` and evaluating expr in the closure dyscope.  Note that since expr has no side effects, the values of the xi are constant.  [Note: a possible change is to allow side-effects in logic functions.]

== Tuples

Non-mutating expressions:
- `t1 == t2`
- `t.i`
- `($tuple<T1,...>){ x1, ... }`
- `$logic $tuple<T1,...> $tuple_write($tuple<T1,...> t, $int i, Ti x);`

Mutating expressions:
- `t.i = x;`

== Sets

Non-mutating expressions:
- `s1 == s2`
- `($set<T>)$empty`   // empty set of type T
- `$logic _Bool $set_in(T x, $set<T> s);`  // is x an element of s?
- `$pure $set<T> $set_with($set<T> s, T x);` // s U {x}
- `$pure $set<T> $set_without($set<T> s, T x);` // s - {x}
- `$pure $set<T> $set_union($set<T> s1, $set<T> s2);` // s1 U s2 
- `$pure $set<T> $set_difference($set<T> s1, $set<T> s2);` // s1-s2 
- `$pure $set_intersection($set<T> s1, $set<T> s2);` // s1 \cap s2
- `$pure T[] $set_elements($set<T> s);`
- `$pure _Bool $set_isSubsetOf($set<T> s1, $set<T> s2);`
- `$pure $set<U> $set_map($set<T> s, $fun<T,U> f);`

Mutating procedures:
- ` _Bool $set_add($set<T> * this, T x);`
- ` _Bool $set_remove($set<T> * this, T x);`
- `void $set_addAll($set<T> * this, $set<T> that);`
- `void $set_removeAll($set<T> * this, $set<T> that);`
- `void $set_keepOnly($set<T> * this, $set<T> that);`

==  Sequences (arrays)

Non-mutating expressions:
- `a1 == a2`
- `a[i]`
- `(T[]){ x1, ... }`

Logic functions:
- `$pure T[] $seq_fun($int len, $fun<$int,T> f);`
- `$pure T[] $seq_uniform($int n, T val);`
- `$pure $int $length(T[] a);` // length of a
- `$pure T[] $seq_write(T[] a, int i, T x);`  // a[i:=x]
- `$pure T[] $seq_subseq(T[] a, int i, int n);`  // a[i..i+n-1]
- `$pure T[] $seq_without(T[] a, int i);` // a with position i removed
- `$pure T[] $seq_with(T[] a, int i, T x);`
- `$pure T[] $seq_concat(T[] a1, T[] a2);`
- `$pure U[] $seq_map(T[] a, $fun<T,U> f);`
- `$pure T[] $seq_filter(T[] a, $fun<T,_Bool> f);`
- `$pure U $seq_foldl(T[] a, $fun<$tuple<T,U>,U> f, U init);`
- `$pure U $seq_foldr(T[] a, $fun<$tuple<T,U>,U> f, U init); `

Mutating expressions:
- `a[i]=x;`

Mutating procedures:
- `T $seq_remove(T[] * this, int i);`
- `void $seq_insert(T[] * this, int i, T x);`
- `void $seq_append(T[] * this, T[] that);`

==  Maps

Non-mutating expressions:
- `m1 == m2`
- `($map<K,V>)$empty`
- `$pure V $map_get($map<K,V> K key);`
- `$pure _Bool $map_containsKey($map<K,V> map, K key);`
- `$pure _Bool $map_containsValue($map<K,V> map, V val);`
- `$pure $map<K,V> $map_with($map<K,V> map, K key, V val);`
- `$pure $map<K,V> $map_without($map<K,V> map, K key);`
- `$pure $set<K> $map_keys($map<K,V> map);`
- `$pure $set<$tuple<K,V>> $map_entries($map<K,V> map);`

Mutating procedures:
- `V $map_put($map<K,V> * this, K key, V val);`
- `V $map_remove($map<K,V> * this, K key);`
- `void $map_removeAll($map<K,V> * this, $set<K> keys);`
- `void $map_addAll($map<K,V> * this, $map<K,V> that);`


== Heaps

There is a single heap for dynamic allocation.  The following built-in procedures are provided:

- `T* $new<T>()`: creates an object `A` of type `T` in heap and returns `&A`.
- `T* $alloc<T>(int n)`: creates an array `A` of length `n` of `T` in heap and returns `&A[0]`.
- `void $free<T>(T* p)`: frees the object referred to by `p`, the pointer returned by an earlier call to `$new` or `$alloc`.


== Questions

* how to deal with "undefined" values?
* can there be nondeterministic expressions, i.e., expressions that evaluate to a set of values instead of one value?
* how to implement C's malloc?

Ideas on malloc:

Union of all types occurring in program:
{{{
typedef union {
  T1 t1;
  T2 t2;
  ...
} BigUnion;
}}}

A call to malloc returns a memory block:
  - size (in bytes) (int)
  - num_elements (int)
  - sequence of tuples:
    - offset (int)
    - size (int)
    - value (type `BigUnion`)