wiki:PIL

Context Navigation

Version 5 (modified by siegel, 21 months ago) ( diff )
--

PIL: Parallel Intermediate Language

Basic Properties

PIL programs are target of translators from C, Fortran, MPI/OpenMP/etc. They can also be written by humans.
PIL should be a high-level language like CIVL-C.
PIL programs can be easily translated to PFG.
PIL must also have a totally well-defined semantics and syntax.
PIL programs can have nested funtion definitions.
PIL programs can use preprocessor directives like in C
There are no automatic conversions. All must be done by explicit casts.
$input/$output variables in global scope (like CIVL-C)
Identifiers are like in C, e.g., x, f10, ...
Keywords, functions, etc. not in C start with $ (like CIVL-C)
Libraries: similar to C, #include <stdlib.h>, but these name PIL libraries and there are additional PIL libraries
PIL programs can be divided into multiple files (translation units)
- One TU can refer to a variable, function, type, in another TU.
- the variable just needs to be declared somewhere in the TU
- the function just needs a prototype somewhere in the TU
- the use of "static" in a variable decl or function def makes it private to that TU (so two can have same name)
- no need for "extern"
- at most one of the TUs declaring the variable can have an initializer
- at most one of the TUs can have a definition for a function
Program order:
- a program is a sequence of variable, function, and type definitions.
- the order is totally irrelevant.
- a variable can be used anywhere it is in scope
- a function can be called anywhere it is in scope

Types

type names look like int[], int*[](double), etc.
basic types: same as C: _Bool, char, short, int, long, long long, unsigned, float, double, ...
what are the limits of these types? to be decided
$int, $real (mathematical types)
struct: as in C
union: as in C
T[]: sequence of T. Note: no "T[n]". sequences can be assigned, returned, passed as arguments, etc.
$set<T> : finite set of T, ops to add, contains, remove, size, ...
$map<T1,T2> : finite map from T1 to T2, ops to put, get, containsKey, ...
$fun<T1,T2> : "logic functions": deterministic, total, side-effect free functions from T1 to T2
$tuple<T,...> : tuples of specified type (similar to struct)
$rel<T,...> : relations of specified type (set of tuples)
T*: pointer to T
- what can be pointed to: var, array element, struct element, union member, ...
Type definitions: typedef typename ID;

Functions

There are two kinds of functions in PIL:

Imperative functions = "procedures". A procedure has a type of the form R(T1,...,Tn) where R is the return type and the Ti are the input types.
Logic functions. One of these has a type of the form $fun<R,T1,...,Tn>.

Procedures

Like in C. These can be nested. Obviously they can have side-effects, be nondeterministic, and the behavior can depend on non-local state. A procedure call f(x1,...,xn) is an expression that can be used anywhere an expression with side-effects is allowed. Imperative procedures are not values and cannot be assigned to anything. However, as in C, a pointer to a procedure is a first-class value that can be assigned, used as an argument, returned, etc. There is a procedure type, similar to C's function type, e.g.,

       typedef int(int) i2i;

defines i2i to be the type "procedure consuming int and returning int". Most commonly, pointers to procedure types will be used, e.g.,

       typedef int(int)* i2ip;

defines the type "pointer to procedure from int to int". As in C, a pointer to a procedure can be used in a procedure call. It is an error to call a procedure f when f is not in scope. (This is similar to Gnu C.) In other words, if the call takes place in dyscope d, then the definition of f must be in d's static scope, or in the parent of d's static scope, or its parent, etc.

A procedure definition can be templated:

       <T1,T2,...> <procedure def>

This defines one procedure for each assignment of types to the Ti.

Logic functions

Typically specified by a $lambda expression. These are first-class objects in PIL: one can be assigned to a variable, passed as an argument in a function call (including a procedure call), and can be returned by a function. Each has a type of the form $fun<U,T1,...,Tn> (functions from T1 x ... x Tn to U). They are side-effect-free. An application of a logic function f(x1,...,xn) is a side-effect-free expression that can be used anywhere an expression is allowed. A logic function is not necessarily pure, i.e., the value of an application may depend on any part of the state, not just the arguments.

Lambda expressions

There are two kinds of lambda expressions. The first kind defines a procedure, the second a logical function.

Procedural lambda syntax:

  $lambda [U1 v1=init1; ... Um vm=initm;] R (T1 x1, ..., Tn xn) { S1; ... }

where

the Ti and Uj are types
the xi and vj are variables
R is a type (the return type), which may be void
{S1; ...} is a block (same as in function definition)
the only variables that can occur free in the block are the xi and vj
if R is not void, the block must return a value of type R

The type of this expression is R(T1, ..., Tn).

Logic function lambda syntax:

  $lambda [U1 v1=init1; ... Um vm=initm;] R (T1 x1, ..., Tn xn) expr

where

the Ti and Uj are types
the xi and vj are variables
R is a type (the return type), which cannot be void
expr is a side-effect-free expression of type R
the only variables that can occur free in expr are the xi and vj

The type of this expression is $fun<R, T1, ..., Tn>.

In both cases, the initializer expressions initj are expressions using any variables in scope.

Semantics

FIX THIS: difference between logic and procedural functions...

The lambda expression is evaluated to yield a *closure*: an ordered pair consisting of a dyscope and an expression. The dyscope has variables v1, ..., vm, initialized by evaluating init1, ..., initm. The dyscope has no parent scope, it is associated only with the closure. The expression is expr. The dyscope is destroyed whenever it becomes unreachable, i.e., whenever no part of the state is assigned the closure.

A logic function application f(e1,...,en) is evaluated by first evaluating the arguments ei, then assigning the ei to the xi and evaluating expr in the closure dyscope. Note that since expr has no side effects, the values of the xi are constant. [Note: a possible change is to allow side-effects in logic functions.]

Tuples

Non-mutating expressions:

t1 == t2
t.i
($tuple<T1,...>){ x1, ... }
$logic $tuple<T1,...> $tuple_write($tuple<T1,...> t, $int i, Ti x);

Mutating expressions:

t.i = x;

Sets

Non-mutating expressions:

s1 == s2
($set<T>)$empty empty set of type T
$logic _Bool $set_in(T x, $set<T> s); is x an element of s?
$pure $set<T> $set_with($set<T> s, T x); s U {x}
$pure $set<T> $set_without($set<T> s, T x); s - {x}
$pure $set<T> $set_union($set<T> s1, $set<T> s2); s1 U s2
$pure $set<T> $set_difference($set<T> s1, $set<T> s2); s1-s2
$pure $set_intersection($set<T> s1, $set<T> s2); s1 \cap s2
$pure T[] $set_elements($set<T> s);
$pure _Bool $set_isSubsetOf($set<T> s1, $set<T> s2);
$pure $set<U> $set_map($set<T> s, $fun<T,U> f);

Mutating procedures:

_Bool $set_add($set<T> * this, T x);
_Bool $set_remove($set<T> * this, T x);
void $set_addAll($set<T> * this, $set<T> that);
void $set_removeAll($set<T> * this, $set<T> that);
void $set_keepOnly($set<T> * this, $set<T> that);

Sequences (arrays)

Non-mutating expressions:

a1 == a2
a[i]
(T[]){ x1, ... }

Logic functions:

$pure T[] $seq_fun($int len, $fun<$int,T> f);
$pure T[] $seq_uniform($int n, T val);
$pure $int $length(T[] a); length of a
$pure T[] $seq_write(T[] a, int i, T x); a[i:=x]
$pure T[] $seq_subseq(T[] a, int i, int n); a[i..i+n-1]
$pure T[] $seq_without(T[] a, int i); a with position i removed
$pure T[] $seq_with(T[] a, int i, T x);
$pure T[] $seq_concat(T[] a1, T[] a2);
$pure U[] $seq_map(T[] a, $fun<T,U> f);
$pure T[] $seq_filter(T[] a, $fun<T,_Bool> f);
$pure U $seq_foldl(T[] a, $fun<$tuple<T,U>,U> f, U init);
$pure U $seq_foldr(T[] a, $fun<$tuple<T,U>,U> f, U init);

Mutating expressions:

a[i]=x;

Mutating procedures:

T $seq_remove(T[] * this, int i);
void $seq_insert(T[] * this, int i, T x);
void $seq_append(T[] * this, T[] that);

Maps

Non-mutating expressions:

m1 == m2
($map<K,V>)$empty
$pure V $map_get($map<K,V> K key);
$pure _Bool $map_containsKey($map<K,V> map, K key);
$pure _Bool $map_containsValue($map<K,V> map, V val);
$pure $map<K,V> $map_with($map<K,V> map, K key, V val);
$pure $map<K,V> $map_without($map<K,V> map, K key);
$pure $set<K> $map_keys($map<K,V> map);
$pure $set<$tuple<K,V>> $map_entries($map<K,V> map);

Mutating procedures:

V $map_put($map<K,V> * this, K key, V val);
V $map_remove($map<K,V> * this, K key);
void $map_removeAll($map<K,V> * this, $set<K> keys);
void $map_addAll($map<K,V> * this, $map<K,V> that);

Heaps

There is a single heap for dynamic allocation. The following built-in procedures are provided:

T* $new<T>(): creates an object A of type T in heap and returns &A.
T* $alloc<T>(int n): creates an array A of length n of T in heap and returns &A[0].
void $free<T>(T* p): frees the object referred to by p, the pointer returned by an earlier call to $new or $alloc.

Questions

how to deal with "undefined" values?
can there be nondeterministic expressions, i.e., expressions that evaluate to a set of values instead of one value?
how to implement C's malloc?

Ideas on malloc:

Union of all types occurring in program:

typedef union {
  T1 t1;
  T2 t2;
  ...
} BigUnion;

A call to malloc returns a memory block:

size (in bytes) (int)
num_elements (int)
sequence of tuples:
- offset (int)
- size (int)
- value (type BigUnion)

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text