How will one mark up a C program for the TASS model extractor? This page is organized by issues that arise. == Input/Output == Question: A real scientific program written in C does not have "input" and "output" variables. Inputs may come from a variety of sources, mainly from files. But preprocessor directives may also be considered input, there may be input from the terminal, or from other sources. Output is also typically written to files or the screen. How do we model the input/output in this case? What should we expect the user to do? Perhaps the user should write a "TASS harness" which wraps the code in some way that specifies the input and output clearly? Ideas, please. Answer: I/O to/from files (including stdout) will be handled by creating an abstract model of the stdio library. This will introduce some shared variables (for the files) and define functions corresponding to the usual ones in stdio.h. There will be no need for user annotations, except for exceptional situations (e.g., to ignore a certain input or output command). Further details are coming. The argv and argc parameters passed to main will become the input variables for the model. There are some ways pre-processor object-type macros can also be treated as inputs, but some thought will have to be put into this. == Assumptions == Question: In MiniMP one writes: {{{ input int N {N>0}; }}} How to do this in C? Answer: with a new pragma inserted at some appropriate place: {{{ #pragma TASS assume N>0 }}} == Non-deterministic select statement == Question: Does not exist in C. Do we need it? == Assertion Language == Question: Some MiniMP expressions do not occur in C, such as {{{forall}}}, {{{exists}}}. For accuracy, there is also the derivative notation. How to express them? Where can they occur? Can we use C's {{{assert}}}? Answer: probably we cannot use C's assert for expressions which are not C expressions. Instead use pragmas: {{{ #pragma TASS assert forall {int i | i>=0 && i=0 }}} == Loop (joint) invariants == Question: how are these expressed in the source code? Answer: These are also expressed as pragmas: {{{ #pragma TASS jointinvariant skew(2:1) i == spec.j && k == spec.s + 1 }}} or something like that. == Library calls == Question: real C programs make lots of calls to library functions. How are these modeled? Answer: Library calls will be modeled by creating abstract versions of the libraries. Syntactically, in the model these will also be function calls just like other function calls. There are three types of functions that can be called: 1. regular function that have an explicit body (locations and statements). These may be user-defined, or be defined in a library. They are treated the same in either case. 2. abstract function: these are modeled just using an uninterpreted symbolic constant function. They have no body and no semantics associated to them. They must be side-effect free. Examples are "sin" and "atanh". These will be declared in our abstract representations of the libraries, e.g.: {{{ abstract real atanh(real x); abstract continuous(\infty) real cos(real x); }}} 3. system level functions. Examples are malloc, free, send. These do not have bodies. Instead, the semantics of each will be defined by a special class in the semantics package that describes how the state is transformed by executing such a function. The function is executed in one atomic step. It is like defining a new atomic statement. == Example == Here is a program that adds the elements in a file. The commandline argument is the filename. Inputs are the filename, the contents of the file, and the parameter B which is an upper bound on the number of elements that will be read. Output is to stdout. Question: How do we annotate this file in such a way that it does not interfere with compilation, but provides all the information we need to build a TASS model? Answer: proposed solution is embedded in the source below... {{{ #include #include #pragma TASS input int B #define B 3 double sum = 0.0; /* arguments to main are inputs */ int main(int argc, char* argv[]) { char *filename; double result = 0.0; FILE *fp; double tmp; int i = 0; if (argc != 2) { printf("Usage: realadder filename\ "); fflush(stdout); return 1; } filename = argv[1]; fp = fopen(filename, "r"); assert(fp); while (1) { int more = fscanf(fp, "%lf", &tmp); #pragma TASS assume tmp>=0.0 if (more == EOF) break; i++; if (i>B) { #pragma TASS ignore printf("Too many numbers! Limit is %d.\ ", B); fflush(stdout); return 2; } sum += tmp; } printf("The sum is %lf\ ", sum); fclose(fp); return 0; } }}} == Pragmas == Question: what are all the pragmas we will need, and what will be there exact syntax and semantics: Answer: Here is a proposed list: * `#pragma TASS assert ` * verify that the expression is true; report an error if it is not * {{{#pragma TASS assume }}} * assume the expression is true, i.e., add the value of this expression to the current path condition * {{{#pragma TASS invariant }}} * this is an (ordinary) loop invariant. Check that it really is an invariant, and use that fact to compute the state after the loop exit * {{{#pragma TASS jointinvariant }}} * this is a joint loop invariant relating this loop to a loop in the specification program. Use to check functional equivalence. * {{{#pragma TASS ignore}}} * ignore the next statement (typically used for I/O statements that are not relevant) * {{{#pragma TASS input }}} * treat the next element as an input variable. Could be a preprocessor directory defining an object-type macro. * `#pragma TASS system` * declares the following function to be "system-level" * `#pragma TASS abstract` * declares the following function to be abstract (a pure mathematical side-effect free function, like sin(x)) To combine an input with assumption: {{{ #pragma input assume N>0 && N*N}}} * {{{forall { | expr} expr}}} * ditto with {{{exists}}} in place of {{{forall}}} Examples: == LLVM to TASS-AST == Pragmas will not be parsed at source-code compile time. Instead, they will be passed along as annotations to nodes in the abstract syntax tree and the ModelBuilder will use an ANTLR compiler to process individual pragmas. LLVM's clang front-end is building the TASS-abstract syntax tree and handling some of the language-level manipulations such as moving variables to outer scopes. The pragma handler for LLVM's clang front-end compiler allows us to attach these pragmas to a specific node in the abstract syntax tree, which we'll then pass along to the model builder.