| Version 26 (modified by , 15 years ago) ( diff ) |
|---|
Notes on TASS AST and Translation to Model
Questions:
- How to handle pre-processor macros? Include in AST?
- How abstract should the AST be?
- Should it contain semantic information, e.g., types, and variables?
- How will it handle things like (foo)*bar: this could be either a cast of *bar to type foo, or it could be the product of foo and bar; you need to know whether foo defines a type, which is some semantic information
- cf. http://www.computing.surrey.ac.uk/research/dsrg/fog/CxxGrammar.y
- approach: just choose one way, then change in later pass when analyzing
Elements of a TASS AST
See description of CIL AST:
Preprocessing: before an AST is created, the source file(s) are preprocessed to create a stream of tokens with complete source information. This stream is fed to the parser which creates the AST.
All AST nodes have source information.
Types of AST Node
- identifiers
- name: string
- type node
- void
- integer
- integer sub-types, specified by parameters
- real
- real sub-stypes, specified by parameters (e.g., IEEE754 floating point numbers)
- boolean
- char
- array of t
- and possible extent?
- pointer to t
- record ("struct" in C)
- name
- sequence of fields, each with type and name
- function from T1 to T2
- enumeration type
- name
- sequence of elements
- function declaration node (no body)
- function definition node (body)
- type definition node (typedef...)
- statements (may have label)
- assign (translate x+=a, x*=a, ...)
- assert
- assume
- pragma (any kind of pragma, represented as just a string)
- string
- assert, assume, invariant, input, output, ...
- case statement (select)
- if-then, if-then-else
- while
- for
- until
- break
- continue
- goto
- return
- no-op
- block ({...})
- variable declaration section
- sequence of statements
- variable declaration
- with possible initialization expression
- possible array extents information and other information modifying type?
- storage class: automatic, static, extern, ...?
- expressions (side-effect-free)
- literals (including named literals): integers, reals, strings, chars
- variable
- operators
- +,-,*,/,%,<,<=,>,>=,==,!=,!,&&, | |, (x?y:z), bitand, bitor, bitxor, lshift, rshift, & (address-of), * (dereference)
- cast-to-t
- a[i] (array index)
- x.a (record navigation)
- quantifiers: \forall, \exists, \uniform, \sum
- initializers (special kind of expression used to initialize variables in their decl)
- sizeof (type, expression, or string literal)
- startOf(a): &a[0] -- that is the Cil way of converting an array to pointer to first element
- function invocation f(x) when f is abstract (pure) function
- expressions with side-effects
- assignments: x=expr, x++, x--, ++x, --x
- function invocation f(x) when f is concrete and return type of f is non-null
Processing Stages
The AST can be used to represent the program at different stages of translation.
Stages
- The input source file(s) are pre-processed, creating a stream of tokens with source information
- The token stream is parsed to produce an AST in which all variables, types, etc., are represented simply as identifiers
- Symbol table information is associated to the identifiers in the AST
- (Static) types are created and associated to every expression (use the types in the model package?)
- Variable objects are created and inserted into AST
- Side effects are removed by introducing temp variables
- --Local variables are all lifted to outermost scope?-- Scopes will be added to model layer
- Model is produced
Modules
- CPreProcessor: an ANTLR-generated parser that takes source file and produces token stream, applies pre-processing rules, produces new token stream
- CParser: an ANTRL-generated parser that takes stream of sourced tokens from pre-processor and produces AST stage 0
- AST, ASTNode
C Example
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
double result = 0.0;
int n = atoi(argv[1]);
int i;
double a[n];
FILE *fp = fopen("data","r");
for (i=0; i<n; i++) fscanf(fp, "%lf", &a[i]);
for (i=0; i<n; i++) result += a[i];
printf("%lf",result);
fclose(fp);
return 0;
}
TASS-AST xml:
<tass-ast>
<function name = "main" typeClass="0" type="int" s="include">
<parameters>
<variableDeclaration class="Builtin" type = "int" name="argc"/>
<variableDeclaration class="IncompleteArray" type = "char *[]" name="argv"/>
</parameters>
<body>
<compoundStatement s="adder_spec.c:5:3">
<variableDeclaration class="Builtin" type = "double" name="result">
<init>
<literalExpression s="adder_spec.c:5:19">
<rationalValue>0.000000e+00</rationalValue>
</literalExpression>
</init>
</variableDeclaration>
</compoundStatement>
<compoundStatement s="adder_spec.c:6:3">
<variableDeclaration class="Builtin" type = "int" name="n">
<init>
<evaluatedFunctionExpression>
<function>
<declRefExpression value="atoi" s="adder_spec.c:6:11"/>
</function>
<argument>
<arrayAccessExpression>
<variable>
<declRefExpression value="argv" s="adder_spec.c:6:16"/>
</variable>
<subscript>
<literalExpression s="adder_spec.c:6:21">
<integerValue>1</integerValue>
</literalExpression>
</subscript>
</arrayAccessExpression>
</argument>
</evaluatedFunctionExpression>
</init>
</variableDeclaration>
</compoundStatement>
<compoundStatement s="adder_spec.c:7:3">
<variableDeclaration class="Builtin" type = "int" name="i"/>
</compoundStatement>
<compoundStatement s="adder_spec.c:8:3">
<variableDeclaration class="VariableArray" type = "double []" name="a"/>
</compoundStatement>
<compoundStatement s="adder_spec.c:9:3">
<variableDeclaration class="Pointer" type = "FILE *" name="fp">
<init>
<evaluatedFunctionExpression>
<function>
<declRefExpression value="fopen" s="adder_spec.c:9:14"/>
</function>
<argument>
<literalExpression s="adder_spec.c:9:20">
<stringExpression value ="data">
</literalExpression>
</argument>
<argument>
<literalExpression s="adder_spec.c:9:27">
<stringExpression value ="r">
</literalExpression>
</argument>
</evaluatedFunctionExpression>
</init>
</variableDeclaration>
</compoundStatement>
<forStatement s="adder_spec.c:11:3">
<initializer>
<binaryExpression kind="=">
<leftHandSide>
<declRefExpression value="i" s="adder_spec.c:11:8"/>
</leftHandSide>
<rightHandSide>
<literalExpression s="adder_spec.c:11:10">
<integerValue>0</integerValue>
</literalExpression>
</rightHandSide>
</binaryExpression>
</initializer>
<updater>
<unaryExpression postFix = "true" operator="++" s="adder_spec.c:11:18">
<expression>
<declRefExpression value="i" s="adder_spec.c:11:18"/>
</expression>
</unaryExpression>
</updater>
<condition>
<binaryExpression kind="less_than">
<leftHandSide>
<declRefExpression value="i" s="adder_spec.c:11:13"/>
</leftHandSide>
<rightHandSide>
<declRefExpression value="n" s="adder_spec.c:11:15"/>
</rightHandSide>
</binaryExpression>
</condition>
<body>
<evaluatedFunctionExpression>
<function>
<declRefExpression value="fscanf" s="adder_spec.c:11:23"/>
</function>
<argument>
<declRefExpression value="fp" s="adder_spec.c:11:30"/>
</argument>
<argument>
<literalExpression s="adder_spec.c:11:34">
<stringExpression value ="%lf">
</literalExpression>
</argument>
<argument>
<unaryExpression postFix = "false" operator="&" s="adder_spec.c:11:41">
<expression>
<arrayAccessExpression>
<variable>
<declRefExpression value="a" s="adder_spec.c:11:42"/>
</variable>
<subscript>
<declRefExpression value="i" s="adder_spec.c:11:44"/>
</subscript>
</arrayAccessExpression>
</expression>
</unaryExpression>
</argument>
</evaluatedFunctionExpression>
</body>
</forStatement>
<forStatement s="adder_spec.c:12:3">
<initializer>
<binaryExpression kind="=">
<leftHandSide>
<declRefExpression value="i" s="adder_spec.c:12:8"/>
</leftHandSide>
<rightHandSide>
<literalExpression s="adder_spec.c:12:10">
<integerValue>0</integerValue>
</literalExpression>
</rightHandSide>
</binaryExpression>
</initializer>
<updater>
<unaryExpression postFix = "true" operator="++" s="adder_spec.c:12:18">
<expression>
<declRefExpression value="i" s="adder_spec.c:12:18"/>
</expression>
</unaryExpression>
</updater>
<condition>
<binaryExpression kind="less_than">
<leftHandSide>
<declRefExpression value="i" s="adder_spec.c:12:13"/>
</leftHandSide>
<rightHandSide>
<declRefExpression value="n" s="adder_spec.c:12:15"/>
</rightHandSide>
</binaryExpression>
</condition>
<body>
<compoundAssignOperator operator="+=">
<leftHandSide>
<declRefExpression value="result" s="adder_spec.c:12:23"/>
</leftHandSide>
<rightHandSide>
<arrayAccessExpression>
<variable>
<declRefExpression value="a" s="adder_spec.c:12:33"/>
</variable>
<subscript>
<declRefExpression value="i" s="adder_spec.c:12:35"/>
</subscript>
</arrayAccessExpression>
</rightHandSide>
</body>
</forStatement>
<evaluatedFunctionExpression>
<function>
<declRefExpression value="printf" s="adder_spec.c:13:3"/>
</function>
<argument>
<literalExpression s="adder_spec.c:13:10">
<stringExpression value ="%lf">
</literalExpression>
</argument>
<argument>
<declRefExpression value="result" s="adder_spec.c:13:16"/>
</argument>
</evaluatedFunctionExpression>
<evaluatedFunctionExpression>
<function>
<declRefExpression value="fclose" s="adder_spec.c:14:3"/>
</function>
<argument>
<declRefExpression value="fp" s="adder_spec.c:14:10"/>
</argument>
</evaluatedFunctionExpression>
<returnStatement s="adder_spec.c:15:3">
<expression>
<literalExpression s="adder_spec.c:15:10">
<integerValue>0</integerValue>
</literalExpression>
</expression>
</returnStatement>
</body>
</function>
</tass-ast>
Note:
See TracWiki
for help on using the wiki.
