Changes between Version 26 and Version 27 of C to tass-AST xml


Ignore:
Timestamp:
04/18/11 11:50:57 (15 years ago)
Author:
Stephen Siegel
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • C to tass-AST xml

    v26 v27  
    11= Notes on TASS AST and Translation to Model =
    22
    3 Questions:
    4  * How to handle pre-processor macros?   Include in AST?
    5  * How abstract should the AST be?
    6  * Should it contain semantic information, e.g., types, and variables?
    7  * How will it handle things like (foo)*bar: this could be either a cast of *bar to type foo, or it could be the product of foo and bar; you need to know whether foo defines a type, which is some semantic information
    8    * cf. http://www.computing.surrey.ac.uk/research/dsrg/fog/CxxGrammar.y
    9    * approach: just choose one way, then change in later pass when analyzing
    10 
    11 == Elements of a TASS AST ==
    12 
    13 See description of CIL AST:
    14  * http://www.cs.berkeley.edu/~necula/cil/api/
    15  * http://www.cs.berkeley.edu/~necula/cil/api/Cil.html
    16 
    17 
    18 Preprocessing: before an AST is created, the source file(s) are preprocessed to create a stream of tokens with complete source information.  This stream is fed to the parser which creates the AST.
    19 
    20 All AST nodes have source information.
    21 
    22 Types of AST Node
    23  * identifiers
    24     * name: string
    25  * type node
    26     * void
    27     * integer
    28        * integer sub-types, specified by parameters
    29     * real
    30        * real sub-stypes, specified by parameters (e.g., IEEE754 floating point numbers)
    31     * boolean
    32     * char
    33     * array of t
    34        * and possible extent?
    35     * pointer to t
    36     * record ("struct" in C)
    37        * name
    38        * sequence of fields, each with type and name
    39     * function from T1 to T2
    40     * enumeration type
    41        * name
    42        * sequence of elements
    43  * function declaration node (no body)
    44  * function definition node (body)
    45  * type definition node (typedef...)
    46  * statements (may have label)
    47    * assign (translate x+=a, x*=a, ...)
    48    * assert
    49    * assume
    50    * pragma (any kind of pragma, represented as just a string)
    51       * string
    52       * assert, assume, invariant, input, output, ...
    53    * case statement (select)
    54    * if-then, if-then-else
    55    * while
    56    * for
    57    * until
    58    * break
    59    * continue
    60    * goto
    61    * return
    62    * no-op
    63    * block ({...})
    64       * variable declaration section
    65       * sequence of statements
    66    * variable declaration
    67       * with possible initialization expression
    68       * possible array extents information and other information modifying type?
    69       * storage class: automatic, static, extern, ...?
    70    * expressions (side-effect-free)
    71       * literals (including named literals): integers, reals, strings, chars
    72       * variable
    73       * operators
    74          * +,-,*,/,%,<,<=,>,>=,==,!=,!,&&, | |, (x?y:z), bitand, bitor, bitxor, lshift, rshift, & (address-of), * (dereference)
    75        * cast-to-t
    76        * a[i] (array index)
    77        * x.a (record navigation)
    78       * quantifiers: \forall, \exists, \uniform, \sum
    79       * initializers (special kind of expression used to initialize variables in their decl)
    80       * sizeof (type, expression, or string literal)
    81       * startOf(a): &a[0] -- that is the Cil way of converting an array to pointer to first element
    82       * function invocation f(x) when f is abstract (pure) function
    83    * expressions with side-effects
    84       * assignments: x=expr, x++, x--, ++x, --x
    85       * function invocation f(x) when f is concrete and return type of f is non-null
    86 
    87      
    88 
    89 == Processing Stages ==
    90 
    91 The AST can be used to represent the program at different stages of translation.
    92 
    93 Stages
    94  1. The input source file(s) are pre-processed, creating a stream of tokens with source information
    95  1. The token stream is parsed to produce an AST in which all variables, types, etc., are represented simply as identifiers
    96  1. Symbol table information is associated to the identifiers in the AST
    97  1. (Static) types are created and associated to every expression (use the types in the model package?)
    98  1. Variable objects are created and inserted into AST
    99  1. Side effects are removed by introducing temp variables
    100  1. --Local variables are all lifted to outermost scope?-- Scopes will be added to model layer
    101  1. Model is produced
    102 
    103 
    104 Modules
    105  * CPreProcessor: an ANTLR-generated parser that takes source file and produces token stream, applies pre-processing rules, produces new token stream
    106  * CParser: an ANTRL-generated parser that takes stream of sourced tokens from pre-processor and produces AST stage 0
    107  * AST, ASTNode
    1083
    1094== C Example ==