wiki: StaticAnalysis

Points-To Analysis

The Flow-Insensitive Representation

  • AssignmentIF : A flow-insensitive representation of a program is a set of AssignmentIF. An AssignmentIF instance represents an assignment in the program which is one of the following forms:
    • Base: lhs = &rhs
    • Simple: lhs = rhs
    • Complex 1: *lhs = rhs
    • Complex 2: lhs = *rhs

For the case of *lhs = *rhs , an auxiliary variable tmp is introduced: tmp = *rhs , *rhs = tmp .

For now, structs and array elements are all invisible for the flow-insensitive representation: so expr.id is abstracted to expr; expr->id is abstracted to *expr; a[I] is abstracted to a; *(p + i), where I is an integer, is abstracted to *p.

For the case of lhs = a op b (or *lhs = a op b), it is abstracted to lhs = a (or *lhs = a) and lhs = b (or *lhs = b)

Intra-Procedural Points-To Analysis

  • Reference : Ben Hardekopf & Calvin Lin "The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code"
  • PointsToGraph : A points-to graph represents an over-approximation of "what pointers are point to" information in a function.
    • In the graph, nodes are the objects that are reachable from the function associated with the points-to graph.
    • The graph also maintains a set of objects, which is the "points-to set" of a node, for every node.
    • In the graph, edges are the subset-of relations between different nodes.
    • A client of PointsToGraph can
      1. ask answers for the question "what a variable points to ?"
      2. add additional points-to information to the graph. Modification to a graph will cause a re-computation of the graph.

Inter-Procedural Points-To Analysis

  • Reference: Emami, Maryam and Ghiya, Rakesh and Hendren, Laurie J. "Context-sensitive inter-procedural points-to analysis in the presence of function pointers"
  • InvocationGraphNode: a node represents a lexical function call in the program. An invocation graph consists of nodes for all lexical function calls in a program.
    • A node n has multiple children, each of which represents a lexical function call in the function body of the function associated with n.
    • A node n also has parent node which represents a lexical call to a function whose body contains the call represented by n.
    • A node n has no child if there is no lexical call in the function body associated with it or it represents a recursive call (i.e., an ancestor of n is associated with the same function as n).
    • Every InvocationGraphNode is associated with a unique instance of a PointsToGraph
    • The inter-procedural analysis traverses the invocation graph and keeps updating the PointsToGraph of each node until a fix-point is reached

Read / Write Analysis

  • A read / write record consists of 1) an entity or an allocation/string; 2) an optional subscript-indices
  • Taking use of the result of the points-to analysis, all collected read / write records satisfy the following conditions:
    • if a record contains a meaningful subscript-indices, then the entity or allocation/string in this record must be an array (allocated object and string can be considered as arrays)
    • for the case of a[x] where a is a pointer, there are a number of records correspond to a[x] each of which contains an entity or allocation/string that is referred by a
    • for now, read / write sets of a function call is over-approximated: a function can read / write everything visible from its function body.

Todo List

  • generalizes the two analysis above and puts them into ABC
  • extends the points-to analysis for being aware of array elements and struct fields
  • extends the read / write analysis with corresponds to the extension of the points-to analysis
  • extends the read / write analysis with more precise analysis for function calls
Last modified 7 years ago Last modified on 05/09/19 15:38:39
Note: See TracWiki for help on using the wiki.