StaticAnalysis – CIVL

Context Navigation

Points-To Analysis

The Flow-Insensitive Representation

AssignmentIF : A flow-insensitive representation of a program is a set of AssignmentIF. An AssignmentIF instance represents an assignment in the program which is one of the following forms:
- Base: lhs = &rhs
- Simple: lhs = rhs
- Complex 1: *lhs = rhs
- Complex 2: lhs = *rhs

For the case of *lhs = *rhs , an auxiliary variable tmp is introduced: tmp = *rhs , *rhs = tmp .

For now, structs and array elements are all invisible for the flow-insensitive representation: so expr.id is abstracted to expr; expr->id is abstracted to *expr; a[I] is abstracted to a; *(p + i), where I is an integer, is abstracted to *p.

For the case of lhs = a op b (or *lhs = a op b), it is abstracted to lhs = a (or *lhs = a) and lhs = b (or *lhs = b)

Intra-Procedural Points-To Analysis

Reference : Ben Hardekopf & Calvin Lin "The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code"
PointsToGraph : A points-to graph represents an over-approximation of "what pointers are point to" information in a function.
- In the graph, nodes are the objects that are reachable from the function associated with the points-to graph.
- The graph also maintains a set of objects, which is the "points-to set" of a node, for every node.
- In the graph, edges are the subset-of relations between different nodes.
- A client of PointsToGraph can
  1. ask answers for the question "what a variable points to ?"
  2. add additional points-to information to the graph. Modification to a graph will cause a re-computation of the graph.

Inter-Procedural Points-To Analysis

Reference: Emami, Maryam and Ghiya, Rakesh and Hendren, Laurie J. "Context-sensitive inter-procedural points-to analysis in the presence of function pointers"
InvocationGraphNode: a node represents a lexical function call in the program. An invocation graph consists of nodes for all lexical function calls in a program.
- A node n has multiple children, each of which represents a lexical function call in the function body of the function associated with n.
- A node n also has parent node which represents a lexical call to a function whose body contains the call represented by n.
- A node n has no child if there is no lexical call in the function body associated with it or it represents a recursive call (i.e., an ancestor of n is associated with the same function as n).
- Every InvocationGraphNode is associated with a unique instance of a PointsToGraph
- The inter-procedural analysis traverses the invocation graph and keeps updating the PointsToGraph of each node until a fix-point is reached

Read / Write Analysis

A read / write record consists of 1) an entity or an allocation/string; 2) an optional subscript-indices
Taking use of the result of the points-to analysis, all collected read / write records satisfy the following conditions:
- if a record contains a meaningful subscript-indices, then the entity or allocation/string in this record must be an array (allocated object and string can be considered as arrays)
- for the case of a[x] where a is a pointer, there are a number of records correspond to a[x] each of which contains an entity or allocation/string that is referred by a
- for now, read / write sets of a function call is over-approximated: a function can read / write everything visible from its function body.

Todo List

generalizes the two analysis above and puts them into ABC
extends the points-to analysis for being aware of array elements and struct fields
extends the read / write analysis with corresponds to the extension of the points-to analysis
extends the read / write analysis with more precise analysis for function calls

Last modified 7 years ago Last modified on 05/09/19 15:38:39

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text