== Points-To Analysis== **The Flow-Insensitive Representation * {{{AssignmentIF}}} : A flow-insensitive representation of a program is a set of {{{AssignmentIF}}}. An {{{AssignmentIF}}} instance represents an assignment in the program which is one of the following forms: * Base: {{{ lhs = &rhs }}} * Simple: {{{ lhs = rhs }}} * Complex 1: {{{ *lhs = rhs }}} * Complex 2: {{{ lhs = *rhs }}} For the case of {{{ *lhs = *rhs }}}, an auxiliary variable {{{tmp}}} is introduced: {{{ tmp = *rhs }}}, {{{ *rhs = tmp }}}. For now, structs and array elements are all invisible for the flow-insensitive representation: so {{{ expr.id }}} is abstracted to {{{expr}}}; {{{expr->id}}} is abstracted to {{{*expr}}}; {{{a[I]}}} is abstracted to {{{a}}}; {{{*(p + i)}}}, where {{{I}}} is an integer, is abstracted to {{{*p}}}. For the case of {{{lhs = a op b}}} (or `*lhs = a op b`), it is abstracted to `lhs = a` (or `*lhs = a`) and `lhs = b` (or `*lhs = b`) ** Intra-Procedural Points-To Analysis * Reference : Ben Hardekopf & Calvin Lin "The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code" * `PointsToGraph` : A points-to graph represents an over-approximation of "what pointers are point to" information in a function. * In the graph, nodes are the objects that are reachable from the function associated with the points-to graph. * The graph also maintains a set of objects, which is the "points-to set" of a node, for every node. * In the graph, edges are the subset-of relations between different nodes. * A client of `PointsToGraph` can 1. ask answers for the question "what a variable points to ?" 2. add additional points-to information to the graph. Modification to a graph will cause a re-computation of the graph. ** Inter-Procedural Points-To Analysis * Reference: Emami, Maryam and Ghiya, Rakesh and Hendren, Laurie J. "Context-sensitive inter-procedural points-to analysis in the presence of function pointers" * `InvocationGraphNode`: a node represents a lexical function call in the program. An invocation graph consists of nodes for all lexical function calls in a program. * A node `n` has multiple children, each of which represents a lexical function call in the function body of the function associated with `n`. * A node `n` also has parent node which represents a lexical call to a function whose body contains the call represented by `n`. * A node `n` has no child if there is no lexical call in the function body associated with it or it represents a recursive call (i.e., an ancestor of `n` is associated with the same function as `n`). * Every `InvocationGraphNode` is associated with a unique instance of a `PointsToGraph` * The inter-procedural analysis traverses the invocation graph and keeps updating the `PointsToGraph` of each node until a fix-point is reached == Read / Write Analysis == * A read / write record consists of 1) an `entity` or an `allocation/string`; 2) an optional `subscript-indices` * Taking use of the result of the points-to analysis, all collected read / write records satisfy the following conditions: * if a record contains a meaningful `subscript-indices`, then the `entity` or `allocation/string` in this record must be an array (allocated object and string can be considered as arrays) * for the case of `a[x]` where `a` is a pointer, there are a number of records correspond to `a[x]` each of which contains an `entity` or `allocation/string` that is referred by `a` * for now, read / write sets of a function call is over-approximated: a function can read / write everything visible from its function body. == Todo List == * generalizes the two analysis above and puts them into ABC * extends the points-to analysis for being aware of array elements and struct fields * extends the read / write analysis with corresponds to the extension of the points-to analysis * extends the read / write analysis with more precise analysis for function calls