wiki:AliasAnalysis

Version 13 (modified by zmanchun, 10 years ago) ( diff )

--

Alias Analysis

Goal: help POR to be more precise

Concepts:

Assume that all statements are "atomic" actions, for simplicity.

dyscope: a dynamic instance of a lexical scope, with values assigned to each variable declared in the lexical scope

call frame: a call frame contains a function and a location of the function (and the corresponding dyscope of the location can be obtained)

call stack: a stack of call frames

process: the state of a process includes its call stacks and its current program location (and thus the current dyscope)

state: a state contains a dyscope tree and a number of processes

independent statements: two statements s1 and s2 are said to be independent if for any state S where s1 and s2 are enabled,

  1. s2 is still enabled in exec(S, s1) and vice versa; and
  2. exec(exec(S, s1), s2) = exec(exec(S, s2), s1)

where exec(S, s) denotes the state resulted by executing statement s at state S. Two independent statements commute.

ample set: a set of processes at a certain state such that for all processes not in the ample set, all their reachable statements commute with any enabled statement of any process in the ample set

The key of the POR is to find out the minimal ample set for a state. Usually, the dependency of statements

Examples

Let s be a statement, and p be a pointer, let L(s) be the source location of s, let O(p, s) be the set of objects that p may alias to at L(s).
Let LR(s) be the set of locations reachable from s, let OR(p, s) be the set of objects that p may alias to at any location of LR(s).
Note that each statement has a unique source location.
Let a state S be a tuple (p0: s0, p1: s1, ...), which contains the statement that maybe enabled for each process.
For simplicity, the dyscope tree and the call stacks are omitted and we assume that the execution of each process is always deterministic.
Let AMP(S) be the set of possible ample sets of S.

Example 1: decides purely local statements statically

int x=0, y=0;

void f(int *q) {
  *q = 1; //s1: would like to know p!=q here
}

int main() {
  int *p = &x;//s0'

  $proc p1 = $spawn f(&y);
  *p = 99; //s0: would like to know p!=q here
  $wait(p1);
}

Suppose p0 is the process executes the main function, and p1 is the spawned process that executes f.

Then the alias analysis result of statements s0 and s1 would be:

O(p, s0)={x} O(q, s1)={y}

and OR(p, s0)={x} OR(q, s1)={y}

Suppose the current state is S=(p0: s0, p1:s1), i.e., the enable statement of p0 and p1 is s0 and s1, respectively. Since we know statically that O(p, s0) never intersect with O(q, s1), then we know that s0 and s1 commute. So the ample set at S is either {p0} or {p1}.

In fact, the ample set of any reachable state of this program is always {p0} and {p1}. For example, if the current state is S=(p0: s0', p1: s1), the POR algorithm looks at all reachable statements of p0 and finds that OR(p0, s0') is {x} and similarly OR(p1, s1) is {y}, thus the {p0} or {p1} is an ample set.

Example 2

Now let's look at a more complicated example which is obtained by modifying Example 1.

int x=0, y=0;

void f(int *q) {
  *q = 1; //s1: would like to know p!=q here
  q = &x; //s2
  *q = 1; //s3
}

int main() {
  int *p = &x;//s0'

  $proc p1 = $spawn f(&y);
  *p = 99; //s0
  $wait(p1);
}

alias analysis result:

O(p, s0')={}
OR(p, s0')={x}

O(p, s0)={x}
OR(p, s0)={x}

O(q, s1)={y}
OR(q, s1)={x,y}

O(q, s2)={y}maybe this should be {} because the value of q is going to be changed and s2 never access the memory unit pointed to by q.
OR(q, s2)={x, y}maybe this should be {x}

O(q, s3)={x}
OR(q, s3)={x}

case 1: S=(p0:s0', p1:s1), then we have AMP(S)={{p0}, {p1}}. rationale: p is local variable of p0 and is never aliased, which implies that {p0} can be an ample set. For {p1}, since OR(q, s1)={x,y}, and s0' only writes to p, {p1} can be an ample set.

case 2: S=(p0:s0, p1:s1), then {p1} \in AMP(S) but not {p0}. The current referenced object of p1 at L(s1) is y and y can never be aliased by p0.

case 3: S=(p0:s0, p1:s2), then {p1} \in AMP(S) but not {p0}.

case 4: S=(p0:s0, p1:s3), then {p0, p1} \in AMP(S).

Example 3: decides two independent transitions with thread-specific variables

process-specific variables: process ID, arguments used when creating the process

int x,y,z;

void f(int *q){
  *q = 2; // s1
}

int main(){
  int *p=&x;

  $proc p1=$spawn f(&y);
  $proc p2=$spawn f(&z);

  *p=1; //s0
  $wait(p1);
  $wait(p2);  
}

process-specific variables: q, let vq be the value of q when the process is created

alias analysis result:

O(p, s0)={x}
OR(p, s0)={x}

O(q, s1)={vq}
OR(q, s1)={vq}

let S=(p0:s0, p1:s1, p2:s1), since vq(p1)=&y, vq(p2)=&z, we can conclude that AMP(S)={{p0}, {p1}, {p2}}, i.e., each process forms an ample set at S.

Note: See TracWiki for help on using the wiki.