| Version 13 (modified by , 10 years ago) ( diff ) |
|---|
Alias Analysis
Goal: help POR to be more precise
Concepts:
Assume that all statements are "atomic" actions, for simplicity.
dyscope: a dynamic instance of a lexical scope, with values assigned to each variable declared in the lexical scope
call frame: a call frame contains a function and a location of the function (and the corresponding dyscope of the location can be obtained)
call stack: a stack of call frames
process: the state of a process includes its call stacks and its current program location (and thus the current dyscope)
state: a state contains a dyscope tree and a number of processes
independent statements: two statements s1 and s2 are said to be independent if for any state S where s1 and s2 are enabled,
s2is still enabled inexec(S, s1)and vice versa; andexec(exec(S, s1), s2) = exec(exec(S, s2), s1)
where exec(S, s) denotes the state resulted by executing statement s at state S.
Two independent statements commute.
ample set: a set of processes at a certain state such that for all processes not in the ample set, all their reachable statements commute with any enabled statement of any process in the ample set
The key of the POR is to find out the minimal ample set for a state. Usually, the dependency of statements
Examples
Let s be a statement, and p be a pointer, let L(s) be the source location of s, let O(p, s) be the set of objects that p may alias to at L(s).
Let LR(s) be the set of locations reachable from s, let OR(p, s) be the set of objects that p may alias to at any location of LR(s).
Note that each statement has a unique source location.
Let a state S be a tuple (p0: s0, p1: s1, ...), which contains the statement that maybe enabled for each process.
For simplicity, the dyscope tree and the call stacks are omitted and we assume that the execution of each process is always deterministic.
Let AMP(S) be the set of possible ample sets of S.
Example 1: decides purely local statements statically
int x=0, y=0;
void f(int *q) {
*q = 1; //s1: would like to know p!=q here
}
int main() {
int *p = &x;//s0'
$proc p1 = $spawn f(&y);
*p = 99; //s0: would like to know p!=q here
$wait(p1);
}
Suppose p0 is the process executes the main function, and p1 is the spawned process that executes f.
Then the alias analysis result of statements s0 and s1 would be:
O(p, s0)={x}
O(q, s1)={y}
and
OR(p, s0)={x}
OR(q, s1)={y}
Suppose the current state is S=(p0: s0, p1:s1), i.e., the enable statement of p0 and p1 is s0 and s1, respectively. Since we know statically that O(p, s0) never intersect with O(q, s1), then we know that s0 and s1 commute. So the ample set at S is either {p0} or {p1}.
In fact, the ample set of any reachable state of this program is always {p0} and {p1}.
For example, if the current state is S=(p0: s0', p1: s1), the POR algorithm looks at all reachable statements of p0 and finds that OR(p0, s0') is {x} and similarly OR(p1, s1) is {y}, thus the {p0} or {p1} is an ample set.
Example 2
Now let's look at a more complicated example which is obtained by modifying Example 1.
int x=0, y=0;
void f(int *q) {
*q = 1; //s1: would like to know p!=q here
q = &x; //s2
*q = 1; //s3
}
int main() {
int *p = &x;//s0'
$proc p1 = $spawn f(&y);
*p = 99; //s0
$wait(p1);
}
alias analysis result:
O(p, s0')={}
OR(p, s0')={x}
O(p, s0)={x}
OR(p, s0)={x}
O(q, s1)={y}
OR(q, s1)={x,y}
O(q, s2)={y}maybe this should be {} because the value of q is going to be changed and s2 never access the memory unit pointed to by q.
OR(q, s2)={x, y}maybe this should be {x}
O(q, s3)={x}
OR(q, s3)={x}
case 1: S=(p0:s0', p1:s1), then we have AMP(S)={{p0}, {p1}}.
rationale: p is local variable of p0 and is never aliased, which implies that {p0} can be an ample set.
For {p1}, since OR(q, s1)={x,y}, and s0' only writes to p, {p1} can be an ample set.
case 2: S=(p0:s0, p1:s1), then {p1} \in AMP(S) but not {p0}.
The current referenced object of p1 at L(s1) is y and y can never be aliased by p0.
case 3: S=(p0:s0, p1:s2), then {p1} \in AMP(S) but not {p0}.
case 4: S=(p0:s0, p1:s3), then {p0, p1} \in AMP(S).
Example 3: decides two independent transitions with thread-specific variables
process-specific variables: process ID, arguments used when creating the process
int x,y,z;
void f(int *q){
*q = 2; // s1
}
int main(){
int *p=&x;
$proc p1=$spawn f(&y);
$proc p2=$spawn f(&z);
*p=1; //s0
$wait(p1);
$wait(p2);
}
process-specific variables: q, let vq be the value of q when the process is created
alias analysis result:
O(p, s0)={x}
OR(p, s0)={x}
O(q, s1)={vq}
OR(q, s1)={vq}
let S=(p0:s0, p1:s1, p2:s1), since vq(p1)=&y, vq(p2)=&z, we can conclude that AMP(S)={{p0}, {p1}, {p2}}, i.e., each process forms an ample set at S.
