Consider using a different font, like small caps or sansserif for the name MiniMP.  Define it as a macro so if you decide to change it, you only have to change it in once place.  \minimp

>A model of MiniMP is first given in §4.2, then we describe its semantics in §4.3.
What is a "model" of a programming language?


>To distinguish between mathematical functions and MiniMP functions,
Except, you use FunctionDecl in the grammar.  Why not ProcedureDecl.  Also, what about procedure that doesn't return a value?   The grammar doesn't allow that?  Maybe rephrase like
We use the work \emph{procedure} for the MiniMP language construct, and reserve the word \emph{function} for a mathematical function.  Note that a MiniMP procedure may return a value (must?), and it may modify global variables.

>A MiniMP program is a set of procedures and variables.

>For simplicity, we define that...
For simplicity, we require that local variable declarations precede all statements in a procedure body.

>A MiniMP program is run by a number of processes, each of which executes a copy of the program. There is no shared variable among processes.
A \minimp{} program is executed by specifying a number of processes, each of which executes a separate copy of the program starting from the main procedure.   (This is similar to the usual usage of MPI.)  There are no variables that are shared by multiple processes.

>an unique ID,
a unique ID,
(the choice of a/an goes by sound, not spelling.   When you say "unique" the sound starts with the consonant "y", so you must use "a")

<which is enumerated from 0.
The IDs are the consecutive integers 0, \cdots, n-1, where n is the number of processes.

>PID and NPROCS
PID and NPROCS, respectively.

>through the communicate statements.
... communication statements.?

>The send(buf , dest ) and recv(buf , src) is a pair o
The subject is plural so the verb must be plural (are).  But you can't say two things are a pair.  You can say two things *form* a pair.  I would just say

The statement send(buf, dest) sends the value obtained by evaluating buf to the process with ID dest.
The statement...

However, why are we using buf?   There is no buffer, which would be a region of memory in the C style.  It is simply a value.  I would use expr or something like that.

Fig. 4.1:
I would add the operators for and, or in the list in Operation, since they are probably important.


---

Instead of map, I recommend saying, for any type T, there is a type T[].   If D is the domain of T, the domain of T[] is D^* (refer back to the notation section).

Fig. 4.4 uses operators in a MiniMP program that are not in the grammar.

"let var=expr in expr" should be defined in Notation.

In Fig 7.5, there is a program with procedure "int[] exchange(...) ..."
However, "int[]" is not in the grammar.

In grammar: Instead of FunctionDecl (or ProcedureDecl), why not just Procedure?

Why can't a Compound statement be empty: {} ?   It is useful to have a no-op statement.

Is it worth it to define a subtype of Expression called LExpr (or something like that) to consist of the expressions of the form Identifier and LExpr[Expr]?   Then you can use LExpr in the Assign rule and in the recv rules.

Grammar says "Constants" but it should be "Constant".  Are there array constants?   (Why not)

What do you mean BLANK stands for nothing?   Why not just delete it?

Sometimes in the grammar there is space between the ; and preceding token, sometimes there isn't.
I would always put a space there.

Why is there a blank line after List-of-T?

Why do you have true and false constants, but no boolean type?  I would either add a boolean type, or get rid of true and false.  Either you are going to have a real boolean type and constants, or you are going to the stupid C thing and not have the boolean type.
I wish you would have the boolean type, it is just more logical and MiniMP is a logical language.
You can call it bool.

I would add void type just for defining procedures that don't return a value.


----

In the grammar, is there any reason to factor out Assignment, Cond, Call, While, Return, Compound, Communicate?  They could all just be inlined into the Statement rule.

If a VariableDecl is a Statement, it can occur anywhere a Statement can occur?  Not just at the beginning of the function body?

An Operation usually means something like "+" or "addition", not the expression "a+b".

I don't see the need for the Operation rule.  That can be inlined into the Expression rule.  However, I would factor out bin-op and un-op (binary operators and unary operators), since those are also parameters to your language (along with Type).

You say Identifier is a non-terminal, but isn't it a terminal?

You say Constant is just the integer literals, but this should be another parameter to the language, along with Type BinaryOp and UnaryOp.  But it must include at least integer constants and array literal constants { List-of-Expr}.


----

Fig. 4.2. comments:

In Fig 4.2, are these Programs?  I thought a program had to have a main procedure?   Maybe they don't have to be complete programs.  You can call them partial programs and just say they are missing a main procedure.

You can get rid of the return 0;.

int i, x, y; is not in the grammar.  It should be int i; int x; int y;
Or change the grammar to accommodate this.

Which words are in blue type?    I thought they would be all the keywords in the language, i.e., those
terminal symbols in the grammar.   But then, whe are send, recv, and any not blue?  I think they should be.

I think "dat" should be a parameter to bcast.  Root (proc 0) will call bcast with an array of length NPROCS.  Non-root procs will call bcast with the empty array {} (or any array they want --- it will just be ignored).  The value returned, an int, will be useful to all procs.

Dually, I think gather should consume an int and return an array of int.  On root, the array returned will have length NPROCS.   On non-roots, the array returned will be the empty array (the value returned should just be ignored).   The int argument will be needed from all procs.

Also, I might call them ibcast and igather (i for int).


----

Sec 4.2.

This should start with some intro that you are defining an abstract representation of
a program. (Like an IR in a compiler).

The defn of vocabulary is not very clear.  First, it should be an ordered pair (F, Var), not a set {F, Var}.  Second, after you say it is this ordered pair consisting of these two components, you then talk about more detailed information in the form of Var_f and Var_g.   Is this data part of the vocabularly or not?  If it is part of the vocabularly, then it must have more components than just the two sets F and Var.   Next, Var_g is not a good choice of notation because g might stand for a procedure name (i.e. g is in F), then you don't know if Var_g means the global variables or the local variables of procedure g.   Finally, you say the union of Var_f and Var_g is empty, but you must mean the intersection.   But in addition to the fact that local variables are distinct are from globals, you should also state that local variables of one procedure are distinct from those of other procedures.    My reformulation below corrects these issues:

\begin{definition} % note it is important to use the defn environment to make clear exactly
  % what is part of the defn and what is not, to make it easy to find, and to give it a number
  % so it can be referenced later
  A \emph{\minimp{} vocabularly} % note the term being defined is italicized
  is a tuple $(Procedure, Var, Global, locvar)$, where
  \begin{itemize}
  \item Procedure is a set of \emph{procedure names}
  \item Var is a set \emph{variables}
  \item Global \subseteq Var is the set of \emph{global variables}
  \item	locvar: Procedure -> 2^{Var} is a function which, given a procedure name, returns
        the set of \emph{local variables} for that procedure
  \end{itemize}
  such that
  \begin{itemize}
  \item forall f\in Procedure, locvar(f)\cap Global = \emptyset
  \item forall f,g\in Procedure, f\neq g => locvar(f)\cap locavar(g)=\emptyset
  \end{itemize}	
\end{definition}

I would define all of these component names as macros
\newcommand{\Procedure}{\textsf{Procedure}}
etc.
I prefer sans-serif (textsf) for multi-character symbols like this because I think it just looks
better and makes it clear that this is not a product of separate mathematical symbols P, r, o, c,
etc.

Also, somewhere I would add
\begin{defn}
  Let V be a MiniMP vocabulary.  The set of program expressions over the variables in Var
  is denoted Expr_V.  The set of program expressions of boolean type over Var is denoted
  BExpr_V.
\end{defn}
I think this will be useful later.

>Let Act be a set of atomic actions, each of which is executed in an atomic step. We define that the assignment, return, communicate, call statements and an additional skip action form the Act set.

I think this is good, but the writing is a little confusing.  Also, it might help to repeat here the exact statement forms, even though it is a little repetitive.   Also, the definition of Act depends on the vocabulary V --- these are the variables and function names that can be used in the statements.  Perhaps something like this:

We next define the atomic actions that will be used to label transitions in a program graph representation.

\begin{definition}
  Let V=(Procedure, Var, Global, locvar) be a MiniMP vocabulary.
  The set of \emph{V-actions}, denoted \Act_V, consists of all
  MiniMP assignment, procedure call, return, and communication statements, as well as a special
  action named skip (a no-op action), in which the variables and procedures belong to V.
  Specifically, each action has one of the following forms:
  \begin{enumerate}
  \item x = expr;
  \item x = f(e1, ..., er) and f(e1, ..., er)
  \item	return expr;
  \item send(expr, dest);
  \item recv(expr, src);
  \item recv(expr, any src);
  \item skip
  \end{enumerate}
  where all variables occurring in those statements are in Var, and f is in Procedure.
\end{definition}


> Statements in a program are modeled as edges. An edge is a tuple...
Problem here is that the definition of edge depends on Loc, which has not yet been
defined.  You can't use something before it is defined!
I think the best solution is to fold this into the definition of program graph.

\begin{defn}
Let V=(...) be a MiniMP vocabularly. Let f\in Procedure.   A
\emph{procedure graph} for f is a tuple
\[
G=(Loc, l_0, T)
\]
where
  Loc is a set of \emph{program locations}
  l_0\in\Loc is the \emph{start location}
  T\subseteq Loc \times BExpr_V \times Act_V \times \Loc is the set of
     \emph{local transitions}.
Let  (l, \phi, a, l')\in T.   The location $l$ is called the \emph{origin}
of the transition; $l'$ is the \emph{target}.   The boolean expression
phi is the transition's \emph{guard}, and $a$ is the transition's action.
The variables occurring in the action and guard must all belong to
Global\cup\locvar(f).
\end{defn}

Question: do communication statements have a guard?   Or will the guard for communication
statements always be "true".  If that's right, then add it to the defn..

Explain that the intuition, from location l, the action is blocks until the guard becomes true.  For communication statements, there is an implicit guard .... (I guess sends are always enabled?  But recvs are only enabled when there is a message).   The action is executed which changes the global state, in one atomic step, and control moves to l'.

Then give the example.

Example (Fig. 4.3).  If every transition has a guard, then the figure should reflect
this.  Even if the guard is true.
I thinkt the diagram doesn't look very good.  The arrows are too thick, the fonts are completely different from the rest of the paper, the bold face font looks weird, etc.    I would redo this diagram in latex using one of the many nice packages for this kind of thing.  I will send you some examples shortly.

BTW, are == and != in the MiniMP grammar (check)?

I think you need to explain that there are standard ways to translate the program code for a procedure body to a program graph.   You don't have to get into that in this dissertation.  Refer to some places where this is done or something similar.   Then give some examples (maybe two).  You can also say there is not necessarily one unqiue program graph representation.

I guess the translation part (Edges(C,l,l')) is trying to explain the translation procedure.  I think it needs a little more explanation.   For example, obviously new locations need to be "created" in this procedures ---- where does that happen.  What is the meaning of the arguments l and l' in Edges(C,l,l').   I think with just a little more explanation, this will be nice.

I don't understand why the call statement x=f(...) is translated into two transitions.
You can't have an atomic action x=f(...)?    And what is var_{ret}?     Did I miss that definition?

Definition of MiniMP model:  your defn gives the program graph for main twice: once in fg_main, and once by FG(main).   How about:

...
M=(main, G)
where
   main\in Procedure is the \emph{main procedure} and
   G is a function f\mapsto G_f from Procedure to program graphs such that
      G_f is a program graph for f.


----

More on Sec 4.2...

>A location l in a model always belong to a procedure f. A variable var is said
to be visible to the location l if var ∈ Varf or var ∈ Varg .

Well, you didn't say that in your definition of "model" (that every location belongs to exactly one procedure).  But is it necessary?  The "visibility" of a variable does not depend on location, since there are only two scopes: global, and local to the whole procedure.    So visibility only depends on the procedure.

Sec 4.3: Semantics

This section needs a little introduction to summarize what's coming.

You can say a frame represents an "activation frame" in a call stack.

>A frame is a structure that stores the runtime information of a procedure where a process is currently in. Let Frame be the set of frames.
Grammar.  You can't say "where X is in".    Maybe
A frame is an entry in a process's call stack.   It stores information about the current procedure for that process: the location within the procedure's program graph, and the values of the procedure's local variables.

You can't say "Let Frame be the set of frames" because you haven't defined Frame yet.
Don't use a term before it is defined.

Def. 4.5 (frame): This definition is missing the context.  It should say something like this:

Def. Let V=(...locvar) be a MiniMP vocabulary and M=(main,G)... a model over v.  Let f\in\Procedure.  A \emph{frame for f} is a tuple...

A few other notes.   I don't think using the italicized types for multisymbol identifiers
in math mode looks good.  It is too close to the product.  I would use sansserif for all multisymbol names, and define them as macros.

l (the location) is in what set exactly?  It is a location in the program graph G_f.
Var_f is now \locvar(f)
var_{ret} cannot be a variable, it must be a value (an element of D).  A frame is supposed to represent the current local state --- that consists of values for variables, not the variables.

Also, what value is used for var_{ret} before the function returns (is there UNDEF?).

After the definition of Frame...Now you can define the set of Frames.  However, what exactly do you mean?  The set of frames for one procedure (f)?  The set of frames for all procedures in this MiniMP model?  The set of frames for all procedures in all models?  I think it only makes sense to talk about the set of frames for f, or possibly the union over all f in Procedure of those sets.

Similar comments apply to process state.   I'm not sure I like the use of subscript g in
mem_g because it makes it look like there is some parameter named "g" and mem is a function
of g.    When really g just stands for "global variables".  I would rather get rid of the
subscript and just call this function "globalmem" or something like that.

Also, how do I get the procedure from a frame in the call stack?  Maybe the Frame tuple should have a component (the first component?) which is the Procedure name.

>Let N<n denote a set of n process IDs that are numbered 0, 1, ... , n − 1.
Now that you've put this in the Notation chapter, you can just say
"Let $n$ be a positive integer.  Recall that N<n is the set of all integers between 0 and n-1, inclusive..."

I think you should explain that at this point you are given n and you are going to define
the state of a MiniMP model *with n processes*.   Remind the reader that the model can
be instantiated for any n>0.

Def. 4.7. Again, the context is necessary:
Let V be a MiniMP vocabulary and M a model over V.  Let n>=1.    An \emph{$n$-process
state of $M$} is an (n+1)-ary tuple...

NOTATION: instead of "D", I think you should use "Val" (sansserif) for the set of values.

> s[(p,q)!val]1 (enqueue):
I wouldn't put the footnote (1) at this point.  It looks like it might be part of the expression.  I would just put the text of the footnote after this itemized list.  No need for a footnote.

> s[(p,q)?var]1 (dequeue)
but var can be any LExpr, like a[i], right?

> The evaluation relation [e] ∈ State × P → D evaluates an expression e for a process on a state.
"on" is not the right preposition.  "at a state".

It seems like n (number of procs) should also be a parameter to this function somehow.   The evaluation function for n=2 is different than the evaluation function for n=3.  Maybe it could be written with a subscript:

[[e]]_n(s,p)

Then you can call it the "n-process evaluation function".
You can say you'll leave n out when it is clear, to simplify the notation.

Notation: throughout the dissertation, I think it is better to write
f:X->Y
rather than
f \in X->Y.

Since the evaluation relation is a function, why not call it the "evaluation function".

I'm a little unclear about what the evaluation function does in the undefined case.  You could say that the domain of the function is really a subset of State x P, i.e., the subset where it is defined.  (That's usually what it means when you say something is "not defined" somewhere).  But that's really inconvenient.  Why not just say it could return anything in the undefined case.  It doesn't matter what it returns.  I guess that means there are multiple possible evaluation functions, so you would define it as "an evaluation function" rather than "the".

Shouldn't the definition of the evaluation function be in a definition environment?

>An interpretation, I(s,p,α), for an edge α = (l,g,a,l′), depends on a state s and a process p.
Do you really want to call these things edges?  It sounds wrong.  I would call them transitions, or local transitions.

Since I is a function, please give its signature.  I believe it returns a set of states.

The plain English description of the I function is very good.  Except for some minor grammar issues.

You should say something about how the chan(i,j) holds the buffered messages sent from
process i to process j --- the messages that have been sent but not yet received.

>mem(s,p): the combination of the local memory in the frame top(s,p) and the global memory in procStatep(s)
What does "combination" mean?  This is not a mathemtically defined operation.  Maybe it's the union (if you think of a function as a set of ordered pairs).

>comm(s): the channel function in state s
Why use "comm" when in the defn. you called it "chan"?

>np(s): the number of processes in state s
This doesn't make sense to me.  The number of processes is fixed, so why is it stored in the state?  I think the number of processes should always be specified in any definition.  There is no reason to put it in the state.

>s[x′/x]p: substitutes all x with x′ in top(s, p)
what are x and x'?   I mean, what are their types?  Are they variables, values, ...?

All of these rules are good but a little too imprecise.  Since this is a dissertation, I would just take all the room needed to be precise.  For example, you can write out all the equations that say the other components of the state don't change when you pop p.

>nexts(S,Edge) = {I(s,p,α) | s ∈ S,p ∈ N<np(s),α ∈ Edge}
So nexts(S,Edge) is a set of sets of states?  I don't think that's what you mean.
You want to take the union of the sets I(s,p,a), not pack them into a new set.

> A process cannot execute a deterministic recv action if the message channel, which is identified by the order pair of the sender and the running process, if the channel is empty.
A process cannot execute a deterministic recv action if the message channel, which is identified by the sender and the running process, is empty.

NOTATION: I am beginning to not like \N^{<n}.  Too cumbersome.   Maybe [n]={0,...,n-1} is better.  Define it as a macro so we can change our minds later.

>Let nexts(S,Edge) be the set of states that can be obtained by executing one edge in the set Edge from a state in the set S...
It's a little weird to use Edge for this set of edges, when usually that kind of name is used for the set of all edges.   Maybe E.  But, I don't think you should call these things edges, I think you should call them transitions, so maybe T.

Again, if you are defining a function (in this case, nexts), please show its signature.
nexts: 2^State x 2^Trans -> 2^State.

>The bahaviors of a MiniMP model is described by a state space graph.
behaviors...are.   it is plural

>Let EdgeM be the set of edges translated from a MiniMP model M, a finite state space G for M is a tuple:
This is a run-on sentence.  The comma is actually separating two distinct sentences.   Each sentence has its own noun and verb.   So use a period.

There is a little issue with the definition of state space for undefined operations.   I think you have to fix the evaluation function.  That's part of the context.  Then (given n, ...) the state space is well-defined.

Is there any reason why you want to generalize the initial state s_0?  Why not just make it the state where every process has just called main.  I don't know what the initial values of global variables are supposed to be.

>a is a deterministic action corresponding to the action of α. Moreover, a only involves constants.
what is the defn of a "deterministic action corresponding to α"?
Why can't this just be α?

>For convenience, we write a transition tp to indicate that the process of t is p.
that's a little weird.  It looks like p is a parameter and t_p is a function that given p, returns a transition t_p.

>a G is a directed
"a G": delete "a".

>States(ρ) = ...􏰅{src(t),dest(t)}.
\rho is a sequence, not a set, so t\in\rho is not right.  (If you think of \rho as a set, it is actually a set of ordered pairs of the form (i,t^i).)   This should be union over 0<=i<m of {src(t^i),dest(t^i)}.

So you don't allow infinite executions? (Can't something loop forever?)