This web page presents challenge problems for the (EC)2 Challenge Problem session of the 2009 (EC)2 (Exploiting Concurrency Efficiently and Correctly) workshop. The objective is to collect specific problems that illustrate more general, difficult challenges that arise in attempting to develop or verify correct, efficient concurrent programs. Everyone is welcome to contribute problems by emailing ec2-org@googlegroups.com. Even more welcome are solutions to the problems, or ideas or insights concerning them. These may be presented and discussed at the workshop at 4:00 PM on June 26, 2009.
One of the challenging aspects of the problem is the combinatorial blow-up resulting from the different orders in which the workers can return their results to the manager. This can lead to extreme state explosion in explicit state enumeration techniques.
Sequential version:
#include <stdlib.h>
#include <stdio.h>
/* define N, L, M before compiling */
void printMatrix(int numRows, int numCols, double *m) {
int i, j;
for (i = 0; i < numRows; i++) {
for (j = 0; j < numCols; j++) {
printf("%6.1f ", m[i*numCols + j]);
}
printf("\n");
}
printf("\n");
fflush(stdout);
}
void vecmat(double vector[L], double matrix[L][M], double result[M]) {
int j, k;
for (j = 0; j < M; j++)
for (k = 0, result[j] = 0.0; k < L; k++)
result[j] += vector[k]*matrix[k][j];
}
int main(int argc, char *argv[]) {
int i, j, k;
double a[N][L], b[L][M], c[N][M];
/* read a, b somehow */
for (i = 0; i < N; i++) vecmat(a[i], b, c[i]);
printMatrix(N, M, &c[0][0]);
return 0;
}
Parallel version:
#include <stdlib.h>
#include <stdio.h>
#include <mpi.h>
/* define N, L, M before compiling */
#define comm MPI_COMM_WORLD
void printMatrix(int numRows, int numCols, double *m) {
int i, j;
for (i = 0; i < numRows; i++) {
for (j = 0; j < numCols; j++) {
printf("%6.1f ", m[i*numCols + j]);
}
printf("\n");
}
printf("\n");
fflush(stdout);
}
void vecmat(double vector[L], double matrix[L][M], double result[M]) {
int j, k;
for (j = 0; j < M; j++)
for (k = 0, result[j] = 0.0; k < L; k++)
result[j] += vector[k]*matrix[k][j];
}
int main(int argc, char *argv[]) {
int rank, nprocs, i, j;
MPI_Status status;
MPI_Init(&argc, &argv);
MPI_Comm_rank(comm, &rank);
MPI_Comm_size(comm, &nprocs);
if (rank == 0) {
double a[N][L], b[L][M], c[N][M], tmp[M];
int count;
/* read a, b somehow */
MPI_Bcast(b, L*M, MPI_DOUBLE, 0, comm);
for (count = 0; count < nprocs-1 && count < N; count++)
MPI_Send(&a[count][0], L, MPI_DOUBLE, count+1, count+1, comm);
for (i = 0; i < N; i++) {
MPI_Recv(tmp, M, MPI_DOUBLE, MPI_ANY_SOURCE, MPI_ANY_TAG, comm, &status);
for (j = 0; j < M; j++) c[status.MPI_TAG-1][j] = tmp[j];
if (count < N) {
MPI_Send(&a[count][0], L, MPI_DOUBLE, status.MPI_SOURCE, count+1, comm);
count++;
}
}
for (i = 1; i < nprocs; i++) MPI_Send(NULL, 0, MPI_INT, i, 0, comm);
printMatrix(N, M, &c[0][0]);
} else {
double b[L][M], in[L], out[M];
MPI_Bcast(b, L*M, MPI_DOUBLE, 0, comm);
while (1) {
MPI_Recv(in, L, MPI_DOUBLE, 0, MPI_ANY_TAG, comm, &status);
if (status.MPI_TAG == 0) break;
vecmat(in, b, out);
MPI_Send(out, M, MPI_DOUBLE, 0, status.MPI_TAG, comm);
}
}
MPI_Finalize();
return 0;
}
LockFreeList implementation in Chapter 9
is buggy (i.e. it is not linearizable). The bug was revealed
by students during discussion of the algorithm in my
course. It would be great if a model checker can discover the
bug. The bug can be revealed by a bounded and small instance
of the problem. However, the checker would need to model the
semantics of various Java features in a reasonably detailed
manner.
LockFreeHashTable in Chapter 13 seems
correct. It is wonderfully interesting and complex. In
particular, it allows resizing to proceed concurrently with
search using a nontrivial ordering of keys. The challenge is
first to come up with a compelling manual proof of the
correctness in its most general form. The next step would be
to come up with a more detailed proof in an automated theorem
prover.
A challenging problem is to design programming languages that assure race-freedom by design. Since this seems hard, without severe restrictions on programmability, we could at least try to verify data-race freedom of programs that use relatively simple communication and coordination.
In particular, can we build manual (or automatic) proofs of race-freedom for at least non-trivial "data-parallel" programs? By data-parallel programs, I mean programs that exploit parallelism embarrassingly by computing in parallel on mostly disjoint parts of data. There is a rich class of implementations of concurrent algorithms available (JavaGrande, BarnesHut, Sorting, Triangulation, etc.); what do proofs for data-race freedom on these programs look like?
Here are many general challenges and some specific ones: