Subdividing and solving computations in parallel can be done most effectively via the following methods, because as we all know “the computations of any parallel algorithm can be viewed as a task dependency graph[1]”.

1. Creating a task-dependency graph.
a. programmer decomposes problem into units of computation (tasks) that when assembled provide the correct output.


2. Identify the degree of concurrency of the entire program, this generally increases as grain size decreases.

a. determine the amount of work per task, this is the weight of each node, thus defining fine vs.coarse grained tasks.
b. identify the maximum degree of concurrency, “the maximum number of concurrent tasks in execution at any one time”.
c. identify the average degree of concurrency, “the average number of tasks that can run concurrently over the entire execution of the program”.
d. avg. degree can be obtained via a critical path assessment, “the longest directed path between any pair of start finish nodes”. The sum of the weights between all of these nodes of this path is the critical path length.

3. Identify task interaction/communication graph, a “pattern for task interaction”.

a. nodes can be assigned weights based on proportional to the amount of computational work the task performs and the amount of interaction
b. undirected graph which is a superset of the edge set of the task-dependency graph.
c. good candidate for experimentation!


4. Measuring communication costs between tasks

a. determine diameter and cost of network operations, fat tree = 2log((p+1)/2), where p = number of nodes, cost = (p-1).
b. comm_cost = Ts + Tn + Tw, where Ts=startup time, Tn = node latency time, tW = per word transfer time.
c. Tw = 1/r, where r = channel bandwidth (words/second).
d T_comm = Ts + Tw(m), where m = length of message in words.

[1]A. Grama, A. Gupta, G. Karypis, V. Kumar. Introduction to Parallel Computing, Addison Wesley, 2nd ed. 2003.