Parallel Programming with Microsoft .Net : Futures - The Basics

2/19/2011 5:10:06 PM

In other words, asynchronous operations often act like functions. Of course, tasks can also do other things, such as reordering values in an array, but calculating new values is common enough to warrant a pattern tailored to it. It’s also much easier to reason about pure functions, which don’t have side effects and therefore exist purely for their results. This simplicity becomes very useful as the number of cores becomes large.

1. Futures

The following example is from the body of a sequential method.

var b = F1(a);
var c = F2(a);
var d = F3(c);
var f = F4(b, d);
return f;

Suppose that F1, F2, F3, and F4 are processor-intensive functions that communicate with one another using arguments and return values instead of reading and updating shared state variables.

Suppose, also, that you want to distribute the work of these functions across available cores, and you want your code to run correctly no matter how many cores are available. When you look at the inputs and outputs, you can see that F1 can run in parallel with F2 and F3 but that F3 can’t start until after F2 finishes. How do you know this? The possible orderings become apparent when you visualize the function calls as a graph. Figure 1 illustrates this.

Figure 1. A task graph for calculating f

The nodes of the graph are the functions F1, F2, F3, and F4. The incoming arrows for each node are the inputs required by the function, and the outgoing arrows are values calculated by each function. It’s easy to see that F1 and F2 can run at the same time but that F3 must follow F2.

Here’s an example that shows how to create futures for this example. For simplicity, the code assumes that the values being calculated are integers and that the value of variable a has already been supplied, perhaps as an argument to the current method.

Note:

The Result property returns a precalculated value immediately or waits until the value becomes available.

Task<int> futureB = Task.Factory.StartNew<int>(() => F1(a));
int c = F2(a);
int d = F3(c);
int f = F4(futureB.Result, d);
return f;

This code creates a future that begins to asynchronously calculate the value of F1(a). On a multicore system, F1 will be able to run in parallel with the current thread. This means that F2 can begin executing without waiting for F1. The function F4 will execute as soon as the data it needs becomes available. It doesn’t matter whether F1 or F3 finishes first, because the results of both functions are required before F4 can be invoked. (Recall that the Result property does not return until the future’s value is available.) Note that the calls to F2, F3, and F4 do not need to be wrapped inside of a future because a single additional asynchronous operation is all that is needed to take advantage of the parallelism of this example.

Of course, you could equivalently have put F2 and F3 inside of a future, as shown here.

Task<int> futureD = Task.Factory.StartNew<int>(
                                          () => F3(F2(a)));
int b = F1(a);
int f = F4(b, futureD.Result);
return f;

It doesn’t matter which branch of the task graph shown in the figure runs asynchronously.

An important point of this example is that exceptions that occur during the execution of a future are thrown by the Result property. This makes exception handling easy, even in cases with many futures and complex chains of continuation tasks. You can think of futures as either returning a result or throwing an exception. Conceptually, this is very similar to the way any .NET function works. Here is an example.

Note:

Futures, as implemented by the Task<TResult> class, defer exceptions until the Result property is read.

Task<int> futureD = Task.Factory.StartNew<int>(
                                          () => F3(F2(a)));
try
{
  int b = F1(a);
  int f = F4(b, futureD.Result);
  return f;
}
catch (MyException)
{
   Console.WriteLine("Saw MyException exception");
   return -1;
}

If an exception of type MyException were thrown in F2 or F3, it would be deferred and rethrown when the Result property of futureD is read. Getting the value of the Result property occurs within a try block, which means that the exception can be handled in the corresponding catch block.

2. Continuation Tasks

It’s very common for one asynchronous operation to invoke a second asynchronous operation and pass data to it. Continuation tasks make the dependencies among futures apparent to the run-time environment that is responsible for scheduling them. This helps to allocate work efficiently among cores.

For example, if you want to update the user interface (UI) with the result produced by the function F4 from the previous section, you can use the following code.

TextBox myTextBox = ...;
var futureB = Task.Factory.StartNew<int>(() => F1(a));
var futureD = Task.Factory.StartNew<int>(() => F3(F2(a)));
var futureF = Task.Factory.ContinueWhenAll<int, int>(
                 new[] { futureB, futureD },
                 (tasks) => F4(futureB.Result, futureD.Result));
futureF.ContinueWith((t) =>
  myTextBox.Dispatcher.Invoke(
       (Action)(() => { myTextBox.Text = t.Result.ToString(); }))
            );

This code structures the computation into four tasks. The system understands the ordering dependencies between continuation tasks and their antecedents. It makes sure that the continuation tasks will start only after their antecedent tasks complete.

The first task, futureB, calculates the value of b. The second, futureD, task calculates the value of d. These two tasks can run in parallel. The third task, futureF, calculates the value of f. It can run only after the first two tasks are complete. Finally, the fourth task takes the value calculated by F4 and updates a text box on the user interface.

The ContinueWith method creates a continuation task with a single antecedent. The ContinueWhenAll<TAntecedentResult, TResult> method of the Task.Factory object allows you to create a continuation task that depends on more than one antecedent task.