Parallel Programming with Microsoft .Net : Pipelines - An Example

10/12/2010 9:28:10 AM

The online samples include an application named ImagePipeline. This application takes a directory of JPEG images and generates thumbnail versions, which are also post-processed with several image-enhancing filters. The resulting processed images are displayed as a slideshow. Each image in the slideshow is displayed in alphabetical file name order.

Note: You can’t use a parallel loop for this example because the application has a requirement that images must be processed in sequence. Parallel loops do not guarantee any particular processing order.

1. Sequential Image Processing

Each image is processed in four stages: the large color image is loaded from a file, a small thumbnail with a picture frame is generated from it, noise is added to the image to create a speckling effect, and then the processed image displays as the next picture in the slideshow. Figure 1 illustrates this sequence.

Figure 1. Sequential image processing

Here’s the code for the sequential version.

string sourceDir = ...
IEnumerable<string> fileNames = ...
int count = 0;
ImageInfo info = null;
foreach (var fileName in fileNames)
{
  /// ...
  info = LoadImage(fileName, sourceDir, count, ...);
  ScaleImage(info);
  FilterImage(info);
  // ...
  DisplayImage(info, count + 1, displayFn, ...);
  // ...
  count += 1;
  info = null;
}

The four steps are performed by the LoadImage, ScaleImage, FilterImage, and DisplayImage methods. This example is slightly abridged for clarity. The code that deals with cancellation, error handing (the disposal of handles to unmanaged objects), and the capture of performance measurements is omitted. You can refer to the online samples to see these details.

2. The Image Pipeline

The sequential loop can process only one image at a time; each image must complete all four stages before work can begin on the next image, and the stages themselves are sequentially linked. In fact, this example seems intractably sequential—the top-level loop has the restriction that images must be displayed in order, and within each step are substeps that require inputs from previous substeps. You can’t display an image until after the filter is applied to it. You can’t apply the filter until after the image is scaled to thumbnail size. You can’t do the scaling until after the original image loads.

However, the pipeline pattern can introduce parallelism into this example. Each image still passes through all four stages in sequence, but the stages can execute in parallel on different images. Figure 2 illustrates the image pipeline.

Figure 2. Image pipeline

The following code shows the parallel version.

IEnumerable<string> fileNames = ...
string sourceDir = ...
Action<ImageInfo> displayFn = ...
int limit = ...
var originalImages = new BlockingCollection<ImageInfo>(limit);
var thumbnailImages = new BlockingCollection<ImageInfo>(limit);
var filteredImages = new BlockingCollection<ImageInfo>(limit);
try
{
  var f = new TaskFactory(TaskCreationOptions.LongRunning,
                          TaskContinuationOptions.None);
  // ...
  
  var loadTask = f.StartNew(() =>
                    LoadPipelinedImages(fileNames, sourceDir,
                                        originalImages, ...));
  var scaleTask = f.StartNew(() =>
                    ScalePipelinedImages(originalImages,
                                         thumbnailImages, ...));
  var filterTask = f.StartNew(() =>
                     FilterPipelinedImages(thumbnailImages,
                                           filteredImages, ...));
  var displayTask = f.StartNew(() =>
                    DisplayPipelinedImages(
                        filteredImages.GetConsumingEnumerable(),
                        displayFn, ...));
  Task.WaitAll(loadTask, scaleTask, filterTask, displayTask);
}
finally
{
   // ... release handles to unmanaged resources ...
}

(Some details of error handling, cancellation, and the collection of performance data have been omitted here for clarity. Refer to the online sample for the complete implementation.)

There are three blocking collections that act as buffers between the stages of the pipeline. The four stages are the same as in the sequential version. A call to the task factory StartNew method launches each processing stage as a long-running task.

The code calls Task.WaitAll to defer cleanup until after all stages complete processing all images.

3. Performance Characteristics

To understand the performance characteristics of the sequential and pipelined versions, it’s useful to look at a scheduling diagram such as the one that Figure 3 illustrates.

Figure 3. Image pipeline with stages of equal speed

Figure 3 shows how the tasks in the image pipeline example execute over time. For example, the top row shows that stage 1 processes image 1 starting at time t₀ and image 2 starting at time t₁. Stage 2 begins processing image 1 at time t₁. Assume for a moment that the pipeline is perfectly balanced; that is, each stage of the pipeline takes exactly the same amount of time to do its work. Call that duration T. Therefore, in the figure, t₁ occurs after T units of time elapse, t₂ after 2 x T units of time elapse, and so on.

If there are enough available cores to allow the pipeline’s tasks to run in parallel, the figure shows that the expected execution time for six images in a pipeline with four stages is approximately 9 x T. In contrast, the sequential version takes approximately 24 x T because each of the 24 steps must be processed one after another.

The average performance improves as more images are processed. The reason for this, as Figure 7-4 illustrates, is that some cores are idle as the pipeline fills during startup and drains during shutdown. With a large number of images, the startup and shutdown time becomes relatively insignificant. The average time per image would approach T.

Note: If there are enough available cores and if all stages of a pipeline take an equal amount of time, the execution time for the pipeline as a whole is the same as the time for just one stage.

However, there’s one catch: the assumption that all the pipeline steps take exactly the same amount of time isn’t always true. Figure 4 shows the scheduling diagram that occurs when the filter stage takes twice as long as the other stages.

Figure 4. Image pipeline with unequal stages

When one of the stages takes 2 x T units of time while the other stages take T units of time, you can see that it’s not possible to keep all of the cores completely busy. On average (with a large number of images), the time to process an images is 2 x T. In other words, when there are enough cores for each pipeline stage, the speed of a pipeline is approximately equal to the speed of its slowest stage.

Note: When the stages of a pipeline don’t take the same amount of time, the speed of a pipeline is approximately equal to the speed of its slowest stage.

If you run the ImagePipeline application in the Microsoft® Visual Studio® development system debugger, you can see this effect for yourself. The ImagePipeline sample has a user interface (UI) feature that reports the average length of time in milliseconds for each of the stages of the pipeline. It also reports the overall average length of time that’s needed to process each image. When you run the sample in sequential mode (by selecting the Sequential radio button), you’ll notice that the steady-state elapsed time per image equals the sum of all the stages. When you run in pipeline mode, the average elapsed time per image converges to approximately the same amount of time as slowest stage. The most efficient pipelines have stages of equal speed. You won’t always achieve this, but it’s a worthy goal.

Other

Parallel Programming with Microsoft .Net : Pipelines - The Basics

Parallel Programming : Concurrent Collections

Debugging Tools For Parallel Tasks

Parallel Programming : Parallel Loops

Parallel Programming : Understanding and Using Tasks

Parallel Programming : Introducing Parallel Computing

Using Remote Assistance to Resolve Problems in Vista

Introducing Microsoft Business Intelligence

Ten Keys to Successful Microsoft Business Intelligence (cont.)

Ten Keys to Successful Microsoft Business Intelligence

Visual Design of Workflows with WCF and WF 4

Enterprise Patterns with WCF RIA Services