The
online samples include an application named ImagePipeline. This
application takes a directory of JPEG images and generates thumbnail
versions, which are also post-processed with several image-enhancing
filters. The resulting processed images are displayed as a slideshow.
Each image in the slideshow is displayed in alphabetical file name
order.
Note:
You can’t use a parallel loop for this example because the application
has a requirement that images must be processed in sequence. Parallel
loops do not guarantee any particular processing order.
1. Sequential Image Processing
Each image is processed
in four stages: the large color image is loaded from a file, a small
thumbnail with a picture frame is generated from it, noise is added to
the image to create a speckling effect, and then the processed image
displays as the next picture in the slideshow. Figure 1 illustrates this sequence.
Here’s the code for the sequential version.
string sourceDir = ...
IEnumerable<string> fileNames = ...
int count = 0;
ImageInfo info = null;
foreach (var fileName in fileNames)
{
/// ...
info = LoadImage(fileName, sourceDir, count, ...);
ScaleImage(info);
FilterImage(info);
// ...
DisplayImage(info, count + 1, displayFn, ...);
// ...
count += 1;
info = null;
}
The four steps are performed by the LoadImage, ScaleImage, FilterImage, and DisplayImage methods. This example
is slightly abridged for clarity. The code that deals with
cancellation, error handing (the disposal of handles to unmanaged
objects), and the capture of performance measurements is omitted. You
can refer to the online samples to see these details.
2. The Image Pipeline
The sequential loop can process
only one image at a time; each image must complete all four stages
before work can begin on the next image, and the stages themselves are
sequentially linked. In fact, this example
seems intractably sequential—the top-level loop has the restriction
that images must be displayed in order, and within each step are
substeps that require inputs from previous substeps. You can’t display
an image until after the filter is applied to it. You can’t apply the
filter until after the image is scaled to thumbnail size. You can’t do
the scaling until after the original image loads.
However, the pipeline
pattern can introduce parallelism into this example. Each image still
passes through all four stages in sequence, but the stages can execute
in parallel on different images. Figure 2 illustrates the image pipeline.
The following code shows the parallel version.
IEnumerable<string> fileNames = ...
string sourceDir = ...
Action<ImageInfo> displayFn = ...
int limit = ...
var originalImages = new BlockingCollection<ImageInfo>(limit);
var thumbnailImages = new BlockingCollection<ImageInfo>(limit);
var filteredImages = new BlockingCollection<ImageInfo>(limit);
try
{
var f = new TaskFactory(TaskCreationOptions.LongRunning,
TaskContinuationOptions.None);
// ...
var loadTask = f.StartNew(() =>
LoadPipelinedImages(fileNames, sourceDir,
originalImages, ...));
var scaleTask = f.StartNew(() =>
ScalePipelinedImages(originalImages,
thumbnailImages, ...));
var filterTask = f.StartNew(() =>
FilterPipelinedImages(thumbnailImages,
filteredImages, ...));
var displayTask = f.StartNew(() =>
DisplayPipelinedImages(
filteredImages.GetConsumingEnumerable(),
displayFn, ...));
Task.WaitAll(loadTask, scaleTask, filterTask, displayTask);
}
finally
{
// ... release handles to unmanaged resources ...
}
(Some details of
error handling, cancellation, and the collection of performance data
have been omitted here for clarity. Refer to the online sample for the
complete implementation.)
There are three blocking
collections that act as buffers between the stages of the pipeline. The
four stages are the same as in the sequential version. A call to the
task factory StartNew method launches each processing stage as a long-running task.
The code calls Task.WaitAll to defer cleanup until after all stages complete processing all images.
3. Performance Characteristics
To understand the
performance characteristics of the sequential and pipelined versions,
it’s useful to look at a scheduling diagram such as the one that Figure 3 illustrates.
Figure 3 shows how the tasks in the image pipeline example execute over time. For example, the top row shows that stage 1 processes image 1 starting at time t0 and image 2 starting at time t1. Stage 2 begins processing image 1 at time t1.
Assume for a moment that the pipeline is perfectly balanced; that is,
each stage of the pipeline takes exactly the same amount of time to do
its work. Call that duration T. Therefore, in the figure, t1 occurs after T units of time elapse, t2 after 2 x T units of time elapse, and so on.
If there are enough
available cores to allow the pipeline’s tasks to run in parallel, the
figure shows that the expected execution time for six images in a
pipeline with four stages is approximately 9 x T. In contrast, the
sequential version takes approximately 24 x T because each of the 24
steps must be processed one after another.
The average performance improves as more images are processed. The reason for this, as Figure 7-4
illustrates, is that some cores are idle as the pipeline fills during
startup and drains during shutdown. With a large number of images, the
startup and shutdown time becomes relatively insignificant. The average
time per image would approach T.
Note: If
there are enough available cores and if all stages of a pipeline take
an equal amount of time, the execution time for the pipeline as a whole
is the same as the time for just one stage.
However, there’s one catch: the assumption that all the pipeline steps take exactly the same amount of time isn’t always true. Figure 4 shows the scheduling diagram that occurs when the filter stage takes twice as long as the other stages.
When one of the stages takes
2 x T units of time while the other stages take T units of time, you
can see that it’s not possible to keep all of the cores completely
busy. On average (with a large number of images), the time to process
an images is 2 x T. In other words, when there are enough cores for
each pipeline stage, the speed of a pipeline is approximately equal to
the speed of its slowest stage.
Note: When
the stages of a pipeline don’t take the same amount of time, the speed
of a pipeline is approximately equal to the speed of its slowest stage.
If you run the ImagePipeline
application in the Microsoft® Visual Studio® development system
debugger, you can see this effect for yourself. The ImagePipeline
sample has a user interface (UI) feature that reports the average
length of time in milliseconds for each of the stages of the pipeline.
It also reports the overall average length of time that’s needed to
process each image. When you run the sample in sequential mode (by
selecting the Sequential
radio button), you’ll notice that the steady-state elapsed time per
image equals the sum of all the stages. When you run in pipeline mode,
the average elapsed time per image converges to approximately the same
amount of time as slowest stage. The most efficient pipelines have stages of equal speed. You won’t always achieve this, but it’s a worthy goal.