Microsoft Visual Studio 2010 : Data Parallelism - Unrolling Sequential Loops into Parallel Tasks (part 1)

10/18/2013 7:02:28 PM

Data parallelism typically begins by unrolling a sequential loop. For example, the code below defines a loop that runs sequentially. In this example, the code iterates over the loop body 100 times. The duration of the for loop is the aggregate duration of those 100 operations. Assuming an average of 100 milliseconds per operation, the for loop will take 10 seconds to complete (100 × 100 milliseconds).

for (int count = 0; count < 100; ++count)
{
   Operation();
}

Most importantly, the preceding code does not fully utilize the available processor cores. Look at a view of the processor cores for a computer with eight cores.

Unrolling Sequential Loops into Parallel Tasks

Note

To display the Windows Task Manager, use the Ctrl+Shift+Esc key combination. You can then double-click anywhere in one of the CPU usage graphs to switch to the dedicated CPU usage view.

Notice that the average utilization of the cores is relatively low. Except for the first processor core, the other cores are not very busy. A lot of processor power is being wasted!

Here’s the complete code for you to try.

using System;+
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

namespace ParallelLoop
{
    class Program
    {
        static void Operation()
        {
            Thread.SpinWait(int.MaxValue);
        }

        static void Main(string[] args)
        {
            for (int count = 0; count < 100; ++count)
            {
                Operation();
            }
        }
    }
}

Fortunately, changing from a sequential loop to a parallel loop requires minimal changes in most circumstances. Here’s the parallel version of the serial loop in the previous example.

Parallel.For(1, 100, (count) =>
{
   Operation();
});

That was an easy change! Look at the processor usage graph for the parallel version. Nirvana! The graph shows 100 percent processor utilization with the parallel loop. However, there is a barely visible straight line across the top of the graph that represents 100 percent utilization.

You’ll delve into the syntactical nuances of Parallel.For shortly. The important point now is that this code successfully parallelizes the operation as a basket of tasks scheduled across the available cores. There are enough tasks to keep the processor cores busy, which is an important goal of parallel programming.

The key to high-performing data parallelism is independent iterations, which prevent dependencies when converted to parallel tasks. Only with independent iterations can you unroll the loops successfully. You can accomplish some dependencies, such as reduction, with minimal effect on performance. Not all dependencies are obvious. Here is a partial list of possible dependencies:

Indexes
Shared memory
Shared files
Order restrictions

Can you identify the dependency in the next loop? The following code is a parallel loop that squares the current index, which is then written into a file. The program generates unexpected results. This is because of a not-so-subtle dependency.

StreamWriter sw = new StreamWriter(filename);
Parallel.For (0, 100, (i)=>sw.WriteLine(i * i));
sw.Close();

The problem is the shared file reference. The code could attempt to write to the same file simultaneously from two or more parallel tasks—and files are not implicitly thread safe. In addition, the order of the writes is not assured, because the Parallel.For method does not guarantee the order in which tasks are executed. The largest reported value should be 10,000 (100×100). This is a partial listing from the file.

6084
4900
705662415041
7225
6400
9216
73966561

None of the preceding values is less than 1,000! The problem occurs because of data corruption from simultaneous writes to the file, which demonstrates the potential problem that dependencies pose for data parallelism. The shared file was an obvious dependency, but unfortunately, most dependencies are more subtle and often go undetected. This underscores the importance of performing rigorous unit testing, stress testing, and concurrency testing when converting routines from sequential to parallel. Converting syntax is simple, but maintaining correctness is a greater challenge.

There are techniques and tricks to identifying dependencies. One technique is to reverse the iteration of a sequential loop. In other words, you change the iteration from ascending to descending. If the reversal causes the application to generate different results or crash, that change in results indicates the likelihood of a dependency.

Microsoft Visual Studio 2010 : Data Parallelism - Unrolling Sequential Loops into Parallel Tasks (part 4) - Handling Exceptions

Microsoft Visual Studio 2010 : Data Parallelism - Unrolling Sequential Loops into Parallel Tasks (part 3) - Interrupting a Loop

Microsoft Visual Studio 2010 : Data Parallelism - Unrolling Sequential Loops into Parallel Tasks (part 2) - The Parallel For Loop

Other

Programming Windows Services with Microsoft Visual Basic 2008 : Implementing the Worker Class, Creating the FileWorkerOptions Class

Programming Windows Services with Microsoft Visual Basic 2008 : Extending the Threading Model

Programming Windows Services with Microsoft Visual Basic 2008 : Writing a New Thread Method, Monitoring with Multiple Threads

The HP Virtual Server Environment : Example nPartition Management Scenario (part 4) - Rebooting and Booting nPartitions

The HP Virtual Server Environment : Example nPartition Management Scenario (part 3) - Creating a new nPartition

The HP Virtual Server Environment : Example nPartition Management Scenario (part 2) - Viewing the Complex after Installing Hardware, Extending the Existing nPartition

The HP Virtual Server Environment : Example nPartition Management Scenario (part 1) - Viewing the Configuration of an nPartition Complex

The HP Virtual Server Environment : nPartition Management Paradigms (part 2) - Remote Management via an nPartition Paradigm, Remote Management via the MP Paradigm

The HP Virtual Server Environment : nPartition Management Paradigms (part 1) - Local nPartition Management Paradigm

The HP Virtual Server Environment : nPartition Servers - Data Maintained by the Management Processor