REVIEW: Intel Parallel Studio Helps Developers Exploit Multiple Cores - Page 2

Parallel Inspector

During tests, the Parallel Inspector tool helped me spot some of the most common problems in parallel programming-particularly deadlocks and data races.

The tool runs your program and monitors it, looking for these problems-as opposed to simply inspecting the code itself. While your program is being analyzed, it takes much longer to run. My test case took more than 10 times the amount of time to run, but the payoff was a comprehensive list of the errors found, including data races, in the form of a to-do list. I could then click on the errors and go right to the source code line that produced the problem.

Although the Inspector finds errors as a program is running and can show you where in your source code the problems occurred, it only gives hints on fixing them. Ultimately it is up to you, as a good software engineer, to understand your code enough to recognize the problems the Inspector found and to fix them correctly.

Parallel Composer

In the sample case I tried, the Inspector discovered that multiple threads were trying to write to the same memory location simultaneously, which suggests I needed a critical section. Creating a critical section was easy. The Intel C++ compiler that's provided as part of the Parallel Composer component fully supports the OpenMP standard, which is a C++ extension that allows you to use pragmas in your code to specify multithreaded features such as critical sections.

That simplifies your job: Instead of calling into the operating system to create a critical section, you just throw in a pragma (like so: #pragma omp critical) before the line that is to be a critical section.

In addition to the use of directives such as the pragmas, the Intel C++ compiler also includes unique language extensions that you can use, such as this:

__par for (i = 0; i < size; i++)

Additionally, the compiler comes with a threading library called the Intel IPP (Integrated Performance Primitives) and a template-based library called the Intel TBB (Threading Building Blocks). All of these are powerful approaches to writing parallel programs that make use of multicore processors. And, if you do your job right, the code created by the Intel compiler will make use of all the cores in the processor (including non-Intel processors).