Monday, June 29, 2009

A foray into OpenMP

(This post has been moved from my other blog on the grounds that it is quite general, and not all that technical).

This is sort of an intro to how easy it is to write multi threaded programs these days. Most people don't bother about parallelizing their applications because the prospect of managing threads and synchronization is too daunting. This is where OpenMP steps in. It is a simple system to use, based on pre-processor directives. It works portably in both windows and linux. And best of all, if you do not invoke the OpenMP option while compiling the code, the parallelization part of your code is ignored, and it works as a single threaded app. Below, you will find a quite pointless program, but it serves well to test my dual core dual socket setup :)

it computes the sum of: [sin(i/128) + cos(i/128)]... don't ask me why :-/
where i ranges from 0 to 98765432



#include <iostream>
#include <cmath>
using namespace std;

#define MAXNUM 98765432


int main(){
        double sum=0;

        #pragma omp parallel for reduction(+:sum)
        for (int i=0;i<MAXNUM;i++)
                sum+=sin(i/128.0)+cos(i/128.0);

        cout<<sum<<endl;
}
          


The #pragma statement automatically parallelizes the for loop. Also, since the 'sum' is updated each time in the loop, synchronization is ensured by the reduction(+:sum) statement. This says that the for loop is performing a reduction operation (like sum of n numbers), and that the operator used is Addition (+).

Pretty neat! And very very simple to use!

To compile this, all i needed to do was:

g++ -fopenmp main.cpp

In case I wanted a version without any of the parallel threading, a single threaded version is easily generated by not using the fopenmp compiler flag:

g++ main.cpp

Amazing simplicity! And this works for all the complicated constructs that OpenMP provides.

So, how well did we fare in this parallelization endeavour? Lets find out:

Single Threaded run (1 core): 16 seconds
Multi threaded run (4 cores): 4.7 seconds

Thats almost a linear scaling from 1 to four cores. Now I cant wait to see what more OpenMP can do for me :)

Learn more about OpenMP at openmp.org or at this comprehensive tutorial.

5 comments:

  1. Thanks, specifically for the reduction example. I actually need something like this. I didn't know how to perform reduction using openmp. This will help.

    And yes, you should checkout chrome on fedora now.

    http://spot.livejournal.com/308900.html

    It rocks.

    ReplyDelete
  2. You may want to check out asciidoc to format code. Blogger has very poor support for it.

    ReplyDelete
  3. Check out http://openmp.org/ for resources on progrmamming with OpenMP. There's also a very good book, Using OpenMP, with examples that you can now download.

    ReplyDelete
  4. @rpg: Hmm, asciidoc? I picked up some cpp2htm code from elsewhere and modified it for my own purposes...

    ReplyDelete
  5. well, the code doesn't seem to be formatted (aka pretty printed) here. :)

    Is there a way to have the follow up comments emailed to the person?

    ReplyDelete