Monday, June 29, 2009

A foray into OpenMP

(This post has been moved from my other blog on the grounds that it is quite general, and not all that technical).

This is sort of an intro to how easy it is to write multi threaded programs these days. Most people don't bother about parallelizing their applications because the prospect of managing threads and synchronization is too daunting. This is where OpenMP steps in. It is a simple system to use, based on pre-processor directives. It works portably in both windows and linux. And best of all, if you do not invoke the OpenMP option while compiling the code, the parallelization part of your code is ignored, and it works as a single threaded app. Below, you will find a quite pointless program, but it serves well to test my dual core dual socket setup :)

it computes the sum of: [sin(i/128) + cos(i/128)]... don't ask me why :-/
where i ranges from 0 to 98765432



#include <iostream>
#include <cmath>
using namespace std;

#define MAXNUM 98765432


int main(){
        double sum=0;

        #pragma omp parallel for reduction(+:sum)
        for (int i=0;i<MAXNUM;i++)
                sum+=sin(i/128.0)+cos(i/128.0);

        cout<<sum<<endl;
}
          


The #pragma statement automatically parallelizes the for loop. Also, since the 'sum' is updated each time in the loop, synchronization is ensured by the reduction(+:sum) statement. This says that the for loop is performing a reduction operation (like sum of n numbers), and that the operator used is Addition (+).

Pretty neat! And very very simple to use!

To compile this, all i needed to do was:

g++ -fopenmp main.cpp

In case I wanted a version without any of the parallel threading, a single threaded version is easily generated by not using the fopenmp compiler flag:

g++ main.cpp

Amazing simplicity! And this works for all the complicated constructs that OpenMP provides.

So, how well did we fare in this parallelization endeavour? Lets find out:

Single Threaded run (1 core): 16 seconds
Multi threaded run (4 cores): 4.7 seconds

Thats almost a linear scaling from 1 to four cores. Now I cant wait to see what more OpenMP can do for me :)

Learn more about OpenMP at openmp.org or at this comprehensive tutorial.

Sunday, June 28, 2009

Sprucing up gedit: Part 2

Hi, welcome back... Yesterday I described what you can do with some gedit plugins. Today, we look at one of the least used, but most powerful features of Gedit: "External Tools". It can be found in the "Tools -> External Tools..." menu.

The external tools option allows you to write a script and execute that script from within gedit, using a few shortcut keys. This script is a shell script, with a few extra variables exposed by gedit.

Without going into the gory details, I will describe how you can make the most of this feature with the least effort... (of course, if you are a little industrious, you can do really cool things with this feature)

So, gedit provides a few default scripts:
1. Build (default Ctrl+F9)
2. Remove Trailing Spaces
3. Run Command

These three should really be quite useful already.

Here is the one most useful builtin variable that you can read: $GEDIT_CURRENT_DOCUMENT_DIR.
This tells you the path to the currently open document in gedit.

If you are used to writing shell scripts, you don't need to read any further, really. Just get creative :) But if you are relatively new/uncomfortable with scripts, here goes...

The shell script that is part of the "Build" command does this:
1. Go to the $GEDIT_CURRENT_DOCUMENT_DIR, and check if there is a makefile there.
2. If yes, execute it.
3. If no, go one level down and try again (until root is reached).

We will now go ahead and modify the build command so it executes the target of the makefile as well...
Assume your executable has the extension ".out". Then, assuming we know which directory it is in, we can write a simple script to execute this file:

cd ${DIR}
OUTFILE=`ls -1 *.out | head -n 1`
exec `echo ./$OUTFILE`


Apologies for the crappy script. With (a lot) more effort, we can figure out what the exact name of the executable created by make is... but that's a topic for more intellectual characters and best not broached by lesser beings such as myself...

For those who want to know what the script does, it goes to the directory where the executable file is present, lists the .out files in the directory, and picks the first one. This first .out file is executed by the 'exec' command.

Note that we no longer need to explicitly exit from this script because the entire instance of the shell has been replaced by the program that we 'exec-ed'. Basically, when we call this External Tool by pressing Ctrl+F9, gedit starts a shell, and runs this script in the shell. When we call 'exec', the shell process is replaced in memory by the program we executed.

Now, if we combine this with the original Build script given in gedit, we get this:


#!/bin/sh

EHOME=`echo $HOME | sed "s/#/\#/"`
DIR=$GEDIT_CURRENT_DOCUMENT_DIR
while test "$DIR" != "/"; do
 for m in GNUmakefile makefile Makefile; do
  if [ -f "${DIR}/${m}" ]; then
   echo "Using ${m} from ${DIR}" | sed "s#$EHOME#~#" > /dev/stderr
   make -C "${DIR}"

    cd ${DIR}
    OUTFILE=`ls -1 *.out | head -n 1`
    exec `echo ./$OUTFILE`


   fi
  done
  DIR=`dirname "${DIR}"`
done
echo "No Makefile found!" > /dev/stderr


The changes we made are highlighted in bold.

Now, whenever you want to run your program, assuming you have a proper makefile ready, just hit Ctrl+F9 and voila! your program (compiles if required and) starts up. You may later want to add cosmetic changes like remapping this to Ctrl+F5 or whatever. To learn more about the External Tools plugin, just go Here

Sprucing up gedit

For 'true' Linux aficionados, there is Emacs...
For worshippers of the devil, there is VI, and VIM...
For us lesser (non-gui impaired) mortals, there is gedit.

But like many things open source, gedit can transformed into something vastly more useful for programming purposes. Note that I am not saying that Gedit will be better than eclipse. It is not even close in terms of being a development platform. But once in a while, if you want a simple, lightweight GUI for coding, Gedit can be a serious contender.

In today's (and maybe tomorrow's) article, I will show you how to go from this:


to this:


GEDIT Plugins:
Most distributions of linux offer a package called gedit-plugins.
This will give you most of the essential plugins you need to spice up gedit.

On ubuntu for example, "sudo apt get install gedit-plugins" will get you this package.

This package will give you, among others, these useful plugins:
1. Code Comment: Comment or uncomment blocks of code (ctrl+M, ctrl+shift+M by default). This plugin is quite clever. It understands various comment formats, and depending on whether you are editing HTML, C, C++, vb or python or whatever else, it uses the appropriate comments.
2. Session Saver: Allow to bookmark working sessions in order to get them back for further use.
3. Terminal: A simple terminal widget accessible from the bottom panel. This, IMO is the coolest feature. You get to use a full bash terminal from within gedit. Compile and run your programs, move around things, browse... perfect if you don't hate the command line, but need some gui to maintain sanity.

Apart from this, a few of the original (default) plugins are very/somewhat useful too:
1. File Browser: A file browser plugin allowing to easily access your filesystem (includes remote mounts, creating new files/dirs, monitor dirs for changes, etc)
2. Indent: Indents or un-indents selected lines.
3. Spell: Checks the spelling of the current document.
4. Document Statistics: Reports the number of words, lines etc.

Finally, there are third party plugins. gedit plugins are written in python, so if you are a python geek, Here is how you can get started. There are a huge number of such plugins, but the one that I personally can't live without is the Autocomplete plugin. Download it here: http://sourceforge.net/projects/gedit-autocomp

Installing custom plugins. The .gnome2 directory:

gedit (and many other gnome programs) store their settings in a directory called '.gnome2'. This is a hidden directory in your home directory. gedit specifically uses the '.gnome2/gedit/plugins' directory to store its plugins. If this directory does not exist, simply make it.
ie:

mkdir -p ~/.gnome2/gedit/plugins

The -p option makes any directory that does not exist in the path that you specify. So, even if the gedit directory is not there, it will be created.

Now, simply download the plugin and unpack it into the plugins directory, so that you have the file "autocomplete.gedit-plugin" in your 'plugins' directory.

Enabling Plugins in gedit:

To enable the plugins, after installing all plugins, simply restart gedit. Go to the Edit Menu, and click on Preferences. In the Plugins tab, select the plugins you want to use. Also, while we are in the Preferences menu, we may as well do some other useful things, like enabling Line Numbers (the 'Display Line Numbers' option in the View tab), enabling Auto Indentation (in the 'Editor' tab), and (depending on your preference), using a cooler color scheme, like Oblivion (in Fonts and Colors tab). If you are annoyed with the ~ files that gedit keeps creating, here is your chance to disable backup creation as well (in the Editor tab).

Finally, to actually see these plugins that you have enabled, you need to go to the View menu, and enable 'Side pane' and 'Bottom pane'. This opens up the file browser, and the terminal plugin.

In tomorrow's post, we will see how to use another powerful feature, the "External Tools" option.

More plugins can be found Here.

Friday, June 26, 2009

The curious case of the CPU Bottleneck: Part 2

In continuation of my earlier blogs about the CPU bottleneck in games, today I will look at how the slow CPU affects some of the FPS and strategy titles out there.

As a refresher, here is the system configuration:

Opteron 144 @ 1.97 Ghz, 1 GB DDR RAM @ 425 Mhz, MSI GeForce N9800GT 512MB, Core: 715, memory: 800x2, shader: 1680 Mhz

Stalker: Clear Sky : 19 FPS in towns, 30-35 FPS in the wild.
The fps reported above is at slightly lower than maximum texture detail.
Note that this game, at ultra high detail lags big time on my machine because of the lack of RAM (I have 1 GB DDR memory). A lot of time is spent in swapping from disk, which explains why one of my hard disks is dying :) . Reducing the texture detail to a little above medium really helps fix that problem.

Crysis:
Here we will look at 3 different configurations:
1. Ultra High, 1440x900, 4xAA: 5 to 12 fps
2. Shaders, Postprocess, Water, Volumetric Effects are Ultra High, rest High, 1440x900, 4xAA: 12 to 15 fps
3. Shaders, Postprocess, are Ultra High, rest High, 1024x768, No AA: 19 to 25 fps.

Note that the major issue with this game is that there is a lot of disk IO when everything is set to ultra. This reduces the actual FPS due to a lot of loading delays. In the second configuration, this disk IO is reduced, but still a problem, which leads me to think that lack of enough main memory is the problem here. If I can find a few cheap DDR ram modules, I can put this theory to the test. In the third setting, the game is actually playable, and by that I mean you can take headshots with a pistol from quite some distance :)

Unreal Tournament 3: 35 FPS
This game is highly sensitive to CPU speeds, and is the only game that seems to scale very well with increased number of CPU cores. So, the framerates that we see here are considerably lesser than what is expected of a 9800gt when coupled with a core i7 for example (around 105 fps).
UT3 engine games like gears of war and mass effect, run well, which is good because this is a quite popular engine these days.

World in Conflict: 50 FPS in non combat, 20-30 fps in heavy combat.
This is at ultra high detail, including cloud reflections etc.
Sadly, this is the only RTS that I have right now. Maybe later, I can do a small test on Company of Heroes, Dawn of War 2 etc...

makefiles

I have been writing makefiles from quite some time now, but for some reason, I could never get them working correctly. ie: If the file is already compiled, a makefile is supposed to do nothing. In case someone else is having the same issues, here goes:

Lets say I want to compile a single cpp file (main.cpp), a header (main.h) and produce an output file (main.o)

this is the simplest makefile:

main.o: main.cpp main.h
    g++ -omain.o main.cpp

The first word "main.o:" refers to the output (compiled) file name. This is called the Target in literature.

The Target must be followed by a colon, and a space separated list of files. This list of files (main.cpp main.h) are called the Dependencies. If you run the makefile, and run it again without making any changes, it wont do any compilation the second time. But if you change any of the Dependencies and then run make again, the files will be recompiled.

Easy enough. Now lets move on to a program with multiple source files. If you have a project with 50 source files, and you change only one file, you dont need to recompile everything again. This is how you do it:

We have 2 source files, and 2 headers: main.cpp, main.h, blah.cpp and blah.h
Let us say we want the output executable name to be final.o
The following is the makefile for such a setup:

final.o: main.o blah.o
    g++ main.o blah.o -ofinal.o

main.o: main.cpp main.h
    g++ -c main.cpp

blah.o: blah.cpp blah.h
    g++ -c blah.cpp


This is what it means: The make file runs the first target. The first target is 'final.o'. final.o is dependent on main.o and blah.o. So, both the Targets 'main.o' and 'blah.o' are called. Note the usage of the compiler flag "-c", which tells g++ not to link the files (since we are compiling them independently). We can link them later, as we shall see.
Now that main.o and blah.o are latest versions, then we can go ahead and link both the object files (yes, if you pass only object files to g++, it links them... much better than using ld).

And there we have it, a makefile!

Thursday, June 25, 2009

The curious case of the CPU Bottleneck: Results

Test Setup:
AMD Opteron 144 (1.8 Ghz, OCd to 1970 MHz), 1 MB cache.
2x512 MB DDR 400 Mhz RAM (Dual channel, OCd to 425 MHz)
MSI N9800GT 512 MB, Clocks: Core: 715, Shader: 1680, Memory: 800 (1600)

All benchmarks are at 1440x900, Maximum detail, 4x AA, 16x AF unless otherwise specified. Framerates reported are average FPS of actual game play.

Racing/Simulation games:

NFS-ProStreet: 12 fps
NFS-Undercover: 15 fps

In the NFS series of games, we see about 10 to 15 fps hit due to other cars and traffic. On a test track, while playing solo, we get around 25 to 50 fps. This means that the CPU bottleneck is truly noticeable in these games. The AI for the cars in the game seems to be quite intensive on the CPU. A dual core machine could easily turn the numbers in our favor.

Note that these games aren't all that graphically demanding. My earlier 6800GS gave around 7-10 fps at high detail on NFS Pro-Street.

Trackmania United: 30 fps

This is a slightly more stable case for our benchmarks. There are no AI drivers, and the game is highly playable at these frame rates.

Racedriver GRID: 20 fps

Another nice game for this test setup. At maximum detail, the game runs smoothly, even with AI players on the tracks. Also, IMO, the best looking racing game out on the PC, and very convincing handling too.

Tom Clancy's HAWX: 50-60 fps

A slightly unconventional game. Tried it on a friend's recommendation. Again, the game runs very nicely on this machine. Note that the game has an inbuilt benchmark mode thats a lot more intensive than the actual game. The first part of the bench runs at 60 fps and the second, insanely cluttered one runs at 15 fps. The average, as reported is 25 fps.

The curious case of the CPU bottleneck

People who bought machines way back in 2005 would probably have an Intel P4 or AMD Athlon 64 single core processor on their hands. Such people, like myself, would sometimes like to play games (yeah, right... sometimes...)

The price of upgrading to a new processor (a dual/tri/quad machine) is amplified by the fact that a CPU upgrade entails a motherboard and RAM upgrade in most cases (unless you are lucky enough to be stuck with the LGA775).

On the other hand, upgrading to a better graphics card is terribly cheap these days. Take a look at these prices (May 2009, Bangalore):
CardPrice in Rupees
ATI HD4670 5000
Geforce 9800 GT6500
ATI HD4770 8000
Geforce GTS 2508500


Another option is to upgrade the CPU, the motherboard and RAM. Lets look at a reasonable configuration
ComponentPrice in Rupees
Intel E7400 (2.8 Ghz Dual Core)6500
Intel Q8200 (2.33 Ghz Quad Core)8500
Cheap Motherboard4000
4 GB of DDR2 800 RAM2600


So, no matter how you swing, your pockets will get lighter by around 12-15k for the entire deal.

I am one of those weirdos who has a Single Core 1.8 Ghz CPU with 1 GB DDR RAM, and a grossly overpowered (for this setup) Geforce 9800GT 512MB. Why? I went with option 1... buy a GPU, leave the rest as it was 3 years ago.
Earlier, I had a Geforce 6800 GS, which was by no means a bad card 3 years ago...

Now that the stage is set, this article will tell you the true story behind the CPU bottleneck in games. How bad is it really, to have a crappy CPU coupled with a reasonably fast GPU? How much difference does the RAM make? These are things you will never see in regular GPU benchmarks, because they use a Core i7 (3 ghz) with 8 GB of 1600 MHz DDR3 RAM... crazy.

These numbers that I will be reporting are not in the true sense a benchmark, they are just a bunch of readings. I dont really have comparision data in most cases. Maybe I will put in some numbers from my friend, who has a Pentium 4 Dual Core rig with a 8800GS.

Watch this space for actual results over the next few days...

A tale of 2 HDDs

I happen to have 2 sata hard disks on my current machine. Both are 160 GB segate drives. One of them is dying... it wrote a song about how nice the world would be if the wars and bloodshed stopped. Kindly donate 1 dollar to the hard disk recovery fund... um, oh, wrong story...

They are identical except that one of them has 8 MB cache, and another has 16 MB cache. Note that you could arbitrarily get one of these hard disks when you go and buy one from the store. People dont really care about the cache size on hdds, and they just go by brand name and capacity. People are wrong!! Look below to see a benchmark of these hard disks in action. I am trying to be a little sneaky, so I wont tell you which one has how much cache... Lets see if you can figure that out on your own ;)


Wassup!!!

So, um... welcome, i guess....
Here is another blog I had to start. I guess there is stuff I sometimes want to write, but there is no easy way of putting it up on the web...

Not in the mood to write a long story about what this blog is going to contain (its early morning and its raining), so here is a simpler way of doing things:

Keywords: Computers, Programming, Gaming, Hardware, Benchmarks, and anything else i want to put up here.