I finished working through Chapman’s Introducton to Fortran 90/95, and it was a very interesting (helpful) read. My next step is to work through Chapman’s (no relation?) Using OpenMP, but there are some performance considerations I must first address.
Therefore, I looked into gprof, which is the GNU profiling tool. It will give me an understanding of how quickly my code runs, and which tasks in the workflow are taking up the most resources. Here is what the ifort man pages say about the gprof compiler flag (note that I have a 32-bit processor for this test!):
Compiles and links for function profiling with gprof(1).
Architectures: IA-32, Intel® 64 architectures
Files are compiled and linked without profiling.
Description: This option compiles and links for function profiling with gprof(1).
Linux and Mac OS X: -pg (only available on systems using IA-32 architecture or Intel® 64 architecture), -qp (this is a deprecated option)
That’s interesting, sure! So with that bit of knowledge, I want to apply it to a large code that might make debugging a pain. I’m going to focus on a much simpler test case (that I’m taking from Chapman’s Fortran 90/95 book, Example 6-10, pg. 340).
gprof Example with Fortran Code
The example I consider has a function called “ave_value” which calculates the average value of a function between two points first_value and last_value. “ave_value” is called by “my_function,” which is declared as external in the test driver program “test_ave_value.” It’s a very simple program with three .f90 files.
I wrote these functions based on the example given in Chapman, and then I compiled them with the following command:
$ ifort -p ave_value.f90 my_function.f90 test_ave_value.f90 -o test_ave_value
As a reminder, the -p flag allows me to specify our gprof option, and the -o flag allows me to rename the executable.
Now that you have your executable, you can simply run it, as I did:
And you’ll notice that it has generated a “gmon.out” file that can be interpreted by gprof to show you your statistics! Writing gmon.out will overwrite any previous versions that you had in the folder, so use caution. Now, run gprof to interpret the gmon.out file.
$ gprof test_ave_value > tav.output
The tav.output was my re-naming of the gprof output. Now we can view the results of gprof in tav.output, in any competent text editor.
Looking at the Numbers
There is sufficient documentation for understanding gprof numbers on their website, but I’ll hit some critical points. The outputs are separated into the Flat Profile and the Call Graph. The Flat Profile conveys how much time your program has spent executing each function. The Call Graph conveys how much time was spent in the function and its children. You can read more here.
Visualization of gprof results
A quick way to put a visualization together (per the documentation of gprof2dot):
gprof path/to/your/executable | gprof2dot.py | dot -Tpng -o output.png
Here, gprof executes your program (which you’ve already compiled and linked with the appropriate flag!). That output is piped to a program called gprof2dot, which then pipes its output to create an output file that you can view in any competent image display tool!
Note that if you download gprof2dot, you’ll need to change the permissions to ensure that it’s an executable. I tried to run the non-executable version with
but it would not execute because the file permissions were not set to executable.
Now that I learned this, I’m going to try it on a bigger code. Happy profiling!