This is the first of two equally-weighted assessed coursework exercises. Hand submit your solution via CATE.
This exercise is about ``FullDiagOdd'', a simplified quantum chemistry application, given to us by Mike Bearpark in Imperial's Chemistry department. The program was written in Fortran 77 and has been automatically translated into C (using ``f2c'') in order to be executable under simplescalar, which only has a C compiler 1.
The methods implemented by the program are described in more detail in:
Excited states of conjugated hydrocarbon radicals using the molecular mechanics - valence bond (MMVB) method. Bearpark MJ, Boggio-Pasqua M. THEORETICAL CHEMISTRY ACCOUNTS 110 (2): 105-114 SEP 2003. (http://dx.doi.org/10.1007/s00214-003-0461-3)It's a very stripped-down model quantum chemistry application, which does two things:
Copy the source code directory tree to your own directory:
cd cp -r /homes/awb01/Teaching/ACA06/FullDiagOdd ./Now compile the program:
cd FullDiagOdd makeNow you can run the program:
./mat.x86 <5a.datThis reads input from the file 5a.dat, and writes its output to screen.
You have been provided with a selection of input files of various sizes: The biggest, ``11a.dat'', takes a few tens of seconds to run. Use small ones with the simulator!
The makefile also builds a binary for execution using the SimpleScalar simulator. You can execute it by typing:
/homes/phjk/simplesim-3.0/sim-outorder ./mat.ss \ < 5a.dat >/dev/nullThis should take less than two minutes (15 seconds on a 3GHz Pentium 4).
Use the fastest machine you can find. On a 3GHz Pentium 4 each simulation run (for problem size 1) takes less than 1.5 minutes. Use the ``top'' command to make sure you're not sharing it.
The most important output from the simulator is ``sim_cycle'' - the total number of cycles to complete the run. It's often also useful to look at ``sim_IPC'', the instructions per cycle - provided you always execute the same number of instructions. The time taken to perform the simulation ``sim_elapsed_time'' simply tells you how the simulator took.
Other outputs from the simulator can be helpful in guiding your search - eg ``ruu_full'', the proportion of cycles when the RUU is full.
To do this you need to write a shell script. You might find the following Bash script useful:
#!/bin/sh -f for ((x=2; x <= 128 ; x *= 2)) do echo -n "ruu-size" $x " " /homes/phjk/simplesim-3.0/sim-outorder -ruu:size $x \ ./mat.ss < 5a.dat 2>&1 >/dev/null | grep sim_IPC doneYou can find this script in scripts/vary_ruu.
Try using the gnuplot program. Run the script above, and save the output in a file ``table''. Type ``gnuplot''. Then, at its prompt type:
set logscale x 2 plot [][] 'table' using 2:4 with linespointsTo save the plot as a postscript file, try:
set term postscript eps set output "psfile.ps" plot [][] 'table' using 2:4 with linespointsTry ``help postscript'', ``help plot'' etc for further details.
Compile the application using the ``-S'' flag.
~phjk/simplescalar/bin/gcc -O3 -S -g diasym.cThe assembler code generated by the compiler is delivered in ``diasym.s'' (the ``-g'' flag adds debugging information including ``.loc'' lines that relate assembly code back to C source line numbers).
The compiler accepts many parameters to control optimisation. Try ``man gcc'' to read about them3. For example, ``-funroll-loops'' encourages the compiler to unroll loops. Modify the Makefile accordingly. Does this improve the performance of the simulated machine? What happens to the IPC?
Paul Kelly and Ashley Brown, Imperial College London, 2006