CGI Scripts

These are some notes on CGI (Common Gateway Interface) scripts, how they work and possible problems you may have. The Common Gateway Interface is a standard agreed upon by webserver developers to allow servers to call user-written programs in a standardised way. Please remember that any scripts you run must comply with the usual computing regulations.

DoC runs the Apache web server, version 2.2.8. The Apache2 project CGI Howto covers some queries on CGI.

This page describes the basics of running CGI scripts in DoC. CGI scripts can be written in any language, but the majority of them tend to be written using either Perl or PHP.

CGI Scripts in your ~/public_html directory

Any script in your public_html directory having the suffix .cgi (or .php) can be run provided it meets the following conditions:

These restrictions are requirements of suexec (see below) - since the script runs with your user permissions, they are for your protection.

If any of these conditions are not met, then suexec will refuse to run your CGI script, and will generate the famous "500 Internal Server Error" page.

The best way to ensure that the permission and group ownership conditions are met is to run the following commands in Linux:

cd ~/public_html
chmod 755 scriptname
chmod 755 .
chown -R yourusername .

Then, to ensure that permissions are correct for scripts created in future, check that your umask (the default permissions mode) is set to 022, this is the default for most groups of students. Check it via:

umask

and, if you need to, set it via the command:

umask 022

CGI Scripts in a group project area

Note that if you are running CGI scripts from a group project area in /vol/project, the rules are slightly different:

Again, if any of these conditions are not met, then suexec will refuse to run your CGI script, and will generate the famous "500 Internal Server Error" page.

We recommend the following simple rules for keeping permissions in your group project directory sensible:

If the permissions have already gone wrong because you didn't read these pages first, you can fix it by running the commands:

cd /vol/project/YOUR/GROUP/PROJECT/DIR
chmod -R ug=rwX,o=rX .
chgrp -R YOUR_GROUP_PROJECT .

However, note that you will not be able to change the permissions and group of files that other members of the group have created. So each member of the group who broke things may need to run the above commands to achieve complete coverage. This is so inconvenient that we strongly recommend you follow the above rules all the time, to prevent the problems occurring.

General CGI Notes

When a web request comes in that names one of your CGI scripts, and the above constraints have been satisfied, the script itself is executed as you in the normal Unix way. This means that, unless it is a true compiled executable, the first line of the script must be a #!/path/to/interpreter line. Hence all Perl CGI scripts should have a first line of #!/usr/bin/perl, and all PHP CGI scripts must have the first line #!/usr/bin/php.

See our PHP scripts guide for more information about making PHP scripts run.

Note that if you download CGI scripts from the Internet (whatever language they're written in), you are responsible for ensuring that the scripts really do what they say they will, do not misbehave and are secure. Bear in mind that by allowing these scripts to run as CGI scripts on our web servers, they will be invoked by untrusted users across the Internet, but will run with your user privileges on our web server, and hence have the same access to files in your home directory as you do! If you choose to use a well known piece of web software (eg a wiki or a content management system), it is quite likely that particular versions of such software will have vulnerabilities which hackers know how to exploit. It is your responsibility for keeping your web-based software up to date and (as far as possible) secure.

The CGI Environment

Your CGI script will be called by the webserver with a number of shell environment variables set. If the URL that invokes the CGI script contains a '?' then anything following the '?' is placed into the QUERY_STRING environment variable, and also given to the script as a command line argument. Similarly, if the invoking URL names a CGI script and has a '/' section after the name of a valid CGI script, the '/' section is extracted and placed into the PATH_INFO environment variable.

Try the following WWW pages to see what environment variables are available to your CGI program. The program is a shell script and is available here.

The program has to return not only the contents of the page, but also the WWW header information for the page. Normally this only requires printing out the Content-type: line followed by two newlines, to mark the end of the pages header information.

Some Examples

For example to return a plain, unformatted, text document, you simply make your CGI script print out the following text:

Content-type: text/plain

 1: Some boring text.
 2: Some boring text.
 3: Some boring text.
...
20: Some boring text.

To do this, you could write the following CGI shell script (if you want to make your life hard!):

# !/bin/sh   <-- delete the space between # and !

echo Content-type: text/plain
echo
i=1
while [ $i -le 20 ]
do
   echo "$i: Some boring text."
   i=`expr $i + 1`
done

Or the following Perl script:

# !/usr/bin/perl
print "Content-type: text/plain\n\n";
foreach my $i (1..20)
{
   print "$i: Some boring text.\n";
}

If (more typically) you wanted to return a page of formatted HTML, you might want your CGI script to print out the following text:

Content-type: text/html

<html><body><ol>
 <li>we can do <b>Bold</b>,<i>Italic</i> and much much more!</li>
 <li>we can do <b>Bold</b>,<i>Italic</i> and much much more!</li>
 ...
 <li>we can do <b>Bold</b>,<i>Italic</i> and much much more!</li>
</li></body></html>

To do this, you could write the following Perl script:

# !/usr/bin/perl
print "Content-type: text/html\n\n";
print "<html><body><ol>\n";
foreach my $i (1..20)
{
    print "<li>we can do <b>Bold</b>,<i>Italic</i> and much much more!</li>\n";
}
print "</ol></body></html>\n";

Or the following PHP script:

# !/usr/bin/php
<?
echo "<html><body><ol>\n";
for( $i = 1; $i <= 20; $i++ )
{
    echo "<li>we can do <b>Bold</b>,<i>Italic</i> and much much more!</li>\n";
}
echo "</ol></body></html>\n";
?>

You should consider using helper modules such as the perl CGI module to make your life easier. On Linux systems type:

perldoc CGI

for details.

You can run the script outside of the web server, and CGI.pm has ways of simulating argument passing.

To get you started, here's a typical Perl CGI script to play with: See the source code here, run it via:

To diagnose faults, check the error.log in /vol/wwwhomeslogs/wwwhomes.doc.ic.ac.uk/, or (for Perl CGI scripts) include the pragma:

use CGI::Carp qw(fatalsToBrowser);

at the top of your CGI script.

You can syntax check a Perl program by typing:

perl -cw script.pl

Access and error logs on the server

There are various logs on the webserver to record accesses to web pages and error messages when things don't work. The most useful places to start when your CGI script refuses to run (usually generating the famous "500 Internal Server Error") are the suexec log - /vol/wwwhomeslogs/server-suexec.log - and the current error log - /vol/wwwhomeslogs/wwwhomes.doc.ic.ac.uk/error.log. This usually tells you which of the suexec conditions you're breaking, or occasionally tells you there's a syntax error in the script itself. (You did, naturally, syntax check the CGI script before you ran it via the web page as we mentioned above?)

guides/web/cgi (last edited 2010-11-29 15:02:44 by dcw)