A common misconception about CGI is that you can send command-line
switches to your program, such as foobar -qa blorf
. CGI
uses the command line for other purposes and thus this is not directly
possible. Instead, CGI uses environment variables to send your program
its parameters. The two major environment variables you will use for
this purpose are:
QUERY_STRING
QUERY_STRING is defined as anything which follows the first ? in the URL used to access your gateway. This information could be added either by an HTML ISINDEX document, or by an HTML form (with the GET action). It could also be manually embedded in an HTML anchor which references your gateway. This string will usually be an information query, i.e. what the user wants to search for in the archie databases, or perhaps the encoded results of your feedback GET form.
This string is encoded in the standard URL format of changing spaces to +, and encoding special characters with %xx hexadecimal encoding. You will need to decode it in order to use it.
If your gateway is not decoding results from a FORM, you will also get the query string decoded for you onto the command line. This means that each word of the query string will be in a different section of ARGV. For example, the query string "forms rule" would be given to your program with argv[1]="forms" and argv[2]="rule". If you choose to use this, you do not need to do any processing on the data before using it.
PATH_INFO
Much of the time, you will want to send data to your gateways which the client shouldn't muck with. Such information could be the name of the FORM which generated the results they are sending.
CGI allows for extra information to be embedded in the URL for your gateway which can be used to transmit extra context-specific information to the scripts. This information is usually made available as "extra" information after the path of your gateway in the URL. This information is not encoded by the server in any way.
To illustrate this, let's say I have a CGI program on my server
called /scripts/foobar
. When I access foobar from a
particular document, I want to tell foobar that I'm currently in
the English language directory, not the Pig Latin directory. In
this case, I could access my script in an HTML document as:
<A
HREF="/scripts/foobar/language=english">foobar</A>
When the server executes foobar, it will give me
PATH_INFO
of /language=english
, and my
program can decode this and act accordingly.
CGI programs can return a myriad of document types. They can send back an image to the client, and HTML document, a plaintext document, or perhaps even an audio clip of your bodily functions. They can also return references to other documents. The client must know what kind of document you're sending it so it can present it accordingly. In order for the client to know this, your CGI program must tell the server what type of document it is returning.
In order to tell the server what kind of document you are sending back, whether it be an document or a reference to one, CGI requires you to place a short header on your output. This header is ASCII text, consisting of lines separated by either linefeeds or carraige returns followed by linefeeds. Your program must output at least two such lines before its data will be sent directly back to the client.
The first line will be different depending on whether your program is returning a full document or a reference to one:
In this case, you must know the MIME type of your output. Common
MIME types are things such as text/html
for HTML, and
text/plain
for straight ASCII text.
In order to tell the server your output's content type, the first line of your output should read:
Content-type: type/subtype
type/subtype
is the MIME type for your output.
Let's say you want to send a file already available on your information server to the client, or perhaps you want them to retrieve a document from another server altogether.
If you want to reference another file on your own server, you should output a partial URL, such as the following:
Location: /dir1/dir2/myfile.html
In this case, the server will act as if the client had not
requested your script, but instead requested
http://yourserver/dir1/dir2/myfile.html
. It will take
care of all access control, determining the file's type, and all
sorts of that ugly stuff that servers do.
However, let's say you want to reference a file on your Gopher server. In this case, you should know the full URL of what you want to reference and output something like:
Location: gopher://httprules.foobar.org/0
In this case, the client will interpret your reply, and fetch the URL for the client automatically.
Advanced usage: If you would like to output headers such as Expires or Content-encoding, you can if your server is compatible with CGI/1.1. Just output them along with Location or Content-type and they will be sent back to the client.
Rob McCool robm@ncsa.uiuc.edu