Back to 12/96 How To : Power Windows
Up to Table of Contents
Ahead to 12/96 Enterprise Windows: Windows NT

12/96 How To: Programming Windows

Fine Form for the Web

Enhance your Web pages and add functionality with HTML forms.
Here's how to get started.

By Martin Heller

Ever since my column introducing CGI and ISAPI programming (see the June issue), I've been inundated with mail from people requesting help setting up forms on their Web pages. There's a business there for someone-maybe you.

To help budding CGI developers get started, I've written a sample CGI program for form handling in C++. I've posted it at http://www.winmag.com/people/mheller/formtest.htm along with some test forms. If you have Web access, I strongly encourage checking out this interactive online version. Don't forget to complete the online experience by viewing the C++ source code, downloading the zip file, trying both sample forms and viewing the HTML source code to see how I set up the forms.

The DUMPARGS code uses mostly C idioms even though it's a C++ program. That was deliberate: There would be little advantage to C++ idioms, and I didn't want to limit the audience for this code. If you're an object-oriented purist, by all means turn the code into C++ classes, replace all the printf calls with cout << argument, and replace the C library string manipulations with MFC Cstring or Standard C++ Library string calls.

You'll also notice that DUMPARGS handles both GET and POST inputs. That's important: Too often I fill out a form on a supposedly commercial Web site only to encounter an error message revealing that the CGI program only accepts POST methods and the form has specified a GET method. People sometimes prefer or require the POST method because the input doesn't wind up in a long URL like it does for the GET method. In effect, online surfers can't generate their own POST method inquiries from the browser's URL input box like they can GET method inquiries.

Critics might argue that I've opened a security loophole by allowing GET inquiries. But take a closer look. I've actually closed that loophole and also sewn shut a bigger one that allows a CGI program on your server to be called from any page on any server on the entire Net. My code tests whether a reference came from a local server or not. When you fake a GET inquiry by hand, the HTTP_REFERER environment variable won't be set. Also, if you invoke the program from another server, the HTTP_REFERER field won't match the SERVER_NAME. Try it with my sample program yourself if you like; you'll find it installed at http://www.winmag.com/cgi-shl/dumpargs.exe. If the reference isn't coming from the WinMag server, the program will say so.

To use this code as the starting point for your own form-handling program, add a call to your own processing function at the end of main just before the return statement, with the call perhaps conditional on the fFromLocalHost flag being set. All your processing function will need is the number of name/value pairs and a pointer or reference to the NameValuePair array nv. Once your code is debugged, you can change the Content-type header from text/plain to text/html and add your own HTML formatting.

You might want to suppress or edit a few other outputs, which you'll find quickly as you debug your program. To debug, you can set environment variables locally to simulate the CGI environment. To actually test your CGI program, you'll need access to a live Web server. It needn't be on the Internet-a personal Web server on your development machine will do fine. If you share system resources like files, you'll have to deal with sharing modes and system mutexes. To test that, you'll eventually have to use a Web server with multiple simultaneous users. But start slowly: There's no need to crash repeatedly your first day.

How it works

DUMPARGS first spits out a Content-type header, echoes its arguments and all its environment strings and then determines the CGI request method. If the method is GET, the program takes its input from the QUERY_STRING environment variable; if the method is POST, it reads its input from standard input, using the CONTENT_LENGTH environment variable to determine the length. GET method calls are URL-encoded by definition, while POST method calls may conform to one of several encodings, as determined by the CONTENT_TYPE environment variable. The program only parses application/x-www-form-urlencoded content.

Before parsing the request, DUMPARGS checks whether it came from the local server, by seeing if the HTTP_REFERER variable references the same machine as the SERVER_NAME variable. This test is useful for a CGI program that you want to be used only from your own server. For instance, many page access counting programs on the Web will count references for anyone. I consider that a bug, although some people probably consider it a feature.

Parsing the URL-encoded strings you get from HTML forms is a bit messy, but it's still not that bad. Basically, it's a matter of changing +s to spaces in the whole buffer, separating name/value pairs at &s, splitting the name from the value of each pair at the = sign, and converting all %xx hex-encoded characters to the characters themselves. Except for the hex decoding, all the work was so simple I just wrote the code inline instead of writing functions. Once the variables are parsed, the program spits out their names and values.

Most of the actual code is pretty transparent. I use standard C library routines as much as possible. As you probably know, getenv retrieves environment variables by name, printf writes formatted stream output to stdout, stricmp lexically compares lowercased versions of two strings, atoi converts a string to an integer, malloc allocates a block of memory from the heap, fread reads a specified number of objects from a stream and strstr looks for occurrences of one string in another.

The only code I consider difficult to understand is in the hex2char function. I found this function on the Net in several places, generally called x2c. What's confusing at first reading is the use of (what[0] & 0xdf). A closer look reveals that this is a quick and dirty way to mask off the high bits of characters; it turns 'a' into 'A' and so on.

For more information

If you'd like to learn more about CGI programming, a good place to start is on the Web itself, with the authoritative introduction to and specification of CGI, at http://hoohoo.ncsa.uiuc.edu/cgi/. If you prefer a published text, The CGI Book by William E. Weinman (New Riders Publishing, 1996) offers a good introduction, and is clearly written with plenty of sample code and an included CD-ROM. Weinman gives his examples in UNIX sh code, Perl, C and pseudo-code. Many of his samples run on Windows NT Web servers. Weinman's Web page is http://www.cgibook.com/.

If you do your CGI programming in Perl, you'll eventually encounter cgi-lib.pl, a library for parsing CGI input and generating HTML pages from Perl programs. The author, Steven Brenner, has teamed with Edwin Aoki to produce Introduction to CGI/Perl (M&T Books, 1996). This book manages to fulfill the title's promise, even though it is primarily aimed at people learning to use Brenner's library. The Web page is http://www.mispress.com/introcgi/.

Have a blast!


Contact Senior Contributing Editor Martin Heller at his Web page at http://www.winmag.com/people/ mheller, or via e-mail at mheller@cmp.com.

Back to 12/96 How To : Power Windows
Up to Table of Contents
Ahead to 12/96 Enterprise Windows: Windows NT