Server Internals
The main routine for the HTTP server configures itself according to the
command line arguments, reads the rules file into memory, and starts
a thread that listens for TCP/IP connections to the specified port. When
a connection is received, the listener thread creates a new thread
with a start routine that performs the following actions:
- Reads the HTTP request from WWW client and parses out method, URL, and
protocol. Any header lines included with the request are read at this
time as well.
- Executes request (function http_execute()):
-
Parse URL into scheme, host, ident, and search arg (if any).
- Apply rules file rules to ident portion of URL. Send error
message and return if 'fail' rule triggered.
-
- If translated URL matched on an exec rule:
- Punt to DECnet task (WWWEXEC) with subfunction "HTBIN"
- else if requested method is "GET" or "HEAD":
- Retrieve file/directory and send to client.
- else if requested method is "POST":
- Punt to DECnet task with subfunction "POST"
-
Close connection and exit thread.
Note the repeated use of the operation "punt to DECnet
task", in which the
server makes a DECnet connection to object WWWEXEC and lets the object
control the processing of the request.
In most environments, the primary function of the HTTP server is the retrieval
of files residing on the server machine. File-oriented URLs are processed with
the steps:
- Check access, abort if access not granted.
- Determine if local port is no-cache port, which will inhibit placing
retrieved file in cache.
- Determine content-type and encoding of document:
- Make list of content-types that client will accept by parsing request
headers.
- Extract suffix and match content-type defined for suffix with
those accepted by client, suffix of "/" is treated special and
forced to type text/file-directory.
- If previous step found no match, abort request unless method is HEAD.
- See if content-type matches presentation rule in configuration file
and punt to DECnet task with subfunction CONVERT if match.
- See if search argument (?xxx) present in request and punt to DECnet task
with subfunction SEARCH if so.
- Check request headers for if-modified-since and convert time to binary
if found. If no if-modified-since, check document cache for requested URL
and complete request using data in cache if found.
- Open file, open depends upon case:
- Encoding is Binary
- Attempt open using fopen mode "rb".
- URL is directory
- Open directory using special mode "d" and check
for existence of welcome file (index.html) in directory. If welcome
file present, set status to re-direct request to welcome file.
- Encoding is 8Bit (text)
- Attempt open using fopen mode "r"
- If open failed, check for errant directory request (missing final slash)
and set status for re-direct if directory found.
- If open successful, make HTTP header and send contents. If
caching allowed and contents fit in cache buffer (4K bytes), save in document
cache.
- If open failed, abort request with 404 status.
The WWWEXEC procedure first opens SYS$NET, making it a PPF (Process
Permanent File). It then reads the basic request parameters from the
server (sub-function, method, ident, protocol) and continues processing based
upon the sub-function:
- HTBIN:
- Parse script name out of URL ident and search for matching
script in htbin directory (directory that was named in the rule
file). If found, invoke script.
- CONVERT:
- Query server for name of converter script (defined by presenation rule
in configuration file) and invoke script to convert specified URL ident.
- SEARCH:
- Send message indicating generic search capability not available.
Note that scripts can handle searches because they are
tested for first by the server
- POST:
- Save request to temporary file and send acknowlegement to
client. Nothing is done with the saved file, I was testing
the request_to_file program for future development.
For a more detailed
description of the protocol
used between the server and WWWEXEC, see the comments at the bottom of
WWWEXEC.COM.