FYI: I developed this system before I learnt PHP and MySQL. It could be much simpler with these tools. Anyway, development of the whole system took less time than writing this document.
This document describes my own project of network computing via WWW. The project was developed for my private needs and is based upon my limited knowledge. There are certainly other (maybe better) ways to achieve the same goal but I do not have time to learn them. I present it here in the hope that it might be interesting for others. If you find the document useful or if you like to see further information, please let me know.
The programs described below run on OS/2 Warp Connect 3.0 and OS/2 Warp 4.0 (Merlin) but some of them could easily be ported to other operating systems which support HTTP daemon with CGI scripts in C++, REXX and Perl, FTP daemon, and Telnet daemon.
The document describes the way how the programs residing on different computers may
communicate with each other. The communication is based upon existing protocols using
TCP/IP, namely HTTP and FTP. The programs make use of existing tools and services and it
is not necessary to program new daemons.
Imagine that you have a program which reads an input data file, does some calculation and writes the results to another file. A great many programs work like that. Consider now, that the program is very flexible and the algorithms for calculation may vary. Depending upon the selected type of calculation and kind and size of the input file, calculation may require a few seconds, a few hours or a few weeks.
An example of such a program is GEPARD which was briefly presented in Czech at the conference in Casta Papiernicka in 1997 and (in English) at the CHISA'98 conference. (GEPARD means GEneralized PARameter Determination and it is a flexible object oriented program with dynamic linking at run-time and plug-in technique. The program can fit any kind of experimental data to any equation.)
If calculation takes a long time, the operating system could crash due to some failure or may be restarted by the operator due to installation of some hardware or software component. It is therefore necessary to build some mechanism for storing intermediate results. GEPARD may be configured in such a way that the intermediate results are written to a series of files. Upon each start GEPARD automatically tries to read these so called restart files. If the files are not present or are corrupted, GEPARD starts calculation from the beginning.
We may also have lots of data files and we would like to have them calculated automatically on a computer which runs permanently. We do not wish to make changes into the program, therefore we design supporting batches for controlling all particular tasks.
It would of course be possible to develop a new internetworking protocol for handling
all actions. This would either require registration in Internet authorities and I doubt
that they would assign a socket number and other resources to a private application. I
could also implement a private non-standard protocol without registration but this is a
wrong way of networking. Therefore communication makes use of existing protocols in new
ways which do not break anything.
Automatic data feeding
GEPARD as most other programs needs the name of the input file supplied as a command line parameter. This requires user intervention. If the computer runs permanently, calculation may finish during night and it would be nice if some mechanism could start the program specifying another data file name in the command line.
Adobe Acrobat Distiller uses method which may be useful for GEPARD. Distiller converts PostScript files to PDF. There are several ways how to instruct Distiller which files should be converted. The interesting method is presented by watched folders. Distiller looks periodically into a specified input directory (or several input directories). If it finds a PostScript file there, it converts the file to PDF and moves both the PostScript and PDF files to the output directory.
Very similar method is used by GEPARD. However, it is not that straightforward. Implementing watched folders would require changes in the program and this was not desired. This mechanism was shifted to an infinite REXX batch which has the following parts:
Notice that the batch has no termination. After storing results the batch proceeds with
In this phase the batch reads the values of various environment variables defined in
config.sys. The batch thus finds locations of its watched folder, its working
directory and other settings which usually differ from one computer to another.
This part of the batch serves two different purposes. It was mentioned earlier that calculation may be interrupted due to various reasons and can later be restarted using intermediate results. Data scanning therefore starts in the working directory. The restart file contains among others the name of the input data file. This is the essential information which must be given to GEPARD on the command line. Thus if we find the files, we know that the old calculation should be restarted.
If the restart files are not present, we look into our watched inbox. If a file is present there, we move it to the working directory so that GEPARD can use it. If the inbox is empty, we wait for some time (usually 10 minutes) and then try again.
If we have bad luck, calculation may be interrupted immediately after the start,
before GEPARD writes the first restart file. If the batch is then started again, the
working directory will not have any restart file and the data file will be rejected.
There is, foortunately, a simple remedy. After moving the data file from inbox to
the working directory the REXX batch makes a restart file of its own. The first line must
contain the name of the input file so that the REXX batch can find it later. The rest of
the restart file is unimportant. GEPARD will think that the file is corrupted and will not
use it. Since no other restart information is available, GEPARD will start calculation
from the beginning.
Invocation of the GEPARD program
This part is very easy. Data scanning gave us the name of the input file. Now we call
GEPARD and give the file name on the command line. GEPARD then does all what is needed.
When GEPARD ends, we collect all files produced by it into a single ZIP archive which
is then moved into the outbox folder. The working directory must be cleaned
otherwise the REXX batch would decide to calculate the same again and again. The batch
then goes to data scanning.
Simple network control
The above description can lead to persuation that the program can easily be controlled just by FTP and Telnet. The user (or operator) can ftp the data files to the watched inbox directory and retrieve the results from the outbox directory. The intermediate files can also be retrieved by FTP or can be directly examined by Telnet. This approach, although workable, has three pitfalls:
moreis time consuming because it is not possible to return to the previous screen and Telnet clients do not allow to break the listing in the middle by sending Ctrl-Break (instead it breaks the Telnet client). The Telnet clients, mainly in MS Windows, do not pass corretly cursor movement commands and control keys. Therefore viewing the file in a text editor is not possible. The only useful way is to fetch the intermediate file via FTP and examine it locally, but this requires an extra work.
As a solution to problems posted in the previous section we set up a server which will coordinate all GEPARDs. There are several possibilities how to design the server-client system:
I decided to use the third possibility. The control system consists of a few dynamically changing web pages. They enable the user to obtain basic information about the status of calculation on all GEPARD sites. CGI scripts available as hyperlinks enable to perform various actions. Other CGI scripts are accessible only from the program which support GEPARD. These programs thus may download input data for GEPARD and send both the intermediate files and results to the server. All changes are then reflected on the web pages.
The greatest advantage of this method is that a widely supported protocol can be used
and the most important tools are either distributed with the operating system or are
GEPARD cannot communicate with the server directly because it is not programmed this way. Anyway, such communication is not needed. All what is necessary can easily be incorporated in the REXX batch which controls GEPARD.
The communication is written in Perl and the client accesses specialized CGI scripts. These scripts verify the identification of the client. They do not allow access from WWW browsers and return status "403 Forbidden" without any explanation. Thus ordinary users cannot guess how to access these scripts and possibly make some damage.
Server-client communication modifies the REXX batch in three areas:
They are described in greater details in the next sections.
The server must know all sites where GEPARD is running. First the server starts FTP
from time to time and downloads the files from working directories of all sites. Second,
the server must prepare web pages for the site. Registration is added to the preparation stage of the REXX batch. The server either creates the
pages for a newly registered site and returns "201 Created", or acknowledges that it
already knows the site by "204 No Content".
Network data scanning
This is an extension to the data scanning described earlier. Here we add another step which takes part in case that inbox does not contain any files. The REXX batch then invokes a small Perl script which will try to obtain a data file from the server. Thus, if the user wishes to calculate certain data on a particular site, he or she can move them directly to the correct inbox. If this is not important, coordination is left upon the server.
The input data file is retrieved from the CGI script. The script must also give the file name under which the file should be known to GEPARD. It can be done by a simple method. The CGI script will not send anything. It rather replies with status "302 Moved Temporarily" and the "Location" field contains the URL of the data file. This response must not be cached according to RFC 2068 and the next query should be directed to the original URL of the CGI script. Thus it is exactly what we need.
If the server does not have any data file, the CGI script returns "404 Not Found". RFC
2068 does not specify whether this condition is permanent or temporary. Therefore, it is
legal to try it again after some delay -- and the data scanning
algorithm in the REXX batch will do it.
Network results notification
The results should be sent to the server so that the user can access them all from a single host. We could use form based file upload but this would require greater amount of programming. The client could start FTP to the server but the server would not know that the results are there and some clever synchronization would be necessary. Thus the client just sends a notification to the server. Later the server itself retrieves all result files via FTP and if they safely arrive, the server deletes the files from the site's outbox.
It may happen that notification fails or the server cannot obtain the results by FTP.
Thus the client always sends the names of all files which are present in its
The user controls all GEPARD sites through a few web pages. The data files must be placed to a server's directory via FTP. The user then accesses a corresponding WWW page and specifies the requested order of calculation.
Other WWW pages enable to retrieve the results and examine the intermediate files directly in the WWW browser.
I use to bring the results home on my ZIP diskette. One of the WWW pages hyperlinks to a CGI script which does it for me.
Of course, these pages are password protected in order to prevent unauthorized access.
GEPARD administration and upgrades
Sometimes the user might wish to perform special actions from different reasons. One of these actions is upgrading the GEPARD itself or DLL's containing some equations. In such cases the GEPARD program must be stopped and after administration/upgrading restarted again.
Sometimes calculation does not converge and it is better to interrupt it. If GEPARD dies, all intermediate files will automatically be collected by the REXX batch and moved to outbox and the next input data file is searched. Thus GEPARD automatically continues with his tasks.
All these actions may be performed from Telnet but is is necessary to have a Telnet
daemon on each GEPARD site. I have a program which can kill GEPARD but leave its session
runnning, or the whole session can be killed. Another program will start a separate
session with GEPARD so that it can be safely invoked from a Telnet client.
GEPARD and related programs now work on OS/2 Warp Connect 3.0 and OS/2 Warp 4.0. The computers must have at least 64 MB RAM. The client may even run on a server site because many tools and services are common and the requirements are not therefore additive.
Both clients and server contain programs written in C++. The C++ compiler should have support for templates. I wrote the programs with IBM VisualAge C++ 3.0 but GCC/EMX may probably be used instead.
TCP/IP Protocol must be installed on both the server and clients. It is supplied with
the above mentioned operating systems. Dial-in access is not sufficient because all sites
must work permanently as daemons. The sites must have fixed IP addresses, therefore DHCP
cannot be used.
The server must have the following tools and services:
The client must run on an operating system which supports DLL's or a similar system where modules can be loaded at run-time by request from the program. GEPARD cannot work without this feature.
The client also needs:
After such a lengthy explanation it might be useful to show some pages in practice. You can view a simple demonstration but you must have a frame enabled browser.