There are the following programs:
cgigate
: CGI that connects a remotely-authenticated (or anonymous) user to their own local JVM.
webgate
: CGI that connects a remotely-authenticated user to their own local JVM running as themselves. See OdysseyWebsuid.
odyssey
: Command-line program that connects a user's Unix command line to their own local JVM running as themselves. If this is invoked on a web server it will actually be the same JVM as they get through webgate. Referred to as cmdgate
in a generic context; in our installation it is called odyssey
.
restart
: Program invoked by the other ones via ssh to start the central JVM that connects to the database.
These are in the odyssey-2
package on capo
, with source in /usr/source
in the usual XHier way. All four are very similar, so they actually share their source code: the few differences are created by #ifdef
s in the code, controlled by the makefile.
The wire protocols used are documented in OdysseyWireProtocols.
Every system call is checked for error responses. With a few specific exceptions, an error response results
in the calling of a routine error()
which simply logs the error, attempts to report the error to the user, and exits. Each place where error()
is called passes it a different number, which along with errno
is recorded in the log and displayed in the error message. In cases where the system call is called from a procedure, that procedure will itself take a where
parameter which is used to construct the value passed to the error()
routine.
Only numbers divisible by 10 are used literally. Numbers are assigned in the following ranges:
adapter.c
process.c
log.c
The IO routines use where
, where+1
, etc. when they make more than one system call. Numbers that are at least 20000 are reserved for errors related to logging. Reporting one of these errors turns off logging before attempting to report the error to the user.
This section describes the sequence of steps carried out by the adapter programs cgigate
, webgate
, cmdgate
, and restart
. For the most part they simply carry out the same steps in sequence, with some variation between the different adapter versions.
The log file is opened and the userid determined. The log file is named #webgate
, #cmdgate
, #restart
, or #cgigate
as appropriate, and is saved in logs/YYYY/MM/DD/
if today's date is YYYY-MM-DD.
The userid is determined as follows:
getuid()
), unless it is equal to POWERLESS_USER, in which case use $REMOTE_USER.
getuid()
)
argv[1]
)
The "log userid" is equal to the userid, unless the userid is null, in which case the string "%%anon%%" is used instead. The "log userid" is meant to be used precisely wherever it is not possible to record a null value, for example when formatting a file name. This rule is not followed perfectly in all code in the system, especially in Java where Java's null is not used everywhere it should be.
webgate
and cgigate
read standard input in its entirety and then close it. The bytes read are recorded for later use.
First the fifoName
is determined. For restart
this is #server
. For cgigate
this is the log userid. For webgate
and cmdgate
this is the log userid with ".=" appended. Note that there should be a one-to-one correspondence between these names and running JVMs, except that some of the JVMs may not be running at any given moment. The ".=" suffix can be thought of mnemonically: it reminds us that the corresponding JVM is running as the indicated user rather than as the odyssey
user.
Corresponding to the fifoName
there are two files, both residing in the connect
directory. fifoName.fifo
is the actual FIFO, while fifoName.lock
is a regular file used for locking. The sole reason for having two separate files is that FIFOs cannot be locked. The lock file is always a 0-byte file.
We open the lock file. If necessary, it and the corresponding FIFO are created. Next we lock the entire lock file. This begins the critical section.
Now we open the FIFO for write, in non-blocking mode. If there is a JVM waiting for the request, the open succeeds and, after switching to blocking mode on the FIFO, we move onto the next step. If there is no JVM waiting, the open should fail. We start a new Java process, and open the FIFO in blocking mode. We therefore block on the FIFO until the Java process has started up and is ready to accept input from the FIFO. Except for restart
, we also SSH to the main database server to restart (if necessary) the main server process and obtain the connection key for our JVM to connect to the main server. In this case we wait until the SSH process terminates before proceeding, as its stdout is hooked to the FIFO to the JVM, so the JVM sees the output of the SSH process.
Now we create the FIFOs which will be used to finish the communication with the JVM. These FIFOs are per request, and are created in the same connect
directory as the request-submission FIFOs. Their names are formed by separating the fifoName
, the process ID, and the "type" by periods. For cmdgate
, the "type" is stdi
for stdin, stdo
for stdout, and stde
for stderr. For the other types, the "type" is just resp
.
If the FIFO already exists, it is considered an error. This should not occur since the previous adapter process with the same process ID should have removed the FIFO. However, this must be considered a defect in the adapter; if it finds the FIFO already existing, it should either just use it, or delete the existing one first. Any other process trying to use it would have to have the same process ID, so it cannot exist at the same time.
Now that we have the FIFO opened for write, we send the request to the JVM. This is described in OdysseyWireProtocolJVMRequest. Once the request is sent, we close the FIFO and then the lock file. This ends the critical section.
All that remains is to take the response sent by the JVM and send it to standard output, except for cmdgate
, where we need to take input on stdin and send it to the JVM, and echo data sent by the JVM to stdout or stderr as appropriate.
For cmdgate
, we use select()
to monitor stdin and the stdo
and stde
FIFOs for available data. When we see it, we copy it to the stdi
FIFO, stdout, and stderr respectively. For the other adapters, we simply send everything from the resp
FIFO to stdout.
Once the response copying is done, we are done. We do not explicitly perform cleanup. Instead, on startup, an atexit()
handler is installed which performs cleanup actions consisting of deleting the per-process FIFOs. In addition, a signal handler is installed so that hangups, interrupts, terminations, segmentation violations, and bus errors all result in cleanup occurring and a log entry being made prior to the exit occurring. Child exits are handled separately.
If the SSH process terminates, a flag is simply set which is tested in the code that initiates the SSH process. This is normal operation.
If the Java process terminates, this indicates a serious problem with the Java environment, such as the required class files being missing. We report the error and terminate.
If a different process terminates, it is an unexpected error since the SSH and Java processes are the only ones we ever spawn.
-- IsaacMorland - 13 Apr 2006