nohup nice +6 ...
.
Proposed improvements that would be irrelevant given CSCF involvement described below.
At the start of the term, each course is expected to take the following steps:
pub_test_kill
and then pub_test_launch
to ensure the test runners are in place, and are running fresh code.
bin
directory and ensure that it is chmod 6555
so requests for tests can be made from the command line by students. This also has the additional requirement that bin and every one of its parent directories be world executable.
While this is a relatively small amount of burden, it is something that is easily overlooked by staff in any given course and which is not necessary if there's additional CSCF support. Also, pending any additional insights in RT #71466, both
pub_test_launch
and distrst
(a program that automatically spreads batch autotesting load across multiple servers) are locked only to Solaris 8, which is particularly undesirable given the pending shift to Ubuntu for many courses.
CSCF has already copied a particular state of pub_test_request
so that it will automatically be available in the standard PATH for all students on student.cs
systems.
This is currently being reviewed and may end up becoming a symlink in future terms.
Right now, pub_test_request
calls a command that must be put on the course accounts manually to get setuid status before calling pub_test_logger
. Instead, it should be possible for this to run as some other privileged user with the ability to set its uid and gid to all course accounts and cs-marks and drop down to course account permissions to run pub_test_logger
without an executable being placed on the course accounts (the primary proposal is for this privileged account to be the isg
account, and for sudo or ssh access to the course account to be used). This would simplify use of the command from the instructional standpoint.
The other potential involvement would be for the pub_test_runner
executables. Right now, every course launches N of these on each server, where N is the number of processors on that server. Most of the time, these processes are idling and doing unnecessary polling. Instead, it should be possible for the privileged account to launch this pool of processes, and for each of them to drop down to the appropriate course to search for requests and service them if necessary. Again, this decreases the burden on the course accounts in terms of monitoring daemon status; however, it does mean that the pollers need to do more work (read configuration files on each account dropdown, check appropriate directory vs. reading configuration once at startup; the advantage though is automatic configuration refresh without restarting the test runners).
There is also the issue that the intent is for each course to choose a particular platform on which to launch the test runners. The privileged account would have to run them on all platforms, and then the course would need a way to list every server it wanted requests serviced on so only the appropriate runner would take action.
"Priviliged user X" launches public test daemons; 1 per “processor” on every server in the student.cs environment. This is done to prioritize the fast servers when requests are serviced. bin/util/numprocessors
in the ISG subversion repository tries to obtain this count; a current unstable checkout is available at the time of writing at /u2/isg/u/tavaskor/working/bin/util/numprocessors
which appears to count cores and hyperthreading on Linux in addition to simple physical processors (which, in this case, seems appropriate).
These daemons should possibly be checked on periodically by a cron job to refresh any that may have crashed or been killed; if this is done automatically, it means CSCF wouldn't need to handle gripes about the runners dying in any cases where this happens.
~isg/bin/public_test/pub_test_runner
on any appropriate courses. There are various approaches to this; one may be:
~isg/bin/public_test/pub_test_runner
is executable); then run itbin/public_test/pub_test_runner
in the ISG repository currently sleeps a random amount of time, with longer sleep periods if it's been “a while” since it last needed to run tests
pub_test_runner
`hostname`
is in the allowable list for this course; then essentially follow the same algorithm it currently does, but without the infinite loop and sleeping as
the privileged-user wrapping process would now handle that.
The net effect is that some scripting/maintenance weight is lifted from ISG and the courses, as every individual course does not need to know how to launch and maintain the public test runners, or figure out a way to launch them on only a particular selection of servers.
However, there would still be a need at the start of term for initial configuration. To simplify this so that a single configuration option can be used for both the public test runners and distrst
, the most natural option will likely be a list of allowable servers in .rstrc
.
As the configuration is read in by bash scripts, this would most naturally be an array; for example,
test_servers=( cpu16.student.cs cpu18.student.cs cpu20.student.cs )
The overall net effect is not a complete elimination of the start-of-term setup requirements for each course regarding public tests, but still a reduction to a single statement in a configuration file.