ISG Web>ISGScripts>ISGTechnicalDirection (2010-10-18, TerryVaskor)

This document is an archive of a proposal that was developed early in Winter 2010. Implemented portions of it are reflected in ISGAccountScriptConventions.

Technical Direction of the Instructional Support Group

This document attempts to summarize the current state of the support scripts for the Instructional Support Group (ISG), the future direction it's currently intended for these scripts to take, and the type of support from CSCF that would assist this.

Support Scripts Needs vs. Student Support Needs

There are two distinct types of software support from CSCF that are used by the testing scripts. The first is the student environment. This includes versions of DrScheme, python, and other programs that students use to complete their work, and which should also be used to test students and to give examples in lecture. This requires synchronization across a diverse set of computing environments, including Windows, Mac OS 10.5, and Solaris 8, and ideally the newer Ubuntu 8, Ubuntu 9, and Solaris 10 systems. This has proven difficult to accomplish on a term-by-term basis as the newest software version frequently changes (particularly in the case of DrScheme). An attempt is made to ensure that all first-year courses require the exact same version of each required piece of software; this could also be done explicitly for second-year. It is very important to provide the same version so students can have a reasonable expectation that programs they write will behave similarly at home, in the labs, and when the automated tests are run.

The second type of support is a completely stable packaging system that protects scripts from unanticipated changes in the default package, or interface/behaviour changes in package updates. This was highlighted during the Fall 2008 term when the default version of the tetex package changed and broke a portion of rst. It appears Xhier provides a way to guarantee this stability across Solaris and Linux systems as precisely-named packages do not change after being installed; only the defaults do. However, it should be possible to accomodate other packaging systems; the requirepackage function would just need to be modified to know to search multiple directories for each package, given the requirement that the package name must appear as a directory somewhere in each path.

Core ISG Scripts

ISG provides a number of scripts for use on the course accounts, including scripts to process plagiarism checking results, perform autotesting, and generate announcements in various formats. These scripts have a need to be stable and, in light of the expanding number of supported architectures in the student.cs environment, a need to be cross-platform. To do this, the scripts are gradually being retrofitted to make use of a central setup script and package-requiring function. The stability requirement suggests that these package names should specify versions as particularly as possible so script-writers can reasonably assume these packages will not change. The cross-platform requirement means that these packages must be installed on all supported architectures in the student.cs environment.

To accommodate this framework, all programs are required to have an bash-script entry point. This will exec and/or source a common setup program which sets up a basic environment and provides a requirepackage function; see appendix A for an example. This function can be called for every required package before the program continues in bash or calls an executable written in another language. Every time requirepackage is called, it will either add the required package to the PATH environment variable, or it will output a warning to standard error to inform the user immediately that the package is not available. The setup script itself requires specific versions of bash and perl, as these are the most common two languages used by the ISG utilities.

Testing Environment

ISG provides a program called rst which is used in many courses to do autotesting. This program provides entry points for other code that will perform assignment-specific tests. It is at this point that the term-specific student environment requirements come into play. For example, many first-year courses hook into the ISG-supported BitterSuite framework to write their tests. These tests typically do not hard-code any particular package requirements so they will be more easily reusable as software versions are updated, and in courses which create custom scripts on every assignment, so as little knowledge about the particulars of the student.cs environment or Xhier as possible needs to be known.

Future Directions: Virtual Hosts

To meet the stability requirement, scripts will continue to be written in a way that makes use of specific versions of required software. As long as packages are named as specifically as possible and are installed on every architecture in student.cs in specific predetermined locations, this should be sufficient.

As noted earlier, though, there has been some difficulty synchronizing software that changes every term across all of the student systems. Furthermore, autotesting is done on code in what is very likely a far different environment than what students are using at home. Virtual machine software could offer a solution by allowing the software that changes term-by-term to be targeted only to a single platform. If we can guarantee a single required version of each piece of software across all of the first- and second- year courses, this greatly simplifies the required setup between terms.

An ideal algorithm would be as follows:

Load the testing setup in the stable scripted environment that will run on each supported platform.
Launch a “clean” virtual machine, containing nothing except the software pre-loaded before the start of term.
Copy over student files, testing files, and testing scripts.
Run the autotesting software on the virtual machine.
Copy back all generated files and shutdown the virtual machine.
Perform all post-test analysis in the original environment.

The virtual machine would require the following properties to be executable in this fashion:

It would be possible to load the machine from a pre-determined frozen state instead of just the state as of the last run
The virtual machine would need to be scriptable, or in some other way directly manipulable; for example, via a virtual network port that in reality is tied to a pipe in the testing account's filespace.

Ideally, it would also have the following features:

Support for a graphical environment so students would be able to run it on their own home machines.
Free, open-source software so students would be able to use it without additional cost and so there would be no direct cost for the university.
Lightweight, so it can be booted, have files copied to it, and then execute the tests in a reasonably brief amount of time.

At the present time, it is unknown if there is virtual machine software available that meets all of these needs. If none is available, VirtualBox is a GLPed virtual machine that would allow us to copy over files and run all of the autotesting, but possibly without the convenience of automated file copying on and off the virtual machine or resetting to a pristine virtual machine state between students.

Other advantages of a virtual host:

We install only the software that's required, removing the ability for students to use all sorts of available software that we don't want to give them access to.
If we reboot it for every test run for every student with a fresh image each time, we prevent students from being able to leave resident processes hanging around, or from leaving extra files behind or overwriting existing ones.

NB: The MarkUs team at the University of Toronto is planning to integrate Virtual Machines as a core part of their testing framework. For that system at least, it may be the case that the virtual machine implementation issue is solved for us.

Appendix A – Script Entry Point

`requirepackage` function

This is the key function that specifies particular dependencies. It issues a warning to standard error if a package is not present, attempting only to do this once if it can. If it can find a package directory, that directory is prepended to the PATH, using showpath if that command is available to try to minimize the size of PATH.

Currently, this only looks for the Xhier-style directory /software/package/bin. It can be extended to look elsewhere once the conventions of other packaging systems are known.

# This pathadd function is here to allow additions to the PATH to use showpath
# (which handles path conflicts) if it exists, or to try to check and prepend
# to the PATH otherwise.
#
# NB: For this to work, grep must be in the PATH already.

pathadd () {
   local sp='/bin/showpath'

   # Make sure this is accessible before adding it to the PATH.
   if [[ -d "$1" && -r "$1" && -x "$1" ]]; then

      if [ -z "$PATH" ]; then
         # Base case
         export PATH="$1"

      elif [[ -x "$sp" ]]; then
         # Take advantage of showpath if possible...
         # If the PATH is empty, showpath seems to spit out '.' for current.
         # In early 2009, I think this was defaulting to standard... the new
         # behaviour is really *NOT* desired, so MAKE SURE IT NEVER HAPPENS!
         export PATH="$("$sp" "$1" current)"

      else
         # Try to handle repeats in the PATH
         # Not perfect; for example, trailing slashes are a known problem
         # Just an approximation in the absence of showpath.
         local pcache=$(echo $PATH | grep -v ":$1:" | grep -v "^$1:" | grep -v ":$1$
         if [ ! -z "$pcache" ]; then
            export PATH="$1:$PATH"
         fi
      fi
   fi
}




# The following function is designed to be run from any auxiliary scripts.
#
# To find all required packages, try issuing the following command from the ISG
# base directory (which will still leave a small amount of garbage...):
# grep -r requirepackage * | perl -ne '/requirepackage (.*)$/; print "$1\n";' | sort | uniq

requirepackage () {
   local res=""

   # Now try to find the passed in package.
   local xhpath="/software/$1/bin"
   if [[ -r "$xhpath" && -x "$xhpath" ]]; then
      res="$xhpath"
   fi
   # TODO: Add elsif clauses to check for package paths elsewhere.


   if [[ -z "$res" ]]; then

      # Format an appropriate error message, dumped to stderr; attempt only
      # to do this once.
      # Use an environment variable to try to accomplish this, which will handle
      # repeat requests from parent->child, but not amongst siblings.
      if ! echo -e "$ISG_CHECKED_PACKAGES" | egrep "^$1$" > /dev/null; then
         export ISG_CHECKED_PACKAGES="$ISG_CHECKED_PACKAGES\n$1"

         echo 'WARNING: Expected package' >&2
         echo "   $1" >&2
         echo -e 'could not be found; relying on default\n' >&2
      fi
   else
      # Add the package that was found to the PATH.
      pathadd "$res"
   fi
}

Standard Entry Code

The following sample code is also available at https://www.cs.uwaterloo.ca/twiki/view/ISG/ISGAccountScriptConventions

It uses setup as early as possible from a generic bash entry point, then executes a secondary script in the setup environment to ensure that the correct version of bash is being used. The newly-executed version then sources the setup script to obtain the requirepackage function, which it can then use to require particular software packages.

Common script that any utility meant to be run from the command line (rst, rsta, announce, etc.) symlink to, isolating the initialization complexity to a single file:

#!/bin/bash -p

# This harness attempts to isolate generic bash code, so all 
# supplementary programs can assume they have access to a 
# decent version of bash (or perl).
#
# This will assume a file of the same name as the one invoked
# with _impl appended will exist in the ISG bin subdirectories.

fail () {
   echo 'Failed to execute setup; aborting' >&2 
   exit 125
}

if [ -z "$ISG_BIN_SETUP_DIRS" ]; then

   # If we reach this point, setup has *NOT* been run.
   # Guess at a default PATH to try to find dirname; if it can't be found, die.
   # Then rely on setup to find basename, etc.

   PATH='/bin:/usr/bin'; hash -r
   dirname "$0">/dev/null || fail
   nextexe="$("$(dirname "$0")/setup" bash -c "eval echo '\$(basename "$0")_impl'")" || fail
   exec "$(dirname "$0")/setup" "$nextexe" "$@" || fail

else

   # If we reach this point, setup *HAS* been run.
   # Trust it to give us a proper basename executable, then defer to the auxiliary executable.
   basename "$0" >/dev/null || fail
   exec "$(basename "$0")_impl" "$@" || fail

fi

Use of =requirepackage

Then, a secondary script that needs the package javajdk-1.5 would begin as follows:

#!/usr/bin/env bash

# Make sure requirepackage, absolute, etc. are defined if needed by sourcing setup...
. "$(dirname "$0")/setup"

# Any needed requirepackage statements now go here
requirepackage javajdk-1.5

Topic revision: r9 - 2010-10-18 - TerryVaskor

ISG Web

ISG Web Home
- Changes
- Index
- Search

Webs
- AIMAS
- CERAS
- CF
- CrySP
- External
- Faqtest
- HCI
- Himrod
- ISG
- Main
- Multicore
- Sandbox
- TWiki
- TestNewSandbox
- TestWebS
- UW

My links
- People
- CERAS
- WatForm
- Tetherless lab
- Ubuntu Main.HowTo
- eDocs
- RGG NE notes
- RGG
- CS infrastructure
- Grad images

Edit

Instructional Support Group, David R. Cheriton School of Computer Science, University of Waterloo