This page describes in detail the handling of extracts from the PeopleSoft SA system.
Each extract email consists of a text body which describes the parameters used to produce the extract, and a single attachment whose body is a CSV file containing the actual data. The CSV file seems not to have any escaping at all: commas unconditionally separate fields and cannot be escaped. Please note that this means that commas cannot occur in field values. Commas occurring in the underlying data seem to be replaced by spaces.
The attachment is named [process]-[extract].csv
, where process
is the numeric process ID from the process viewer in Quest and extract
is the code for the type of extract (e.g. uwxplans
, uwxgpcon
). The first step in processing an extract file is to save the text body and the attachment to separate files. The attachment is saved to [base]/[YYYY]/[MM]/[DD]/[filename]
. The text message body is saved to the same filename but with .txt
substituted for .csv
at the end.
In the event an email arrives that cannot be understood, the entire raw email message will be saved to a file in the same directory as the attachments and message bodies. The name will be [hh]:[mm]:[ss].[unique]
, where unique
is chosen to ensure uniqueness (i.e., the name will be the current time in ISO format plus some extra to ensure no overwriting). If this occurs the script will also produce error output which will be reported via xh-mail
.
Note that in addition to the regularly-scheduled extracts which form part of our business processes, there may be ad hoc reports requested through the Quest interface. There is no convenient way for procmail to distinguish reliably between the systematic extracts and the ad hoc ones. Therefore, all extracts requested by the staff member whose email account is used will end up here.
In order to sort out what, if anything, to do with each saved extract file, the text message accompanying each extract is parsed to extract the parameter values. The exact parsing process will depend upon the extract type, which is determined from the extract
portion of the attachment file name. The field values are compared against a list of expectation records each of which contains the following information:
An extract file may fail to be processed for reasons ranging from forgetting to request the extract from Quest to a crash of the host just after the extract handling script has started. For this reason a cron job should be used to verify each extract. Verification can be accomplished by a script that takes some extract types and checks the batch directories to ensure that all of the .expect
files are gone, and then creates new .expect
files.
-- IsaacMorland - 02 Jan 2008