PeopleSoft Extract Handling

This page describes in detail the handling of extracts from the PeopleSoft SA system.

Email Handling

Each extract email consists of a text body which describes the parameters used to produce the extract, and a single attachment whose body is a CSV file containing the actual data. The CSV file seems not to have any escaping at all: commas unconditionally separate fields and cannot be escaped. Please note that this means that commas cannot occur in field values. Commas occurring in the underlying data seem to be replaced by spaces.

The attachment is named [process]-[extract].csv, where process is the numeric process ID from the process viewer in Quest and extract is the code for the type of extract (e.g. uwxplans, uwxgpcon). The first step in processing an extract file is to save the text body and the attachment to separate files. The attachment is saved to [base]/[YYYY]/[MM]/[DD]/[filename]. The text message body is saved to the same filename but with .txt substituted for .csv at the end.

In the event an email arrives that cannot be understood, the entire raw email message will be saved to a file in the same directory as the attachments and message bodies. The name will be [hh]:[mm]:[ss].[unique], where unique is chosen to ensure uniqueness (i.e., the name will be the current time in ISO format plus some extra to ensure no overwriting). If this occurs the script will also produce error output which will be reported via xh-mail.

Note that in addition to the regularly-scheduled extracts which form part of our business processes, there may be ad hoc reports requested through the Quest interface. There is no convenient way for procmail to distinguish reliably between the systematic extracts and the ad hoc ones. Therefore, all extracts requested by the staff member whose email account is used will end up here.

Extract Disposition

In order to sort out what, if anything, to do with each saved extract file, the text message accompanying each extract is parsed to extract the parameter values. The exact parsing process will depend upon the extract type, which is determined from the extract portion of the attachment file name. The field values are compared against a list of expectation records each of which contains the following information:

  • Batch template — target directory for extract file
  • Filename template — filename to use for saving extract
  • Fixed parameter values — the value expected for each parameter not used in the filename template. A parameter can be omitted if it is expected to be blank.
  • Expected parameter values — a list of expected parameter value records. Each one gives, for each parameter used in the filename template, a list of parameter values.

Arrival Verification & Initialization

An extract file may fail to be processed for reasons ranging from forgetting to request the extract from Quest to a crash of the host just after the extract handling script has started. For this reason a cron job should be used to verify each extract. Verification can be accomplished by a script that takes some extract types and checks the batch directories to ensure that all of the .expect files are gone, and then creates new .expect files.

-- IsaacMorland - 02 Jan 2008

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2008-01-02 - IsaacMorland
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback