ST - Internals - Coding Conventions

The following describes the coding conventions used in any additions/changes,

Table of Contents


Directory Structure

The directory structure in lib/*/st was motivated mostly by some degree of compatibility with that in lib/*/rt. The boundary between "support" and "database" is fuzzy in some cases. It's tempting to move support/{convert,precision} into "database", and rename that "data".

database/*:
implements data access to "internal fields" for the rest of the code. So it's about more than just the relational database, as some fields are computed. It also includes general information about fields, as a lot of it is in a single file (st/database/fields.pm)
support:
about the data itself, and help with doing various things with it. It includes text and HTML specific things, as often they're similar, e.g. in support/title.pm TODO: consider a separate place for data specific support, e.g. precision.pm, convert.pm, compare.pm, ... so that "support" doesn't become too bloated. Maybe "data", as opposed to "database" above. The exact characterization is the challenge.
ui/web:
the top level CGI's. There are some HTML things in ./support/, when there's also corresponding text things.
st::debug
exists at the top level only to save typing while debugging. Really. Otherwise it would likely be in ./support.

Handy Links

The directory /software/rt-math-1/data/handy-links contains symbolic links to various (mostly source) files in the package. When a filename is unique across the package, the link is directly to it. E.g.

cgi.pm → /software/rt-math-1/lib/debug/st/ui/web/cgi.pm

When a name appears in multiple places, the result is an inverted tree structure. E.g.

gateway.pm/support /software/rt-math-1/lib/debug/st/ui/web/cgi.pm
gateway.pm/ui /software/rt-math-1/lib/debug/st/ui/mail/gateway.pm

This can save typing in referencing files. That all that this is about.

So, when adding a new source file, or moving files around, make the corresponding change here.

Package Names

Unlike the .../lib/*/rt/ code, package names are used liberally, and match the directory structure described above. A reference to

    &st::xxx::yyy::zzz()
is found either in the file
    st/xxx/yyy.pm
or in
    st/xxx/yyy/zzz.pm
The choice between the two approaches is made to reduce occurrences of names of the form
    st::...::Name::Name

Error Handling Conventions

Coding Style

Unlike much of ../rt/. code, the code here tries to avoid writing error messages to STDERR (e.g. via &rt::logmsg), rather it returns error messages as part of its result, e.g.

   my ($result, $emsg) = &SomeFunction(...);

A successful error message is "", rather than undef, to facilitate concatenation of messages across multiple function calls, e.g.

   ($thing, $emsg) = &FunctionA(...);
   $emsgs .= $emsg;

   ($amabob, $emsg) = &FunctionB(...);
   $emsgs .= $emsg;

   return (undef, $emsgs) if $emsgs ne "";

A variation is to use the presence of result to determine if a message is a warning or an error, e.g.

   ($thing, $msg) = &FunctionA(...);
   ${defined($thing) ? \$wmsgs : \$emsgs} .= $msg;

Or when the presence of a result can't reveal the nature of the message, or the possibility of both error and warning messages exist, both types of message can be returned, and then optionally merged, e.g.

   ($thing, $emsg, $wmsg) = &FunctionA(...);
   $emsgs .= $emsg;
   $wmsgs .= $wmsg;
   if ($emsgs ne "") {
     return ($thing, $wmsgs . $emsgs);
   }
	 ...

Ultimately the top layer(s) can display/log errors as they wish.

When to Log

To avoid WWW server logs being filled with generally unimportant info, don't "log" something unless is really is an internal error that a maintainer is going to want to know later. Errors should be displayed to the invoker.

An exception to the need for the above is the debug WWW version. As an aid to debugging, and filling logs with junk, the debug version appends the contents of STDERR and an eval() to the end of the generated WWW page. So using STDERR is fine in the debug version.

Coding Style

Whitespace

Tabs are used for indenting, in a way which is independent of tab setting. So intended alignments occur regardless of the choice of tab setting. Thus to produce something of this form:

    &SomeFunction($aaaaaaaa, $bbbbbbbbbbb, $ccccccccc,
                  $ddddd, $eeee);
tabs would be used like this:
^I&SomeFunction($aaaaaaaa, $bbbbbbbbbbb, $ccccccccc,
^I              $ddddd, $eeee);

Lines are folded so that a tab setting of 4 results in a maximum line length of 80. When forced to fold a line, instead of

  my ($x, $y)
    = &func(...)
we prefer to see
  my ($x $y) =
    &func(...)

Documentation

Every function and external has an accompanying description. And in most cases, every my/local variable has a description as well. A function's description is of the form:

################################################################
#
# <Name>: <purpose>
#
# Input:
#  <arg1>: ...
#  ...
#  <argN>: ...
#
# Imports:
#  <external1>: ...
#  ...
#  <externalN>: ...
#
# Output:
#  <return_value>: ...
#
# Exports:
#  <externalN+1>: ...
#  ...
#  <externalN+M>: ...
#
sub <Name> {
  my ($a, $b, ...) = @_;

where all but the <Name> section are absent when they don't apply.

In addition, it's common to document code that affects a single variable in the form:

  # $variable_name: doing something or other ...
  #
  my $variable_name;
  ...

Barewords vs Hashes

Perl allows "barewords" in many places, most of which are not recommended. The one exception to that, which causes no warnings and improves readability, is in hash references. So that's what's being done. E.g. $r{type} instead of $r{'type'}.

Warnings

Code should work quietly with Perl warnings enabled. That's not yet true for the RT part of the code. Warnings are enabled for the debug version, and will appear in the "Internal Error" section at the end of the regular display.

RT vs ST

As code meets standards, it is moved from

  /software/rt-math-1/lib/*/rt
to
  /software/rt-math-1/lib/*/st

As a result, code is slowly migrating from ./rt to ./st, as it's changed to follow current conventions, or simply rewritten. So most (maybe all?) of ./st was done here. Changes are recorded in ST #77799.

Data Structures

The following are some commonly used data structures. For a given structure, it is usually the case that there are a set of reserved variable names. For example, if something is called $group_info, it is expected to always contain the structure initially associated to that variable.

Record

A scalar variable with the name "record", i.e. $record, is always a reference to a hash mapping a subset of the fields of a single record. As described elsewhere, those are from the "each_req" database table together with fields computed from them. With the exception of when a new record is being created, it always contains the "serial_num" field. What other fields are present depends upon context. See &st::database::records::expand_record for the usual way of adding needed fields.

In addition, the transactions associated with a record are cached in

  $record->{"X-trans-data"}
with subsets selected via
  $record->{"X-trans-selections"}

ID

People are identified by a canonicalized email address called an "ID". It is of the form

   userid[@non_default_domain]
The default domain is
  $st::config::identification_domain

current_user

A "current_user" structure contains:

required:
optional:

One exception to identifying a person is

  $st::database::access:the_system

transactions

If it's a hash (reference) called transactions then the structure can be seen in &st::database::transactions::get. It contains both fields from the "transactions" database table and a few other fields computed from them for convenience.

Unions

A "field union" is a representation of the result of using the "+" syntax for a fieldname, to result in the "union" or "sum" of the values of a field across a selected subset of a records dependencies.

The union is represented as an array of the values in the union, `bless`d as a "st::database::records::FieldUnion". `bless` is part of what Perl provides for object oriented programming. See &st::database::records::compute_field for details.

History

A "field history" is a representation of the result of using the "-" syntax for a fieldname, to result in the "history" of the values of a field across a selected subset of a records transactions.

The history is represented as an array, `bless`d as a "st::database::records::FieldUnion". `bless` is part of what Perl provides for object oriented programming. See &st::database::transactions::field_history for details of the array representation used.

Data Representations

Each field value has multiple representations, depending upon where it's used and/or stored. Most of the details are in st::support::convert.

Database

This is the text that is used in an SQL statement to write the value to the database, and the text value we expect back.

Internal

This is whatever (Perl) data structure we find handy to represent the data. In some cases it's an array, when the data looks appropriate. There is no guarantee that an array is used in all multi-valued cases. An array is also used when grouping single-valued fields because of gathering values across relatives of selected items. See the "+" syntax of fieldnames.

Input

This is the text we expect to be given by UI's. We're currently assuming we can handle both CGI and CLI input with the same code, by handling the "\0" that a CGI can provide.

Text

This is a text form of a value, suitable for display in a text based environment. We often produce the HTML value by first converting to text. This is where the concepts of "precision" and "verbosity" appear.

HTML

This an HTML form of a value, suitable for display in an HTML based environment. See &&rt::ui::web::text2html for some embellishments that this provides over text.