Standardized Administration Tools (SAT) Accounts Management Requirements

Proposed Requirements

Immediate Implementation

Database

  • The database relies on three sources housed within the Odyssey postgres database. See web diagram with architecture overview, and SATAccountsDatabase for database details.
    • data from Quest and watiam is campus-wide; our design can accomodate the rest of campus if they want to use this in the future.
      • includes data for Grad Students, Faculty, Researchers, Visitors, Staff and Undergraduates - anyone with a watiam account.
  • updates triggered by timed events following the watiam updates (every 4 hours on the :30; take approximately 7 minutes to complete)
  • new schemas cover sponsorship, groups, and other SAT data.
    • sponsorship covers accounts, mail aliases, and print quotas currently.
  • SAT Accounts API - "wrapper" postgres functions to perform updates.
    • Used by CSCF for populating Active Directory. Will eventually be used by MFCF for populating their accounts.

Web UI

  • UI will currently offer one administrative access level- eg., for technical staff in CSCF and MFCF.
  • use-cases (see-also SATAccountsUserStories)
    • add users to "groups" corresponding to course sessions and/or other purposes such as Research Groups. Group membership grants access.
    • maintain "sponsors" who are authoritative for budgeting related to groups. Sponsors have optional "billcodes" for each of their sponsored groups.
      • UI must intuitively allow management of sponsors without repetitive updates.

Command Line

  • replacement for userinfo command

Future Work

Database

  • using postgres triggers to kick off LDAP updates would be a useful enhancement so that updates that aren't from watiam will propagate more quickly than up-to-4-hours.
  • Grouper integration.
    • Could we use the Grouper API to pull Grouper-defined data into our database?
  • New types of sponsored resources other than accounts/email aliases/print quota.
    • What might we want to automatically set up for different groups of users?...
    • Daniel imagines: authorizing physical lab access; students in a course receiving shared git repo access per assigned group; shared LXC container under student.cs; access to any cloud system we can programmatically authorize
  • Out of scope to plan for accounts that do not have watiam correspondence. odyssey is driven by the campus user database (with unique internal IDs corresponding to them), and this would introduce another authoritative source (potentially conflicting) that hasn't been implemented or worked through.

Web UI

  • user preferences- such as an end-user's preferred/most-used bill codes
    • user preferences will allow data filtering, separate from access permissions, so that a user will see the most useful data first
    • will be necessary if MFCF is going to see different groups and sponsors than CSCF.
  • an eventual goal is self-service: people have access to the right level of permissions for them to grant resources.

Requirements Gathering notes

2019-01-31 - meeting with Adrian Pepper

Met with Adrian about his perspectives on automated and manual accounts management

  1. Isaac and clayton's tools handle automated cases but do not handle either of the following situations, which cause more work than the automated cases:
    • adding people who are auditing a course; (2-5 per term)
    • adding people who are not yet in registrar data but intend to take the course (~2 dozen per term)
    • So we need to consider undergrads as well as grads in this system.
  2. A current complexity is people not in quest (but are in watiam) who need manual recording in "Research Regions" (currently groups in AD)
    • visible from linux.cs > getent group | grep users_
    • 14 groups. list of users come from analysis of accounts_master/data/sponsors/research ...
  3. The importance of recording "sponsorship" - eg., MFCF identified tasks that turn out much easier if there are sponsors
    • (I didn't write down any examples)
    • sponsors data is organized around classes- including "fake" classes like research groups.
    • Q: are sponsors manually set? A: in the current system, sponsor of course accounts is "dean of math"...
    • (what kinds of sponsors are manually set?)
  4. many accounts are by year and we don't actually know if they left before the end... old system threw away what their last situation was?...
    • People change their year mid-term sometimes; they might be listed under CS1 and CS2 at the same time.
    • what happens on this transition from CS1 to CS2? should they have both resources, or just most recent? Overlap?
    • Could we record resources by effective-date/end-date? probably.
      • if there are triggers to automatically change the end-dates
  5. sponsored email aliases
    • do we keep doing this?
    • put this all into salt?
    • some kind of merge; don't want all aliases everywhere;
    • who is able to edit? salt is a shift from "administrative data" to "system administrative data" - possibly less easily editable by an end-user than an .aliases file on an archmater.
    • noting we run 2 mail servers; probably will keep doing so; raised question of internal-only email.
    • email .forward file created by accounts packages - how do we handle this?
  6. Adrian has written sponsors-range to turn resources files -> start/end dates
  7. potential ways forward given that we're shutting down 14.04 arch-master.
    • no 16.04 arch master. shutting down 14.04 by end of term.
    • Adrian says he could set up 18.04 arch master?...
    • accounts relies on libraries that would need to be compiled, but they can be on 18.04
    • sponsor_resources could be kludged to work on 18.04
    • Currently three xhier machines: cs-xh-admin cs-general cs-teaching

2019-03-22 - meeting with Dave Gawley

Met with Dave [update on 31 July 2019]. His summary of related data he wants to access:

Course accounts / CS-TEACHING

Changing each term, we track:

  • for each cs course
    • co-ordinator IDs
    • lecturer IDs
    • TA IDs
    • BIU (Business Income Units) - (CS Students account-names from odyssey db) - under the new budget model- funding units from the province
    • BTU (BIU Teaching Unit) - (non-CS Students account-names from odyssey db) - under the new budget model- funding units defined internal to UW, capacity of a faculty to generate operating grants, according to IAP. data might be from Quest or other UW database
    • ta-sponsored student IDs - (add period per calendar +1 or 2 days; auto-expires @ term drop deadline [???])
    • lecturer-sponsored student IDs - (add any time, end-of-term expires)
    • group (should receive extra resources, such as for project groups)
    • resources
      • disk quota
      • lab room (NOT currently tracked)
      • host list (NOT currently tracked)

CS_course_resources
  ${Course-ID}_account (account-name exists in both CS-GENERAL and CS-TEACHING)
  ${CS-course-resources}_host (list)
  ${CS-course-resources}_lab_room (list)
  ${CS-course-resources}_extra_quota (list)

CS_course_group (list)
  ${CS-course-group}_manager
  ${CS-course-group}_members
  ${CS-course_group}_resources (CS_course_resources)
  
The CS-Research domain needs this data for each term:

CS_research_group (list)
  ${CS-research_group}_manager
  ${CS-research_group}_members
  ${CS-research_group}_resources (CS_research_resources)

CS_reearch_resources
  ${CS-research-resources}_host (list)
  ${CS-research-resources}_lab_room (list)
  ${CS-research-resources}_extra_quota (list)

Additional to above, add concept of "group" within a course, that gets resources.

  • Using sub-groups, could give particular people higher disk quota, access to particular machines / labs.

Peoples' username unique across (uwdir + UID/GID)

  • Q: do we have accounts that don't have uwdir? A: yes:
    • course accounts don't;
    • role-based accounts such as programming contests don't

Non-course accounts / CS-GENERAL

  • for each "group" (generically defined), we track:
    • faculty co-ordinator
    • group members
    • resources (same as above).

Dave's notes: https://cs.uwaterloo.ca/cscf/internal/infrastructure/services/Authentication-Authorizatin-Resources/schema

Followup by Daniel

  • "group" is currently included in SAT design, with a "coordinator" or responsible person to reference; we need to add:
    • resources
    • other roles corresponding to TAs and lecturers? Or make these as sub-groups which inherit?

2019-03-22 - meeting with Adrian Pepper

  • concerns about composing a person from a set of groups... (I'm not sure why.)
  • we get the data from quest; says Isaac says we should be getting the data from OAT. concerned about making that transition smoothly. (Good to know; we can manage that risk).
  • Work flow question: "how do we make user account go away?"-
    • What do we do about transitional time periods. when someone transfers from CS1 to CS2- we don't want to reap their accounts/files and then create them again.
      • The current account system handles "what we want status to be right now" not "do this to this person"
      • perhaps handle transitions by ensuring system adds then removes?

2019-04-02 - meeting with Dave, Adrian, Clayton, Lawrence, and Isaac.

Agenda

  1. Better understanding of current system for accounts management
  2. what we need going forward: requirements; nice-to-haves..
  3. timing for requirements

1. Better understanding of current system for accounts management

  • involves manually-maintained text state files - with expiry dates
  • automatic additions/removals from certain groups (classes)
  • which indicate sponsorship of resources for individuals and groups.
  • creates and removes unix accounts directly on linux machines (with a quota)
    • user directory created on regional masters - how? to be discovered.
  • creates and removes AD-controlled accounts
  • handles mailing lists
    • sponsors for mailing lists @cs.uwaterloo.ca in CS-GENERAL
    • separate: aliases in cscf-specific
  • See: flow chart Nov 2018

2. what we need going forward: requirements; nice-to-haves..

  • post-processing software is not going away. The new system needs to kick off updates within 5 minutes.
    • see photo; tools being developed by Clayton to post-process.
  • as of a given day, what are the resources being sponsored?
  • as a minimum, we want to update for arbitrary prof's groups.
    • nice to have: prof/ta can update their own
  • security: using campus authentication as sufficient to create/remove accounts?
  • Isaac: exam management will have (modifiable) list of who is in course- we should drive this list of accounts from exam management.

3. timing for requirements

  • go-live summer 2019

2019-04-03 - Lawrence Folland / Adrian Pepper

  • related to Daniel by Lawrence.
Lawrence met with Adrian about accounts to discuss comparing the old and potential systems. The old and new systems both have RO and watiam inputs which are read-only; and the new can have groups-containing-groups which satisfy the same needs as the old.

registrar data:

  • including "Members" to source sub-files: /software/accounts-master/data/sponsors/REGISTRAR/cs includes:
========
Class: cs116
Description: CS 116
Load: high

Members: <.DATA/cs116

====
Computing: cs-teaching.cs.private
AssignTo: *MEMBERS*
====
Computing: canadenis.student.cs
Groups: student_cs  cs116_student
AssignTo: *MEMBERS*


========

  • And xhier.cs:/software/accounts-master/data/sponsors/REGISTRAR/.DATA/cs116 contains 706 lines each in the format of:
userid:nnnnnnnn

  • A question of understanding the old system: could postgres inputs generate "resources" file ouput? A: We don't know yet. Daniel to follow up with Adrian/Clayton.

2019-04-16 - Adrian Pepper

  • "could postgres inputs generate "resources" file ouput?" A: In theory, I believe yes it could, but with indeterminate, amounts of work. Non-trivial.
  • Adrian and I reviewed the white-board data which I transferred to a web diagram
  • Adrian says diagram of new system's "sponsor_computing" needs to also consider:
    1. ) resource allocations eg quota
    2. ) account deletion/creation: when student drops one CS course, and adds another, does data from registrar's office accurately reflect this?- Adrian says can't guarantee it'll be within the granularity of the system - eg., might be dropped, then added an hour later - we don't want to have their student accounts go away for an hour.
      • existing system will keep their resource allocations
        • possibly expire after a few days?
      • Retention of away students? co-op; medical leave; ...
        • old system: co-ops came back and asked for their accounts to be restored. This involves manual restore from backups.
    3. ) three xh-master machines: xh-admin , general, teaching - do these need recording in the diagram? I don't think so. Just to know these exist.

2019-04-17 - Sean Mason, Lawrence Folland, Isaac Morland, Daniel Allen on Grouper

Lawrence asked Sean to come answer our questions about the status of Grouper ( https://grouper.private.uwaterloo.ca/ ) Summary: we might be able to leverage their "groups" rather than storing them ourselves, however we would have to write database or API hooks to automatically populate them with course data. Worth looking further, next term.
  • Grouper offers many interactions with other campus systems. Might be useful for us to use; or to supply data.
    • They have Campus Data -> Quest including:
Academic Level Affiliation -> [4A] -> [ names ]
Class Enrollment -> [1191] -> [ARTS] -> [1191.ARTS.130.005.SEM]
Faculty Affiliation -> [MAT] -> [names]
Plan Affiliation -> [Accounting and Financial Management, Honours, Co-operative Program] [plus 700 more]
Program Affiliation -> [Acc & Financial Mgt,H][plus 146 more]
    • They have Campus Data -> Workday including:
Employee Affiliations -> (Casual Employee, Employee, Faculty, Retiree, Staff, Temporary Employee)
  • Includes everyone as recorded in Workday.
  • Supervisor data is weird, because workday data has weird workarounds.
  • Grouper does NOT have course instructors from Quest, or "course coordinators". odyssey has both- Isaac would be happy to get them this data.
    • Similarly, grouper has section membership but not Isaac's concept of "who's in the combined class" - only defined by faculty members in odyssey.
  • Is it supported?
    • considered production by IST; being used by Science to maintain SPSS license access for students, as well as other units. They will work with us on our needs. Code is freely available and documented
  • How would it be used?
    • populates campus AD directly
    • can consume data from a database view (such as odyssey).
    • can read and write grouper data via API
  • What time-lags are there for updates?
    • typical 5-minute synch to AD groups (eg., confluence).
    • Might be 1-2 hours at high load (once a term they have 100,000 changes at once)
    • sailpoint consumes its data and also sees 5-minute lag (with similar 1-2 hour during high load).
Discussion of our requirements. We don't want to give instructors direct grouper access; Isaac suggests instead instructors would update their course data including course coordinators within the Exam Management System. A Grouper or CS-maintaained database would then interact with that odyssey data.

Sean has given Daniel and Isaac full access to workday and quest data, for examination.

2019-04-21 - Clayton Tucker

  • Reviewed the web diagram and made clarifications about AD.

2019-04-22 - Adrian Pepper

  • Reviewed the web diagram and made further clarifications: no direct path from Quest/watiam sources to the resource files- always via sponsors files.

2019-05-14 - Lawrence Folland

  • Proposed development process:
    1. Clarify the requirements for the new system, including all of the pieces we want to preserve from old system.
      • Q: Are we expected to continue sponsored mail aliases and lists? Are we re-implementing "sponsor_resources" code to output these (along with UID/GID registry,
      • What is the automated process for students who go on leave or co-op? Adrian says currently students leave, return, find their home-dir files are gone, and ask for them to be restored (from backups- manually).
      • Q from Lawrence: will we/can we have a replacement for the "userinfo" command? command line or gui?
        • would Adrian require "userinfo" for his tasks?
        • are there other such command line tools we need to make work?
    2. Determine data models that will make this possible; and how existing systems will interact with the data models to: a) automate what we need and b) provide manual access to the parts that are not automated.
    3. Implement web interfaces which provide all necessary manual access.

2019-07-30 - Meeting with Dave, Adrian, Robyn, Lawrence, Isaac, Daniel. Regrets from Clayton.

Agenda:

  1. Clarify our understanding of requirements for the new system, including all of the pieces we want to preserve from old system.
  2. Current status
  3. Steps moving forward

1. Clarify our understanding of requirements

  • Daniel shared the web diagram which reflects his current understanding of data flow and pieces to be implemented. Key starting points for discussion:
    • a database, sponsor_computing, is being populated with all data currently produced by sponsor_resources - Isaac added hooks into sponsor_resources (C code) to output existing data structures to SQL in addition to the flat text files.
    • this database is used by psql_update to generate LDIF files and populate the new AD, in production since early this year.
    • this process relies on sponsor_resources to generate the data, which will be replaced. (With what? DA)
    • this database might have all the elements sufficient to kick off the additional tasks Dave says are under development- eg., quota allocation, account creation and removal, UID/GID registry.
  • Dave reiterated his request for a database with schema access to data described on 2019-03-22 - updated in https://cs.uwaterloo.ca/cscf/internal/infrastructure/services/Authentication-Authorization/DB-queries.shtml
    • And a request for comments on these requirements.
    • He would like an API for accessing the database, and methods for users to update the database.
      • minimally, administrative users in CSCF need to be able to update course and resource memberships.
      • ideally, course managers (faculty members? TA's?) can update their own courses to add special cases). Brief discussion about who should be; this is an administrative/Registrar kind of question for followup discussion.
  • Isaac proposes that course managers update special cases within the Exam Management System, which is already used by nearly all Math faculty courses. This would be a smallish addition.
  • Isaac says that producing the UID/GID registry falls under things he's already thought about and would be very simple to do in SQL.
  • Dave says we can stop being responsible for sponsored mail aliases - handing this task over to IST mail services. For followup discussion.
  • We might be able to fulfil the term-goal requirements without significantly revamping the database schema done by Isaac and used by Clayton. For followup discussion.
    • Adrian raises a concern about data representation of empty classes. Details to follow.
  • Some followup discussion I didn't capture, sorry... flying fast and furious.
  • Robyn shared these MFCF pages of documentation: http://www.math.uwaterloo.ca/~rblander/mfcf/old_site/resource.shtml and http://www.math.uwaterloo.ca/~rblander/mfcf/old_site/sponsorship.shtml
    • and also he shared: On an xhiered accounts master machine - mfcf.math or xhier.cs (?) see these man pages:

      How things work:

      accounts-master-package(7)
      accounts-master(8)
      accounts-client(8)
      sponsor_accounts(8)
      sponsor_resources(8)

      How the data are organized:
      sponsors(5)

2. Current status

  • Infrastructure has a data-path that populates the new AD, as described in early April onward
  • Isaac and Clayton have done schema/database work to output all the (uncommented) detail in the sponsors files.
  • Today we found that we might not need much schema change to meet the term goal accounts-management requirements, which would be helpful, though we will still discuss deeper changes.

3. Steps moving forward

  • This meeting pointed to a number of threads to be discussed; some were opened in email threads, details to follow below.
  • Another meeting to be held ASAP.

Followup emails

  • Summary: Isaac notes the database in question is transitional; minimally it will need additions when we determine what should be added via web UI.
  • Robyn notes Adrian's question about data in context- what if a course has no members for a month? We will probably need to test this.

2019-07-31 - Dave

  • Dave clarified that the existing data in the sponsor_computing schema appears to be sufficient to produce the new AD data (not needing additional contextual data)
  • The "schema" he described yesterday is the desired end-goal of data he would like to access, perhaps from a database view. This goes further than requirements for deployment for this Fall.
  • Most of the additions are already available in OAT/ASUS.
    • BIU/BTU (reporting CS vs non-CS students) is required by the School Director to report under the new budget model. This data would be available in ASIS for each student.
  • The most time-critical piece as far as he is concerned is the UID/GID registry revamps. He asked if Isaac might be able to have this ready for production within two weeks? (The alternative is deploying new UID/GID for end-of-December).
    • I asked Isaac; Isaac thinks this is doable in under two weeks. He will follow up with Dave. A consideration is whether MFCF can also switch-over UIDs at the same time.

2019-08-02 - Dave

  • Dave views the replacement for the sponsor_resources process, is a database transformation of the data sources, into the sponsor_computing schema. And that schema is used to run all of the infrastructure processes such as new quota allocation, account generation/removal, etc.
  • To be determined by talking with Clayton/Isaac: does Clayton's psql_update rely directly on OAT data, or is it only relying on the sponsor_computing data which was pulled by the old sponsors files? That tells us more about how far along we are, since Dave thinks we're not relying on much of the old sponsor files. From what I've heard, we are.
    • 2019-08-06: Clayton says psql_update does read directly from OAT, not just the sponsor data pulled from files. Good!
  • In either case, to eliminate sponsor_resources we'd need to identify what are the pieces of sponsorship data (from new sources, such as the Exam Management UI, and a resources-management UI) that can produce a reasonable output.
  • Next week I will replace the web diagram with updates.

2019-08-08 - Meeting with Dave, Adrian, Clayton, Robyn, Lawrence, Isaac, Daniel.

Agenda

  1. Reviewing diagram
  2. Steps moving forward
  3. what will we have working by end-of-term?

Diagram

  • Daniel shared the web diagram and made updates to make it accurate.
  • For future integration: Grouper.
    • Could we use the Grouper API to pull Grouper-defined data into an existing schema? (math_computing? _people?)

Steps moving forward

  • Replacing 14.04 (regional masters; collecting home directory paths)
    • Robyn will look into replacing /u[0-9] with /u
  • Replacing long userids with short userids- likely could be done for beginning of Fall
    • old long userids die by attrition
  • UID/GID registry replacement - NOT for beginning of Fall
    • timing isn't for beginning of Fall, because most new accounts were already registered near beginning of term.
    • since overall process isn't changing this term, we need to use the new registry with xhier - replace uid registry program
  • new account creation: create course resources for previous/current/next term
    • instead of running commands at end-of-term, can have updates at any time
    • Clayton, Isaac to decide between eg., ( netgroup = CS100, with netgroup_1199 vs. netgroup_YYYYMMDD )

what will we deliver by end-of-term?

  • /u[0-9] -> /u
  • short userids
  • existence of new UID/GID registry - but not being used for new F2019 accounts
  • a roadmap for Fall work

2019-08-15 - Meeting with Lawrence, Isaac, Adrian, Clayton, Robyn, Daniel.

Agenda

  1. Reviewing progress: a) short userids; b) UID/GID registry.
  2. Work remaining for Spring
  3. Roadmap for Fall

Reviewing progress: a) short userids; b) UID/GID registry.

a. short userids
  • accounts creation always will need to consider (old) long userids.
  • q: can we mv all long homedirs too short?
    • replacing old long homedir to symlink - one-time
  • q: have we decided where homedirs go? /u - assuming Dave agrees.
    • in 2015 a suggestion was implemented in code for mkhomes to allow /nethome/ [ 2019-10-16 : rblanders clarifies that nothing else generates these. mkhomes merely recognizes and accepts it the same as it does /u and /uN ]
  • q: can we replace long group names?
    • can we store the long data in the schema? don't have to; it's in _watiam schema.
    • we're OK with a one-time change of group-names
  • we think we have the policy figured out; and TBD who will do it. Isaac is happy to write script to do the symlink creation.
b. Isaac has started schema for UID/GID and has docs from Robyn. Feels confident about design. 1st, implement db; 2nd, will put interface onto it for access.
  • q: is there any chance a userid ever is reused?
    • possibly staff userids get reused
    • need to account for merged userids where old userid gets reused
  • looking at what tools directly call the uid/gid code, versus uidregistry program. may be simple.

Work remaining for Spring

  • Adrian putting expiry dates into registrar data- to have a better picture for data we're seeing
  • UID/GID registry: Isaac will be ready to demo UID/GID database, possibly not uidregister code.
    • Isaac and Adrian will look at existing code on idregistry.math
  • Daniel, Isaac, and Clayton will have another discussion of new schema.

  • Answering q of courses with no people in them?...
    • if a course (requesting extra resources/quota) has no members, do those extra resources persist?
      • existing tables record what are in the resource files - not the final tables.
      • we need new tables to record per-course information. Then revise tools to use these.
    • q: does sponsor_resources properly handle user-specified odd directories? Isaac can look at code to answer this.

Roadmap for Fall

    • deferred two weeks when all are back.

2019-09-26 Database Schema: to be discussed between Daniel and Isaac.

  • Draft description found in SATAccountsDatabase. Isaac has not had time to discuss; he and Daniel agree that Daniel will proceed with his draft, to be updated as necessary.

2019-10-11 and 2019-10-15 Database Schema: discussion between Daniel and Robyn

  • Robyn has helped clear up Daniel's understanding of the existing schema, both for: how some of the fields are used for both CS and MFCF, and what fields are used in MFCF but not CSCF. Details recorded in SATAccountsDatabase (particularly this diff )

2019-10-17 - Meeting with Dave, Lawrence, Lori, Robyn, Isaac, Adrian, Clayton, Daniel.

Agenda

  1. Reviewing progress: a) short userids (Isaac/Adrian); b) UID/GID registry (Isaac) c) database (Daniel/Isaac)
  2. Roadmap for Fall

Reviewing progress

a. short userids (Isaac/Adrian);
  • we handled most of setup of short userids.
  • short group names to match - and home directory name matches the short name.
  • will be handled by UID/GID registry.

b. UID/GID registry (Isaac)

  • started schema;
  • has conceptualized a new implementation that closely matches what we currently do, with an expectation of adding new stuff later.
  • RFC 2307 and extensions- covers AD extensions that Dave says we should include initially.
  • Isaac will review these and propose a solution. Daniel to talk to Isaac next week to see if he's started.

c. database (Daniel/Isaac)

Roadmap for Fall

  • we should have a web demo within a few weeks - to cover the user interface, and database as it stands.
  • we set up accounts in November; not going to have new system for November.
  • Can development proceed before we have the UID/GID registry finished?
    • Dave needs new accounts API, plus data (not just API).
    • at a cutover-point ldif code needs to switch over to new code.
    • Clarification to the accounts process document: UID/GID tables will be part of the accounts schema; they will be generated when data is available, rather than when it's requested for the user to first log in.
  • Dave wants this complete by end of February - by 15th for testing.
    • would like API and test data by Jan 2.
  • Next meeting in two weeks: Daniel to demo UI; and have details for Dave about API (likely JSON).
    • will also invite Lori Suess.

2019-10-21 Meeting with Caroline Kierstead

While doing an off-campus project, Daniel was sitting next to Caroline Kierstead and he asked her about about accounts management from her perspective.

These kinds of users need to be manually added to the system:

  • sessional instructors
  • grads (TAs or Research Assistants)
  • ISA's (co-op students working on the course).

Who should be making these decisions?

  • depends on the course; primarily depends on the instructor.
  • ISC's or instructors might be making the additions

Who should we talk with next, from an administrative end-user's perspective? Byron Weber Becker; and Gang Lu (Carolyn's replacement; ISG Undergraduate Operations Coordinator).

2019-11-07 - Meeting with Dave, Lori P, Robyn, Lori S, Isaac, Adrian, Anoushka, Daniel.

Agenda

  1. Reviewing UI
  2. Recording future directions

UI Questions and Feedback

  • Q: What happens at group end date, to members of the group?
    • They are still in the database, and listed in the UI; the API will report to Dave's code on active groups, not including inactive, so it can remove access when the group becomes inactive.

  • Q: Can we delegate authority to change particular data, to a group of admins? Follow up with Dave.
    • Yes, the database can support this, though it's not certain the business-logic and UI will have this complete for January.
    • And yes we will have a path for upgrading the code that doesn't take down the production system...
  • Q: Dave wants multiple sponsors- eg., two advisors, and the second one will remain if the first one is removed.
    • A group can only have one sponsor. However, this can be done in effect by having a group for Prof A, and a group for Prof B, both which contain the student; and they have a parent group, which is the one that we depend on for denoting resources.
  • Q: by Daniel: Are sponsorships always explicit, or do we sometimes want them inherited from groups?
    • Probably sometimes inherited. Will discuss further.
  • UI improvement: Groups: instead of +, "add new member"
  • UI concern by Lori S: not very straightforward workflow to her.
    • How can we support regular workflows? We walked through adding an arriving Faculty Member to a Group; but it's not clear to her where she'd start. We didn't try and walk through a more complicated example, such as giving the arriving Faculty Member proper sponsorships.
    • A basic answer to workflows would be a help page with instructions and links to the appropriate page(s). There aren't hints to the order of pages. (There's more than one way to do it).
    • A more complete answer would be application "wizards" to walk through common processes. A list of these starting-points could be put on the front page.
    • currently Lori trains co-op students to operate the existing system; spends three weeks teaching them, and then they are able to do it in the proper ways. There isn't a written training document.
  • Basic improvements to the UI:
    • hover text;
    • pulldowns and/or validation, not free-field text for text that should be constrained.

Future features

  • user preferences- preferred/most-used bill codes, etc.
    • to offer a user filter for data, so that user won't see more than they should.
    • will be necessary if MFCF is going to see different groups and sponsors than CSCF.
      • I've written up how to do this for groups - displaying based on a person's Primary Appointment and that Appointment's default Group.
      • For resources and sponsors, do we need to add a field to constrain what group "owns" the records?
  • a goal is self-service: people have access to the right level of permissions for them to grant resources.
    • see Dave's first question about delegating authority.

Next steps

  • Continuing work from last group meeting (see 2019-10-17) by Adrian, Isaac, Daniel
  • UI work by Anoushka
  • API work by Daniel and Isaac
  • testing a more complete UI with a wider range of end-users - Anoushka and Daniel
  • interviews to determine who can make the decisions (see 2019-10-21)

2019-11-29 - Meeting with Dave, Lori, Robyn, Isaac, Adrian, Clayton, Daniel.

Agenda

  1. Reviewing progress: a) UID/GID registry (Isaac) b) short userids (Isaac/Adrian); c) database (Daniel/Isaac) d) API (Daniel/Dave)
  2. Plan for Fall
  3. Plan for Spring

Reviewing progress

a. UID/GID registry (Isaac)
  • Isaac has schema that exactly reproduces UID registry for general.math - normalized
    • analysis shows no data has weird incompatibilities
    • next step: interface, for lookups; looking at programs that refer to registry, for exactly what needs changing.
    • then, can change to use our new numbering scheme under the hood.
  • Isaac can concentrate on this after next Friday.

b. short userids (Isaac/Adrian)

  • new UID registry has long. Isaac is looking into the impact of short, when looking at code.
  • possibly: when requesting ID by number, return short/long? Nope, we don't do that currently.
  • other direction, by name, can accept short/long...
  • Adrian notes registry currently returns long. (yes)
  • group search for short returns unknown; long returns correct.
  • currently there are existing short groups that are ambiguous truncations

c. database (Daniel/Isaac)

  • Lori question: do we need separate AD from these databases?... yes. it's not separate; it's subordinate.
    • all one data-pool
  • Dave question: where are friendly-name, shell, homedir kept? homedir is important to be included in the database
    • shell and homedir not centralized, dependent on "realm" or "region"
    • at any time do we want LDAP rebuilt from data-pool?
    • Clayton and Isaac to discuss.
  • where do we store the list of realms/domains?
    • our "forest" is in AD, can we put it in our data-pool.

d. API (Daniel/Dave)

  • Anthony Brennan arriving Monday for a 5 month contract; will be available 1 week later, Dave hopes (one higher priority). Max. 2 projects.
  • Dave wants to defer API discussion until Anthony's here, so he's on-board.
  • API and test data: deadline: by Jan 2.

Plan for Fall

  • UID/GID registry completion
  • short userids
  • database completion (including stored procedures and triggers)
  • API complete (including stored procedures)
  • UI: hopefully with end-user privileges; possibly with only global administrator privileges
    • Interviews with end-users (Anoushka/Daniel)

Plan for Spring

  • Dave would like roadmap for how we get from Fall to done.
    • end-to-end how create users? - on whiteboard/sticky notes
    • written statement already for what data do we need to authorize an account?
      • What can we delegate?
    • Math: changing plan. needs design; they are not running AD, will be running salt or grouper.
      • Math has an accounts request webform.
    • with the end-to-end design, get admin check-off in followup.
  • [from previous meeting] Dave wants this complete by end of February - by 15th for testing.
    • this requires: API and test data by Jan 2.
    • needs to build: new quota allocation/account creation and removal; sponsored mail aliases; command-line tools (?); new version of AD tools that uses new database
  • Daniel asked Isaac: how can we satisfy Dave's request for "as-needed" updates, rather than on a timed schedule?
    • this would require completely revamping the data process from IST.
    • they currently provide _quest updates once a day. Isaac is in conversation with them about increasing to 3x/day, which would make a big difference.
    • _watiam updates are ?x/day but we don't know a new user is in courses until we get the next morning's _quest update.
    • we could know about name changes more quickly.
    • Isaac's proposal: our "trigger" is running after the data gets updated each morning. If that frequency updates, we automatically run more frequently.

2019-12-06 - Adrian

Question: is next week's meeting to discuss end-to-end how we currently create users, or how we hope to create users in the new system?
  • Our motivation is providing Dave with what he needs to produce the latter. We need to have the former, first, and that's not fully documented. So we are starting with discussing the current system end-to-end.

There are portions of the old system that we haven't discussed much (or at all).

  1. Adrian says he currently turns Undergrad Operations data into sponsor text files for sessionals (teaching).
  2. Grad data is turned into sponsor text files for TAs
    • via a script developed by Isaac and Adrian which queries grad office tables (psql -h postgres.odyssey -d odyssey ; set search_path to _instruct, _identity, _quest, public; [...]
    • produces sponsors file: cs-xh-admin:/software/accounts-master/data/sponsors/CLASSES/automatic/TA-cs2001
  3. Some sponsorship data is NOT in watIAM, such as role accounts (cs-pwset).
    • those are maintained in the "userids" database: cs-xh-admin:/software/accounts-userids/data
            cs-xh-admin:/software/accounts-userids/data# grep cs-pwset Userids
            x:cs-pwset:For Web password:2004/Oct/21 - UWdir
  • all entries where the second character is ":" are maintained outside of watiam, such as "x:" (the nomenclature is documented in a README; there are many formats).

We haven't discussed these non-watiam accounts. MFCF has intentionally stopped using them.

  • If we are going to decide to put them into watiam, we need to implement a process for that.
  • Adrian notes: "Compared to a few years ago, the creation in WatIAM of accounts with a uwuserid corresponding to a userid we are currently using (known to be using) has apparently become troublesome and somewhat time-consuming (for IST). E.g. a recent example of userid "isg"... /RT#1019570 /RT#1020383

2019-12-06 - Gang Lu

I called Gang to ask about her interactions with accounts creation.
  • she emails accounts@cs concerning sessionals and new courses. She updates SessionalHiringArchive to include a list of names and the courses they are teaching. Both (email and twiki) updates are incremental as she receives information.
    • Adrian waits until the list is complete to do them as a batch.
    • where does her list of sessionals come from? Her group goes through the hiring process, and for the successful hires, that information is passed along to HR. Tracy D is in contact with HR, but often HR doesn't tell us when they have created WatIAM accounts. With the new WatIAM creation process, HR sends the new hire a registration-link, and they complete their information, and then it falls into a black hole as far as she's concerned.

In general does Gang know WatIAM IDs for these hires?

  1. if they are recent graduates, yes.
  2. if they are brand new, she knows that they need them (via HR).
  3. others are grey areas; some have them already because they are alums.

Are there other accounts setup we should know about? I saw in RT that she asked Nick for course accounts creation?

  • She first emails accounts@cs, and eventually if needed she reaches out to other CSCFers such as Nick.

Are there other things we should know?

  • She wishes we could find out when someone's WatIAM accounts are created by HR; rather than Tracy having to bug them for it. She says recently a new hire emailed her a screen-shot as proof that they did have an account (that she didn't know about).

2019-12-12 - Meeting with Dave, Robyn, Lori S., Isaac, Adrian, Clayton, Daniel.

Agenda

  1. End-to-end walk through: how do we create users?
    • MFCF has an accounts-request webform. Is there a written statement for what data does CS need to authorize an account? No.
  2. Reviewing progress: a) UID/GID registry (Isaac) b) short userids (Isaac/Adrian); c) database (Daniel/Isaac) d) API (Daniel/Dave)

End-to-end walkthrough

Whiteboards filled, with existing system. Daniel to transfer to something we can potentially edit and markup collaboratively. Identified a number of questions to answer.
  • Can we eliminate reliance on non-watiam "userids" accounts, moving them into watiam?
    • We are in agreement we would like to eliminate that database. MFCF doesn't use them any more. Dave suggests we consider designing workarounds (eg., for visiting faculty who want a userid which cannot be a watiam account. He has seen three, over the years.) Daniel says watiam existence is fairly fundamental to our design- this would be a big deal.
    • Can we address this ASAP? (Dave?)
  • What validation happens from end-to-end; identifying errors that need to be manually fixed? MFCF relies on regular reports of errors. I believe CSCF debugs errors upon running sponsorships. (Adrian?)
  • What auditing/fixing do course-accounts need? Adrian cautions that we do a lot of auditing; we don't know if the course-account owners rely on this, or wouldn't care. These are questions for Omar. (Adrian/..Daniel?)
  • How do we use "regions"? There are various definitions in use. (discuss as a group)
  • Does CSCF need to consider machines that don't use A/D? Dave notes that Lori does not use A/D for accounts on all cluster machines; typically just for the head nodes (due to network fragility). (Dave?)
  • in MFCF, what precisely do "Jim's scripts" do? (Robyn to investigate.)

Reviewing Progress

a. UID/GID registry (Isaac)
  • Isaac demo'd a schema with custom functions; these replicate the current ID registry functions
  • next step: looking at programs that refer to registry, including reviewing source-code on capo/(etc?) and talking to Robyn and Adrian. (soon).
  • following: depending on the programs that need to continue, either tweaking their source-code and deploying; or replacing.
  • then: can change to use our new numbering scheme under the hood.

We ran out of time in this meeting for discussing other items. Daniel to schedule targeted follow-up meetings soon.

2020-01-23 - Meeting with Dave, Isaac, Daniel

  • To discuss: API and inputs for Dave's tools; timeline.

API

  • Dave's initial request for "views" on data
  • Isaac proposes postgres access for queries, and stored procedures, which make queries and updates.
  • We've established that Clayton's code is essentially doing all of this, already. Isaac thinks we only have minor work to go?
  • What does Dave need?
    • White board photo
    • SAT API will be read by scripts on linux.cscf, kicking off scripts which send data to the new 2019 CS Windows server, which will send data to LXC client-machines on a new Ubuntu 20.04 server (yet to be setup, likely Feb 1-Feb 15). That server will emulate both general.cs and student.cs for testing.
      • the LDIFs for the new Windows server will also be compared to the LDIFs of the current LDAP forest, to validate that new = old.
      • This part of the diagram is only an overview; further detail/correction is a good idea, but we didn't go into those details in that 1-hour meeting.
    • What does CSCF need, other than the AD-generation?
      • setting account quotas: will be supplied by SAT/accounts/"groups" - also read via database queries.
      • setting email aliases in general.cs and student.cs - under discussion by managers; might not be needed. We will assume we will implement it.
      • on the photo the dotted lines are the data path for setting account quotas and email aliases: data from odyssey -> linux.cscf -> lxc containers for testing general.cs and student.cs
      • We are NOT setting print quotas, despite that being in the spec. Dave says these are per-machine only.
      • Future potential for adding new types of sponsored resources other than disk quota/email aliases/print quota. What do we want to automatically set up for different groups of users?... (Daniel imagines: automatic grouper updates which authorize physical lab access; students in a course receiving shared git repo access per assigned group; shared LXC container under student.cs; access to any cloud system we can programmatically authorize)
      • Re-iterated in the meeting: it's currently out of scope to plan for accounts that do not have watiam correspondence. odyssey is driven by the campus user database (with unique internal IDs corresponding to them), and this would introduce another authoritative source (potentially conflicting) that hasn't been implemented or worked through.
        • existing process: picking uwdir IDs is handled by watiam onboarding. Dave's example of "CS Director asks for a visiting researcher's account at 7pm": answer is: self-service watiam for the researcher, which gets them as far as an auto-created account, we think. They won't get a corresponding CS account until an authorized SAT user adds them to an appropriate group.
        • creating campus-wide email aliases is handled by watiam.
        • Dave asked about Lori P's cluster accounts and off-campus users. Not clear how those accounts relate to campus accounts- not clear how odyssey can see.

Triggers

  • Two kinds: the timed events for Isaac's watiam updates, and via Postgres.
    • timed events for watiam updates: kick off every 4 hours, on the :30. (and take approx. 7 minutes to complete). Isaac can add a shell stanza that kicks off whatever Dave needs.
      • We agree we will use this for now. It covers watiam updates, as well as for UI changes which lead to group and quota updates (up to 4 hour lag)
    • vs. postgres triggers can kick off stored procedures whenever any data updates.
      • they can also send a "notify" event, which is used by an external listener to execute code - https://dzone.com/articles/notify-events-from-postgresql-to-external-listener
      • we agree that using postgres triggers to kick off LDAP updates is out of scope for now. But they would be a useful enhancement so updates that aren't watiam updates will propagate more quickly than up-to-4-hours.

Timeline

  • last month we said the following:
    • Dave wants this complete by end of February - Feb 15th for testing.
    • this requires: API and test data by Jan 2.
    • needs to build: new quota allocation/account creation and removal; sponsored mail aliases; command-line tools (?); new version of AD tools that uses new database
  • We're behind on having a fully vetted API and test data.
  • Isaac and I will work toward having the API and test data in the next few weeks AKA approx Feb 7.
    • Dave wants to have his test environment operational for Feb 15.

Followup with Lori P.

  • How are we involving MFCF in these discussions? Robyn L. should be kept up-to-date.
    • This time, Daniel to do followup with Robyn independently.
    • In the future Daniel to inform Robyn about the meeting in advance, as optional-invite.

2020-06-08

Meeting with Isaac, Daniel. Following a lot of work done in ticket /RT#1015583.

Agenda: review and revise draft database schema.

Sponsor Computing related questions:

  • If Clayton determines the "region" from sponsor_computing.computing_name it should be a lookup table, not fill-in text.
  • What part defines whether a unix group should be created?
    • It's definitely not "all SAT Groups become unix groups"
    • It's possibly not "all sponsored SAT Groups become unix groups."
    • So there might need to be a field in the group definition table

[ jumping forward to related topic:]

  • Isaac points out that group_group is best limited to general group definition, and another table is best used for the specifics of sponsorship (including the above, group sponsorship unix group)
    • Which answers an open question, "what should be the default sponsorship for new courses imported by quest?" - they shouldn't have a sponsorship yet; sponsorship details are separate and subsequent.
    • Should group sponsorship have a settable expiration extension time? Eg., person should lose access immediately, but files are not deleted for n months. We know this delay is a spec requirement, but where is the time set?
      • All else being equal, account expiration and removal parameters belong in the accounts expiration software (Dave's group), not the database.
      • Are unix groups addition/removal a different case? I suggest that no, removal can be set by the active-as-of end date, rather than a separate customizable parameter in the database for amount of time.

Group characterization.

  • Change the group_group table to add a "source" column, possible values:
    • Odyssey (automatically imported by script)
    • Manual (updated in UI or CLI)
    • Computed (this is new; discussed in depth):

Computed groups could be unions and intersections of other groups.

  • This may come up in some more complex situations, such as "this research group includes that research group minus this subset"
  • I can see the usefulness, but I'd be concerned about complexity of managing these. Noting end-users who I heard say Grouper's interface was too cluttered/complicated for our account-management, and I think they might have been confused with defining groups. (Which is a fair enough complaint; they are complicated).
  • My schema's proposed hierarchical trees are essentially "parent is union of its children"; but not allowing cyclic graph.
  • I suggested we don't add the complexity of computed at this stage, and could be added later when we had a use-case. Hierarchical meets all the identified use-cases.

Adding a "source" column neatly solves the question of "how do we manage/merge SAT groups and odyssey-supplied groups"?

  • the import process will update/add/delete those labeled "Odyssey"
  • the UI will show both; but only allow edits on "Manual"

Isaac will proceed with writing up another schema version.

2020-07-16

Meeting with Isaac, Daniel.

Following Isaac's work on the schema mostly recorded in /RT#910469

Prep for tomorrow's project meeting. Agenda: finalizing schema; planning for data loading.

  • implementing hierarchical groups. Agreed: implement "union" now, for hierarchical groups; other kinds, later.
  • he will share v1.0 schema later today; including sponsor_automatic but leaving off the sponsor booleans that were still uncertain (nocharge and nosubsidy).
    • we can later re-add those fields if needed
    • nocharge might be redunant
    • if in the future, MFCF needs to record subsidies, we can make specifications that meet their current needs, rather than whatever the needs were when the sponsorship database was originally defined.

  • planning for data loading
    • Dave will have a VM "accounts.dev.cscf.uwaterloo.ca" ready later today /RT#1080931.
    • Isaac will do some work today on 1st-draft queries to load data
    • if we don't have 100% coverage of old groups/classes data, we can work with that in upcoming weeks while Isaac is away.

Other items:

  • future work: making account number into a validated field.
  • current work: includes idregistry schema. So Clayton can request new idregistry values without an external tool.

2020-07-17

Meeting with Dave, Robyn, Lori P., Isaac, Adrian, Clayton, Daniel.

Agenda

  • SAT Accounts Project overall progress?
  • Status report and next steps
  • Timeline

SAT Accounts Project overall progress?

University funding freeze --> no co-op for Daniel this summer --> web UI won't be delivered until Fall.
  • Summer: will deliver command-line tools for querying and manipulating database
  • Summer: will deliver necessary stored procedures and views for technical client needs.
  • Fall: will deliver web UI to meet broader technical needs

Database is proceeding, though not as fast as we'd expected at start of year.

Status report and next steps

  • schema: will be complete as of today.
  • data import: a draft will be complete as of today.
  • stored procedures: idregistry = done. Other stored procedures for data updates are not done yet, but will follow.
  • Isaac is returning on the 28th but will hand over database details to Daniel today.
  • Isaac will coordinate administrative database permissions with Daniel; Daniel to set up access for others while Isaac is away.
    • Dave requests a read-only account

  • Q: will this include MFCF data? Not before end-of-term.
    • Robyn notes that Jim has written code that consumes the flat-files that will need re-implementing.
    • Daniel and Robyn will discuss and document what MFCF needs. (Adrian says there are differences within the accounts code; Robyn suspects it is a loosening of requirements so that's not an issue for the new database).

  • since we have stored procedures for idregistry, any idregistry replacement scripts can be fully implemented now
    • Daniel to set a meeting with ? from Dave's group to discuss implementation.
  • other front-end (admin user) scripts can be spec'd and v1 written while Isaac is away, for testing and finding missing pieces; and Isaac will provide stored procedures after he's back.
  • Qs about queries
    • process for updating resources for TAs, tutors, lecturers? will be command-line. Daniel to discuss the current needs, with Adrian.
      • Isaac notes that lots of updates are automatic via odyssey; ISAs and tutors aren't automatic yet.
    • can we query who has taught a course for the last 5 years? Yes. What courses a student has taken? Yes. Anything in odyssey could be constructed as a query...
    • do we handle ambiguous/conflcting information such as:
      • "student added as an exception; then Quest updates automatically." - yes.
      • "faculty member A requests privileges for a grad student; faculty member B requests other (or overlapping) privileges for the same student" - yes.
      • "faculty member A grants machine privileges for all grads in their research group; faculty member B restricts privileges for their personal workstation" - will need looking into; not yet.
        • a) could be done with separate regions, though it's not entirely clear how we will support overlapping regions.
        • b) current deliverables do not offer subtraction-from-groups, though that is an eventual feature that's within the realm of automatic update.

  • deployment timeline? Dave suggests betwen 18th and 31st, since courses are done on 17th. Let's say Tuesday Aug 25th. OK!

Timeline

  • database: Done July 17 (Isaac/Daniel)
  • dev AD forest server: Done < July 17 (Clayton)
  • dev AD client server : July 17ish (Dave/Anthony/Clayton)
  • data import, enough for Dave's group to proceed with developing tools: ASAP / either today or week of July 20-24 (Daniel)
  • queries for Clayton to programatically query database: July 20-25 (Daniel, with Clayton, possibly Anthony)
  • stored procedures to update database: start week of July 27 (Isaac)
  • maintenance tools to update database: start week of July 20; may depend on stored procedures. Through mid-August. (? Dave's group / Daniel)
  • discussion of MFCF database needs: July 20-25 (Robyn/Daniel)
  • test that the database is complete: start week of July 20, through mid-August (Clayton / Daniel / Isaac)
  • production-ready / flip the switch on the new AD: August 25th at the latest

2020-07-31

Meeting with Isaac, Daniel.

Agenda: What is top criticality? What are we blockers on?

  • process for auto-populating groups from class data
  • data import on S2020 accounts so Clayton can do ldif comparisons
  • rest of existing data import, for the rest of our requirements

Today Isaac is working on:

  • process for auto-populating groups from class data
  • re-deploying test database with more data (currently MIA due to his testing work)
  • being clear on the three queries "what users need to exist in the region; what groups need to exist; which users are in which groups"
  • possibly followup with Clayton about writing queries.

Today Daniel will work on:

  • process for converting data to populate sponsor_unixgroup and sponsor_login tables
  • schedule a group meeting- ideally for Wednesday afternoon, assuming people are available.

Next week we need to:

  • supply stored procedures, including to manually add a person to the database
  • confirm with Dave's group that we are supplying the necessary data/stored procedures/views for maintenance tools
    • including or excluding imported data for quotas? Depending on import, that may add complexity.
    • look at 'userinfo' command and figure out what pieces of data are missing -
      • (noting we will NOT be providing a precise replacement; hopefully there aren't maintenance scripts that try and parse userinfo output).
  • document what is not auto-importing from old data; as a precursor to discussing fallback procedures:
    • can we use the old system to do mail aliases? (this is mostly static data, since we think changes are largely made from include-files)
    • more generally if our best efforts at auto-importing data aren't good enough or we haven't given INF enough time to code and debug, is the old system an option for Fall?

2020-08-05

Meeting with Dave, Lori P., Isaac, Clayton, Daniel. Robyn is away today. Adrian is away for part of August.

Agenda: Progress report

  • Daniel has process for converting data to populate sponsor_unixgroup and sponsor_login tables.
    • Yet to be implemented in SQL.
  • Isaac has set up process for auto-populating groups according to class data;
    • re-deployed test database with more data (groups; sponsors; has a sample entry for ? unixgroups ?)
    • has done followup with Clayton about sample queries.

Clayton would like more data- a full set would allow comparison. We'll do what we can!

Discussed timeline. We are bumping to end-of-year for having a completed system. Drop-dead deadline is April 2021.

  • Dave notes that account creation for Fall term may have already happened in the old system- Adrian does it earlier than end-of-term.

Dave notes that Lori Suess is now retired- so UI requirements are going to be different.

  • Daniel to work on UI in the Fall, with co-op.

Confirmed, keep treating this as a top priority project.

Requests:

  • Dave: wants access to watiam hidden flag. Isaac can add to some view.
  • Dave: wants access to section, for Clayton to make netgroups for it. Isaac says he will expose all the groups that odyssey knows about, which will include sections.
    • use-case: different resource requirements for different instructors in the course. Another use-case: different resource requirements for undergrads in 440 and grads in 640 (currently in one section. Hmm.)

Discussion about account creation: Dave says ldap can create automatically, on-demand? But what about resource usage when 100 (1000) users all try to log in. To be discussed, but not today.

2020-08-07

Notes from Daniel.

Open questions, to discuss with Isaac.

  • group_depends population: how far are we from being able to record groups-of-groups?
    • necessary for recording computing_group memberships
  • from ticket: transactions table for auditing changes - pro/con?
  • the old system [as represented in postgres] will apparently create a new "class" or computing_group if you assign an individual user resources. Essentially a group of one. How should this work in the new system? Is it the same process under the hood, of an explicit "group" or is there part of the design that accepts user records in lieu of group?
  • the import from old data includes a number of 'away' classes, eg., cs2, cs6 ... I assume we already have that data automatically from quest?
    • We need to preserve (not deactivate or delete) accounts for students who are away. We will need a clean way to provide this information to Dave's group.
    • the import includes course groups, which are a duplicate from Quest. After we've analyzed the patterns, should we drop course groups?
      • keep it for historical info? Would likely be a lot of work. * For now let's assume we can drop it; could be a request for later.

Open questions from the ticket, ping Clayton:

  • feedback about current need for 9 vs 7 vs ? region names

Open tasks for me to dig into:

  • reverse-engineer person-to-group in the old system; manually identify exceptions to the known group memberships.
  • identified groups with different end-dates for different users: (17 in cs-general, 2 in teaching)
  • are there other user exceptions I can find?...
  • hopefully not important: resolve: 573 unique class_names, 580 unique (class_name,class_description)

2020-08-20

Meeting between Daniel and Isaac.

From open questions above:

  • group_depends population: how far are we from being able to record groups-of-groups?
    • Existing: how to record group dependency as union/difference. To do: populating a list of the dependent groups.
    • Isaac will work on this as a priority.
  • transactions table for auditing changes?
    • Agreed, we'll add a transaction table.
    • we want to record why something was done. Do we need a separate column for comments and for RT number? Or just RT?
    • Agreed: include both ticket_id column plus comment column. RT tickets are sufficient for CSCF updates- indeed, the web UI can look up the ticket subject. But comment column is necessary for updates by other users who don't use RT.
  • the import from old data includes a number of 'away' classes, eg., cs2, cs6 ... I assume we already have that data automatically from quest?
    • yes, there will be automated groups for who is in what plan; who is a co-op; so there will be automated groups for cs_away.
  • the import includes course groups, which are a duplicate from Quest.
    • convert? keep it for historical info? Would likely be a lot of work.
    • No; don't bother. Superfluous. So I will skip over importing cs_students into sponsor_unixgroups.

Further on sponsor_unixgroups data import and different end-dates for different users in the existing data:

  • do we need to relax the constraints, to allow a unixgroup to have different (start,end) dates for different group_groups?
    • people are automatically in courses; courses are automatically paired with unix groups. So teaching environmment shouldn't need different dates for different groups.
    • Are there legitimate reasons why a single unix_group might have different (start,end) for different group_groups?
      • example: research group X is given access to core www_logs for a limited time. Definitely this is different from the access of other users of www_logs.
      • Isaac suggests we can resolve this by assigning the research group X a time-limited group_group of www_logs. I think that's OK. I will look into the data further.

2020-09-09

/RT#910469 had an update by arpepper long-form which I attempt to summarize here.

Adrian: [ in the existing system, a non-math student who registers for cs100, and drops part-way into the term, will have a sponsor record only until they drop; there is no enduring record ]

Daniel: In addition to the group "active dates", the new system will also record the individuals' active date within the group- including the automatic joins and removals such as courses. So there will be a record left behind that they had been a member, with the actual join date and drop date. You can use the actual drop-date; or calculate plus n days, or start date of next term plus n days. Whatever you want.

Adrian: [ undergrad students who continue as grad students will unfairly have their student.cs accounts reaped, because the "last seen as" value would be cs8, and we only preserve accounts for people whose "last seen as" value is cs1..4 or math1..4 due to political pressure to not give new grad students accounts on the students system unless they were actively involved in classes. ]

Daniel: The problem of continuing CS8 students' student.cs accounts expiring more quickly than desired, could be resolved by not just relying on a single "last seen as". It will be easy to query for "was this student a member of group CS4 within the last two terms?"

Adrian: [ proper terminology for last-seen-as being difficult ]

Daniel: The question of "what do you call this student" is a good thing to keep in mind, though we can just look up their plan/program in odyssey.

2020-09-17

Webdev checkin: Isaac, Anna, Daniel, Niya
  • Discussed setup and postgress access.
  • Daniel will make Niya a dev postgres account as per Isaac's instructions (from elsewhere...)
  • Discussed psycopg2 stored procedures: callproc(procedure, parameters)

2020-09-25

Webdev checkin

Discussed latest version of schema, which Isaac has checked in.

  • Some areas yet to be filled in.
  • Isaac will work on the auto-update of computed groups
  • init_perm.sql includes the "API" which is functions granted to other roles, including public. This will be expanded.
  • Daniel noting: group_member_insert() is idimpotent - inserting a member into a group they are already in will silently succeed. UI will handle if it should have a warning in that case.
  • Isaac will rename account to billcode
  • Isaac to add auto-import so every user in the system automatically has a group
  • Isaac to add transaction table

Daniel to continue with import investigation.

Daniel/Niya/Isaac/Lori S/Robyn to meet on Tuesday to discuss current MFCF operations.

Daniel/Niya to continue development on the web UI which is not using this database yet, so they can do user testing.

  • We'll plan to begin implementing using the new schema soon.

2020-09-29

Meeting with Lori Suess, Robyn, Daniel, Niya, Isaac to discuss accounts/sponsorship workflow and process in MFCF.

Follow-up:

  • Daniel to write up notes from this meeting
  • Daniel and Niya to discuss UI involvement
  • Daniel/Niya/Isaac to discuss any necessary schema additions/changes
    • One obvious addition: We were going to remove sponsorship for printing, but Math uses it and will continue to use sponsorship for printing.
  • We will need a follow-up meeting for Daniel/Robyn/(Jim?)/Niya to discuss "Jim's Scripts" which operate on comments in files within the sponsorship file tree.
    • Initial thoughts: these scripts cover groups and not users. As long as the group names map directly to SAT groups, perhaps they can continue working unchanged.
  • We will schedule a follow-up meeting as soon as we have UI samples for Lori, Jim, and other users to look at.

Workflow demo and discussion

* A faculty member, or visitor, or staff member, will visit https://uwaterloo.ca/math-faculty-computing-facility/ and traverse -> Services -> Accounts -> Research and staff accounts -> Request for computer account.

  • This form contains all the information MFCF needs to make accounts.
    • requestors will fill in "Printer access: Applied Math printer" or rarely "Printer access: Yes" but Lori prefers "lpr_am"...
  • Form results are sent via email to Lori and the co-op, who either handles it or triages who can do it. The requests aren't sent to RT.
    • MFCF knows typical defaults for specific groups including nexus.
  • They determine where in the sponsor flat-files they should put the addition, such as for faculty / administrative.
  • They add the person to the appropriate file used by "Jim's scripts" - to add them to nexus groups. (eg., "am_other" for applied-math printers, access for others who are not staff or faculty.)
    • they use nexus groups to replace lp_quota; and increasingly for deployed linux machines for authentication.
  • Lori runs sponsor_resources and accounts-client manually, at whatever interval
    • whenever she knows there are changes to deploy;
    • month end for billing charges;
    • term-switch n-weeks before the end of term
  • Lori also runs accounts_master to pull down registrar data manually, at whatever interval (frequently at beginnings of terms).
    • Robyn: intentional design of a person monitoring the output of each step to look for syntax errors or new fields the Registrar added without telling anyone, which require parser changes. These are intentionally not automated.
  • They can infer from experience that not all the steps happened, need to fix and re-run...
  • sponsor_resources output warns about batches of expired sponsorships; Lori then needs to go into source files and remove the people manually. We walked through removing a sponsored research account for a grad student; while the account still exists, since it's expired, it automatically has been given a non-login shell.
  • NexusComputerGroups and NexusComplexGroups define the Nexus and SCCM groups and computers which associate the groups to account classes (groups). This suggests we can leave aside pulling this directly into the database; if they update SAT with equivalent group names, the pairing will still work.
    • they will use our API to pull results from SAT instead of extracting from the flat files.
  • They use classlist files to populate access lists in Active Directory - they would use our API to output an equivalent list of users (Isaac has already working on a ticket to report classlist data replacing the flat files).
    • it is unclear how the classlist files are turned into Active Directory Security Group members; the process is equivalent to Clayton's updates.
  • Sample error conditions:

Fatal error: "/software/accounts-master/data/sponsors/REGISTRAR/Undergrads" line 81: Unknown member "*"

Error: "/software/accounts-master/data/sponsors/REGISTRAR/Undergrads" line 27: Can't find userid for id number '*:20811267' - supplied student ID has no watiam associated.

  • When do they have to go back to IST for a fix? One is when someone has more than one watiam ID. eg., they were a student, and now they're staff, and HR doesn't use the old userid. They have to be merged by IST.
  • our uid registry will complain about non-standard uids.
==== checking id registry status ==== Fri Feb 12 13:15:22 EST 2016
Host has non-standard uids:
/etc/passwd: rstudio-server:82
  • I brought up how the new system will be much more automated.
    • Robyn raised: what if Lori is changing a number of class names related to one thing, and wants them to take effect when she's done, rather than immediately.
    • Or if edits introduce an inconsistent state, and has potential for causing damage. Jim's scripts run every 15 minutes.
    • suggests he wants a "pause" button.
    • term switch and month-end billing (pulling in data to create statements - which uses the data we've revamped; and they will need to rewrite in order to use).

Some takeaways and questions

  • new request: potential for transactional batch updates, which must happen at the same time. (none happen until ready).
    • would 1-day granularity help? No; some changes could conceivably be multi-day work.
    • how about prepping a batch update via a CSV instead?...
  • I think Jim's scripts can exist outside our code, for now. Rely on group names matching up.
  • MFCF does monthly billing which requires pulling stats from accounting system. Will require rewriting their scripts.
  • Lori and her student employees do a LOT of manual updates. eg., removing entries post-expiry.
  • how many of their error conditions would still happen if fed latest watIAM and classlist data? Already pre-vetted by OAT import?
  • they don't make tickets for requests- should I try and get a look at some sample emails from their users? Probably not necessary at this point.

2020-09-30

Webdev checkin

Discussed yesterday's meeting with Lori and Robyn.

  • It was useful to see the current workflows.
  • One question is about the request to "pause updates" to allow for batch changes all at once. What specifically would need this feature in the new system?
    • It seems likely that a lot of the use-cases are already covered by auto-imported data.
    • Robyn's request was "don't do an update tonight" - but it's not clear how that's compatible with the overall data model of frequently imported data and all-campus usage (what if one unit wants changes immediately and another wants changes deferred? The changes could cover the same CS/Math users...)
    • Discussed one possible solution: via batch-import from text-file, for the things which have to happen at the same time.
    • Another (more complex) solution is like a git commit, where batch changes are somehow staged.
  • It looks likely that Jim's scripts can exist outside our code, for now. Rely on group names matching up; and using our API to report the details to Jim's scripts.
  • MFCF does monthly billing which requires pulling stats from accounting system. Will require rewriting their scripts.

Isaac has updated the schema to rename account to billcode.

  • he will work on the auto-update of computed groups

Discussed transaction table; and comments; agreed that Daniel and Niya will try figuring out what this should look like before we add them to the schema.

  • First steps: Daniel and Niya looked at the web UI; which currently doesn't load the people section, or for assigning people to groups. This needs fixing.
  • Looked at the wireframe
    • Can we perform user stories using the wireframe? Possibly. Daniel to write up yesterday's meeting notes and then we can work on user stories.

-- DanielAllen - 2019-03-25

2020-10-01

User Stories discussion

Daniel and Niya created detailed user stories.

  • Wrote out user personas, what users want/need, and what we need to include in the UI

What we need to do (UI):

  • Currently existing
    • Adding single/multiple users to a group/groups
    • CSV imports
    • Viewing of user details
    • Automatically create groups
  • Need to create
    • View user history

Next Steps:

  • Daniel to fix web UI for viewing users and viewing group members, so we can examine them more closely. [ done ]
  • Learn MFCF monthly reports process- how does it interact with database? [ in process over email ]
  • Request more data from MFCF about user requests, so we can create additional accurate user stories. [ see further notes from Daniel on 2020-10-02 saved to SATAccountsUserStoriesData ]
  • Discuss UI changes that currently do not exist and look at existing UI

2020-10-05

Daniel and Niya: Discussion of Transactions / Comments / History

Core motivations:

  • As a user editing data, they should have every chance to record why they did something.
  • As a user looking up data, they should be able to easily check on why someone else did something, including if it was done by automated changes.
  • As a user looking up data, they should be able to see all transactions that might be related to their lookup, in all places that might be helpful.
    • At the same time, they should be able to search and filter out data they don't care about.

Comments on Comments

Comments are always attached to a transaction where a change was made in the database. Always store comments with the related transaction.

We can add two fields, prominent on edit pages. [ RT#___ ] [ Comments___ ]

  • Both are optional. Ideally only an RT# is necessary, but sometimes there isn't an RT#, and/or sometimes an additional comment is needed.
  • Possibly a 3rd button: [ clear comments ]
    • So that on saving a change, RT# and comments are kept for the next change as well.
      • keep on all saves (user-saves, group-saves, resource-saves, etc.) until cleared.
  • possibly, instead of 3rd button, a more complex "git commit" version:
    • replace 3rd button with "[ do as batch ]" and it somehow holds off before submitting them, tracks between comments, such that edits to the comment will update for all.
    • probably too complex and unecessary. Recording the idea in case we aren't happy with the other options.
  • we may learn that people don't want the 3rd button or remember to clear the comments... Prototype with/without 3rd button?

Transactions

2020-11-24 updates to this spec to remove "automatic" transactions. Those will be handled in a separate display of the historical values, details TO BE DETERMINED.

Transactions are visible on just about every page, to show manual update history that is relevant.

On user-display page, "transactions" section will include one row per transaction. Columns:

timestamp action action-on old-value new-value actor RT# Comments
  • Descriptions and details
    • timestamp is recorded to the 100th of second? but displayed only to the second. In case there are sub-second update occurances we're sure to see the correct order.
    • action is the stored procedure used - either just the stored procedure name, or a human-readable version.
      • (Do we need a table of stored procedures and their human-readable titles, such as "Create customer" instead of "sponsor_customer_create" ?)
    • action-on is the person's name, or group's name, or "group x resource y", or sponsor name ...
    • old-value is blank for additions to group; new-value is blank for removals from group.
    • optional RT# is a link, if it's been filled in. Possibly we can display the RT title text using our direct RT SQL lookup.
    • Comments are optionally supplied by the user

The transactions table can be re-sorted by any column.

  • Possibly can be filtered? Or filtering is only done on Transactions page (see below)

On the "Transactions" page visible from left-tab, we can see all transactions. Columns also as above.

  • table can be re-sorted and filtered.

On a group-display page, "transactions" section same as above.

  • by default, includes all transactions affecting the group, including manual additions and subtractions from that group.
  • checkbox for "show member changes" is checked by default
    • unchecking dynamically updates the page.

On the list-of-groups display page, "transactions" section same as above.

  • by default, includes all groups.

On Resources display pages, "transactions" section same as above.

  • implemented by doing a text search for action-on matching the resource name
  • this is arguably a questionable design-
    1. what if a resource name matched a group name?
      • But if the alternative is to replace "action-on" with hard-coded columns; or supplementing "action-on" with 1:many relational lookups for user, group, resource, sponsor, or billcode; both of these alternatives sound overly complex.
    2. this design relies on postgres efficiency of text searches.
      • But I think we want to allow users to do a text search of various fields.
      • So this feature isn't "worse" than the text search we want to offer them anyway.
    • If the choices are between a redesign for "action-on" with a more complex structure, or leaving out the display of Resource transactions, I'd leave it off and just allow users to to a text search on in the "Transactions" section.

On Sponsors display pages, "transactions" section same as above.

On Billcode display pages, "transactions" section same as above.

2020-10-06

Daniel and Niya: Discussion of Transactions / Comments / History (continued)

We reviewed "Comments" as described yesterday.

  • Niya will prototype code that illustrates two options for editing groups and group-members:
    1. two fields, Comments & RT#, which auto-blank after every change (added/edited group; added/edited/removed group member)
    2. three fields. Comments & RT# text stays filled in after every change and page-change; "Clear comment" button blanks them.
  • We will then review with some end-users to learn if there is consensus about the UI before we plan for database changes.

We reviewed "transactions" (which we will call "History" in the web UI).

  • Most pages should have a history section.
  • When we discussed transaction details, we raised more questions...
    • We need descriptions for an assortment of stored procedures for illustration purposes.
      • eg., we have group_member_insert("group","person") illustrates action = "Add Person to Group", action-on "group", new-value = "person"
      • we'll need an equivalent which includes parameters for active dates... how do we show the active dates in the transactions table?
    • Daniel to suggest some possibilities with Isaac.

2020-10-07

webdev checkin
  • Daniel to send Isaac possibilities for stored procedures in the API RT
    • discussed some options for recording transactions/history. One transactions table, including columns for various procedures' options as columns (as above).
      • Question: what about groups-of-groups; we're not recording nested changes, are we? No; but in the UI we're already traversing the tree to report to the user about nested groups, so we can look up history for each at the same time. Also, Isaac notes we are recording an explicit list of all group memberships.
    • Either Isaac or Daniel will make the stored procedures, for use by web UI and by command-line tools.
  • Niya to continue prototyping UI for user-testing on Comments

2020-10-14

webdev checkin
  • Niya and Daniel; further work on recording Comments/RTs in /RT#1105414 including meetings with Gord and Harsh for user-testing options.

2020-11-05

Notes from Daniel, for discussion with Isaac (along with stored procedures in the API RT).

See #2020-11-24 for updated version.

Draft Transaction Log table, based on SATAccountsRequirements#Transactions above.

CREATE TABLE transaction (
    transaction_id   serial NOT NULL,
    PRIMARY KEY (transaction_id),
    
    transaction_time timestamp NOT NULL,
    action           text NOT NULL,
    action_on        text NOT NULL,
    oldval           text,
    newval           text,
    actor            text NOT NULL,
    rt_id            integer,
    comment          text,
    automatic        boolean DEFAULT FALSE
    );

Comments

  • action is either the name of the caller stored procedure, or a human-readable version of it. (to be determined; I think we can go with just the name of the stored procedure. Such as "group_member_insert" rather than a more human-readable "Add member to group")
  • action_on is the field being updated
  • oldval is blank for additions to group; newval is blank for removals from group.
  • for automatic updates, I'd like to see a comment "Automatic update by [ process ]" along with automatic TRUE

Some questions about how we implement a transaction logging function.

  • how do we obtain the oldval? Does the caller need to supply it, or can the transaction logger look it up?
  • how do we obtain the actor? Sometimes it will be the postgres user (eg., calling the API function from a command-line application). But the web application will presumably be a single user.

Comments about using the transaction logging function.

  • In previous discussion with Isaac and Niya, I suggested a single line in the transaction log for updates. This is not sufficient- the right answer is a line in the transaction log for each change.
    • As a basic example: many of the proposed stored procedures have more than one parameter supplying a change. Either we need a pair of columns in the transaction log for each possible parameter (old and new value), or we store one transaction per changed parameter.
    • Looking at draft function sponsor_region_create_or_update, on "create" there would be two transactions:
      1. action_on = "region", oldval = "", newval = the new region name
      2. action_on = "admin", oldval = "", newval = the new region's admin name
    • On "update" there would be one transaction:
      1. action_on = "admin", with supplied oldval and newval
    • so in addition to the above two questions, there is a third question: how do we know whether our change is a create or update? There may be a clever postgres feature for this; or we may want to split apart the functions into "sponsor_region_create" and "sponsor_region_update"

2020-11-06

Meeting between Daniel, Isaac, Niya, and Anna

We went over most (all?) of the outstanding database questions.

Isaac clarified an important database feature: a group member can have multiple historical active-date ranges which means:

  1. Deleting someone from a group is setting the active ending date to today. Adding them back to the group adds a new row, with a default active starting date of today.
  2. that implies automatic updates of group membership don't need transactions; since all of the relevent information is stored in the group_member table. There isn't any point to recording comments or actors on group membership updates; in fact, trying to make an intelligent comment about why an update happened, is more difficult than Isaac wants to consider. Fair point.
  3. that implies the web UI for viewing group and person history will have two sections; a historical record of their group memberships as automatically updated, and a transaction log for manual updates.
    • Noting that there can be manual updates of group memberships, which will have a transaction recorded, and potentially RTs/comments.
  4. the transaction log might not need a flag to record "automatic" transactions, though perhaps it will be useful for batch updates by command-line users.
    • Daniel to ask potential users, what batch updates might you be doing from the command-line? (recognizing that student group updates are automated).
    • Noting for the record, it will be nice when we can access HR data for appointment start/end dates; this would automate a batch of additional tasks.

Examining Daniel's notes on stored procedures including deletion functions and specifying custom start/end dates, it becomes clear that many other tables will also need active dates; eg.: admin, region, filesystem, customer, bill-code. Everything that we could delete, we want to preserve the historical record that it had once existed.

  • Isaac will do these schema updates, after he's worked a bit more on automatic group changes.

Questions about transactions table:

  • how do we obtain the oldval? A: the transaction logger should look it up.
  • how do we obtain the actor when it's the web application? A: we can have a stored procedure to SET a custom variable used for the session (so, the entire web transaction, not just the postgres function call). It will set the REMOTE_USER at the beginning of the web transaction, which can be automatically included in subsequent function calls.
    • Agreement that command-line API-calls should use user-specific accounts, so we can record the actor.

Stored procedures:

  • We'll discuss implementation after we have the above schema additions.

2020-11-11

Meeting between Daniel, Lawrence, Niya

Gave Lawrence a demo of the latest changes since he last saw it.

  • Q: is it possible to fill in comments after they user has started saving changes?
  • A: we were looking at that, but it might be complex. For example, how do we know what actions to "back-date" the comments to? Everything they've done? What if they didn't mean everything?
    • followup after thinking about it: we could allow adding/editing comments on a user's own actions. It could be a special feature on the Transactions Viewing page. We can keep this as a wish-list item to look into later in available time.

  • Q: can we make it more likely they remember to fill in a comment? Empty comments/RT: prompt them? show as red?
  • A: yes, we had more or less the same idea!

  • Note for additional user-testing: options for adding start and end date specifics, for things like Sponsors and Billcodes. Offer them everywhere, or do we just give them "delete" and make the start/end dates an advanced edit function?

  • Q: when they are making updates from the command-line, how do they add comments? Could we support using the same comments on multiple command entries?
  • A: the initial design is a simple command parameter. Perhaps the end-user command could check a shell variable?

  • Q: in the old system, term additions happen in advance. If we're adding users to groups automatically, when do accounts get created? In particular, for TAs and profs?
  • A: this is controlled by the start dates. There is a different automatic group set up for instructors of each course, and the start date for instructors could be programmatically set for weeks or months before the term start. We need to go over this in detail with Infrastructure.

Daniel will set up a group meeting to go over details and status, particularly focusing on infrastructure questions. Ideally next week; though we should have more clarity about what the stored procedures look like.

2020-11-18

webdev checkin
  • UI demo from Niya; brainstormed RT/comments improvements.
  • Database update: Isaac's nearly done with schema updates, which add active dates. He will create one or more sample stored procedures, and we'll meet Friday afternoon to discuss.
  • After that, I will book a meeting with the project team to discuss infrastructure implementation
  • Discussed how web UI / stored procedures can handle username, RT/comment via SET and custom postgers variables as described above on 6-Nov. So we don't need to add them all as parameters.
    • Command-line tool can call a wrapper around stored-procedure, looks for: 1) command-line specified RT/comment; 2) otherwise check for unix env variable

Questions for Infrastructure users:

  • Given groups corresponding to all courses, for both students and instructors, which are automatically updated
    • What manual command-line updates will you need, particularly those that don't fall into the pattern of "add user X to group Y with these start/end dates"?

2020-11-20

Meeting between Daniel, Isaac, Niya, and Anna
  • To discuss progress on stored procedures and schema

As noted previously, the new database renames the entities that were named "sponsors" into "customers" to disambiguate since many tables are named "sponsor_something".

We spent a while discussing the new stored procedures for sponsor_customer_create and sponsor_customer_delete - chosen for their simplicity.

  • and they raise interesting questions, noted below.
  • including active dates will allow(but not require) multiple historical active-date ranges as described on 2020-11-06.

Isaac demoed helper-functions:

  • DOMAIN daterange_active which is a daterange type plus a default value (current_date, null) and constrained to NOT NULL
  • And a trigger which will automatically delete data rows which have an empty range (start and end dates are the same).
    • which will be the default for all SAT tables.

That raised a number of questions:

  • Q: what if a user set the start/end dates to the same thing accidentally? A: for some stored functions, the user can't- we're not exposing the dates everywhere, just where it makes sense. Isaac says it doesn't make sense for sponsor_admin or sponsor_customer tables; we just let the user issue "create" or "delete" and set the start and end date automatically.
    • As far as users are concerned, the customer exists or it doesn't. The dates are not exposed unless we're looking at an audit-history.
    • Q: how about the use-case of creating a customer who starts being active in the future? A: Isaac doesn't see why we ever would. What's the use-case in the real data? (creating a sponsor who isn't available yet?) TO BE DETERMINED.
    • Q: how about a use-case of creating a customer who became active at a time in the past, eg., importing? A: if necessary, we could manually update that data. Probably not useful/necessary beyond first setup. TO BE REVISITED if necessary.
    • Q: what about the stored functions where the user does need to set start/end dates? A: TO BE DETERMINED. Will discuss later (when we have examples of those functions).
    • Q: what happens if the customer is deleted (manually or automatically) and they have associated billcodes, which have associated sponsorships? A: TO BE DETERMINED.

We spent a while discussing the more complex example of group_member_insert and group_member_delete.

  • By default these functions set the start and end dates automatically (as above).
    • Isaac's suggestion is that _insert with a specified date-range will create or expand the date ranges and _delete will restrict the date ranges.
    • But we want to be able to allow the end user to assert "make this person a group member for this start and end dates."
  • Q: what happens if they assert a date range that overlaps with an existing range? Does it change the existing one, or fail because it conflicts? A: Since this is a fairly complex operation, we may need to use the 2nd option, for now; adding more flexibility later. TO BE DETERMINED.
  • Isaac notes that changing dates in the past (changing history) is different than changing dates in the future (changing plans).
    • Q: do we need to allow arbitrary history changes at the user level? Why? Do we need to accomodate all kinds of complex multiple active ranges in the future? TO BE DETERMINED

Comments from Daniel:

  • The design including multiple historical active-date ranges for all kinds of data is very helpful for not losing "deleted" data. It also adds complexity to any table where start/end dates are meant to be updated by administrative users: group membership, diskquota, login, user-group, mail-alias. Do we need more than one active row in each table? Taking one example...
  • Is there a requirement that we allow setting diskquota at different values at different current or future time-periods? Or is it sufficient to have one current value but record history?
  • Is there a requirement that we record the diskquota history in the same table, or can we rely on the transactions table for that, and simply record a single diskquota that is set by the user?

Unless I'm mistaken, the old system did not have arbitrary future-dated sponsorship changes; so why not do away with the multiple active-date ranges for those tables where dates are edited by end-users? It would certainly be simpler.

We will meet again on Monday.

2020-11-23

Meeting between Daniel, Isaac, Niya, and Anna

Isaac reported on automatic group-of-group membership updates. He plans to split group-membership into two tables, one for "current/history" and one for "planned"; such that computed groups will calculate more easily.

  • Q: do we need arbitrary future group-of-groups computation? Isaac doesn't plan to compute them in advance. We looked at a few examples.
    • "instructors for CSxxx for next term" is represented by a sub-group which is set up and auto-populated for the term. We do not need an "Instructors for CSxxx" supergroup where we specify the date in question.
    • But "who are the members of research group X as of date Y" is a question we will need to answer, regularly. (Noting: "research group X" will definitely contain sub-groups, since each sponsorship requires a different sub-group. So we do not completely avoid needing to compute future groups)

We will allow one active date-range as described on the 20th. This greatly simplies a number of questions that were "TO BE DETERMINED":

  • stored procedures where a user needs to set start/end dates: we can set the start/end dates directly, to create or update the existing entry.
  • asserting a date-range that overlaps with an existing range: can't happen. There is only one.

We discussed the transaction log. Daniel to send Isaac details on what we want to be able to query and display.

We briefly discussed what happens if a customer is deleted (manually or automatically) and they have billcodes and sponsorships.

  • In the general case, deleting is setting an end-date, not deleting the row of data; the billcode can still exist in the database without violating integrity constraints. Do we want to propagate the end-date change, to the billcode and then to the sponsorships? Probably.
    • As of right now, the web UI doesn't display effective-dates for customers or bill-codes. But it does for sponsorships.
  • In the special case of deleting a customer on the same day as it was created, it would actually delete the data, and presumably delete any sponsorships.
    • Or if someone goes to the trouble of creating a batch of bill-codes, then deletes the customer, should it instead refuse to delete the customer because they have active data?
  • To be discussed with more end-users.

Since we don't have more examples of stored procedures, we won't yet schedule the meeting with team members. Will revisit on Wednesday.

2020-11-24

Notes from Daniel on the Transaction Log

The UI will offer a relevant transactions log on many pages, based on SATAccountsRequirements#Transactions spec above which explains fields from a UI perspective.

Here's an updated draft table as a basis for discussion.

CREATE TABLE transaction (
    transaction_id   serial NOT NULL,
    PRIMARY KEY (transaction_id),
    
    transaction_time timestamp NOT NULL,
    action           text NOT NULL,
    action_on        text NOT NULL,
    oldval           text,
    newval           text,
    actor            text NOT NULL,
    rt_id            integer,
    comment          text
    );

Comments

  • action is either the name of the caller stored procedure, or a human-readable version of it. (to be determined; perhaps we can go with just the name of the stored procedure. Such as "group_member_insert" rather than a more human-readable "Add member to group")
  • action_on is the field being updated
  • oldval is blank for additions to group; newval is blank for removals from group.
    • oldval will be supplied by a lookup by the transaction logger (somehow!)
  • actor is intelligently determined as described on 2020-11-06
  • Very likely there needs to be a row in the log for each field change (not just one row for each stored-procedure call).

2020-11-25

Meeting between Daniel, Isaac, Niya, and Anna.

Isaac demoed working stored procedures on dev server port 5503:

odyssey=> select sponsor_customer_create ('WatFORM');
[returns numeric ID]

odyssey=> select sponsor_billcode_create ('[same numberic ID]', 3002, 'WatFORM billcode'); -- 3002 is a customer-id
[ returns 1 ]
odyssey=> table sponsor_billcode;
[returns populated billcode rows]

odyssey=> select group_create ('manual_3', 'Test Manual Group 3', 1020); -- 1020 is a maintainer-id
[ returns numeric ID]

odyssey=> select group_member_insert ([same numeric ID], [a person_id] ); -- adding member to the manual group

odyssey=> select group_auto_create ('auto_3', 'Test Auto Group 3', 10); -- 3rd parameter is the "level", which must be specified.
[ returns numeric ID]

odyssey=> select group_auto_member ([same numeric id]);
(0 rows)

-- manual insert into group_depends to make the manual group a member of the automatic group 

odyssey=> select group_auto_member ([same numeric id]);
[ returns the manually added member from the sub-group ]

odyssey=> select group_member_insert_future ([manual group numeric id], [a person_id], daterange ('2021-01-01', '2021-05-01')); 

  • reviewed group_member_future table, similar to group_member with no pkey restrictions on member_active dates, and does not have triggers that recalculate active memberships or delete on empty daterange

  • Daniel will book a team meeting.

  • discussed transaction logging, which is a follow-up step to implement. Isaac says possibly a draft by Friday. Briefly looked at SATAccountsRequirements#Transactions spec.
    • "action" column: record the stored procedure name, and if we need, we can add a lookup table afterward.
    • Q: what does it mean to display a list-of-groups? There will be thousands. The UI will need filtering before that list is useful. A: agreed; will need filtering.

2020-11-27

Meeting between Daniel, Isaac, Niya, and Anna.

Isaac walked through his test.sql to describe functions. Following are notes which will be more useful when we have the schema available in git, which may be Monday.

Customer:

  • customer is available across the database.
  • a change/feature from the old system is we can support one customer in both MFCF and CSCF data.
    • they would have different billcodes for each.

Groups:

  • group name is unique. group_description is optional and non-unique.
    • open question: group name formatting for all the automated groups. TO BE DETERMINED.

group_auto_create() requires "level"

  • level = 0 for manual groups.
  • automatic groups: can only depend on groups whose levels are lower.
    • (recommended increment by 10s to leave room for insertions)
    • update a group's level: cannot set it to higher number than any of the dependencies, or lower than any dependent.
  • automatic groups are only composed of other groups, not members directly.
    • Will require UI implementation
  • Q: do we ever need a singleton group named for one person?
    • A: TO BE DETERMINED talking with clients.
    • Isaac asserts most of these cases will be better satisfied by a group that's named for what it's doing, not the person currently doing it. So it can be assigned to someone else when they are doing that thing.
    • Daniel asked about a case of CSCF testing vault access. Isaac suggested group "vault load testing".
    • the question isn't whether it isn't possible in the UI; can manually create groups for a person. It's whether the process should be facilitated.

Q: querying groups by name? A: select from group_current where name = 'whatever';

2020-11-30

Group meeting with Dave, Adrian, Clayton, Robyn, Lawrence, Isaac, Niya, Daniel.

Agenda

Reviewed SATAccountsAPI. Discussion with the following questions and and answers.

  • Q: To identify who should have an account and "is a student" - what's the definition of "is a student?"
    • A: a computed group of all users in whatever Math plans make sense. The computed group can include all users in the past two terms to match the current 2-term grace period.
  • Q: Where do home directory paths come from?
    • A: Not currently in the odyssey/SAT database. Since this isn't likely to be something you'd modify for individuals, it's probably best done programmatically.
      • Noting: currently these are recorded on a regional server, and INF intends to move away from doing that.
      • Also noting: Robyn indicates Math (CS/MF) are the only campus groups currently doing our own disk storage. Others are using OneDrive which offers 10tb per student. We don't currently have plans to mount that storage in our environment, though.
  • Q: Are there SAT groups representing sections of a course?
    • A: Not currently. If the need is to separate courses by content, Isaac currently has the ability to separate them in Odyssey.. If the need is to split apart roughly equal sized sections, we can talk to Dave/Clayton about what kinds of data is needed...
  • Q: Should individuals' accounts get a SAT unix group named for them automatically?
    • A: no, the Active Directory system can create them automatically.
  • Q: a change/feature from the old system is we can support one customer in both MFCF and CSCF data. This is OK?
    • A: yes, as far as we can determine today. (considering that billing is actually done via the billcode and we don't have to worry about billing CS users).
  • Q: do we ever need a singleton group named for one person?
    • A: not automatically.
  • Q: what is Dave's requirements for deployment?
    • A: they are using parts of the odyssey lookups already; development will continue at as fast a pace as possible, but it's not needed to be deployed this December.

Daniel will set a demo meeting for approximately a week from now.

Topic attachments
I Attachment History Action Size Date Who Comment
JPEGjpg 2020-01-23-SAT-API-consumers-for-accounts.jpg r1 manage 1543.2 K 2020-01-28 - 12:08 DanielAllen white-board: how API will be used for CS accounts generation
JPEGjpg IMG_7091_IMG_7092.jpg r1 manage 2123.1 K 2019-12-12 - 17:09 DanielAllen white-board: current accounts creation
JPEGjpg IMG_7093.JPG r1 manage 1574.9 K 2019-12-12 - 17:09 DanielAllen white-board: current accounts creation
JPEGjpg IMG_7094_IMG_7095.jpg r1 manage 2467.9 K 2019-12-12 - 17:09 DanielAllen white-board: current accounts creation
JPEGjpg IMG_7096.JPG r1 manage 1320.3 K 2019-12-12 - 17:09 DanielAllen white-board: current accounts creation
Edit | Attach | Watch | Print version | History: r72 < r71 < r70 < r69 < r68 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r72 - 2020-11-30 - DanielAllen
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback