Meeting 3 March 2016, 2pm

Attended: drallen a2brenna fhgunn ldpaniak

Agenda:

  • Progress on Milestones
  • Timeline
  • Brief summary since the last meeting

Progress on Milestones:

  • Backups set up and tested: (Due-date reset to next Wednesday): Remaining: testing restore

Timeline

  • Fraser and Daniel are testing and re-moving inventory- likely not finished until mid-next-week (9th or so).
    • Schedule shows wrapup March 31 (!)
      • where the last 4 days is wrapup, and the 5 days before that are finalizing monitoring, tuning, documentation.
      • we're actually doing all three as we go, which will take less time at the end. Good.
  • Aiming for:
    • by next Wednesday (9 Mar): finish inventory move
    • by following Thursday (17 Mar): finish move of remainder of databases
    • by Friday 25 Mar: parts involving Fraser (he is away the first week of April, and has trains work the last week in March).

Brief summary since the last meeting, and upcoming week ("this week")

  • slave grants: Anthony has updated to use hostnames.
  • Fraser worked cleaning up permissions and process for re-importing (prod changes)
  • Fraser/Daniel worked on failover testing, which had a number of discoveries
    • default binlog_format=STATEMENT may have different failover response than MIXED ("best practice"). Switched in the live setup; will need failover testing (Daniel) - and will set up frequent percona checksum (daily? hourly?) (Fraser)
    • slaves are not read-only; updates are NOT copied to master; so probably want to set read-only ("best practice") - and skip networking (would still allow mysql connections via localhost) (Fraser)
    • master/master/slave has failure-modes. So probably want to go with master/slave/slave. (Anthony)

  • Anthony: restore from backups this week - using Fraser's documentation as base, and save edited docs.
  • Fraser will be updating configs this week
  • Anthony: update salt to reflect recent changes - this week
  • nagios checks: Lori to implement this week.
  • Anthony: switch the machine to 10Gb (requiring restarting containers) - probably ST#102302
  • Backups: Anthony will deb-ify Fraser's mysql-102:/usr/local/lib/mysql scripts and install on mysql{,.student}.cs and mysql-10{2,4,6}

  • Anthony: documentation will be sufficient that given a dead master, switch to another master.

Still to do later:

  • Fraser: running mysqldumpexcept is quick. run twice separated by 10 sec; if the diff are identical, can use that timestamp.
    • how do we re-run import of permissions without clobbering mysql table?
      • percona command to output grants
        • but mysql.cs has lots of junk- grants for tables/users that don't exist; so fraser is writing a script to trim these.
    • does nscd run? We don't think so; probably we want to run nscd. Correction: Fraser and Devon discovered that mysql has its own cache, which makes clear that we should NOT run nscd.
  • Anthony will review /etc/mysql/conf.d/ and merge into salt what he can.
    • Salt and mysql: best practices for password storage? Anthony suggests making a more secure password to be stored on slave... Anthony will make an ST item to do that before we're done. (/etc/mysql/conf/slave.conf ?)
  • Anthony will add marmoset to the slave "do NOT replicate" configs
  • Fraser will update the written list of instructions for doing stage two. He will put it in ST for comment.

Next Meeting

  • Wed 9 Mar? (day before retreat)

-- DanielAllen - 2016-03-02

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2016-03-03 - DanielAllen
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback