Inventory Monitoring Updates Spring 2019

Summary

Devon has a term goal of updating Nagios to Ubuntu 18.04. This includes the Nagios API layer written by Dennis, which consumes data from Inventory. Devon has made quick progress by starting with Icinga, a free open-source tool based on Nagios4 with many external packages built in with developer support. (For the rest of this spec, "Icinga" and "Nagios" will be used to mean "Icinga" moving forward). Remaining to do is the API, which Devon would like to re-implement rather than porting Dennis's code directly. Devon and Justin are working on the API with help from Daniel.

RT

/RT#977315 - Spring 2019 goal: Update Nagios-Inventory interface to 18.04
  • /RT#965909 - Inventory updates to assist this goal.

Design

Design details mostly depend on API specs. Details to be filled in later.

Meetings

2019-07-16

Meeting with Devon, Justin, Daniel to discuss basics.

2019-07-23

Devon says he's finished setup replacing nagios4 with icinga. It includes packages for integrations, a modern UI with mobile support, and a very complete API.

2019-07-29

Meeting with Devon, Justin, Daniel to discuss API and inventory requirements. Nathan also attended.

Devon has three requests for inventory enhancements before end-of-term. #3 is the highest priority and the other two are nice-to-have by end of term.

  1. Inventory "Services" will pass host IP and host name to Nagios. Wants to also send arbitrary additional variable values for certain custom monitoring (which requires UI additions). An example would be confirming web results at a specific web address rather than just the hostname of the web server.
  2. Can we make the notifications more flexible, to send to more than one email address/email list that's overloaded with DNS configuration? Possibly validate and use the "Auth User" field? - Daniel says this would require much more discussion of requirements by all CSCF users. Likely out of scope for this term.
  3. Parenting/Dependencies integration with Inventory: Inventory's parenting features would be useful to propagate to Icinga, so monitoring can be aware that if a switch reports a problem, we don't need to report a connection problem for all child machines.
    • A complication is that multiple concepts of parents make sense in different contexts. "network parenting" is potentially different from "physical enclosure parenting" is potentially different from "power supply parenting"
    • Inventory supports multiple parents.
    • Starting with network parenting. Power supply parenting is out-of-scope for now.
    • Note that getting this working simplifies requirements for #2- for example making it easy to notify the right "owner" for a switch versus emails for each connected host.

Other notes:

  • rename "Services" section to "Monitoring"
  • Icinga has integrations with Salt - to receive downtime info. Nathan sees this as useful for RSG, such as for scheduling downtime for servers about to be reconfigured under Salt. To be investigated further, later.
  • https://icinga.com/2010/11/03/a-lesson-in-zulu-icinga/

Devon is continuing his development work this week, with support from Justin. -- DanielAllen - 2019-07-29

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2019-07-29 - DanielAllen
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback