UW SSEC Lustre Statistics How-To
Introduction
This guide will take the user step-by-step through the Lustre Monitoring deployment that the Space Science and Engineering Center uses for monitoring all of its Lustre file systems. The author of this guide is Andrew Wagner ([email protected]).
Building the Lustre Monitoring Deployment
Setting up an OMD Monitoring Server
The first thing that we needed for our new monitoring deployment was a monitoring server. We were already using Check_MK with Nagios on our older monitoring server but the Open Monitoring Distribution nicely ties all of the components together. The distribution is available at http://omdistro.org/ and installs via RPM.
On a newly deployed Centos6 machine, I installed the OMD-1.20 RPM. This takes care of all of the work of installing Nagios, Check_MK, PNP4Nagios, etc.
After installation, I created the new OMD monitoring site:
omd create ssec
This creates a new site that runs its own stack of Apache, Nagios, Check_MK and everything else in the OMD distribution. Now we can start the site:
omd start ssec
You can now nagivate to http://example.fqdn.com/sitename of your server, i.e. http://example.ssec.wisc.edu/ssec and login with the default OMD credentials.
We chose to setup LDAPS authentication versus our Active Directory server to manage authentication. There is a good discussion of how to do this here: https://mathias-kettner.de/checkmk_multisite_ldap_integration.html
Additionally, we setup HTTPS for our web access: http://lists.mathias-kettner.de/pipermail/checkmk-en/2014-May/012225.html
At this point, you can start configuring your monitoring server to monitor hosts! Check_MK has a lot of configuration options, but it's a lot better than managing Nagios configurations by hand.