UW SSEC Lustre Statistics How-To: Difference between revisions

From OpenSFS Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
== Introduction ==
== Introduction ==
This guide will take the user step-by-step through the Lustre Monitoring deployment that the Space Science and Engineering Center uses for monitoring all of its Lustre file systems. This is a work-in-progress for now.
This guide will take the user step-by-step through the Lustre Monitoring deployment that the Space Science and Engineering Center uses for monitoring all of its Lustre file systems. The author of this guide is Andrew Wagner ([email protected]).


== Building the Lustre Monitoring Deployment ==
== Building the Lustre Monitoring Deployment ==


=== Setting up an OMD Monitoring Server ===
=== Setting up an OMD Monitoring Server ===
The first thing that we needed for our new monitoring deployment was a monitoring server. We were already using Check_MK with Nagios on our older monitoring server but the Open Monitoring Distribution nicely ties all of the components together. The distribution is available at http://omdistro.org/ and installs via RPM.
On a newly deployed Centos6 machine, I installed the OMD-1.20 RPM. This takes care of all of the work of install Nagios, Check_MK, PNP4Nagios, etc.
After installation, I created the new OMD monitoring site:
<code>omd create ssec</code>
This creates a new site that runs its own stack of Apache, Nagios, Check_MK and everything else in the OMD distribution. Now we can start the site:
<code>omd start ssec</code>


=== Deploying Agents to Lustre Hosts ===
=== Deploying Agents to Lustre Hosts ===

Revision as of 11:55, 3 February 2015

Introduction

This guide will take the user step-by-step through the Lustre Monitoring deployment that the Space Science and Engineering Center uses for monitoring all of its Lustre file systems. The author of this guide is Andrew Wagner ([email protected]).

Building the Lustre Monitoring Deployment

Setting up an OMD Monitoring Server

The first thing that we needed for our new monitoring deployment was a monitoring server. We were already using Check_MK with Nagios on our older monitoring server but the Open Monitoring Distribution nicely ties all of the components together. The distribution is available at http://omdistro.org/ and installs via RPM.

On a newly deployed Centos6 machine, I installed the OMD-1.20 RPM. This takes care of all of the work of install Nagios, Check_MK, PNP4Nagios, etc.

After installation, I created the new OMD monitoring site:

omd create ssec

This creates a new site that runs its own stack of Apache, Nagios, Check_MK and everything else in the OMD distribution. Now we can start the site:

omd start ssec


Deploying Agents to Lustre Hosts

Writing Local Checks to Run via Agents

Check_MK RRD Graphs

Deploying Graphite/Carbon

Deploying Grafana

Using Graphios to Redirect Lustre Stats to Carbon