Lustre Monitoring and Statistics Guide: Difference between revisions

From OpenSFS Wiki
Jump to navigation Jump to search
(Created page with "== DRAFT IN PROGRESS == == Introduction == what this is about == Reading /proc vs lctl == 'cat /proc/fs/lustre...' vs 'lctl get_param' With newer lustre versions, 'lctl ...")
 
(Replaced content with "This content has moved to the [http://wiki.lustre.org/Lustre_Monitoring_and_Statistics_Guide lustre.org Wiki].")
 
(41 intermediate revisions by one other user not shown)
Line 1: Line 1:
== DRAFT IN PROGRESS ==
This content has moved to the [http://wiki.lustre.org/Lustre_Monitoring_and_Statistics_Guide lustre.org Wiki].
 
 
== Introduction ==
 
what this is about
 
 
== Reading /proc vs lctl ==
 
'cat /proc/fs/lustre...' vs 'lctl get_param'
With newer lustre versions, 'lctl get_pram' is the standard and recommended way to get these stats. This is to insure portability. I will use this method in all examples, a bonus is it can be often be a little shorter syntax.
 
== Data Formats ==
Format of the various statistics type files varies (and I'm not sure if there's any reason for this). The format names here are entirely *my invention*, this isn't a standard for lustre or anything.
 
jobstats Jobstats are multi-line records. Each OST or MDT then has an entry for each jobid (or hostname, or however we collect job stats). Example:
<pre>
obdfilter.scratch-OST0000.job_stats=job_stats:- job_id:          56744
  snapshot_time:  1409778251
  read:            { samples:      18722, unit: bytes, min:    4096, max: 1048576, sum:    17105657856 }
  write:          { samples:        478, unit: bytes, min:    1238, max: 1048576, sum:      412545938 }
  setattr:        { samples:          0, unit:  reqs }  punch:          { samples:          95, unit:  reqs }
- job_id: .... etc
</pre>

Latest revision as of 15:14, 3 June 2015

This content has moved to the lustre.org Wiki.