Lustre Monitoring and Statistics Guide: Difference between revisions

From OpenSFS Wiki
Jump to navigation Jump to search
mNo edit summary
(Replaced content with "This content has moved to the [http://wiki.lustre.org/Lustre_Monitoring_and_Statistics_Guide lustre.org Wiki].")
 
(26 intermediate revisions by one other user not shown)
Line 1: Line 1:
== DRAFT IN PROGRESS ==
This content has moved to the [http://wiki.lustre.org/Lustre_Monitoring_and_Statistics_Guide lustre.org Wiki].
 
 
== Introduction ==
 
There are a variety of useful statistics and counters available on Lustre servers and clients. This is an attempt to detail some of these statistics.
<br>
The presumed audience for this is system administrators attempting to better understand and monitor their Lustre file systems.
 
== Lustre Versions ==
 
This information is based on working mostly with Lustre 2.4 and 2.5.
 
== Reading /proc vs lctl ==
 
'cat /proc/fs/lustre...' vs 'lctl get_param'
With newer Lustre versions, 'lctl get_pram' is the standard and recommended way to get these stats. This is to insure portability. I will use this method in all examples, a bonus is it can be often be a little shorter syntax.
 
== Data Formats ==
Format of the various statistics type files varies (and I'm not sure if there is any reason for this). The format names here are entirely *my invention*, this isn't a standard for Lustre or anything.
 
It is useful to know the various formats of these files so you can parse the data and collect for use in other tools.
 
=== Stats ===
 
What I consider a "standard" stats files include for example each OST or MDT as a multi-line record, and then just the data.
 
Example:
<pre>
obdfilter.scratch-OST0001.stats=
snapshot_time            1409777887.590578 secs.usecs
read_bytes                27846475 samples [bytes] 4096 1048576 14421705314304
write_bytes              16230483 samples [bytes] 1 1048576 14761109479164
get_info                  3735777 samples [reqs]
</pre>
 
snapshot_time = when the stats were written
 
For read_bytes and write_bytes:
First number = number of times (samples) the OST has handled a read or write.
Second number = the minimum read/write size
Third number = maximum read/write size
Fourth = sum of all the read/write requests in bytes, the quantity of data read/written.
 
=== Jobstats  ===
 
Jobstats are slightly more complex multi-line records. Each OST or MDT also has an entry for each jobid (or procname_uid perhaps), and then the data.
 
Example:
<pre>
obdfilter.scratch-OST0000.job_stats=job_stats:
- job_id:          56744
  snapshot_time:  1409778251
  read:            { samples:      18722, unit: bytes, min:    4096, max: 1048576, sum:    17105657856 }
  write:          { samples:        478, unit: bytes, min:    1238, max: 1048576, sum:      412545938 }
  setattr:        { samples:          0, unit:  reqs }  punch:          { samples:          95, unit:  reqs }
- job_id: . . . ETC
</pre>
 
Notice this is very similar to 'stats' above. But there's a lot of extra: { bling: }! Why? Just because it got coded that way?
 
=== Single ===
 
These really boil down to just a single number in a file. But if you use "lctl get_param" you get an output that is nice for parsing. For example:
<pre>[COMMAND LINE]# lctl get_param osd-ldiskfs.*OST*.kbytesavail
 
 
osd-ldiskfs.scratch-OST0000.kbytesavail=10563714384
osd-ldiskfs.scratch-OST0001.kbytesavail=10457322540
osd-ldiskfs.scratch-OST0002.kbytesavail=10585374532
</pre>
 
=== Histogram ===
 
Some stats are histograms, these types aren't covered here. Typically they're useful on their own without further parsing(?)
 
 
* brw_stats
* extent_stats
 
 
 
== Interesting Statistics Files  ==
 
This is a collection of various stats files that I have found useful. It is *not* complete or exhaustive. For example, you will noticed these are mostly server stats. There are a wealth of client stats too not detailed here. Additions or corrections are welcome.
 
* Host Type = MDS, OSS, client
* Target = "lctl get_param target"
* Format = data format discussed above
 
{| class="wikitable"
|-
!Host Type !! Target !! Format !! Discussion
|-
| MDS || mdt.*MDT*.num_exports || single || number of exports per MDT - these are clients, including other lustre servers
|-
| MDS || mdt.*.job_stats || jobstats || Metadata jobstats. Note that with lustre DNE you may have more than one MDT, so even if you don't it may be wise to design any tools with that assumption.
|-
| OSS || obdfilter.*.job_stats || jobstats || the per OST jobstats.
|-
| MDS || mdt.*.md_stats || stats || Overall metadata stats per MDT
|-
| MDS || mdt.*MDT*.exports.*@*.stats || stats || Per-export metadata stats. Exports are clients, this also includes other lustre servers. The exports are named by interfaces, which can be unweildy. See "lltop" for an example of a script that used this data well. The sum of all the export stats should provide the same data as md_stats, but it is still very convenient to have md_stats, "ltop" uses them for example.
|-
| OSS || obdfilter.*.stats || stats || Operations per OST. Read and write data is particularly interesting
|-
| OSS || obdfilter.*OST*.exports.*@*.stats || stats || per-export OSS statistics
|-
| MDS || osd-*.*MDT*.filesfree or filestotal || single || available or total inodes
|-
| MDS || osd-*.*MDT*.kbytesfree or kbytestotal || single || available or total disk space
|-
| OSS || obdfilter.*OST*.kbytesfree or kbytestotal, filesfree, filestotal || single || inodes and disk space as in MDS version
|-
| OSS || ldlm.namespaces.filter-*.pool.stats || stats || lustre distributed lock manager (ldlm) stats. I do not fully understand all of these stats. It also appears that these same stats are duplicated a single stats. Perhaps this is just a convenience.
|-
| OSS || ldlm.namespaces.filter-*.lock_count || single || lustre distributed lock manager (ldlm) locks
|-
| OSS || ldlm.namespaces.filter-*.pool.granted || single || lustre distributed lock manager (ldlm) granted locks - normally this matches lock_count. I am not sure of what the differences are, or what it means when they don't match.
|-  | OSS || ldlm.namespaces.filter-*.pool.grant_plan || single || ldlm lock planned number of granted locks (see 'glossary' in  http://wiki.lustre.org/doxygen/HEAD/api/html/ldlm__pool_8c_source.html)
|-
| OSS || ldlm.namespaces.filter-*.pool.grant_rate || single || ldlm lock grant rate aka 'GR'
|-
| OSS || ldlm.namespaces.filter-*.pool.grant_speed || single || ldlm lock grant speed = grant_rate - cancel_rate. You can use this to derive cancel_rate 'CR'. Or you can just get 'CR' from the stats file I assume.
|}
 
== Tools and Techniques ==
 
 
 
== References and Links ==
 
 
* Daniel Kobras, "Lustre - Finding the Lustre Filesystem Bottleneck", LAD2012. http://www.eofs.eu/fileadmin/lad2012/06_Daniel_Kobras_S_C_Lustre_FS_Bottleneck.pdf
* Florent Thery, "Centralized Lustre Monitoring on Bull Platforms", LAD2013. http://www.eofs.eu/fileadmin/lad2013/slides/11_Florent_Thery_LAD2013-lustre-bull-monitoring.pdf
* Daniel Rodwell and Patrick Fitzhenry, "Fine-Grained File System Monitoring with Lustre Jobstat", LUG2014. http://www.opensfs.org/wp-content/uploads/2014/04/D3_S31_FineGrainedFileSystemMonitoringwithLustreJobstat.pdf
* Gabriele Paciucci and Andrew Uselton, "Monitoring the Lustre* file system to maintain optimal performance", LAD2013. http://www.eofs.eu/fileadmin/lad2013/slides/15_Gabriele_Paciucci_LAD13_Monitoring_05.pdf
* Christopher Morrone, "LMT Lustre Monitoring Tools", LUG2011. http://cdn.opensfs.org/wp-content/uploads/2012/12/400-430_Chris_Morrone_LMT_v2.pdf
 
* https://github.com/jhammond/lltop
* https://github.com/chaos/lmt
* https://github.com/chaos/cerebro
* http://graphite.readthedocs.org/en/latest/
* https://mathias-kettner.de/check_mk
* https://github.com/shawn-sterling/graphios

Latest revision as of 15:14, 3 June 2015

This content has moved to the lustre.org Wiki.