|
|
| (30 intermediate revisions by one other user not shown) |
| Line 1: |
Line 1: |
| == DRAFT IN PROGRESS ==
| | This content has moved to the [http://wiki.lustre.org/Lustre_Monitoring_and_Statistics_Guide lustre.org Wiki]. |
| | |
| | |
| == Introduction ==
| |
| | |
| There are a variety of useful statistics and counters available on lustre servers and clients. This is an attempt to detail some of these statistics.
| |
| | |
| The presumed audience for this is system administrators attempting to better understand and monitor their lustre file systems.
| |
| | |
| == Lustre Versions ==
| |
| | |
| This information is based on working mostly with lustre 2.4 and 2.5.
| |
| | |
| == Reading /proc vs lctl ==
| |
| | |
| 'cat /proc/fs/lustre...' vs 'lctl get_param'
| |
| With newer lustre versions, 'lctl get_pram' is the standard and recommended way to get these stats. This is to insure portability. I will use this method in all examples, a bonus is it can be often be a little shorter syntax.
| |
| | |
| == Data Formats ==
| |
| Format of the various statistics type files varies (and I'm not sure if there is any reason for this). The format names here are entirely *my invention*, this isn't a standard for lustre or anything.
| |
| | |
| It is useful to know the various formats of these files so you can parse the data and collect for use in other tools.
| |
| | |
| === Stats ===
| |
| | |
| What I consider a "standard" stats files include for example each OST or MDT as a multi-line record, and then just the data.
| |
| | |
| Example:
| |
| <pre>
| |
| obdfilter.scratch-OST0001.stats=
| |
| snapshot_time 1409777887.590578 secs.usecs
| |
| read_bytes 27846475 samples [bytes] 4096 1048576 14421705314304
| |
| write_bytes 16230483 samples [bytes] 1 1048576 14761109479164
| |
| get_info 3735777 samples [reqs]
| |
| </pre>
| |
| | |
| snapshot_time = when the stats were written
| |
| | |
| For read_bytes and write_bytes:
| |
| First number = number of times (samples) the OST has handled a read or write.
| |
| Second number = the minimum read/write size
| |
| Third number = maximum read/write size
| |
| Fourth = sum of all the read/write requests in bytes, the quantity of data read/written.
| |
| | |
| === Jobstats ===
| |
| | |
| Jobstats are slightly more complex multi-line records. Each OST or MDT also has an entry for each jobid (or procname_uid perhaps), and then the data.
| |
| | |
| Example:
| |
| <pre>
| |
| obdfilter.scratch-OST0000.job_stats=job_stats:
| |
| - job_id: 56744
| |
| snapshot_time: 1409778251
| |
| read: { samples: 18722, unit: bytes, min: 4096, max: 1048576, sum: 17105657856 }
| |
| write: { samples: 478, unit: bytes, min: 1238, max: 1048576, sum: 412545938 }
| |
| setattr: { samples: 0, unit: reqs } punch: { samples: 95, unit: reqs }
| |
| - job_id: . . . ETC
| |
| </pre>
| |
| | |
| Notice this is very similar to 'stats' above. But there's a lot of extra: { bling: }! Why? Just because it got coded that way?
| |
| | |
| === Single ===
| |
| | |
| These really boil down to just a single number in a file. But if you use "lctl get_param" you get an output that is nice for parsing. For example:
| |
| <pre>[COMMAND LINE]# lctl get_param osd-ldiskfs.*OST*.kbytesavail
| |
| | |
| | |
| osd-ldiskfs.scratch-OST0000.kbytesavail=10563714384
| |
| osd-ldiskfs.scratch-OST0001.kbytesavail=10457322540
| |
| osd-ldiskfs.scratch-OST0002.kbytesavail=10585374532
| |
| </pre>
| |
| | |
| === Histogram ===
| |
| | |
| Some stats are histograms, these types aren't covered here. Typically they're useful on their own without further parsing(?)
| |
| | |
| | |
| * brw_stats
| |
| * extent_stats
| |
| | |
| == Scripts to Parse Data Formats ==
| |
| | |
| Here are some example perl modules to help parse the various data formats. Better, faster, stronger scripts and methods are welcome.
| |
| | |
| == Interesting Statistics Files ==
| |
| | |
| This is a collection of various stats files that I have found useful. It is *not* complete or exhaustive. Additions or corrections are welcome.
| |
| | |
| host type, target, format, discussion
| |
| | |
| * Host Type = MDS, OSS, client
| |
| * Target = "lctl get_param target"
| |
| * Format = data format discussed above
| |
| | |
| {| class="wikitable"
| |
| |-
| |
| !Host Type !! Target !! Format !! Discussion
| |
| |-
| |
| | MDS || mdt.*MDT*.num_exports || single || number of exports per MDT - these are clients, including other lustre servers
| |
| |-
| |
| | MDS || mdt.*.job_stats || jobstats || Metadata jobstats. Note that with lustre DNE you may have more than one MDT, so even if you don't it may be wise to design any tools with that assumption.
| |
| |-
| |
| | OSS || obdfilter.*.job_stats || jobstats || the per OST jobstats.
| |
| |-
| |
| | MDS || mdt.*.md_stats || stats || Overall metadata stats per MDT
| |
| |-
| |
| | MDS || mdt.*MDT*.exports.*@*.stats || stats || Per-export metadata stats. Exports are clients, this also includes other lustre servers. The exports are named by interfaces, which can be unweildy. See "lltop" for an example of a script that used this data well. The sum of all the export stats should provide the same data as md_stats, but it is still very convenient to have md_stats, "ltop" uses them for example.
| |
| |-
| |
| | OSS || obdfilter.*.stats || stats || Operations per OST. Read and write data is particularly interesting
| |
| |-
| |
| | OSS || obdfilter.*OST*.exports.*@*.stats || stats || per-export OSS statistics
| |
| |-
| |
| | MDS || osd-*.*MDT*.filesfree or filestotal || single || available or total inodes
| |
| |-
| |
| | MDS || osd-*.*MDT*.kbytesfree or kbytestotal || single || available or total disk space
| |
| |-
| |
| | OSS || obdfilter.*OST*.kbytesfree or kbytestotal, filesfree, filestotal || single || inodes and disk space as in MDS version
| |
| |-
| |
| | OSS || ldlm.namespaces.filter-*.pool.stats || stats || lustre distributed lock manager (ldlm) stats. I do not fully understand all of these stats. It also appears that these same stats are duplicated a single stats. Perhaps this is just a convenience.
| |
| |-
| |
| | OSS || ldlm.namespaces.filter-*.lock_count || single || lustre distributed lock manager (ldlm) locks
| |
| |-
| |
| | OSS || ldlm.namespaces.filter-*.pool.granted || single || lustre distributed lock manager (ldlm) granted locks - normally this matches lock_count. I am not sure of what the differences are, or what it means when they don't match.
| |
| |- | OSS || ldlm.namespaces.filter-*.pool.grant_plan || single || ldlm lock planned number of granted locks (see 'glossary' in http://wiki.lustre.org/doxygen/HEAD/api/html/ldlm__pool_8c_source.html)
| |
| |-
| |
| | OSS || ldlm.namespaces.filter-*.pool.grant_rate || single || ldlm lock grant rate aka 'GR'
| |
| |-
| |
| | OSS || ldlm.namespaces.filter-*.pool.grant_speed || single || ldlm lock grant speed = grant_rate - cancel_rate. You can use this to derive cancel_rate 'CR'
| |
| |}
| |