BWG File System Monitoring: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
(8 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
[[Benchmarking_Working_Group|Back to the main BWG page]] | |||
'''Lead:''' Liam Forbes<br> | |||
'''Members:''' Alan Wild, Andrew Uselton, Ben Evans, Cheng Shao, Jeff Garlough, Jeff Layton, Mark Nelson, Nic Henke | |||
<br> | |||
The task of the BWG File System Monitoring group is to: | The task of the BWG File System Monitoring group is to: | ||
# develop a list of existing parallel filesystem monitoring tools. | # develop a list of existing parallel filesystem monitoring tools. | ||
# identify their capabilities, | # identify their capabilities, add any others we think should exist. | ||
# compare and contrast the tools to each other and the capabilities we think should exist. | # compare and contrast the tools to each other and the capabilities we think should exist. | ||
[[BWG_FSM_Tool_List|The Tools List]] | |||
Scenario: <br> | |||
You've just deployed a copy of [http://www.hpcwire.com/hpcwire/2013-08-16/spider_ii_emerges_to_give_ornl_a_big_speed_boost.html Spider II] at your site. Now you need to instrument it a) to detect if a component has failed, b) to ensure you are meeting your target performance numbers, and c) to determine what future improvements can be made. | |||
# Where do you start? | |||
# What information do you think you should collect? | |||
# What tools/utilities/commands do you reach for? | |||
[[BWG_FSM_Features_List|The Features List]] - what to monitor. | |||
[[BWG_FSM_Tool_List|The Tools List]] - how to monitor. | |||
[[BWG_FSM_IO_Process_Model|An IO Process Model]] to get an idea of the space of variables that are usefully monitored. | [[BWG_FSM_IO_Process_Model|An IO Process Model]] to get an idea of the space of variables that are usefully monitored. | ||
Status-ey Stuff | |||
* [http://wiki.opensfs.org/images/e/ec/LUG_2013_OpenSFS_BWG_update.pdf LUG 2013 Report] (final ppt). | |||
Return to [[Benchmarking_Working_Group|Benchmarking Working Group]] page. | Return to [[Benchmarking_Working_Group|Benchmarking Working Group]] page. |
Latest revision as of 11:55, 6 March 2015
Lead: Liam Forbes
Members: Alan Wild, Andrew Uselton, Ben Evans, Cheng Shao, Jeff Garlough, Jeff Layton, Mark Nelson, Nic Henke
The task of the BWG File System Monitoring group is to:
- develop a list of existing parallel filesystem monitoring tools.
- identify their capabilities, add any others we think should exist.
- compare and contrast the tools to each other and the capabilities we think should exist.
Scenario:
You've just deployed a copy of Spider II at your site. Now you need to instrument it a) to detect if a component has failed, b) to ensure you are meeting your target performance numbers, and c) to determine what future improvements can be made.
- Where do you start?
- What information do you think you should collect?
- What tools/utilities/commands do you reach for?
The Features List - what to monitor.
The Tools List - how to monitor.
An IO Process Model to get an idea of the space of variables that are usefully monitored.
Status-ey Stuff
- LUG 2013 Report (final ppt).
Return to Benchmarking Working Group page.