BWG File System Monitoring: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 5: | Line 5: | ||
[[BWG_FSM_Tool_List|The Tools List]] | Scenario: <br> | ||
You've just deployed a copy of Spider II (http://www.hpcwire.com/hpcwire/2013-08-16/spider_ii_emerges_to_give_ornl_a_big_speed_boost.html) at your site. Now you need to instrument it a) to detect if a component has failed, b) to ensure you are meeting your target performance numbers, and c) to determine what future improvements can be made. | |||
# Where do you start? | |||
# What information do you think you should collect? | |||
# What tools/utilities/commands do you reach for? | |||
[[BWG_FSM_Features_List|The Features List]] - what to monitor. | |||
[[BWG_FSM_Tool_List|The Tools List]] - how to monitor. | |||
[[BWG_FSM_IO_Process_Model|An IO Process Model]] to get an idea of the space of variables that are usefully monitored. | [[BWG_FSM_IO_Process_Model|An IO Process Model]] to get an idea of the space of variables that are usefully monitored. |
Revision as of 09:19, 20 September 2013
The task of the BWG File System Monitoring group is to:
- develop a list of existing parallel filesystem monitoring tools.
- identify their capabilities, any any others we think should exist.
- compare and contrast the tools to each other and the capabilities we think should exist.
Scenario:
You've just deployed a copy of Spider II (http://www.hpcwire.com/hpcwire/2013-08-16/spider_ii_emerges_to_give_ornl_a_big_speed_boost.html) at your site. Now you need to instrument it a) to detect if a component has failed, b) to ensure you are meeting your target performance numbers, and c) to determine what future improvements can be made.
- Where do you start?
- What information do you think you should collect?
- What tools/utilities/commands do you reach for?
The Features List - what to monitor.
The Tools List - how to monitor.
An IO Process Model to get an idea of the space of variables that are usefully monitored.
Status-ey Stuff
- LUG 2013 Report (final ppt).
- SC13 Report (in development).
Return to Benchmarking Working Group page.