BWG File System Monitoring
Jump to navigation
Jump to search
The task of the BWG File System Monitoring group is to:
- develop a list of existing parallel filesystem monitoring tools.
- identify their capabilities, add any others we think should exist.
- compare and contrast the tools to each other and the capabilities we think should exist.
Scenario:
You've just deployed a copy of Spider II (http://www.hpcwire.com/hpcwire/2013-08-16/spider_ii_emerges_to_give_ornl_a_big_speed_boost.html) at your site. Now you need to instrument it a) to detect if a component has failed, b) to ensure you are meeting your target performance numbers, and c) to determine what future improvements can be made.
- Where do you start?
- What information do you think you should collect?
- What tools/utilities/commands do you reach for?
The Features List - what to monitor.
The Tools List - how to monitor.
An IO Process Model to get an idea of the space of variables that are usefully monitored.
Status-ey Stuff
- LUG 2013 Report (final ppt).
- SC13 Report (in development).
Return to Benchmarking Working Group page.