Hero Run Best Practices: Difference between revisions

From OpenSFS Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 2: Line 2:


'''Leads:''' Ben Evans, Andrew Uselton<br />
'''Leads:''' Ben Evans, Andrew Uselton<br />
When testing a parallel file system in a computing cluster environment, a '''Hero Run''' is an experiment that seeks to drive the bulk I/O to and from the file system at its best possible rate. The run is "heroic" in two ways: it can require a lot of work to get the test to show what you are wanting to see, and it can take a lot of I/O to show it.


The Hero Run team is tasked with:
The Hero Run team is tasked with:
* Establishing a process to determine the peak streaming performance of a clustered filesystem (both read and write)
* Explaining sound methodology
* Describing what the test is doing, and detailing why we chose it over other options
* Describing the mechanics of running a test for one or more benchmark applicaitons
* Providing an optional form to detail the system that was tested (servers, targets, interconnect, clients, etc.)
* Giving guidance on what sort of information one should capture during a hero run (about servers, targets, interconnect, clients, etc.)
 
The Hero Run team is optionally tasked with:
* Establishing a process to determine the peak random I/O performance of a clustered filesystem (both read and write)
* Describing what the test is doing, and detailing why we chose it over other options
* Establishing an algorithm to combine Streaming read+write, Random read+write into a single number.
* Establishing an algorithm to combine Streaming read+write, Random read+write into a single number.


The Hero Run team is not tasked with:
The Hero Run team is not tasked with:
* Creating a Top500 list
* Creating a Top500 list
* Determining scaling (though the tests can be used to establish that)
* Reporting results of particular tests
* Comparing vendors offerings, or maintaining a database of results.
* Comparing vendors offerings, or maintaining a database of results.



Revision as of 11:00, 3 April 2015

Back to the main BWG page

Leads: Ben Evans, Andrew Uselton

When testing a parallel file system in a computing cluster environment, a Hero Run is an experiment that seeks to drive the bulk I/O to and from the file system at its best possible rate. The run is "heroic" in two ways: it can require a lot of work to get the test to show what you are wanting to see, and it can take a lot of I/O to show it.

The Hero Run team is tasked with:

  • Explaining sound methodology
  • Describing the mechanics of running a test for one or more benchmark applicaitons
  • Giving guidance on what sort of information one should capture during a hero run (about servers, targets, interconnect, clients, etc.)
  • Establishing an algorithm to combine Streaming read+write, Random read+write into a single number.

The Hero Run team is not tasked with:

  • Creating a Top500 list
  • Reporting results of particular tests
  • Comparing vendors offerings, or maintaining a database of results.

Zero to Hero

Hero-Run-Zero-to-Hero

What goes into a hero run? A list of terms and concepts

Single shared file performance

Tools

For testing standard POSIX-compliant filesystems, IOR will be used along with an MPI infrastructure. IOR is available Here . Client allocation in the cluster is up to the team performing the test.

Hero-run-IOR-tests