Hero Run Best Practices

From OpenSFS Wiki
Revision as of 12:01, 14 April 2015 by BenEvans (talk | contribs) (→‎Tools)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Back to the main BWG page

Leads: Ben Evans, Andrew Uselton

When testing a parallel file system in a computing cluster environment, a Hero Run is an experiment that seeks to drive the bulk I/O to and from the file system at its best possible rate. The run is "heroic" in two ways: it can require a lot of work to get the test to show what you are wanting to see, and it can take a lot of I/O to show it.

The Hero Run team is tasked with:

  • Explaining sound methodology
  • Describing the mechanics of running a test for one or more benchmark applicaitons
  • Giving guidance on what sort of information one should capture during a hero run (about servers, targets, interconnect, clients, etc.)
  • Establishing an algorithm to combine Streaming read+write, Random read+write into a single number.

The Hero Run team is not tasked with:

  • Creating a Top500 list
  • Reporting results of particular tests
  • Comparing vendors offerings, or maintaining a database of results.

Zero to Hero

Hero-Run-Zero-to-Hero

What goes into a hero run? A list of terms and concepts

Single shared file performance

Tools

For testing standard POSIX-compliant filesystems, IOR will be used along with an MPI infrastructure. IOR is available Here . Client allocation in the cluster is up to the team performing the test.

Hero-run-IOR-tests

Benchmarking_Basics