Hero Run Best Practices

From OpenSFS Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Back to the main BWG page

Leads: Ben Evans, Andrew Uselton

When testing a parallel file system in a computing cluster environment, a Hero Run is an experiment that seeks to drive the bulk I/O to and from the file system at its best possible rate. The run is "heroic" in two ways: it can require a lot of work to get the test to show what you are wanting to see, and it can take a lot of I/O to show it.

The Hero Run team is tasked with:

  • Explaining sound methodology
  • Describing the mechanics of running a test for one or more benchmark applicaitons
  • Giving guidance on what sort of information one should capture during a hero run (about servers, targets, interconnect, clients, etc.)
  • Establishing an algorithm to combine Streaming read+write, Random read+write into a single number.

The Hero Run team is not tasked with:

  • Creating a Top500 list
  • Reporting results of particular tests
  • Comparing vendors offerings, or maintaining a database of results.

Zero to Hero

Hero-Run-Zero-to-Hero

What goes into a hero run? A list of terms and concepts

Single shared file performance

Tools

For testing standard POSIX-compliant filesystems, IOR will be used along with an MPI infrastructure. IOR is available Here . Client allocation in the cluster is up to the team performing the test.

Hero-run-IOR-tests

Benchmarking_Basics