Zero to Hero

From OpenSFS
Revision as of 12:57, 3 October 2014 by BenEvans (talk | contribs) (First attempts)
Jump to: navigation, search

Starting from scratch

HPC systems provide value through their ability to run big jobs faster than other systems. It is natural to look for performance metrics; users need to know if your system can run their job, managers want to know if they received good value from their vendor, and so on.

When a system is procured, the vendor often provides peak performance numbers. These are based on the theoretical maximum for each component. Storage benchmarks can often run at 80% or more of the theoretical peak. Good results are almost never achieved on the first attempt. What follows here is a guide to improving benchmark results to get to the desired result.

First attempts

A normal test of read and write rates begins with a program that creates random data and writes it to a file. The new file is then read by the processor and sent to /dev/null.

The first time you try this, expect to be underwhelmed. It's far from the peak that you expected. What could be wrong? Where is the bottleneck? Is the problem with IO parameters such as the block size, or is there a problem with parallelism?

Start over, restrict your test to one processor and one IO device. Determine the optimal parameters for this configuration. Then see if another processor helps. See if you can double the speed when you are using two IO devices. Find out if one processor can drive two IO devices at twice the speed of one device.


The dd command is something you've heard of, maybe used on occasion for testing disk arrays, you might as well give it a shot, and maybe reading from your local drive is causing some problems. So you run something like:

dd if=/dev/zero of=/mnt/filesystem/testdir/file bs=4k count=1m

You get something that is better than your previous attempts, but still isn't near what you should get.

More threads? can't hurt. You see some improvement as you start to add threads, but the returns are diminishing, and you're still not seeing the numbers you were promised. You're pretty sure you're out of network bandwidth on the client. If you've been working with networked filesystems before, this may or may not have happened to you.

More clients

Time to add another client to the mix. Taking the scripts and configurations you've got from the single client dd script, you expand it out to 2 clients. You get roughly double the performance, now we're getting somewhere.

So you start adding in 4, then 8 then ... clients watching the numbers creep up. However, when you started this whole thing you weren't really thinking of running dozens, hundreds, or thousands? threads across many, many different clients. Things are starting to get messy.

Someone else has got to have done this before, right? Maybe tools exist that can help sort all of this out.

Clustered runs

Hopefully you've got your compute cluster up and running, and you can run jobs on it. Time to go hunting for tools to use. IOR, IOzone and xdd are the tools that the Hero Run task group uses (which is why you're here). So you grab one, find the basic scripts to run them on your cluster, and start firing away, tweaking variables in the scripts, recording numbers, creeping upwards in performance, and hopefully reaching your goal.