The number of tasks or processes used in a test will impact the measured performance in many ways.
- Each IO request will have some period of time where the request is in-flight. While one process is waiting for that IO to complete, another process can be forming it's own IO requests. This makes better use of the CPU and other system resources.
- IO requests from different process that overlap may cause lock contention both locally and on the MDS and OSS. Lock contention will cause an increase in IO latency.
Depending on your usage of IOR and filesystem striping, IOR may assign IO offsets such that different processes on the same client are sending requests to different OSS and/or OST's. If the total number of processes (client-node-count * processes-per-client) is evenly divisible by the number of OST's, it's likely that each process will only communicate with one OST and therefore only one OSS.
The goal in choosing a task count is to generate enough IO's to keep the network interfaces busy. If you choose a task count that is too high, IO latency and CPU utilization will increase without seeing an increase in throughput.
This is a page that talks about the filesystem tunables that control the number of RPCs and the amount of memory each client can have outstanding at any time.