The term "Node" and "Client" are sometimes used interchangeably; however servers and routers are nodes as well. In this context we are only interested in the number of client nodes where a client is an independent instance of the software used to perform filesystem requests. The software may be in the form of a kernel module, or a user space implementation of the filesystem.
The number of client nodes involved in a test is important because the probability of lock contention and network traffic contention increases as the number of client nodes increases.
If the number of client nodes is greater than the number of LNET Routers, then you might be measuring the throughput capacity and efficiency of the LNET Router. As alternative, you could use the LNET Self Test. This would be a list of tunable parameters related to LNET.
If the number of client nodes is greater than the number of OSS's, but your performance isn't constrained by LNET Routers, then you might be measuring the capacity of the OSS. This is a list of resources that multiple clients compete for when communicating with the same OSS.
If the number of client nodes is greater than the number of OST's, then it's likely that requests from different clients will require resources from the same OST. The MDS and OSS manage these conflicts using locks. If there are lock exchanges between client, MDS, and OSS, this can impact IO latency as well as throughput.
While it's not a set-in-stone rule, using a 1-to-1 ratio of client nodes to OST's is a good place to start. It's likely that you will need to adjust the number of client nodes down from this 1-to-1 ratio if your performance isn't as high as expected. Before doing so, it is advised that you measure IO latency as that would be an indicator of resource (lock) contention.
Client node count is also important because an individual client is limited in the amount of IO's it can generate. That's why we're using a parallel system to begin with. By increasing the number of client nodes, you take advantage of the fact that non-conflicting IO performance will aggregate.