Lustre Technical FAQ

From OpenSFS
Jump to: navigation, search

This FAQ is intended to answer technical questions for developers on how Lustre works and mechanisms involved.

Timeouts

Why do I see a client timeout greater than at_max?

The clients allow for additional network latency (ptlrpc_at_get_net_latency). For some conditions, e.g. pulling your network cable, the network latency can to some high value. The client will wait for the server to time it out (which is capped exactly at at_max), and then for the network latency to deliver the reply.

You can see both values in the "timeouts" stats on the clients:

lctl get_param osc.*.timeouts
osc.lustre-OST0000-osc-ffff88000a8e3400.timeouts=
last reply : 1340951220, 0d0h00m03s ago
network    : cur   1  worst   1 (at 1340951095, 0d0h02m08s ago)   1   0   0   0 
portal 28  : cur   1  worst   1 (at 1340951095, 0d0h02m08s ago)   1   0   0   0