pNFS: Handle NFS4ERR_RECALLCONFLICT correctly in LAYOUTGET

Instead of giving up altogether and falling back to doing I/O
through the MDS, which may make the situation worse, wait for
2 lease periods for the callback to resolve itself, and then
try destroying the existing layout.

Only if this was an attempt at getting a first layout, do we
give up altogether, as the server is clearly crazy.

Fixes: 183d9e7b11 ("pnfs: rework LAYOUTGET retry handling")
Cc: stable@vger.kernel.org # 4.7
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
This commit is contained in:
Trond Myklebust 2016-07-14 14:28:31 -04:00
parent e85d7ee420
commit 66b53f3258

View File

@ -1505,7 +1505,7 @@ pnfs_update_layout(struct inode *ino,
struct pnfs_layout_segment *lseg = NULL;
nfs4_stateid stateid;
long timeout = 0;
unsigned long giveup = jiffies + rpc_get_timeout(server->client);
unsigned long giveup = jiffies + (clp->cl_lease_time << 1);
bool first;
if (!pnfs_enabled_sb(NFS_SERVER(ino))) {
@ -1649,9 +1649,18 @@ pnfs_update_layout(struct inode *ino,
if (IS_ERR(lseg)) {
switch(PTR_ERR(lseg)) {
case -EBUSY:
case -ERECALLCONFLICT:
if (time_after(jiffies, giveup))
lseg = NULL;
break;
case -ERECALLCONFLICT:
/* Huh? We hold no layouts, how is there a recall? */
if (first) {
lseg = NULL;
break;
}
/* Destroy the existing layout and start over */
if (time_after(jiffies, giveup))
pnfs_destroy_layout(NFS_I(ino));
/* Fallthrough */
case -EAGAIN:
break;