NFS v4, HA Migration, and stale handles on clients

Posted by Karl Katzke on Server Fault See other posts from Server Fault or by Karl Katzke
Published on 2009-06-05T12:55:49Z Indexed on 2012/09/11 3:40 UTC
Read the original article Hit count: 628

Filed under:

migration

|

nfs

|

high-availability

|

nfs-client

|

pacemaker

I'm managing a server running NFS v4 with Pacemaker/OpenAIS. NFS is configured to use TCP. When I migrate the NFS server to another node in the Pacemaker cluster, even though the metadata is persisted, connections from the clients 'hang' and eventually time out after 90 seconds. After that 90 seconds, the old mountpoint becomes 'stale' and the mounted files can no longer be accessed.

The 90 second grace period seems to be part of the server configuration and not the client configuration. I see this message on the server:

kernel: NFSD: starting 90-second grace period

If I restart the NFS client on the client nodes after I migrate (unmounting and then remounting the share), then I don't experience the problem, but connections and file transfers still interrupted.

Three questions:

What is the 90 second grace period? What's it there for?
How can I keep the files from going stale on the clients without restarting them after I migrate the NFS server to another node?
Is it actually possible to migrate the NFS server without having large file uploads drop?

© Server Fault or respective owner

Related posts about migration

Where do I find scripts generated by SharePoint MCMS Migration Profiles

as seen on Server Fault - Search for 'Server Fault'
I am attempting to migrate data from an Microsoft Content Management Server (MCMS) 2002 instance into a new Microsoft Office Sharepoint Server (MOSS) 2007 installation using the Manage Microsoft Content Management Server Migration Profiles tool in the Operations space of MOSS Central Administration… >>> More
SQL SERVER – Microsoft SQL Server Migration Assistant V6.0 Released

as seen on SQL Authority - Search for 'SQL Authority'
Every company makes a different decision about the database when they start, but as they move forward they mature and make the decision which is based on their experience and best interest of the organization. Similarly, quite a many organizations make different decisions on database, like Sybase… >>> More
Core Data migration problem: "Persistent store migration failed, missing source managed object model

as seen on Stack Overflow - Search for 'Stack Overflow'
The Background A Cocoa Non Document Core Data project with two Managed Object Models. Model 1 stays the same. Model 2 has changed, so I want to migrate the store. I've created a new version by Design Data Model Add Model Version in Xcode. The difference between versions is a single relationship… >>> More
The Business case for Big Data

as seen on Oracle Blogs - Search for 'Oracle Blogs'
The Business Case for Big Data Part 1 What's the Big Deal Okay, so a new buzz word is emerging. It's gone beyond just a buzzword now, and I think it is going to change the landscape of retail, financial services, healthcare....everything. Let me spend a moment to talk about what i'm going to talk… >>> More
Core Data migration failing with error: Failed to save new store after first pass of migration

as seen on Stack Overflow - Search for 'Stack Overflow'
In the past I had already implemented successfully automatic migration from version 1 of my data model to version 2. Now, using SDK 3.1.3, migrating from version 2 to version 3 fails with the following error: Unresolved error Error Domain=NSCocoaErrorDomain Code=134110 UserInfo=0x5363360 "Operation… >>> More

Related posts about nfs

12.10 update breaks NFS mount

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I've just upgraded to the latest 12.10 beta. Rebooted twice. The problem is with the NFS folders not mounting, here's a verbose log. # mount -v myserver:/nfs_shared/tools /tools/ mount: no type was given - I'll assume nfs because of the colon mount.nfs: timeout set for Mon Oct 1 11:42:28 2012 mount… >>> More
NFS to NFS mount

as seen on Server Fault - Search for 'Server Fault'
I have a machine that I need to bridge NFS files to. Can I mount an NFS directory on machine2 from machine1 and then mount the mounted NFS directory on machine2 on machine3 via NFS? Do you see any problems with that? I am basically bridging some subnet domains this way, in a certain fashion. My development… >>> More
CentOS 5.4 NFS v4 client file permissions differ from original files & NFS Share file contents

as seen on Server Fault - Search for 'Server Fault'
Having a strange problem with NFS share and file permissions on the 1 out of the 2 NFS clients, web1 has file permissions issues but web2 is fine. web1 and web2 are load balanced web servers. So questions are: how do I ensure NFS share file contents retain the same permissions for user/group as… >>> More
Slow NFS and GFS2 performance

as seen on Server Fault - Search for 'Server Fault'
Recently I've designed and configured a 4 node cluster for a webapp that does lots of file handling. The cluster have been broken down into 2 main roles, webserver and storage. Each role is replicated to a second server using drbd in active/passive mode. The webserver does a NFS mount of the data… >>> More
NFS (with Kerberos) mount failing due to "Server not found in Kerberos database" error

as seen on Server Fault - Search for 'Server Fault'
When running: `sudo mount -t nfs4 -o sec=krb5 sol.domain.com:/ /mnt` I get this error on the client: mount.nfs4: access denied by server while mounting sol.domain.com:/ And on the server syslogs UNKNOWN_SERVER: authtime 0, nfs/[email protected] for nfs/ip-#-#-#-#.ec2.internal@SOL… >>> More