Why does writing a file to an NFS share send a COMMIT operation to the NFS server?

Posted by Antonis Christofides on Server Fault See other posts from Server Fault or by Antonis Christofides
Published on 2012-09-26T14:44:08Z Indexed on 2012/09/27 9:39 UTC
Read the original article Hit count: 233

Filed under:
|

I have a Debian squeeze (2.6.32-5-amd64) which is at the same time a NFS4 server and client (it mounts itself through NFS4). The local directory that leads directly to disk is /nfs4exports/mydir, whereas /nfs4mounts/mydir is the same thing mounted through NFS, using the machine's external IP address. Here is the line from fstab:

192.168.1.75:/mydir   /nfs4mounts/mydir      nfs4    soft  0 0

I have an application that writes many small files. If I write directly to /nfs4exports/mydir, it writes thousands of files per second; but if I write to /nfs4mounts/mydir, it writes 4 files per second or so. I can greatly increase speed if I add async to /etc/exports. (Writing a single large file to the NFS-mounted directory goes at more than 100 MB/s.)

I examine the server statistics and I see that whenever a file is written, it is "committed" (this also happens with NFSv3):

root@debianvboxtest:~# mount -t nfs4 192.168.1.75:/mydir /mnt
root@debianvboxtest:~# nfsstat|grep -A 2 'nfs v4 operations'
Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 10        4% 1         0% 1         0% 
root@debianvboxtest:~# echo 'hello' >/mnt/test1056
root@debianvboxtest:~# nfsstat|grep -A 2 'nfs v4 operations'
Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 11        4% 2         0% 2         0% 

Now in the RFC, I read this:

The COMMIT operation is similar in operation and semantics to the POSIX fsync(2) system call that synchronizes a file's state with the disk (file data and metadata is flushed to disk or stable storage). COMMIT performs the same operation for a client, flushing any unsynchronized data and metadata on the server to the server's disk or stable storage for the specified file.

I don't understand why the client commits. I don't think that the "echo" shell built-in command runs fsync; if echo wrote to a local file and then the machine went down, the file might be lost. In contrast, the NFS client appears to be sending a COMMIT upon completion of the echo. Why?

I am reluctant to use the async NFS server option, because it would apparently ignore COMMIT. I feel as if I had a local filesystem and I had to choose between syncing every file upon close and ignoring fsync altogether. What have I understood wrong?

© Server Fault or respective owner

Related posts about linux

Related posts about nfs