Weird nfs performance: 1 thread better than 8, 8 better than 2!

Posted by Joe on Server Fault See other posts from Server Fault or by Joe
Published on 2010-12-20T23:30:33Z Indexed on 2011/01/04 11:55 UTC
Read the original article Hit count: 249

Filed under:

I'm trying to determine the cause of poor nfs performance between two Xen Virtual Machines (client & server) running on the same host. Specifically, the speed at which I can sequentially read a 1GB file on the client is much lower than what would be expected based on the measured network connection speed between the two VMs and the measured speed of reading the file directly on the server. The VMs are running Ubuntu 9.04 and the server is using the nfs-kernel-server package.

According to various NFS tuning resources, changing the number of nfsd threads (in my case kernel threads) can affect performance. Usually this advice is framed in terms of increasing the number from the default of 8 on heavily-used servers. What I find in my current configuration:

RPCNFSDCOUNT=8: (default): 13.5-30 seconds to cat a 1GB file on the client so 35-80MB/sec

RPCNFSDCOUNT=16: 18s to cat the file 60MB/s

RPCNFSDCOUNT=1: 8-9 seconds to cat the file (!!?!) 125MB/s

RPCNFSDCOUNT=2: 87s to cat the file 12MB/s

I should mention that the file I'm exporting is on a RevoDrive SSD mounted on the server using Xen's PCI-passthrough; on the server I can cat the file in under seconds (> 250MB/s). I am dropping caches on the client before each test.

I don't really want to leave the server configured with just one thread as I'm guessing that won't work so well when there are multiple clients, but I might be misunderstanding how that works. I have repeated the tests a few times (changing the server config in between) and the results are fairly consistent. So my question is: why is the best performance with 1 thread?

A few other things I have tried changing, to little or no effect:

increasing the values of /proc/sys/net/ipv4/ipfrag_low_thresh and /proc/sys/net/ipv4/ipfrag_high_thresh to 512K, 1M from the default 192K,256K
increasing the value of /proc/sys/net/core/rmem_default and /proc/sys/net/core/rmem_max to 1M from the default of 128K
mounting with client options rsize=32768, wsize=32768

From the output of sar -d I understand that the actual read sizes going to the underlying device are rather small (<100 bytes) but this doesn't cause a problem when reading the file locally on the client.

The RevoDrive actually exposes two "SATA" devices /dev/sda and /dev/sdb, then dmraid picks up a fakeRAID-0 striped across them which I have mounted to /mnt/ssd and then bind-mounted to /export/ssd. I've done local tests on my file using both locations and see the good performance mentioned above. If answers/comments ask for more details I will add them.

Developer IT

Weird nfs performance: 1 thread better than 8, 8 better than 2! - Developer IT

Weird nfs performance: 1 thread better than 8, 8 better than 2!

linux

server

Performance

xen

nfs

Related posts about linux

apt-get install and update fail

kernel module compiling error

Build-Essentials installation failing

Updating Debian kernel

Serial connection over a single USB cable (Windows to linux, or linux to linux)

Related posts about server

SQL SERVER – Server Side Paging in SQL Server 2011 Performance Comparison

Should I switch my server to Ubuntu Server from Windows Server 2003

Windows web server and SQL Server on same dedicated server

Windows 7 Desktop/Start Menu Redirection: Server O/S: Windows Server 2003 And Server 2008

Windows 7 Desktop/Start Menu Redirection: Server O/S: Windows Server 2003 And Server 2008

Categories cloud