How to tune system settings for mongoDB on Linux?
- by jsh
Trying to squeeze a lot out of one question here -- please bear with me.
Although the MongoDB man pages make several useful recommendations about system settings like ulimit (http://docs.mongodb.org/manual/reference/ulimit/), and other production factors (http://docs.mongodb.org/manual/administration/production-notes/) they seem mysteriously silent on things like virtual memory and swap settings.  
The closest we get to a hint is that "...the operating system’s virtual memory subsystem manages MongoDB’s memory..." (http://docs.mongodb.org/manual/faq/fundamentals/#does-mongodb-require-a-lot-of-ram).
Running the same job - high writes and high reads on about 10,000,000 records in a single collection -- on my 4-processor, 4GB RAM macbook and an 8-core ubuntu box with 64GB RAM I saw dramatically WORSE read performance on the linux box with factory settings, and could hear the disk constantly spinning, indicating high I/O and presumably swapping. Yes, other things were happening on the box, but there was plenty of free RAM, disk space, etc.; furthermore, I did not see evidence that Mongo was expanding to take advantage of all that free RAM as it is touted to do.
Linux box default settings were as follows:
vm.swappiness =60
vm.dirty_background_ratio = 10
vm.dirty_ratio = 20
vm.dirty_expire_centisecs =3000
vm.dirty_writeback_centisecs=500
I hazarded some guesses looking at docs and blogs for other types of databases (Oracle, MYSQL, etc.), experimented, and adjusted as below.
vm.swappiness=10
vm.dirty_background_ratio=5
vm.dirty_ratio=5
vm.dirty_writeback_centisecs=250
vm.dirty_expire_centisecs=500
I saw some immediate apparent improvements in read time.  However, when I ran my test jobs again, read performance continued to be painfully sluggish during heavy writes.
Then, I REBUILT the collection from an available data source - and suddenly I can read at 1ms or less per record WHILE doing the write job!
So the question is really two-fold:
1) What are appropriate VM settings for MongoDB on Linux?
2) (bonus)  Does Mongo do some checking or optimization with the OS while data is being built?  In other words, if I have built a large data set with suboptimal VM or I/O settings, does Mongo make assumptions during the memory-mapping process that will fail to take advantage of optimizations down the road?
Obviously I don't fully grok memory mapping under the hood (I was hoping I wouldn't have to).
Any help appreciated...thanks!  -j