Potential problems with preforking Ruby on Rails
In my previous blog entry, I blogged about using fork() and copy-on-write semantics to reduce memory usage in Ruby on Rails. Saimon Moore suggested that I should contact Zed Shaw, author of Mongrel. I asked him on his opinion and potential problems. Unfortunately I don’t have permission to quote him, so I’ll just summarize the issues (with preforking in Rails) and my own findings.
Leaking I/O handles
It is said that Ruby leaks I/O handles when it forks. I really don’t know how in the world that is possible - when the child exits all of its resources are freed, there is no way for it to leak anything unless the parent process forgets to clean up something that it created before forking.
I wrote a script to test this:
require 'socket'
serv = TCPServer.new(2202)
puts "*** File descriptors in parent process:"
system("ls --color -l /proc/#{Process.pid}/fd")
pid = fork do
serv = TCPServer.new(2203)
puts "*** File descriptors in child process:"
system("ls --color -l /proc/#{Process.pid}/fd")
exit
end
Process.waitpid(pid)
puts "*** File descriptors in parent process:"
system("ls --color -l /proc/#{Process.pid}/fd")
The script creates a TCP server socket, then lists the process’s file descriptors. It then forks, creates another TCP server socket, and lists the child process’s file descriptors. The parent process waits for the child, closes its own server socket, then lists its file descriptors again. The output is:
*** File descriptors in parent process: total 4 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 0 -> /dev/pts/0 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 1 -> /dev/pts/0 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 2 -> /dev/pts/0 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 3 -> socket:[2300615] *** File descriptors in child process: total 5 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 0 -> /dev/pts/0 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 1 -> /dev/pts/0 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 2 -> /dev/pts/0 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 3 -> socket:[2300615] lrwx------ 1 hongli hongli 64 2007-04-05 21:48 4 -> socket:[2300628] *** File descriptors in parent process: total 3 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 0 -> /dev/pts/0 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 1 -> /dev/pts/0 lrwx------ 1 hongli hongli 64 2007-04-05 21:48 2 -> /dev/pts/0
Conclusion: Everything looks perfectly normal to me. I have no idea what “leaking IO handles” means.
Reconnecting to the database
It is said that preforking will cause issues with database reconnections. I gave it a try.
- I preforked 2 Rails processes with my script.
- I setup lighttpd to only proxy to the first Rails process.
- I then visited a page in my Rails app which lists a bunch of records in the database.
- I stopped the MySQL server.
- I reloaded the page, and it threw an exception, which is to be expected.
- I started the MySQL server and reloaded the page. The page displayed fine.
- I setup lighttpd to only proxy to the second Rails process, and reloaded the page. The page still displayed fine.
Conclusion: I have no idea what database reconnection issues people are talking about. I can’t find any.
Sharing issues with pstore and SQLite
I don’t use SQLite, and don’t plan on using them any time soon, so I didn’t test this. I use SQLSessionStore for storing session data in MySQL, so pstore issues don’t affect me directly. Pstore is the default session storage in Rails.
Pstore stores session data in files. Imagine two HTTP clients, with the same session ID, accessing two different Rails processes. Both Rails processes write session data to disk. What happens? Will the pstore session file be corrupted? Zed said that even Mongrel (without preforking) has problems with pstore sharing, so it’s possible that Rails doesn’t lock the pstore session file.
I tested my own Rails application, which uses SQLSessionStore:
- I launched 2 Rails processes.
- I added the following functions to a controller:
def read if session[:rand].nil? render :text => "No random number set." else render :text => session[:rand] end end def write session[:rand] = rand read end
The write method generates a random number and saves it in the session. The read method reads the last saved number.
- I setup lighttpd to only use Rails process 1.
- I visited the ‘write’ page, then setup lighttpd to use Rails process 2. I then visited the ‘read’ page. The number is still correct.
- I repeated this a few times, and couldn’t find any problems.
Conclusion: I don’t know whether pstore has problems, but SQLSessionStore seems to work fine. It’s a good idea to use SQLSessionStore anyway, as pstore slows down when you have a lot of sessions, and SQLSessionStore makes it easy to wipe idle session data.
Garbage collection makes pages dirty
According to this page, Ruby’s mark-and-sweep garbage collection makes all memory pages dirty, causing almost the entire child process’s to be copied. In my previous blog, I ran httperf to test preforked Rails. Rails creates a new ActionController object every time a HTTP request comes in, so using httperf will definitely activate garbage collection. Yet the memory usage didn’t increase as much as the page predicted it would.
I have a Perl application which uses about 35 MB of memory. 25 MB of that is spent on storing the parsed Perl optree, and only 10 MB is spent on storing runtime data. I suspect that Ruby is similar: most of the memory is spent on storing Rails code, not variable data. Code is probably never garbage collected (why would it be? in a dynamic language one cannot predict whether a function will be used in the future) so the garbage collector probably wouldn’t mark the pages containing Ruby opcodes as dirty. This explains why memory usage doesn’t go up a lot, after having made some HTTP requests.
Conclusion: I can’t find the problem. Nothing to worry about.
Final conclusion
I couldn’t find any large problems that were relevant to me. In the future I will test this preforking technique on a busy (non-commercial) website to see how well it works.

Ezra said,
April 5, 2007 @ 11:01 pm
With the rails config settings you showed in your last post:
config.cache_classes = false
config.whiny_nils = true
config.breakpoint_server = false
config.action_controller.consider_all_requests_local = true
config.action_controller.perform_caching = false
config.action_view.cache_template_extensions = false
config.action_view.debug_rjs = false
You are basically running in development mode which means rails does a lot of reloading of classes and db handles on each request. In production mode this does not happen. Please use a standard rails production mode config with caching turned on and try your experiments again. I’m interested to see what you find out.
Hongli said,
April 6, 2007 @ 12:40 am
Oops, you’re right, I forgot to disable reloading of classes. I’ve fixed my blog post now, thank you for pointing this out.
The memory savings turned out to be far better than expected: from 55% to 75%!
HelloWorld said,
April 28, 2007 @ 11:07 am
Peace people
We love you
Marcin Raczkowski said,
May 29, 2007 @ 9:41 pm
problem with database connections is that when you try to do 2 gueries CONCURENTLY two processes use same descriptor which can cause unexpected behavior - i’m currently working on fixing that isue with my mongrel modiication - it should allow mongrel to fork on request, i’m using your script for preloading - and i’m going to release my modifications on compatibile OS license
greets
Marcin Raczkowski
Hongli said,
May 30, 2007 @ 12:02 am
What I do is forking before any requests (but after Rails has been loaded). Wouldn’t that solve all problems?
Ruby Garbage Collection Links « Open Source Teddy Bears said,
June 25, 2007 @ 8:44 am
[…] * Hongli Lai […]
Ruby Developer said,
December 6, 2007 @ 8:16 pm
I don’t think there are potential risks are worth it. Let’s stay away from Forking people, unless you want to spend endless nights of Christmas fixing stuff.
Jenn
Hongli said,
December 6, 2007 @ 9:57 pm
Jenn, what “potential risks” are you talking about? Forking is a well-known and well-understood concept. Whatever problems that may arise are simply implementation issues that can be fixed. It’s not black magic.