Making Ruby’s garbage collector copy-on-write friendly, part 8
Hi folks, it has been a while since the last “Making Ruby’s garbage collector copy-on-write friendly” post. Many things have happened in the mean time, and my copy-on-write work is now usable (and used) in production environments, but it seems that there is still confusion. So I’ve decided to write a new post which explains the situation.
Copy-on-write updates
In March I submitted my work to the Ruby core mailing list. There has been some discussion. As a result, various people, including myself, have made a number of improvements.
The improvements are as follows:
- The copy-on-write friendly garbage collector is now a few % faster thanks to various micro-optimizations.
- The mark table implementation is now pluggable.
On Windows, a copy-on-write friendly garbage collector is totally useless because fork() is not supported on Windows. Furthermore, not all Ruby applications call fork(). So I’ve made two mark table implementations: one based on the old one (which marks objects directly by setting a flag on the object) and a copy-on-write friendly one. It is now possible to change the mark table implementation during runtime by calling GC.copy_on_write_friendly = (boolean value).
This has huge performance implications. The copy-on-write friendly mark table makes the garbage collector about 0%-20% slower, depending on the application and the workload. However, the non-copy-on-write friendly mark table is enabled by default, so by default there is only a 1% performance penalty. This performance penalty comes from the fact that marking an object now requires a function call which sets the mark flag, instead of setting the mark flag directly. But I think 1% is acceptable.
- Various little bugs in the debugging code have been fixed.
- Me and Ninh are working on a scientific paper regarding the copy-on-write work.
Unfortunately the discussion stranded. Matz had some concerns about performance, which is why I made the mark table implementation pluggable. I will re-submit the patch for further evaluation when the time is right.
Ruby Enterprise Edition
Many of you have probably heard of Ruby Enterprise Edition. There has been, and still is, a lot of fuss about the name. But that’s intentional and is all part of the plan — if people make a fuss about the name then it means we’re not in the Zone of Mediocrity.
What is Ruby Enterprise Edition? People thought it’s a closed source product, but in fact the website’s front page and download page has the following huge sticker:

(We actually added this sticker after we’ve seen that people think it’s going to be closed source.)
In one sentence:
Ruby Enterprise Edition is an easy to install Ruby interpreter that includes, among other things, my copy-on-write work.
Facts and myths:
- It’s open source, not closed source. It’s freely available to all.
- It’s not an entirely new Ruby implementation. It’s based on the official Ruby interpreter (MRI), version 1.8.6-p286. This means that all your existing Ruby applications are compatible with Ruby Enterprise Edition.
- It does not only include my copy-on-write work. There’s more. Read on.
- It is not a hostile fork, but a friendly one. The work included in Ruby Enterprise Edition is meant to be merged back to upstream at some point in the future.
- The copy-on-write work has been submitted to the Ruby core team in the past.
- Phusion Passenger is very well-integrated with Ruby Enterprise Edition. If you use Phusion Passenger in combination with Ruby Enterprise Edition, then your Rails applications will transparently use 33% less memory and will be faster, as if it’s magic. You don’t need to do anything special, it just works.
The only condition is that you must not be using conservative spawning in your application. But if you don’t know what conservative spawning is then you’re not using it, and you’ll have nothing to worry about.
Why was Ruby Enterprise Edition made?
Consider the following facts:
- My copy-on-write work can potentially save a lot of memory in Rails applications.
- The patch has been submitted to upstream, but hasn’t been accepted yet.
- There is a demand for lower memory usage in Rails applications, right now, not X months/years in the future.
Given the circumstances, and to satisfy the demands (including that of ourselves), we have decided that it would be best to maintain our own Ruby fork which includes these patches.
You might be wondering: Why not just release the patch? Why create a fork?
The answer is user friendliness. Telling people to download Ruby’s source code and apply a patch is not user friendly. In fact, to many people, it’s downright scary. Imagine that you want a transparent and easy way to make your Rails applications “magically” use 33% less memory. Which of the following instructions would you prefer?
Use Phusion Passenger to deploy your application. Then download the Ruby interpreter source code from www.ruby-lang.org. Download it and extract the tarball. Then, download this patch and apply it with this and that command. Then, run ‘./configure –prefix=/somewhere’. Make sure that /somewhere is not /usr in order to prevent overwriting your old Ruby installation, you don’t want that to happen. Then type ‘make’, and then ’sudo make install’. Then download RubyGems, extract it, and type ’sudo /somewhere/bin/ruby setup.rb’ in the RubyGems source folder. Then type ‘/somewhere/bin/gem install rails’ to install Ruby on Rails and whatever other gems you might need.
or:
Use Phusion Passenger to deploy your application. Then download Ruby Enterprise Edition. Run the installer and follow the instructions. Done.
The first one contains a lot of caveats. Many many things can go wrong. Many many people aren’t experienced in installing Ruby from source. It’s just easier if there’s a vendor that takes care of everything for you. And we are that vendor.
We want Phusion Passenger and everything surrounding it to have a “just works” experience.
So if it’s not just the copy-on-write work, then what else does Ruby Enterprise Edition include?
- This one is huge: by using Google’s tcmalloc, an alternative memory allocator, Ruby becomes 20% faster even with the copy-on-write friendly garbage collector! Furthermore, tcmalloc seems to be more copy-on-write friendly than ptmalloc2, Linux’s default memory allocator, so by using tcmalloc we can save even more memory!
We discovered this shortly after submitting the patch to the Ruby core mailing list. So Ruby Enterprise Edition also includes tcmalloc. - Ruby Enterprise Edition includes an easy-to-use installer which takes care of installing tcmalloc, Ruby, RubyGems and important/useful gems for you. It also teaches you how to tell Phusion Passenger to use Ruby Enterprise Edition instead of normal Ruby.
- In the future we might include more patches that might be useful in production environments.
Who’s already using Ruby Enterprise Edition?
I’m not sure because we’ve never asked our users. But the Ruby on Rails Wiki is running on it, and it has been great. I’ve been monitoring the Wiki for a while now, and ever since we’ve switched it to Phusion Passenger + Ruby Enterprise Edition, it has been rock-solid (before, it used to crash often). We also observed a great reduction in memory usage.
Michael Koziarski, a Rails core developer, runs Phusion Passenger with Ruby Enterprise Edition on his blog. He said that he downgraded his server because Phusion Passenger + Ruby Enterprise Edition saved him so much memory.
Final words
I hope this post has shed some light on matters. I’m just a little surprised that there’s all this confusion going on because all of this is also documented on the Ruby Enterprise Edition website’s FAQ. eustaquiorangel.com recently interviewed me and asked similar questions. You should check it out.
I’m also a little surprised that people seem to be reluctant about installing Ruby Enterprise Edition. If I have the choice between two products A and B, and B is the same as A but is much more efficient and is easy to install, then I’d choose B.
It is that people are suspicious about our claims? We’ve published a performance and memory usage comparison. Anybody can read this comparison, perform it himself, and check whether our claims are true. Everything we claim is verifiable so I don’t understand what there is to be suspicious about.
Please feel free to post your thoughts on this, I’d really like to hear what people have to say.

