Phusion Passenger 2.0.4 released; 37signals’s Ta-da List now using Passenger

Phusion Passenger is an Apache module for deploying Ruby on Rails web applications, and is mainly focused on ease of use and stability. Since its first release in April 2008, it has gained quite a lot of attention from the Rails community, and nowadays it has become a very popular deployment tool.

Tobias Lütke has recently announced that Shopify is now running Phusion Passenger:

“In conclusion: I cannot see any reason to choose a different deployment strategy at this point. Its simple, complete, fast and well documented.” —
Tobias Lütke

Even 37signals is now using Phusion Passenger. They’ve recently announced that they’ve switched Ta-da List to Phusion Passenger:

“We’re really impressed with the ease of deployment and stability under Passenger. The app now requires less than 10 lines of configuration to launch and deploy.” — Joshua Sierles

Recent changes

Phusion Passenger is under constant maintenance and development. We are pleased to announce Phusion Passenger version 2.0.4. This is mainly a bugfix release, but contains one new feature. Please read for the full announcement.


Phusion Passenger is provided to the community for free. If you like Phusion Passenger, please consider sending us a donation. Thanks!

Hongli Lai Ninh Bui

Comments (2)

Upcoming Ruby Enterprise Edition improvements thanks to sponsorship campaign

Wow, the community has been on fire lately. 6 months after the first introduction of Phusion Passenger (our Rails deployment utility) and Ruby Enterprise Edition (which, in combination with Phusion Passenger, allows one’s Rails applications to use 33% less memory), people are still saying good things about us. 🙂

Tobias Lütke from Shopify has given us a lot of praise:

“At the same time Passenger introduced some tangible improvements. We switched to enterprise ruby to get the full benefit of the [Copy-On-Write] memory characteristics and we can absolutely confirm the memory savings of 30% some others have reported. This is many thousand dollars of savings even at today’s hardware prices.”

Not only that, 37signals has recently switched Ta-da List to Phusion Passenger. According to DHH, their system administrators have been very content with Phusion Passenger.

But there’s more.


We’ve been talking with DHH from 37signals about a sponsorship campaign for supporting the development of REE. We just received words that all funds have been secured. In the mean time, we had been working hard on developing REE, and so we will be releasing the improvements as well as announcing the sponsors in the very near future. The improvements are, in a nutshell:

  • Integration with the RailsBench GC patches, allowing one to tweak the garbage collector for maximum performance.
  • Better MacOS X support.
  • Better 64-bit support.
  • Better Solaris support.

Thank you, 37signals and other sponsors!

Stay tuned for more news.

Comments (8)

Who’s using Ruby Enterprise Edition in production?

A while ago someone had asked who’s using Phusion Passenger in production. The positive responses were overwhelming; thanks to all who had replied!

Ruby Enterprise Edition seems to get a bit less attention than Phusion Passenger, so we’re wondering how many people use Ruby Enterprise Edition in production. We need this information for marketing purposes, and seeing that we’re providing REE for free, we’d be really grateful if you could take some time to tell us. Please send replies to the mailing list.

And if you can, please also tell us which of your websites are powered by REE and how much traffic they get.

Thank you.

Comments (6)

Blog compromised

Roger Pack just told me that my blog had been compromised. A popup will open up in Internet Explorer. I just get rid of that. Thanks for the heads up Roger.

My apologies to those who saw the popup. It seemed to be some kind of XSS vulnerability in WordPress. Sigh, I guess I’ll have to upgrade WordPress again. 🙁

Comments (1)

Long-running requests, now a problem of the past

Do you have long-running requests in your web applications? Say, requests that can take 30 seconds or longer. Before, you might run into web server queuing problems.

However, this is now a problem of the past. 37signals has sponsored the development of a new feature in Phusion Passenger called global queuing. Read more about this on


Optimizing RDoc

When it comes to documentation, there’s always room for improvement. I’ve been contributing to docrails for quite a while now. However, one issue that has always annoyed me is how slow RDoc is.

In particular, generating RDoc output for ActiveRecord is painfully slow. It takes 34.2 seconds on my machine. Using JRuby helps a little, seeing that it can do JIT and all, but not much: it still takes 29 seconds (including JVM startup time). I had hoped for a bigger performance improvement.

So a few days ago I started optimizing RDoc. And not without result. Here are the results of generating RDoc for the Ruby 1.8 standard library with RDoc running in MRI 1.8:
Before optimizing: 1624.8 seconds (27 minutes). I had to close all applications because RDoc ate almost 1.5 GB of memory.
After optimizing: 531.2 seconds (8 minutes and 51 seconds). RDoc now uses only about half the memory it used to need.
Performance improvement: 206%!


The results for generating RDoc for ActiveRecord are:

Before optimizing (MRI 1.8): 34.2 seconds
After optimizing (MRI 1.8): 29.9 seconds
Performance improvement: 14%

Before optimizing (MRI 1.9): 16.8 seconds
After optimizing (MRI 1.9): 13.2 seconds
Performance improvement: 27%

Before optimizing (JRuby): 29.0 seconds
After optimizing (JRuby): 24.5 seconds
Performance improvement: 18%


So what did I do? Read on!

Multi-threaded parsing

My laptop has a dual core CPU. It’s kind of a waste to see RDoc utilizing only a single core. RDoc can be parallelized by running 2 rdoc processes at the same time. But I’m never generating 2 different sets of RDoc documentation at the same time, so running 2 rdoc processes in parallel won’t do me any good. So I had to search for ways to parallelize a single rdoc process.

It turns out that RDoc’s code parsing phase is pretty easy to parallelize. It just creates a new parser object for every input file, and parses files sequentially. So I modified RDoc to create multiple worker threads during the parsing phase. The main thread will offer a list of filenames, and the worker threads will each consume filenames and parse the corresponding file as fast as they can.

What I did not try to do however, is making the parser itself multi-threaded. That’s just asking for problems. Those of you who are interested in multi-threaded programming, but aren’t experienced with it, should keep this in mind: try to keep your threading code as simple as possible, and make sure that your threads share as little data as possible. In my case, the only things that the threads share are the filename queue and the result aggregation array.

MRI 1.8 implements userspace threads, so having multiple threads doesn’t increase CPU core utilization. MRI 1.9 uses kernel threads, but it has a global interpreter lock, so it can’t utilize multiple CPU cores either. Luckily, JRuby doesn’t have a global interpreter lock. The parsing phase is now about 35% faster, when run on JRuby on my dual core machine.

This patch has been submitted upstream:

Cache template pages

For larger projects, the slowest phase of RDoc is probably the HTML generation phase. The latest version of RDoc uses ERB templates for HTML output. RDoc generates an HTML file for every class, every module and every source file. However after some profiling with ruby-prof, I found out that RDoc recompiles the ERB template for every output file, which is slow.

I modified RDoc so that it caches compiled ERB templates. This made it about 19% faster.

This patch has been submitted upstream:

Reduce the number of relative URL generations

For every output file, RDoc generates a list of URLs to this output file, relative to the output filenames of all classes, files and methods. This is done by calling the function index_to_links, which runs in O(n) time. For every output file, index_to_links is called 3 times, each time with the full list of classes, files and methods. Suppose there are K output files, L classes, N files and M methods. Then this will result in K * (L+N+M) operations, which is quadratic! Ouch!

The RDoc authors already tried to optimize this by only calling index_to_links when the current output filename’s directory is different from the last one. For example, if the last output file was classes/active_support/cache/memcache_store.html and the current output file is classes/active_support/cache/memory_store.html, then index_to_links doesn’t need to be called again, and its last result can be reused. The problem is that the output filename list isn’t sorted, and so a directory change is detected much more often that it needs to be.

I optimized this by sorting the list. This resulted in an 8% performance improvement.

This patch has been submitted upstream:

Reduce garbage and redundant operations in index_to_links

But the story doesn’t end there. index_to_links accepts two arguments: a filename and an array. index_to_links sorts its input array every time it is called. Remember that I said that index_to_links is O(n)? It’s not: it’s actually O(n log(n)) because of the sorting.

The arrays that are passed to index_to_links remains the same every time. So it’s very inefficient to keep sorting them over and over. I optimized this by:

  1. sorting the arrays only once.
  2. passing the sorted arrays to index_to_links.
  3. modifying index_to_links to not perform any sorting.
  4. modifying index_to_links to use in-place array manipulation methods as much as possible, to avoid generating unnecessary garbage objects.

The result is a 14% overall performance improvement.

This patch has been submitted upstream:

Failed: parser output caching

Seeing that the parser isn’t very fast, I thought about caching its output. So I modified RDoc in the following manner:

  1. It Marshal.dumps the parser output to a cache file.
  2. It consults the cache file whenever possible, instead of parsing the input file.

This made the parsing phase about 20% slower on a cold run, but 75% faster on a warm run.

Unfortunately, this approach totally fails on large projects. When run on the Ruby 1.8 standard library, it results in cache files that are 6 MB or larger. Loading such a file is much slower than parsing the original source file, and uses more memory too. My laptop crashed during this experiment because Ruby used 2 GB of memory. So I abandoned this effort.

Final words

Take the performance improvement numbers that I’ve given with a grain of salt. These numbers were taken by running RDoc on its own sources, under MRI 1.8. The performance gain really depends on the input. You’ve already seen the difference in performance improvement between running RDoc on the Ruby 1.8 standard library, and running it on ActiveRecord.

Comments (7)

This blog now officially Safe For Work(tm)

Latin America Rails Summit has been a very nice experience. It has been about a week since I’m back home, and the jetlag that I was experiencing is finally disappearing. I’ve talked to many interesting people, such as Chris Wanstrath from Github, Obie Fernandez from Hashrocket and Chad Fowler.

I heard a… uhm… “interesting” story from Chris. There was a time when he was reading my blog. His boss saw my blog’s banner and thought that he’s viewing non-safe-for-work-content. I heard similar things from several other people. Or, to quote Chad:

“When I met you at RailsConf, I was pleasantly surprised that your personality doesn’t match your pr0n website.” — Chad Fowler, referring to this blog

Or, as Tinco said, “the entire Ruby community wants to read your blog, but they can’t”.

I asked, why not use Adblock? “Too much of a hassle” was the answer I got.

I also pointed out that this is just my personal blog, and that it is clearly separated from the company blog. But the response was “But there’s too much interesting stuff on your blog!”

No more excuses!

Starting from today, there is no excuse anymore! I’ve added a “Censor banner” button, as you can see in this screenshot:


The state will be saved into a cookie, so even if you press Reload, the banner will stay hidden. Therefore, this blog is now officially Safe For Work(tm).

Update: it is now also possible to hide the banner by appending “?hide_banner=yes” to the URL, e.g.:

What’s with your blog’s banner anyway?

The girls you see in the banner are actually friends of mine. Yes, I know them in real life. Those who live in Japan or who watch anime are probably familiar with the term “cosplay”. These girls like cosplaying, and I like this particular photo that I shot of them, so I turned it into a blog header. With permission.

Comments (13)

Eating our own dog food


“ doesn’t eat its own dog food. They’re using nginx/0.6.32, so does that mean they think Phusion Passenger sucks?”

That’s right, is behind Nginx. But that’s because the entire website only consists of static HTML! We didn’t use PHP, or Rails, or anything dynamic. The website is generated from a few input files with Webgen and rsync’ed regularly.

Our Rails apps do run on Phusion Passenger… behind Nginx.


default_value_for Rails plugin: declaratively define default values for ActiveRecord models

We’ve just released default_value_for, a plugin for declaratively defining default values for ActiveRecord models.


Who has experience with Qt 4 on OS X?

I’ve had pretty bad experiences with wxWidgets. Not that the toolkit itself is bad, but it’s lacking functionality and polish in various unexpected and awkward ways. The biggest turn-off for me is that it doesn’t support buttons with icons. Not only do I want my GUI apps to work well, I also want them to look nice and to integrate well into the environment. On Linux/GNOME, having buttons with icons is almost essential – not having icons just makes the GUI look plain and ugly. On Windows it can make a big difference as well when it comes to UI aesthetics. The wxWidgets developers commented that they won’t implement this because not all platforms (e.g. Windows) support it. I personally think this is nonsense – Delphi has supported buttons with icons since version 1.0 (for Windows 3.1). Besides, why not just implement it on platforms that do support it, and document it as such? This is already the case for things such as the flat button style.

Another thing I don’t like about wxWidgets is how it forces one to build the GUI top-down. One must first construct a parent container before one can construct child widgets. It’s not possible to construct an invisible child widget and then later on attach that onto a parent. This seems to be a design decision influenced by limitations in Windows.
wxWidgets also has the tendency to layout the GUI differently on different platforms. I usually develop wxWidgets applications on Linux, and port them to Windows later on. What usually happens is that the GUI looks fine on Linux, but totally breaks on Windows – buttons being laid out differently, controls that have the wrong size, etc. I usually end up having to fix the GUI code for Windows. Apparently wxWidgets has different layout implementations for different platforms, and they behave subtly different ways.

The list can go on and on. But generally, wxWidgets feels clunky and awkward except for simple and standard user interfaces without a lot of dynamics. The differences in layout and resize behavior on different platforms seem to be bigger than the differences in CSS implementations in different browsers (with the exception of IE of course).

Qt 4 seems to be a good cross-platform GUI toolkit and doesn’t suffer from these issues. It looks very nice on Linux. However, I’ve seen Mac people flaming Qt for looking “totally miserable” on OS X. I couldn’t find any screenshots of Qt 4 on OS X so I can’t confirm whether that’s true. Does anybody have experience with Qt on OS X, and can show me some screenshots?

Comments (9)

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »