Archive for Ruby

Optimizing RDoc

When it comes to documentation, there’s always room for improvement. I’ve been contributing to docrails for quite a while now. However, one issue that has always annoyed me is how slow RDoc is.

In particular, generating RDoc output for ActiveRecord is painfully slow. It takes 34.2 seconds on my machine. Using JRuby helps a little, seeing that it can do JIT and all, but not much: it still takes 29 seconds (including JVM startup time). I had hoped for a bigger performance improvement.

So a few days ago I started optimizing RDoc. And not without result. Here are the results of generating RDoc for the Ruby 1.8 standard library with RDoc running in MRI 1.8:
Before optimizing: 1624.8 seconds (27 minutes). I had to close all applications because RDoc ate almost 1.5 GB of memory.
After optimizing: 531.2 seconds (8 minutes and 51 seconds). RDoc now uses only about half the memory it used to need.
Performance improvement: 206%!

rdoc_generation_time_for_ruby_1_8_standard_library.png

The results for generating RDoc for ActiveRecord are:

Before optimizing (MRI 1.8): 34.2 seconds
After optimizing (MRI 1.8): 29.9 seconds
Performance improvement: 14%

Before optimizing (MRI 1.9): 16.8 seconds
After optimizing (MRI 1.9): 13.2 seconds
Performance improvement: 27%

Before optimizing (JRuby): 29.0 seconds
After optimizing (JRuby): 24.5 seconds
Performance improvement: 18%

rdoc_generation_time_for_activerecord.png

So what did I do? Read on!

Multi-threaded parsing

My laptop has a dual core CPU. It’s kind of a waste to see RDoc utilizing only a single core. RDoc can be parallelized by running 2 rdoc processes at the same time. But I’m never generating 2 different sets of RDoc documentation at the same time, so running 2 rdoc processes in parallel won’t do me any good. So I had to search for ways to parallelize a single rdoc process.

It turns out that RDoc’s code parsing phase is pretty easy to parallelize. It just creates a new parser object for every input file, and parses files sequentially. So I modified RDoc to create multiple worker threads during the parsing phase. The main thread will offer a list of filenames, and the worker threads will each consume filenames and parse the corresponding file as fast as they can.

What I did not try to do however, is making the parser itself multi-threaded. That’s just asking for problems. Those of you who are interested in multi-threaded programming, but aren’t experienced with it, should keep this in mind: try to keep your threading code as simple as possible, and make sure that your threads share as little data as possible. In my case, the only things that the threads share are the filename queue and the result aggregation array.

MRI 1.8 implements userspace threads, so having multiple threads doesn’t increase CPU core utilization. MRI 1.9 uses kernel threads, but it has a global interpreter lock, so it can’t utilize multiple CPU cores either. Luckily, JRuby doesn’t have a global interpreter lock. The parsing phase is now about 35% faster, when run on JRuby on my dual core machine.

This patch has been submitted upstream: http://rubyforge.org/tracker/index.php?func=detail&aid=22555&group_id=627&atid=2474

Cache template pages

For larger projects, the slowest phase of RDoc is probably the HTML generation phase. The latest version of RDoc uses ERB templates for HTML output. RDoc generates an HTML file for every class, every module and every source file. However after some profiling with ruby-prof, I found out that RDoc recompiles the ERB template for every output file, which is slow.

I modified RDoc so that it caches compiled ERB templates. This made it about 19% faster.

This patch has been submitted upstream: http://rubyforge.org/tracker/index.php?func=detail&aid=22556&group_id=627&atid=2474

Reduce the number of relative URL generations

For every output file, RDoc generates a list of URLs to this output file, relative to the output filenames of all classes, files and methods. This is done by calling the function index_to_links, which runs in O(n) time. For every output file, index_to_links is called 3 times, each time with the full list of classes, files and methods. Suppose there are K output files, L classes, N files and M methods. Then this will result in K * (L+N+M) operations, which is quadratic! Ouch!

The RDoc authors already tried to optimize this by only calling index_to_links when the current output filename’s directory is different from the last one. For example, if the last output file was classes/active_support/cache/memcache_store.html and the current output file is classes/active_support/cache/memory_store.html, then index_to_links doesn’t need to be called again, and its last result can be reused. The problem is that the output filename list isn’t sorted, and so a directory change is detected much more often that it needs to be.

I optimized this by sorting the list. This resulted in an 8% performance improvement.

This patch has been submitted upstream: http://rubyforge.org/tracker/index.php?func=detail&aid=22565&group_id=627&atid=2474

Reduce garbage and redundant operations in index_to_links

But the story doesn’t end there. index_to_links accepts two arguments: a filename and an array. index_to_links sorts its input array every time it is called. Remember that I said that index_to_links is O(n)? It’s not: it’s actually O(n log(n)) because of the sorting.

The arrays that are passed to index_to_links remains the same every time. So it’s very inefficient to keep sorting them over and over. I optimized this by:

  1. sorting the arrays only once.
  2. passing the sorted arrays to index_to_links.
  3. modifying index_to_links to not perform any sorting.
  4. modifying index_to_links to use in-place array manipulation methods as much as possible, to avoid generating unnecessary garbage objects.

The result is a 14% overall performance improvement.

This patch has been submitted upstream: http://rubyforge.org/tracker/index.php?func=detail&aid=22557&group_id=627&atid=2474

Failed: parser output caching

Seeing that the parser isn’t very fast, I thought about caching its output. So I modified RDoc in the following manner:

  1. It Marshal.dumps the parser output to a cache file.
  2. It consults the cache file whenever possible, instead of parsing the input file.

This made the parsing phase about 20% slower on a cold run, but 75% faster on a warm run.

Unfortunately, this approach totally fails on large projects. When run on the Ruby 1.8 standard library, it results in cache files that are 6 MB or larger. Loading such a file is much slower than parsing the original source file, and uses more memory too. My laptop crashed during this experiment because Ruby used 2 GB of memory. So I abandoned this effort.

Final words

Take the performance improvement numbers that I’ve given with a grain of salt. These numbers were taken by running RDoc on its own sources, under MRI 1.8. The performance gain really depends on the input. You’ve already seen the difference in performance improvement between running RDoc on the Ruby 1.8 standard library, and running it on ActiveRecord.

Comments (7)

Making Ruby’s garbage collector copy-on-write friendly, part 8

Hi folks, it has been a while since the last “Making Ruby’s garbage collector copy-on-write friendly” post. Many things have happened in the mean time, and my copy-on-write work is now usable (and used) in production environments, but it seems that there is still confusion. So I’ve decided to write a new post which explains the situation.

Copy-on-write updates

In March I submitted my work to the Ruby core mailing list. There has been some discussion. As a result, various people, including myself, have made a number of improvements.

The improvements are as follows:

  • The copy-on-write friendly garbage collector is now a few % faster thanks to various micro-optimizations.
  • The mark table implementation is now pluggable.

    On Windows, a copy-on-write friendly garbage collector is totally useless because fork() is not supported on Windows. Furthermore, not all Ruby applications call fork(). So I’ve made two mark table implementations: one based on the old one (which marks objects directly by setting a flag on the object) and a copy-on-write friendly one. It is now possible to change the mark table implementation during runtime by calling GC.copy_on_write_friendly = (boolean value).

    This has huge performance implications. The copy-on-write friendly mark table makes the garbage collector about 0%-20% slower, depending on the application and the workload. However, the non-copy-on-write friendly mark table is enabled by default, so by default there is only a 1% performance penalty. This performance penalty comes from the fact that marking an object now requires a function call which sets the mark flag, instead of setting the mark flag directly. But I think 1% is acceptable.

  • Various little bugs in the debugging code have been fixed.
  • Me and Ninh are working on a scientific paper regarding the copy-on-write work.

Unfortunately the discussion stranded. Matz had some concerns about performance, which is why I made the mark table implementation pluggable. I will re-submit the patch for further evaluation when the time is right.

Ruby Enterprise Edition

Many of you have probably heard of Ruby Enterprise Edition. There has been, and still is, a lot of fuss about the name. But that’s intentional and is all part of the plan — if people make a fuss about the name then it means we’re not in the Zone of Mediocrity. :)

What is Ruby Enterprise Edition? People thought it’s a closed source product, but in fact the website’s front page and download page has the following huge sticker:

(We actually added this sticker after we’ve seen that people think it’s going to be closed source.)

In one sentence:
Ruby Enterprise Edition is an easy to install Ruby interpreter that includes, among other things, my copy-on-write work.

Facts and myths:

  • It’s open source, not closed source. It’s freely available to all.
  • It’s not an entirely new Ruby implementation. It’s based on the official Ruby interpreter (MRI), version 1.8.6-p286. This means that all your existing Ruby applications are compatible with Ruby Enterprise Edition.
  • It does not only include my copy-on-write work. There’s more. Read on.
  • It is not a hostile fork, but a friendly one. The work included in Ruby Enterprise Edition is meant to be merged back to upstream at some point in the future.
  • The copy-on-write work has been submitted to the Ruby core team in the past.
  • Phusion Passenger is very well-integrated with Ruby Enterprise Edition. If you use Phusion Passenger in combination with Ruby Enterprise Edition, then your Rails applications will transparently use 33% less memory and will be faster, as if it’s magic. You don’t need to do anything special, it just works.

    The only condition is that you must not be using conservative spawning in your application. But if you don’t know what conservative spawning is then you’re not using it, and you’ll have nothing to worry about.

Why was Ruby Enterprise Edition made?

Consider the following facts:

  • My copy-on-write work can potentially save a lot of memory in Rails applications.
  • The patch has been submitted to upstream, but hasn’t been accepted yet.
  • There is a demand for lower memory usage in Rails applications, right now, not X months/years in the future.

Given the circumstances, and to satisfy the demands (including that of ourselves), we have decided that it would be best to maintain our own Ruby fork which includes these patches.

You might be wondering: Why not just release the patch? Why create a fork?

The answer is user friendliness. Telling people to download Ruby’s source code and apply a patch is not user friendly. In fact, to many people, it’s downright scary. Imagine that you want a transparent and easy way to make your Rails applications “magically” use 33% less memory. Which of the following instructions would you prefer?

Use Phusion Passenger to deploy your application. Then download the Ruby interpreter source code from www.ruby-lang.org. Download it and extract the tarball. Then, download this patch and apply it with this and that command. Then, run ‘./configure –prefix=/somewhere’. Make sure that /somewhere is not /usr in order to prevent overwriting your old Ruby installation, you don’t want that to happen. Then type ‘make’, and then ‘sudo make install’. Then download RubyGems, extract it, and type ‘sudo /somewhere/bin/ruby setup.rb’ in the RubyGems source folder. Then type ‘/somewhere/bin/gem install rails’ to install Ruby on Rails and whatever other gems you might need.

or:

Use Phusion Passenger to deploy your application. Then download Ruby Enterprise Edition. Run the installer and follow the instructions. Done.

The first one contains a lot of caveats. Many many things can go wrong. Many many people aren’t experienced in installing Ruby from source. It’s just easier if there’s a vendor that takes care of everything for you. And we are that vendor.

We want Phusion Passenger and everything surrounding it to have a “just works” experience.

So if it’s not just the copy-on-write work, then what else does Ruby Enterprise Edition include?

  • This one is huge: by using Google’s tcmalloc, an alternative memory allocator, Ruby becomes 20% faster even with the copy-on-write friendly garbage collector! Furthermore, tcmalloc seems to be more copy-on-write friendly than ptmalloc2, Linux’s default memory allocator, so by using tcmalloc we can save even more memory!
    We discovered this shortly after submitting the patch to the Ruby core mailing list. So Ruby Enterprise Edition also includes tcmalloc.
  • Ruby Enterprise Edition includes an easy-to-use installer which takes care of installing tcmalloc, Ruby, RubyGems and important/useful gems for you. It also teaches you how to tell Phusion Passenger to use Ruby Enterprise Edition instead of normal Ruby.
  • In the future we might include more patches that might be useful in production environments.

Who’s already using Ruby Enterprise Edition?

I’m not sure because we’ve never asked our users. But the Ruby on Rails Wiki is running on it, and it has been great. I’ve been monitoring the Wiki for a while now, and ever since we’ve switched it to Phusion Passenger + Ruby Enterprise Edition, it has been rock-solid (before, it used to crash often). We also observed a great reduction in memory usage.

Michael Koziarski, a Rails core developer, runs Phusion Passenger with Ruby Enterprise Edition on his blog. He said that he downgraded his server because Phusion Passenger + Ruby Enterprise Edition saved him so much memory.

Final words

I hope this post has shed some light on matters. I’m just a little surprised that there’s all this confusion going on because all of this is also documented on the Ruby Enterprise Edition website’s FAQ. eustaquiorangel.com recently interviewed me and asked similar questions. You should check it out.

I’m also a little surprised that people seem to be reluctant about installing Ruby Enterprise Edition. If I have the choice between two products A and B, and B is the same as A but is much more efficient and is easy to install, then I’d choose B.
It is that people are suspicious about our claims? We’ve published a performance and memory usage comparison. Anybody can read this comparison, perform it himself, and check whether our claims are true. Everything we claim is verifiable so I don’t understand what there is to be suspicious about.

Please feel free to post your thoughts on this, I’d really like to hear what people have to say.

Comments (11)

Readable test names in Rails 2.1

I’m currently working on a Rails application. It has some non-trivial business rules, so I ended up writing test methods along the lines of:

Ruby
  1. def test_a_message_with_a_password_protected_channel_as_recipient_will_be_delivered_to_a_users_mailbox_if_that_user_is_subscribed_to_said_channel

Okaaaay….. This is what I think of that method name:
Nice boat!
Nice boat!

Other than the mental-psychological stress as well as an unexplainable impending feeling of doom that such a long method name brings forth in our minds, one have to ask oneself whether it is morally justified to unintentionally punish whomever will ever read this code by presenting them with such a NICE BOAT, even if said person deserves it.

As much as I love RSpec, it doesn’t feel totally appropriate to use it, because then everything must start with “it”. Unfortunately not all rules in my application can be described with sentences that start with “it”. Rails edge has a solution though. You can define test methods in a declarative style ala RSpec, like this:

Ruby
  1. test "an anime should be invalid if any of its characters are invalid" do
  2.   # Your usual test code here.
  3. end

After staring at my test cases for the 2^32th time in a futile attempt to understand what the test method names are actually trying to tell me, I gave up and decided that it’s time to dig up a series that I’ve been trying to finish for the past 3 months. Surely only this will save me from going completely insane.
spicywolf_13_horo_surprised.jpg
The face of un-insanity (?)

There Chu Yeow, I did what you asked me to do. ;)

spicywolf_13_horo_tantrum.jpg
Baka baka baka! ….or maybe not.

Anyway, under the hood, the test method translates that block to:

Ruby
  1. def test_an_anime_should_be_invalid_if_any_of_its_characters_are_invalid

That’s nice. But it could be nicer. I don’t want to upgrade to Edge so I decided to copy & paste the ‘test’ method from ActiveSupport edge – it’s only 6 lines. And yesterday I found out that it’s apparently possible for method names to contain arbitrary binary data, except “\0″. So you can do this:

Ruby
  1. Object.send(:define_method, "omg\1wtf\n!@$%^&*()") do
  2.   "abc"
  3. end
  4. Object.new.send("omg\1wtf\n!@$%^&*()")   # => "abc"

Cool! So this means we can have RSpec-style test method names even when using Test::Unit. :)

So I modified the test method a little bit. Copy and paste this into your test/test_helper.rb to enjoy this:

Ruby
  1. def self.test(name, &block)
  2.     test_name = "test: #{name.squish}".to_sym
  3.     defined = instance_method(test_name) rescue false
  4.     raise "#{test_name} is already defined in #{self}" if defined
  5.     define_method(test_name, &block)
  6.   end

My test method now becomes:

Ruby
  1. test "a message with a password protected channel as recipient will be delivered to a user’s mailbox, if that user is subscribed to said channel" do
  2.    …
  3. end

Comments (11)

Ruby 1.8.6-p230/1.8.7 broke your app? Ruby Enterprise Edition to the rescue!

Ruby 1.8.6-p230/1.8.7 include fixes for the recently discovered security vulnerabilities, but they also break some apps. We’ve backported the security patches to 1.8.6-p111 and made a Ruby Enterprise Edition release based on that. See http://blog.phusion.nl/2008/06/23/ruby-186-p230187-broke-your-app-ruby-enterprise-edition-to-the-rescue/ for details.

Comments

Does Ruby have a distribution problem?

Luke Kanies wrote an article, in which he says that Ruby has a distribution problem. It criticizes Rails applications for vendoring a lot of stuff, and criticizes RubyGems for not being able to handle native package dependencies, among other things.

I beg to disagree. The described problem is not a pure Ruby problem: it’s a general software distribution problem. I’m concerned that people would use this as another “reason” to oppose Ruby, even though it’s not specific to Ruby.

My response

I posted a reply to Luke’s article, and it was as follows:

Luke, I don’t really understand what you’re expecting from the Ruby/Rails community. You’re dealing with cross-platform distribution issues. That’s hard by its very nature. I was a developer in the Autopackage project (www.autopackage.org), a software packaging system that works across multiple Linux distributions. I’m currently a developer of Passenger (www.modrails.com). We’ve run into many of the issues that you mention here.

If I understand it correctly, you’re claiming that vendoring stuff is bad because:

  • You cannot vendor everything (glibc, web browser, etc).
  • As someone else has mentioned, vendoring stuff creates potential security issues. This is not unlike static linking in C/C++ projects.

On the other hand, vendoring stuff does have benefits:

  • Guaranteed compatibility. Suppose you rely on GTK 2.2. On a faithful day, GTK 2.2.5 was released, but accidentally introduced a regression, and now your application fails left and right. Uh oh. If you vendored GTK then that wouldn’t have happened. This is not a theoretical possibility: GTK 2.2.15 or something actually broke AbiWord. (I’m not saying that vendoring GTK is a good idea, as GTK is quite large. I’m just pointing out a benefit of vendoring.)
  • Less installation hassle. Not all platforms have good packaging systems. As far as I know, Debian-based distros are the only ones. RedHat-based YUM repositories tend to be quite small compared to Debian’s APT repositories. MacOS X and Windows don’t have native package management systems at all. If your app is cross-platform, then vendoring stuff is a lot easier for both the developer and the end user.

Vendoring stuff is not only common in the Java world, but also in the Windows and MacOS X world. Windows apps tend to bundle all their dependencies (with the exception of obvious stuff, such as Internet Explorer). How many games have you seen that bundle DirectX? Actually I’d say that not vendoring stuff is only common in Linux, and languages that have strong ties to Linux, such as Perl. Here’s where Debian’s package management system and huge package repositories really shine.

There’s almost no common ground in the world of package management. We at Autopackage had to invent our own dependency resolution mechanism (similar to APT) because there’s no lowest common denominator, even amongst Linux distros. RubyGems is probably created for the same reason: not all Ruby-supported platforms have (decent) package management, so they just wrote their own. Autopackage experimented with native package management integration (i.e. being able to use the system’s native package manager to resolve dependencies) but that proved to be much, much harder than initially expected, and up until today that feature still isn’t finished. So I don’t think you can reasonably expect the Ruby community to do something about this. It’s not a pure Ruby problem: it’s a general software distribution problem.

As for being FHS-compliant: I can only say “don’t bother” if your application is cross-platform. It seems that only hardcore Linux users care about that. Outside the Linux world, FHS is being criticized by pretty much everyone. Windows and OS X users complain that application files in Linux are scattered everywhere, instead of being self-contained. Scattering files isn’t a problem in Linux because of package management, but it is a problem in all other platforms that don’t have decent package management. I’ve found that being FHS-compliant is more trouble than it’s worth.

So how do you solve this problem? I don’t think it’s possible to come up with a general silver bullet solution. So that leaves the following choices to you, the developer:

  • Create a native package for every platform that you support, i.e. .deb for Ubuntu/Debian, .rpm for RedHat, another .rpm for Mandriva and SuSE because their package names are different, a .exe for Windows, .dmg for MacOS X, .tgz for Slackware, .??? for Solaris, etc.
  • Write a cross-platform installer. Passenger/mod_rails chose this option because it’s a lot easier than the first one. Passenger depends on Ruby packages (Rails, fastthread, Rake, etc.) as well as native packages (GCC, Apache, APR). It checks whether all dependencies are available, and if not, it tells the user how to install those dependencies. We’ve put platform autodetection and Linux distro autodetection code in the installer. So on Debian/Ubuntu it would tell you to run “apt-get install apache2-prefork-dev” while on Fedora/RHEL/CentOS it would tell you to run “yum install apache-devel”. We’ve found that this approach works extremely well.

Finally, we at Autopackage fully recognized the pros and cons and vendoring/static linking. Autopackage recommends the following: dynamically depend on stuff that are common, but vendor/static link stuff that are uncommon. We believe that this is a good trade off between the pros and cons. Passenger follows this recommendation as well: we vendor the Boost C++ library. Few people have Boost installed, and when they have it installed it often isn’t the version that Passenger requires. Installing Boost is a huge, huge pain on MacOS X. In this case, the benefits that vendoring Boost gives us outweight the cons by far.

On the other hand, Apache is fairly common, and easy to install on most platforms. Rake, fastthread, etc. are also easy to install because of RubyGems. That’s why we dynamically depend on those things instead of vendoring them.

So it all boils down to making the right choices and correctly balancing the pros and cons of vendoring. There’s no silver bullet.

Gunnar Wolf’s follow-up

Luke’s article was quickly followed by a blog post by Debian developer Gunnar Wolf:

By using Ruby Gems, you dramatically increase entropy and harm your systems’ security.

To this, I say nonsense. It’s pretty well-known that Debian, and Linux distros in general, dislike foreign packaging systems, regardless of their merits. I see quite a lot of conversations on #rubyonrails @ irc.freenode.net that are somewhat like this:

Person A: hi, I’m using Debian/Ubuntu/some-other-Debian-derived-distro. I typed “gem install rails”. But when I type “rails foo”, it says “command not found”. what’s going on?
Person B: type “gem update –system && gem install rails”
Person A: wow, it worked! thank you!

It is painfully obvious that Debian did something to RubyGems. I was bitten by this very issue as well. Debian’s RubyGems package places binaries in /var/gems (or something like that, I don’t remember the exact location) instead of /usr/bin. Fine, I understand that they don’t want foreign packages to pollute /usr/bin, which is a managed directly. But the least they can do is adding /var/gems to $PATH by default, something which they didn’t. As a result, many people who installed Rails via Debian thought that Rails is broken, when it’s actually Debian who crippled RubyGems.

“Increase entropy”? I don’t even know what this is supposed to mean in the context of Ruby software distribution.
“Harm your systems’ security”? As I’ve already stated, vendoring has both pros and cons. If one dynamically depends on a library, then it means that the system administrator and library author are responsible for security updates, but it also means that a security update can potentially break the application. By vendoring stuff, the responsibility of security updates is mostly shifted to the application developer. This is a trade-off: there’s nothing wrong with it, and whether it’s the best thing to do depends on the situation. Gunnar however has an extremely purist view, along the lines of “if you don’t agree with us then you’re an idiot, regardless of the circumstances”.

It seems more like that Gunnar’s words are carefully picked, with the goal of creating knee-jerk reactions. Debian said pretty much the same thing about Autopackage in the past, and now they’re doing it again with RubyGems.

Luke, Gunnar, will you realize that all you’re doing is ranting about a problem, without offering any solutions? And no, creating native Debian packages is not a solution, the world isn’t comprised of just Debian.

Comments (9)

Wanted: Passenger testers using 64-bit OS X and 64-bit Linux

Passenger 1.0.2 fixed MacOS X compatibility. Unfortunately, the OS X fix broke compatibility with 64-bit Linux. The 1.0.4 release was supposed to fix it for both platforms. Before we released 1.0.4, we had an OS X user test it, and the test result was positive. But soon after the release we got bug reports from people, telling us that 1.0.4 broke OS X compatibility.

The main problem lies in the file descriptor passing code. We’re struggling to get it right:
1. The original code was copied from Ruby. Apparently that didn’t work well on 64-bit MacOS X. This also means that file descriptor passing on 64-bit MacOS X Ruby is also broken, not just Passenger.
2. There is generally very little documentation about file descriptor passing. We found a bunch of code samples, but it turns out that all of the ones that we’ve tried have portability problems. The developer man pages don’t provide a lot of information on file descriptor passing.

We want to get this right, but we don’t have any 64-bit machines at all. So we’re looking for people who can help us with testing Passenger on 64-bit OS X and 64-bit Linux. We’d like to have at least 3 people from both categories. If you’re interested, please send a message to our discussion board, or join #passenger on irc.freenode.net.

Thank you.

Comments (3)

Phusion Passenger (mod_rails) version 1.0.2 released, and more

Passenger version 1.0.2 has been released. :) We’ve finally finished our corporate blog now, so visit http://blog.phusion.nl/ for the release announcement, overview of changes, upgrade instructions, and more.

Now that we finally have an official corporate blog, people can stop complaining about my blog template. ;) Note that our corporate blog’s template isn’t finished yet: it still needs some polish.

Comments (5)

The hidden corners of Passenger

There are a few technical achievements in Passenger that we haven’t actively marketed. But we’d like the community to be aware of them, so we’ve written this blog post. :)

Unix domain sockets

Not too long ago, Thin announced support for Unix domain sockets. This gave Thin an incredible speed boost. Switchpipe soon followed, with alpha support for Unix domain sockets.

It has always surprised me that the Ruby/Rails web containers didn’t support Unix domain sockets until early 2008. Unix domain sockets aren’t exactly rocket science or exotic: the X Window System has used them for client-server communication on localhost for as long as I can remember. Database systems such as MySQL prefer to use Unix domain sockets on localhost as well. It’s also well-known that Unix domain sockets are faster than TCP sockets.

There’s one down side to Unix domain sockets though: you have to remove them. Usually, server processes remove them during exit. But if the system crashes, or if those processes crash, then the socket files are never removed, and the system administrator will have to do that manually. Ouch, maintenance overhead.

Passenger uses Unix sockets extensively, and has done so since the first release. But there’s one difference compared to Thin and Switchpipe: Passenger uses Unix sockets in the abstract namespace (as opposed to on the filesystem) whenever possible. Abstract namespace Unix sockets have filenames, just like regular Unix sockets, but they do not appear as files on the filesystem. This means that after a reboot, or if Apache crashes, no stale files will be left on the filesystem. This way, the system administrator doesn’t have to manually remove stale files if something went wrong.

Passenger strives for a concept that we call “zero maintenance”. We believe that software should Just Work(tm), and system maintenance overhead should be as low as possible. The system administrator shouldn’t have to worry about stuff like stale files, if he doesn’t have to. Ideally, the system administrator should be able to forget that he ever installed Passenger at all. The usage of abstract namespace Unix sockets is one of the mechanisms that we use to achieve that goal.

PID files (or the lack thereof)

Mongrel and Thin write so-called PID files to the filesystem. PID files contain the PIDs of background processes, and are used for shutting down those processes. But if the system crashes, or if those background processes crash, then the PID files will never be removed, and they become stale PID files. Early Mongrel versions refuse to start if there are stale PID files, forcing the system administrator to remove them manually. Passenger on the other hand doesn’t use PID files at all. If one shuts down Apache, then all Rails processes are shut down as well.

So what happens if Apache crashes? The Rails processes will exit as well. We use so-called “owner pipes” in Passenger, pipes that are shared between Apache and the Rails processes. Owner pipes are essentially a mechanism for reference counting processes. If Apache crashes, then the owner pipe is automatically closed. The Rails processes will detect this, and will shut down gracefully.

Comments (3)

Passenger (mod_rails): community response, donations, InfoQ interview

Community response: one week after the release

It has been a little more than one week since the first public release of Passenger. In this one week, lots and lots of people have blogged about us, including even O’Reilly. The feedback has been overwhelmingly positive! :)

A few nice quotes from the community:

  • “As many of you have already done, I hopped on board to see how it worked and was amazed on how easy it was to get up and going.” — O’Reilly
  • “So far I’m happy to report that, as advertised, it’s dead easy to use… and the performance seems solid.” — zerosum.com
  • “Mod_rails seems to be a surprising stable and finished product, and this for a first public release! I’m anxious to check out their Ruby Enterprise Edition.” — Handermann.be

Thanks for the praise, everybody. :)

“Enterprise licenses” (a.k.a. donations)

One may wonder why our “Enterprise”/Donation page was taken down so soon. It’s because people have been very, very generous with regard to donating to us. :) In fact, so generous that Paypal decided to block our account under the guise of “anti-money laundering investigation”. But everything has been cleared up now, so the donations page is once again online.

If you’d like to help us survive the coming winters, or just want to support Passenger’s development, then please consider getting an “enterprise” license. ;)

InfoQ interview

We’ve been interviewed by InfoQ. This interview elaborates some things about the name and the license. It also elaborates our upcoming memory optimizations for Passenger/Ruby, which allows them to use less memory. More information about this soon, honestly. ;)

On a side note, we’re preparing the release of Passenger 1.0.2. It will soon be available.

Comments (1)

Phusion Passenger Enterprise Licensees, check your mail

Hi there,

Our apologies for the later than usual update, but things are getting really busy here at Phusion and this has been the first opportunity that we’ve been able to blog about the latest developments again! First off, we’d like to take the opportunity to ask all enterprise licensees to check their mail as they’re bound to find something nice there ;-)

Also, the next batch of enterprise licenses will become available very soon. We just wanted to let you guys know that we’re still alive and are working around the clock on some very interesting projects for our clients. Regardless of this, we want you guys to know that we’re still actively involved in the development of Phusion Passenger and Ruby Enterprise Edition and for those who want to know a bit more about Phusion Passenger and Ruby Enterprise Edition, we’ve recently done a mini interview with the fine people at infoq, and we encourage you to go check it out over here.

Even though we try to stay as actively involved as possible, we would also like to encourage the community to submit patches for the improvement of Phusion Passenger. We’re confident that we can make Passenger even better together (wow, that even rhymes so it must be true right? ;-)).

Also, it has come to our attention that some of you guys have expressed a concern about the memory use of Phusion Passenger and we have reason to believe that the memory wasn’t always measured in an accurate way. This is for the greater part the result of a well known problem with the innacurate display of memory usage by ps and/or top.

We’ll try to write an article on this as soon as possible where we will address this ‘issue’ with the appropriate measuring tools as well. Hopefully, this should set the record straight and take away any reluctance to try out Phusion Passenger as we’re pretty confident that Phusion Passenger uses a lot less memory than some people seemed to have inferred. :-)

We’d like to leave it at this for now, and we’ll keep you guys posted and would love to stay in touch with you guys :-).

Cheers,
Hongli Lai
Ninh Bui

Comments (1)

« Previous entries Next Page » Next Page »