Making Ruby’s garbage collector copy-on-write friendly, part 5

It seems that the garbage collector doesn’t make pages dirty after all, and that malloc() seems to be the problem.

I inserted the following test code at the beginning of the garbage_collect() function:

C
  1. if (debugging) {
  2.     printf("Before 1 KB is allocated.");
  3.     getchar();
  4.     calloc(1024, 1);
  5.     printf("After 1 KB is allocated.");
  6.     getchar();
  7. }

My Ruby test script looks like this:

Ruby
  1. load_ruby_on_rails
  2. ObjectSpace.garbage_collect    # garbage collect before forking to make sure stuff isn’t freed after forking
  3. pid = fork do
  4.     ObjectSpace.start_debugging    # sets the ‘debugging’ variable to true
  5.     ObjectSpace.garbage_collect
  6.     exit!
  7. end
  8. Process.waitpid(pid)

Between the messages “Before 1 KB is allocated.” and “After 1 KB is allocated.”, memory usage in the child process jumps from 125 KB to 7 MB!!!

I wrote a test program in C to see whether I can reproduce it outside Ruby:

C
  1. #include <stdio .h>
  2. #include < sys/types.h>
  3. #include < sys/wait.h>
  4. #include <unistd .h>
  5. #include <stdlib .h>
  6.  
  7. int
  8. main() {
  9.         int i;
  10.  
  11.         for (i = 0; i < 20000; i++) {
  12.                 // Allocate some stuff so that the heap isn’t empty.
  13.                 calloc(1024, 1);
  14.         }
  15.  
  16.         pid_t pid = fork();
  17.         if (pid == 0) {
  18.                 getchar();
  19.                 calloc(1024, 1);
  20.                 getchar();
  21.                 _exit(0);
  22.         } else {
  23.                 int status;
  24.                 waitpid(pid, &status, 0);
  25.         }
  26.         return 0;
  27. }

And I couldn’t. In the above test program, memory usage in the child process jumps from 30 KB to 40 KB.

Is there anyone with intimate knowledge about the Linux/glibc malloc() implementation who can tell me what’s going on?

2 Comments »

  1. Kevin Watt said,

    December 10, 2007 @ 6:35 pm

    Since this got mentioned in the slashdot post about ROR recently, I wanted to reply and say that this is a huge deal for me, and I am super-looking forward to a solution.

    I know ruby allocates memory in “slabs” of incresaing size. For example, the next one after 80mb or so is like 40 megs, which pushes you rmemory to 120Mb and that sucks. Perhaps the one after 30k is 7mb.

    The real problem, in my mind, is that ruby’s garbage collector kills the crap out of copy-on-write… something Eric Hodel+others are working on. At least they were, I hope they’re successful soon :)

  2. Hongli said,

    December 10, 2007 @ 10:37 pm

    Hi Kevin Watt. Good that you show interest. It has already been solved though, please click on the “Optimizing Rails” category and see part 6.

RSS feed for comments on this post · TrackBack URI

Leave a Comment