Garbage Collection and the Ruby Heap

leftnode · on Feb 23, 2010

I'm not fluent in Ruby whatsoever, but I think we need more content like this on HN. Incredibly informative and in depth. I probably won't ever learn Ruby in depth, but I love reading about this type of stuff, learned a lot.

Thanks for the link.

ice799 · on Feb 23, 2010

thanks, glad you liked the slides.

cullenking · on Feb 22, 2010

Was this a slidestack from a bit ago? I remember reading a slidestack of yours a couple months back on the same topic, and found it VERY informative.

In any event, keep up the free education! I've learned alot going through your site. Oh, and thanks for some great ruby libraries!

tmm1 · on Feb 22, 2010

This a new presentation specifically about the MRI GC, but it covers some of the same tools mentioned in our Debugging Ruby and Threaded Awesome talks

mrduncan · on Feb 22, 2010

Seriously excellent slides.

Any chance of a video of this talk being released? I'd love some commentary from you guys.

ice799 · on Feb 22, 2010

yup, it was recorded. not sure when the video will be out, but i'll tweet it or something as soon as i hear.

kg · on Feb 22, 2010

Very cool, detailed slides.

I think his notes demonstrate how wildly the 'optimal' settings for a scripting runtime can vary based on use case - 500K slots in a slab makes perfect sense for rails, but 10K also makes pretty good sense for running ruby from the command line to execute a tiny script.

One thing that caught my eye - I think according to slide 52, at startup a rails app is using over 3625840 bytes of heap just to represent newline nodes generated from source code? Am I interpreting the data right, or is he just counting the actual nodes and not the RVALUEs attached to them? Kind of funny to think about optimizing memory usage in a rails app by stripping out superfluous newlines.

tmm1 · on Feb 22, 2010

You're right, there's about 90k NEWLINE nodes on the heap, and at 40 bytes a piece that's taking up about 3.6mb of slots on the heap that could be used by other ruby objects instead.

Unfortunately, removing newlines from your codebase will not help since NODE_NEWLINE is used for separators like semicolons as well.

Ruby 1.9 gets rid of NODE_NEWLINE altogether by adding a NEWLINE flag to the next node instead. A similar patch will be in the next release of REE.

darkhelmetlive · on Feb 22, 2010

I always learn so much from you and Damato.

ice799 · on Feb 22, 2010

glad you liked the slides.

chuhnk · on Feb 22, 2010

Whoa extremely informative slides, thanks for those. Alot of stuff in there I had no idea about. I love learning.

wingo · on Feb 23, 2010

Sigh. Does anyone have a PDF link?

tmm1 · on Feb 24, 2010

http://dl.dropbox.com/u/635/gc-export.pdf

kristianp · on Feb 22, 2010

It's a pity memprof doesn't support 32-bit linux yet. I would like to try out the method in the slides on my 32 bit machine.

ice799 · on Feb 23, 2010

working on it.

jcnnghm · on Feb 22, 2010

If you haven't, tune the garbage collector for rails. By tuning, I was able to improve the performance of my production application by about 60%.

munctional · on Feb 23, 2010

Can you elaborate a bit on this? Did you simply patch the GC? I've seen Evan Weaver's post (http://blog.evanweaver.com/articles/2009/04/09/ruby-gc-tunin...) on it, but that's it.

tmm1 · on Feb 23, 2010

GC tuning is easy with REE, you can set environment variables to change the behavior of the mark and sweep collector. See slides 50-53 for more info.

munctional · on Feb 23, 2010

Thanks for the info! I'll check out REE again.

jpr · on Feb 22, 2010

Ruby is known as the slowest and memoryhungriest language in common use today, is there any reason why people should know how it achieves that? How does this advance the state of the art?

tmm1 · on Feb 22, 2010

"Now you know, and knowing is half the battle."

FooBarWidget · on Feb 22, 2010

Uh what? Are you saying some like "X has problem Y, why should anybody study problem Y"?

> and memoryhungriest language in common use today

I think you've never heard of Perl.

telemachos · on Feb 22, 2010

I didn't realize that Perl was memory intensive compared with Ruby (or other comparable languages more generally).

Any links for that? (Not trolling: I use both Perl and Ruby day to day, and I'm curious. I certainly haven't noticed significant differences in memory usage. Speed certainly - Perl being faster in many cases.)

_ivvf · on Feb 23, 2010

http://shootout.alioth.debian.org/u32/benchmark.php?test=all.... With the exception of one benchmark ruby's memory is about the same as perl's. As for speed usage, ruby is no more than twice as slow as perl for most benchmarks. It does poorly on pidigits because there aren't gmp bindings available for ruby 1.9, and it does significantly worse on regex-dna because perl's regex library is much higher quality.

Both are so much slower than C that a 2x difference is insignificant.

FooBarWidget · on Feb 23, 2010

Benchmarks don't say anything when it comes to real memory usage. In my experience Perl is a lot more memory hungry than Ruby. All the data structures take so much memory. Defining a single, empty function eats 1.5 KB (!) of memory. I had a Perl program of about 30k lines, and it ate 25 MB during startup, most of which is consumed just by storing the Perl optree. During runtime memory usage would jump to 35 MB as it loads all kinds of data into hash tables and stuff.

jakedouglas · on Feb 22, 2010

These slides are _ILL_ bro.

munctional · on Feb 23, 2010

Avoiding local slang when writing is also "ill", my brother.