time to bleed by Joe Damato

technical ramblings from a wanna-be unix dinosaur

What is a ruby object? (introducing Memprof.dump)

View Comments

If you enjoy this article, subscribe (via RSS or e-mail) and follow me on twitter.
After Joe released memprof a few days ago, I started thinking about ways to add more functionality.

The initial Memprof release only offered a simple stats api, inspired by the one in bleak_house:

require 'memprof'
Memprof.start
o = Object.new
Memprof.stats
      1 test.rb:3:Object

With the help of lloyd‘s excellent yajl json library, I’ve slowly been building a full-featured heap dumper: Memprof.dump.

require 'memprof'
Memprof.start
[]
Memprof.dump
[
  {
    "address": "0xea52f0",
    "source": "test.rb:3",
    "type": "array",
    "length": 0
  }
]

Where can I find it?

This new heap dumper will be in the next release of Memprof. If you want to play with it, checkout the heap_dump branch on github.

What else is planned?

Over the next few days, I’m going to add a Memprof.dump_all method to dump out the entire ruby heap. This full dump will contain complete knowledge of the ruby object graph (what objects point to other objects), and its json format will allow for easy analysis. I’m envisioning a set of post-processing tools that can find leaks, calculate object memory usage, and generate various visualizations of memory consumption and object hierarchies.

Why should I care?

In building and testing Memprof.dump, I’ve learned a lot about different types of ruby objects. The rest of this post covers interesting details about common ruby objects, with examples of how they’re created and what they look like inside the MRI VM.

Objects and Floats

o = Object.new
o.instance_variable_set(:@pi, 3+0.14159)
  {
    "address": "0x1823dd8",
    "source": "test.rb:3",
    "type": "object",
    "class": "0x1854b38",
    "class_name": "Object",
    "ivars": {
      "@pi": "0x1823da0"
    }
  }

This ruby object points to its class (Object 0x1854b38) and has some instance variables- here, there’s only one variable named @pi that points to another object at 0x1823da0.

The address 0x1823da0 belongs to a float object- this float was created on the heap when MRI executed the code 3 + 0.14159.

  {
    "address": "0x1823da0",
    "source": "test.rb:4",
    "type": "float",
    "data": 3.14159
  }

The float 0.14159 used in the addition also lives on the heap, but it is created upfront once when the ruby source is parsed.

Strings

Unlike floats, new string objects are created every time ruby encounters a string in its execution path.

1.times{"abc"}
  {
    "type": "string",
    "shared": "0x15136a0",
    "flags": ["elts_shared"]
  }

This newly created string object has no character data associated with it; instead, it is marked elts_shared and points to 0x15136a0. In this case, 0x15136a0 is another string object- one that holds the actual data “abc” and was created earlier when the ruby source was parsed.

Arrays and Fixnums

[1,2,3,"hello"]
  {
    "type": "array",
    "length": 4,
    "data": [
      1,
      2,
      3,
      "0x12aa0c0"
    ]
  }

The fixnums 1, 2 and 3 in the array are immediates, so they live in the array itself and do not occupy slots on the ruby heap1. The fourth member is the string object “hello” that lives at 0x12aa0c0.

Hashes and Symbols

{:a=>1,"b"=>:c}
  {
    "type": "hash",
    "length": 2,
    "default": null,
    "data": {
      "0xd13378": ":c",
      ":a": 1
    }
  }

The symbols :a and :c are also immediates, so they live directly inside the hash’s data table. The key for “b” is a pointer to that string object at 0xd13378.

Blocks and Data

Hashes can also be created with a default block.

Hash.new{|h,k| h[k] = k; h }
  {
    "type": "hash",
    "length": 0,
    "default": "0xcca208"
  },
  {
    "address": "0xcca208",
    "type": "data",
    "class": "0xcced80",
    "class_name": "Proc"
  }

In this case, the block is converted to a new Proc data object that holds a reference to an internal struct BLOCK2. The new hash’s default field points to the address of the Proc.

Data objects are commonly created by C extensions to point to external memory that needs to be marked and freed using ruby’s garbage collector.

Classes

A simple class definition creates many objects on the heap.

class MyClass; end

First is the class itself, along with the class’s string representation (pointed to by an internal ivar __classpath__). Notice the class object holds a reference to its superclass.

  {
    "address":"0x29f3228",
    "type": "class",
    "name": "MyClass",
    "super": "0x2a23b28",
    "super_name": "Object",
    "ivars": {
      "__classpath__": "0x29f31b8"
    }
  },
  {
    "address": "0x29f31b8",
    "type": "string",
    "length": 7,
    "data": "MyClass",
  }

The class definition also creates two more objects- an internal CREF node, and another singleton class with no name that is __attached__ to MyClass.

  {
    "type": "node",
    "node_type": "CREF",
  },
  {
    "type": "class",
    "name": null,
    "super": "0x2a23a80",
    "super_name": null,
    "singleton": true,
    "ivars": {
      "__attached__": "0x29f3228"
    }
  }

This singleton is MyClass‘s metaclass, where singleton methods and instance variables are added.

MyClass.instance_variable_set(:@a, 123)
  {
    "type": "class",
    "name": null,
    "singleton": true,
    "ivars": {
      "__attached__": "0x29f3228",
      "@a": 123
    }
  }

Constants, Class and Instance Variables

Classes store both constants and class variables along with the instance variables.

class MyClass
  A=1
  @@b=2
  @c=3
end
  {
    "type": "class",
    "name": "MyClass",
    "ivars": {
      "@@b": 2,
      "A": 1,
      "@c": 3
    }
  }

Methods

Methods are stored in a separate method table and represented by METHOD node objects which hold the method body.

class MyClass
  def d() end
end
  {
    "type": "class",
    "name": "MyClass",
    "methods": {
      "d": "0xb7ec30"
    }
  },
  {
    "address": "0xb7ec30",
    "type": "node",
    "node_type": "METHOD",
  }

Method Invocation

def test()
  a=1
  b=:b
  c='c'
  Memprof.dump3
end
test()
  {
    "type": "scope",
    "node": "0xa9bdd0",
    "variables": {
      "_": null,
      "~": null,
      "a": 1,
      "b": ":b",
      "c": "0xb60ce8"
    }
  }

During method invocation, a new scope object is created on the heap. This scope points to the node object representing the method body, and has a list of all local variables.

The local variables include the perl-style ruby magic variables $_ and $~.

Modules and IClasses

Modules in ruby are similar to classes and have the same associated strings and CREF nodes created with them.

module MyModule; end
  {
    "address": "0xe82248",
    "type": "module",
    "name": "MyModule",
    "super": false,
    "ivars": {
      "__classpath__": "0x208eda8",
      "__classid__": ":MyModule"
    }
  }

When a module is included into a class, an extra iclass object is created:

class MyClass
  include MyModule
end
  {
    "address": "0x208ecc8",
    "source": "-e:1",
    "type": "iclass",
    "super": "0x20bfb40",
    "super_name": "Object",
    "ivars": {
      "__classpath__": "0x208eda8",
      "__classid__": ":MyModule"
    }
  }

This new iclass points to MyClass‘s old superclass, and shares its instance variable and method tables with MyModule. Once created, this iclass becomes MyClass‘s new superclass.

  {
    "type": "class",
    "name": "MyClass",
    "super": "0x208ecc8",
    "super_name": "MyModule",
  }

and more..

Ruby has various other internal object types, including Regexps, Matches, Bignums, Structs, Files, Varmaps, and almost 130 different types of Nodes. Memprof will eventually be able to dump out all these objects in individual detail.

  1. Fixnums can, however, still have instance variables []
  2. Future versions of memprof will print out struct BLOCKs in more detail, to show all references held by ruby procs []
  3. Memprof.dump was called in the method body, because the scope is freed explicitly when the method ends (unless it is referenced by a block). []

Written by Aman Gupta

December 14th, 2009 at 5:59 am

  • Bkulbida

    thanks, great article.

  • Nice tool.
    But is there any lib code or plugin to use it with Rails easily?

  • Very impressive tool! It would also be super useful if it could spit out the number of bytes an object used. Your thoughts?

  • tmm1

    I plan on adding a Memprof.dump_all which will produce a full heap dump which can be analyzed to calculate the size of any given object. It's tricky because the size of an Object has to take into account all the instance variables and singleton methods attached to it, etc.

  • Awesome stuff guys, looking forward to playing with it once it's merged into memprof.

    Question: CREF nodes?

blog comments powered by Disqus