Nested include has major memory leak (Rails 2.0.1).

Posted by Bart ten Brinke Wed, 02 Jul 2008 14:00:33 GMT

As our mongrels were using up quite a lot of memory, so I tried to figure out what was causing this.

When running the app locally I found out that one certain page caused the mongrel to grow from 60 to 190 megabytes. A whopping 130 megabytes!

After commenting out some of the code, I found out that a single line was causing all this memory usage

contracts = Contract.find( :all, 
  :conditions => ['contracts.employee_id IN (?) ', employees ],
  :include => [:expertise_profile => :qualifications ] )

Auch! The nested include of rails somehow leaks a large amount of memory. The fix was off course a piece of cake.

contracts = Contract.find( :all, 
  :conditions => ['contracts.employee_id IN (?) ', employees ],
  :include => :expertise_profile )

No my mongrel stays a nice 60 megabytes. I don't know if this issue persists in the new 2.1 Rails, but I'll check that soon!

Posted in  | Tags , , , ,  | 1 comment

libXML and activeresource revisited

Posted by Pieter Bos Thu, 26 Jun 2008 08:32:58 GMT

A while ago, an article was posted on this blog by Bart which showed how activeresource can be sped up with libXML. However, this patch was not complete and broken in rails 2.1. So Brian Guthrie made a new patch. However, this patch also has an issue:

If you parse an xml file containing the google weather api for example, you get something like:

<weather> <current_conditions></current_conditions> <forecast_conditions><low data="10"/></forecast_conditions> <forecast_conditions><low data="12"/></forecast_conditions> </weather>

to an object with a single forecast_conditions object with the low temperature at 12 degrees.

So, i created a simple patch:

   def to_hash(hash={})
      #if there is already an entry with the given name, switch 
      if text?
        hash[CONTENT_ROOT] = content
      else            
        sub_hash = insert_name_into_hash(hash, name)            
        attributes_to_hash(sub_hash)
        if array?
          children_array_to_hash(sub_hash)
        else
          children_to_hash(sub_hash)
        end
      end
      hash
    end

    protected

    def insert_name_into_hash(hash, name)
      sub_hash = {}
      if hash[name]                        
        if !hash[name].kind_of? Array 
          hash[name] = [hash[name]]              
        end
        hash[name] << sub_hash
      else
        hash[name] = sub_hash
      end   
      sub_hash
    end

This replaces the line that just overwrites the previous hash with one that creates an array if required. Also, libxml doesn't like empty files. So much that it just segmentation faults on empty input. So we add to the from_xml method:

        def from_xml(xml)              
          xml.gsub!(/\s*\n\s*/, '')
          if(xml.blank?)
            return {}
          else              
            typecast_xml_value(undasherize_keys(XML::Parser.string(xml).parse.to_hash))
          end
        end

Download the patch!

1 comment

Campfire Notifier for cruisecontrol.rb

Posted by Andre Foeken Thu, 17 Apr 2008 11:39:36 GMT

We added a campfire notifier to our build of cruisecontrol. Click on the link to download the builder plugin. It uses tinder as an API.

Posted in  | 2 comments

Optimizing math

Posted by Andre Foeken Thu, 10 Apr 2008 21:19:19 GMT

As you might have guessed from Bart's previous article we've been looking at ways to speed up our Rails app. We've been profiling and query optimizing but at some point we reached a dead end.

Our app needs to calculate a lot of distances between geo locations. Until now we've been happy using a home-grown Ruby method to calculate these distances but our profiling showed that it was (as one may suspect) horribly slow.

We now have several options:

  • We could try to built a faster Ruby method (but that would be hard since it's pure math and not really a lot can be done here)

  • We could use the database (mysql in our case) to calculate our distances (A lot more db connections in our case, since we need distances between lots of points. Not the standard stuff that gems like acts_as_mappable can handle)

  • Use RubyInline to create a faster C based method

We decided to look at RubyInline. A gem that enabled C code to be used right inside a Ruby script. We rewrote the method in C. A simple benchmark proved that our inline C method was 2.3 times faster!

require 'inline'
inline do |builder|
    builder.include '<math.h>'
    builder.c "double calc_distance_between(...) { ... }"
end

Although this result is very good, it does complicate your app and makes it less readable. These inline methods have to be used with care. But in our simple (and very localized) case we decided to keep the C method in favor of the pure Ruby call.

Posted in ,  | Tags , , , , ,  | 1 comment

Optimizing the queries of your rails app

Posted by Bart ten Brinke Wed, 09 Apr 2008 19:51:20 GMT

When you are developing your application, you should allways look for the following line in your development console.

Processing EmployeesController#index[GET]
Employee Load (0.055003) 
SELECT * FROM `people` WHERE `people`.`type` = 'Employee'

 select_type | key_len | type | Extra       |
---------------------------------------------  =>
 SIMPLE      |         | ALL  | Using where |

| id | possible_keys | rows | table  | ref | key
--------------------------------------------------
| 1  |               | 6965 | people |     |

The type ALL means that you are preforming a full table scan in a query. This is usually not a problem when you are in development mode, but what is you have millions of people in your database?

Usually it is pretty straight forward to find out where you were calling this from, as you will probably remember what the request was you did. If you are having a hard time, install the query_trace plugin. This gives the following result:

vendor/plugins/query_analyzer/lib/query_analyzer.rb:38:in `select'
app/controllers/employees_controller.rb:71:in `find'
app/controllers/employees_controller.rb:71:in `index'
vendor/plugins/browser-prof/lib/browser-prof.rb:32:in `process'

Looking at line 71 of the employees controller is a good idea here as you might be doing something stupid. As line 71 just reads: @employees = Employee.find(:all)) we have to turn to your database.

mysql -u root --database myapp_development

mysql> EXPLAIN SELECT * FROM `people` WHERE `people`.`type` = 'Employee';
+----+-------------+--------+------+------------------+
| id | select_type | table  | type | possible_keys    |
+----+-------------+--------+------+------------------+ =>
|  1 | SIMPLE      | people | ALL  |                  |
+----+-------------+--------+------+------------------+

+------+---------+------+------+-------------+
| key  | key_len | ref  | rows | Extra       |
+------+---------+------+------+-------------+
| NULL | NULL    | NULL | 6873 | Using where |
+---------+------+------+------+-------------+
1 row in set (0.00 sec)

As you can see we are not hitting any indexes. Lets try adding an index.

mysql> create index people_type_test on people (type);
Query OK, 6715 rows affected (1.38 sec)
Records: 6715  Duplicates: 0  Warnings: 0

Now we run the explain again:

mysql> EXPLAIN SELECT * FROM `people` WHERE `people`.`type` = 'Employee' ;
+----+-------------+--------+------+------------------+
| id | select_type | table  | type | possible_keys    |
+----+-------------+--------+------+------------------+ =>
|  1 | SIMPLE      | people |range | people_type_test |
+----+-------------+--------+------+------------------+

+------------------+---------+-------+------+-------------+
| key              | key_len | ref   | rows | Extra       |
+------------------+---------+-------+------+-------------+
| people_type_test | 768     | const | 2496 | Using where |
+------------------+---------+-------+------+-------------+
1 row in set (0.00 sec)

Thats more like it, now we need to add this to our app trough a migration.

class CreatePeopleIndices < ActiveRecord::Migration
  def self.up
    add_index :people, :type
  end

  def self.down
    remove_index :people, :type    
  end
end

After a db:migrate and a restart of the server, we now see the following in the development console:

Employee Load (0.027666)
SELECT * FROM `people` WHERE `people`.`type` = 'Employee'
Analyzing Employee Load

 select_type | key_len | type | Extra       |
---------------------------------------------  =>
 SIMPLE      | 768     | ref  | Using where |

| id | possible_keys        | rows | table  | ref   | key
----------------------------------------------------------------------
| 1  | index_people_on_type | 2496 | people | const | people_type_test

Success! Also note that the load on the database has been cut in half.

Posted in  | Tags , , , ,  | 2 comments

Gettextgenerators update

Posted by Bart ten Brinke Mon, 31 Mar 2008 18:29:05 GMT

Gettext generators for rails 2.0 is out. It was available from the trunk for a long time, but now it was actually tagged. Note that it should work with the latest Gettext (1.90), but I have not yet tested this myself.

Posted in  | no comments

Old style alarm clock

Posted by Andre Foeken Fri, 07 Mar 2008 13:58:42 GMT

We wanted to show how many hours were going through our application on a big flat screen at our office. So I whipped up a small javascript/html page that could show a neat animation. I borrowed the raw images from this blog. The end result can be downloaded from here.

Posted in  | Tags , , , ,  | no comments

attr_accessor_with_default

Posted by Andre Foeken Wed, 05 Mar 2008 08:53:39 GMT

What a marvelous feature, but be wary! It can cause some unexpected behavior if you don't know what you are doing.

This morning we found a rather suspicious bug that lead to unexpected things. Here is an example:

class Person ; attr_accessor_with_default :things, {} ; end
john = Person.new
john.things[:table] = true
...
jim = Person.new
jim.things => {:table => true} # huh??

As it would seem attr_accessor_with_default has some problems with collections. Since we are sharing the instance over the entire class. Peter Williams noted this problem several month ago in his article (which we didn't read until it was too late), however the solution he provided still left us with some very undesireable behaviour.

class Person ; attr_accessor_with_default :things, {{}} ; end # note the extra brackets!


john = Person.new
john.things[:table] = true

...

jim = Person.new
jim.things => {} # okay!
john.things => {} # uhm... not okay!

The variable would only stick if we actually assigned it.

john.things = {:table => true}
john.things => {:table => true} # yay!

But doing this every time is not only a pain but also introduces very hard to debug errors, since the assignment does not fail...it just doesn't work!

We solved it by going old-school. Back to the normal accessor for collections.

class Person
attr_accessor :things

 def initialize attributes=nil
  super
  self.things = {}
 end

end

Now the code works as expected and we can use all operators (like >>, []) from the get go.

Update: It seems this had no effect on ActiveRecord objects that were created using finders so here is the fix for any those:

class Person < ActiveRecord::Base
attr_accessor :things

 def after_initialize
  self.things = {}
 end

end

Posted in ,  | Tags , ,  | no comments

Javascript testing problems

Posted by Steven van der Vegt Tue, 04 Mar 2008 16:03:16 GMT

A few months ago I got the assignment to set up and build javascripts-tests for the scheduling view of moves. At first I had to select a testing-framework where the test would be written in. The framework had to be able to run both unit and integrations tests. Also it would be very nice if we can run these test automatically through the command-line instead of clicking around in the browser (autotest).

After reviewing some frameworks, I chose the Crosscheck framework (http://www.thefrontside.net/crosscheck). Crosscheck is a javascript-testing-framework written in java. It is crossplatform and you can control it from the command-line. It is even able to emulate the behavior of multiple popular browsers like ie6, Firefox 1.0 and Firefox 1.5. Sounds like the ultimate testing framework!

So after my decision I started playing around and implementing tests. However, when the tests became more complex I ran into trouble. Some basic browser features like document.write() (ie6) and the Option object where missing. I was able to work around these problems, but the real trouble began with the integration tests. As our application relies heavily on ajax through the prototype framework, testing this functionality is crucial. However, I was not able to do this. Performing one hack after another, I finally gave up.

The crosscheck framework was clearly not mature enough to satisfy my needs. My conclusion is: Using crosscheck for unit-testing is doable, but the framework is not mature enough for the use of integration tests. So what are the problems with crosscheck? As mentioned earlier, it is written in Java, so it tries to emulate the browser behavior based on it's specifications. The advantages of this approach is that you can emulate more than one browser, the disadvantages are that the emulation is an approach, so you will never really get the real behavior of the browser.

If a browser changes its implementation, the framework is always outdated. You don't get the browsers quirks, so if it your tests pass in the test framework, there is no guarantee it will work in the real-browser world. Another problem with crosscheck is that is seems to be abandoned. The last changes are from end-2007 and as far as I can see none of the reported bugs have been fixed yet. Firefox 3.x and ie7 are becoming standard, but these browsers are not available as a test-browser in crosscheck (and there are no clues they will be soon).

What can we expect from the continuity of crosscheck? If you, like us, want to be able to test a long-term project, then it is important that your test-framework is long-term too. So what does my ideal javascript-test-framework look like?

It should:

  • Run from the command line
  • Represent the browser realistically
  • Emulate ajax responses
  • Mock objects
  • Run implementation tests

As far as I know, no such framework exists. However, in my view, it shouldn't be too hard to realize. We could embed the mozilla tree in an application. That way we always have the latest version of the mozilla engine and an exact copy of the behavior including it's quirks. Through the Mozilla API we can access the DOM and other functions. The framework should be able to easily load tests and run them (for this part it's important to take a good look at the x-unit patterns).

At this moment I don't have a clue how easy it is to emulate the ajax responses through mozilla API calls. Furthermore the framework should be give a rich toolkit for testing including abilities to mock objects. You start the framework via the command-line for easy integration with test-runners like autotest. Of course this application should be released under the GPL-licence. Now we only need someone to implement this for us!

Steven van der Vegt (s.vandervegt TA student.utwente.nl)

Ps we are offering an internship for the development of the described plugin at Nedap healthcare. Interrested? bart.tenbrinke@movesonrails.com International students welcome!

Posted in , ,  | Tags , , ,  | 4 comments

libXML for Active Resource 2.0

Posted by Bart ten Brinke Mon, 25 Feb 2008 08:19:39 GMT

I received an email from Stevie Clifton today, asking about our libXML patch for rails 2.0. As we have been running 2.0 for quite some time now, I never realised I forgot to post the new overrides.

The file below goes into /config/initializers/libxml.rb

# This is actally a fix for activeresource as it
# will behave incorrectly when it encounters
# Complex xml files. This override fixes this,
# but it should be submitted to rails trunk.
module ActiveResource
  module Formats
    module XmlFormat
      private
      def from_xml_data(data)
          if data.is_a?(Hash) && data.keys.size == 1
            from_xml_data(data.values.first)
          else
            data
          end
        end      
    end
  end
end

module Nedap #:nodoc:
  module Hash #:nodoc:
    module Conversions

      def self.included(klass)
        require 'xml/libxml'
        klass.extend(ClassMethods)
      end

      module ClassMethods

        # Hash from_xml mixin that uses libxml.
        # This ensures a 20x speed increase
        # Compared to libxml. Plus it is less ugly.
        def from_xml(xml) 
          result = XML::Parser.string(xml).parse 
          return { result.root.name.to_s => xml_node_to_hash(result.root)} 
        end 

        def xml_node_to_hash(node) 
          # If we are at the root of the document, start the hash 
          if node.element? 
           if node.children? 
              result_hash = {} 

              node.each_child do |child| 
                result = xml_node_to_hash(child) 

              if child.name == "text"
                if !child.next? and !child.prev?
                  return result
                end
              elsif result_hash[child.name] 
                  if result_hash[child.name].is_a?(Object::Array) 
                    result_hash[child.name] << result 
                  else 
                    result_hash[child.name] = [result_hash[child.name]] << result 
                  end 
                else 
                  result_hash[child.name] = result 
                end              
              end 

              return result_hash 
            else 
              return nil 
           end 
           else 
            return node.content.to_s 
          end 
        end          

      end        
    end
  end
end 

Hash.send :include, Nedap::Hash::Conversions

There you go. You now have a blazingly fast active resource! If you want some more bang out of your resource, add the following mixins to the initializer too:

# Add inflate to NET class (zLib support)
module Net
  class HTTPResponse
     def inflate!
       require 'zlib'
       @body = Zlib::Inflate.inflate(@body)
     end
   end
 end

# Increase timeout and buffersize for big XML files
module Net
  class BufferedIO 
    def rbuf_fill
      timeout(3000) { 
        @rbuf << @io.sysread(32768) 
      }
    end 
  end
end

Posted in  | Tags , , ,  | no comments

Older posts: 1 2 3 ... 6