Tips to avoid RMagick memory leaks

RMagick has a bad reputation because of memory leaks. I had memory leaks issue while working on DISCO.IO where Ruby processes kept bloating. That issue was really annyoing since millions of pictures are resized. Finally I succeed to fix my code. Now memory usage is stable. Here is how I did it.

TL;DR for hurry people:

  1. Call Magick::Image#destroy! (ensure block is a damn good idea)
  2. Use methods with exclamation mark as much as spossible

Here is the code before improvements:

def resize_picture(data, size)
  picture = Magick::Image.from_blob(data).first

  if (min_size = [picture.rows, picture.columns].min) >= 360
    picture = picture.crop(Magick::CenterGravity, min_size, min_size)

  picture = picture.resize_to_fill(size, size) if picture.rows > size
  result = normalize_picture(picture)

Here is the code after fixing memory leaks with changes in bold:

def resize_picture(data, size)
  picture = Magick::Image.from_blob(data).first

  if (min_size = [picture.rows, picture.columns].min) >= 360
    picture.crop!(Magick::CenterGravity, min_size, min_size)

  picture.resize_to_fill!(size, size) if picture.rows > size
  picture && picture.destroy!

Calling the crop! instead of crop is a huge memory save, because the latter creates a copy the of the picture. Like the Ruby convention of Enumerable, Array and Hash, RMagick provides the 2 versions for all methods.

Finally destroy! is called from an ensure block to free memory in case of any error.

That is not really complicated and makes me really happy to stop Ruby processes from eating all the memory. I hope that will work for you too !

Good naming for a beautiful code

Naming variables, methods and classes well helps teammates to easily understand your code. Indeed when a code is perfectly named you feel like almost reading english. Thus, it’s not necessery to read the documentation or the code of a method to understand what it does. It saves a lot of time and makes developers happier. I am going to explain how I try to choose a good name when I am writing Ruby.

First I use a kind of grammar for naming variables, methods and classes. I use regular names for classes and variables. I start all my methods by a verb except when it’s an attribute accessor. When choosing a verb, I avoid to pick up set and get, because they are catch-all. Let’s take an example with a server instance providing data.        # Read an attribute?
server.get_data    # Better but still bad
server.fetch_data  # Good name, I understand it goes through the network

It’s pretty clear that the last example is the most explicit naming. Indeed, I understand it retrieves data from the network, so it is explicit that call could be slow. However don’t go too deep in detail. If the name expose all the internal behavior and algorithme, then you went too far. When calling a method, we are interested by its result, not how it has been implemented.

server.fetch_data_via_http # Callers usually doesn't care about internal behavior

It’s also important to be consistent. Let’s say you have a cart object and the payment process has two steps. First it initializes the payment, then it confirms.

# Not consistent, because the calling order is not obvious

# Consistent naming, because it makes clear that those methods are linked

For variables I don’t prefix or suffix them with their type, because it generates noise. Most of the time names is connected to the variable type. I try to be accurate about the naming. If the code is dealing with a banned user I name it banned_user instead of just user.

Arguments order must not be disregarded. I use to put the most important argument in first, because it is directly related to the verb used by the method.

# Which one makes more sens?
email.send_to(headers, users)
email.send_to(users, headers)

I try to avoid boolean parameters because they don’t have any sens when written litteraly.

# What means the last argument?
email.send_to(users, headers, true)
email.send_to(users, headers, signature: true)

I hope those small tips have been interesting.

Uncouple JavaScript and CSS

I used to tighten JavaScript to my CSS. My JavaScript was also too coupled to the backend of web application. Having too much coupled code is bad because refactoring becomes harder and costly. I am going to show you how I solved that problem.

Here is a basic snippet where CSS and JavaScript are heavily coupled.

<span class="delete">Delete</span>


This is bad code because when you refactor some CSS you can break your Javascript code. For example if I rename .delete in my CSS files to .remove, I may forget to do the same in my JavaScript files. It happens often because CSS files are edited by designers and JS files are written by programmers. Even though communication may be good between developers and designers on your team, you can be sure this situation will happen.

Fortunately HTML5 has introduced the very convenient “data-anything” attributes. If the selector is based on a data attribute, then JavaScript won’t be coupled to the CSS anymore.

<span class="delete" data-action="delete-picture">Delete</span>


I really enjoy this approach and it makes me feel more confident between JS and CSS. However I still couple my JS to the backend of web application. Check the following snippet where I added a picture ID into the HTML.

<span class="delete" data-picture-id="123">Delete</span>


function deletePicture(event) {
  $.ajax('/pictures/' + event.attr('data-picture-id'), {type: 'DELETE', ...});

My code is coupled to the backend because JavaScript code builds the URL on its own. The URL schema is shared by both frontend and backend. If the URL changes on the backend, then a frontend developer have to update the JavaScript code. That’s exactly the same problem as the previous one. My solution is to include the picture’s full path into the HTML document.

<a href="/pictures/123" data-action="delete-picture">Delete</a>


function deletePicture(event) {
  $.ajax(, {type: 'DELETE', ...});

Now, the JS is neither coupled to the CSS nor the backend side. The code is nicer, it breaks less often and is easier to refactor.

I switched the span to an anchor for semantic purpose. That’s why I call preventDefault to tell to the browser to not open the link. If you prefer to have a span with a data-url attribute instead of the href, that’s perfectly fine too.

Setup Rcov with Ruby on Rails on Ubuntu

Rrcov is a code coverage tool for Ruby. First of all we install it with Ruby gems.

sudo gem install rcov

Then we create a file lib/tasks/rcov.rake in your rails project.

require 'rcov/rcovtask' do |t|
  t.test_files = FileList['test/unit/*.rb'] + FileList['test/functional/*.rb']
  t.rcov_opts = ['--rails', '-x /var/lib', '--text-report', '--sort coverage']
  t.output_dir = 'doc/coverage'
  t.libs << "test"
  t.verbose = true

You can change the options for your own usage. I like these options because worst covered files appeared on the top of the list. The HTML output is written in the folder doc/coverage. Then to run rcov you just enter.

rake rcov

No implicit conversion from nil to integer

If you got the following error.

no implicit conversion from nil to integer (TypeError)

You should edit the file /usr/lib/ruby/1.8/rexml/formatters/pretty.rb and replace this line

place = string.rindex(' ', width) # Position in string with last ' '

by this one

place = string.rindex(' ', width) || width # Position in string with last ' '

Stack level too depp

If you got the following error.

/usr/lib/ruby/1.8/rexml/formatters/pretty.rb:129:in `wrap': stack level too deep (SystemStackError)

You should edit the file /var/lib/gems/1.8/gems/rcov- and replace the following line.

if RUBY_VERSION == "1.8.6" && defined? REXML::Formatters::Transitive

by this one

if RUBY_VERSION == "1.8.7" && defined? REXML::Formatters::Transitive

Control download with Net::HTTP::Stats

Download a file is a source of problems. Host can be down, bandwith can be too slow, file can be too big, download takes too much time … This is an important point for a program which download many files. Furthemore it’s often useful to notice users how much of a file is downloaded and when it will be finished.

That’s why I wrote a little module named Net::HTTP::Stats. This module count the number of bytes read, and then set correctly some variables like rate, estimated left time, percent of bytes read and so on. I used this module for a web bot, which downloads many many web pages, and it’s working really well.

The following code download Ruby and print the percent of download, the rate and the estimated remaining time.

require 'net/http'
require 'net/http/stats'
content = ''
uri = URI.parse('')
response = Net::HTTP.get_response_with_stats(uri) do |resp, bytes|
  content << bytes
  puts "#{resp.bytes_percent}% downloaded at #{(resp.bytes_rate / 1024).to_i} Ko/s, remaining #  {resp.left_time.to_i} seconds"

This script will print something like:

13% downloaded at 59 Ko/s, remaining 65 seconds

The module Net::HTTP::Stats add the get_response_with_stats method. It works like get_response. However the block have a 2nd argument which contains bytes. It’s not possible to get bytes via response.read_body, because bytes have been already read to set the stats.

From this work it’s easy to write rules:

  • Minimum rate
  • Maximum file size
  • Maximum time

That’s why I wrote a 2nd method called get_response_with_rules. Rules are stored in a hash. Like get_response_with_stats the block takes 2 same arguments.

rules = {
  # Download is interrupted if spent time is greater (in sec).
  :max_time => 5 * 60,
  # Download is interrupted if estimated time is greater (in sec).
  :max_left_time => 5 * 60,
  # Download is interrupted if body is greater (in byte).
  :max_size => 50 * 1024 * 1024,
  # Wait some time before checking max_time and max_left_time (in sec).
  # something between 5 and 20 seconds should be good.
  :min_time => 15
content = ''
uri = URI.parse('')
response = Net::HTTP.get_response_with_rules(uri, rules) do |resp, bytes|
  content << bytes

To use Net::HTTP::Stats you have to copy the « lib » folder into your project or any location pointed by your $LOAD_PATH. Any feed back is welcome.

Download Net::HTT::Stats.

Ruby Inline

Sometimes applications need to be fast. Unfortunalty scripting languages like Ruby are slow. However rewriting an entire application isn’t the most efficient way. Furthemore the part that need to be faster isn’t really of an outstanding size in terms of code. That’s why Ruby inline is an excellent solution.

The goal is to replace the slow Ruby code by C code. It could be 5 or 20 lines in most of cases. Thanks to Ruby Inline it’s possible to rewrite methods in C directly in the ruby source. The C code is compiled on the fly only once. In less than 10 lines I decrease execution time of 40% in the example of this article. The most impressive is that it took me 15 minutes to do it without knowing Ruby Inline before. I think Ruby Inline is a really good tool.

For example we will try to optimise the class Hash32. This class computes the hash of a string by mixing characters by block of 4 bytes. The result is an integer of 32 bytes. The following sample code is extracted from file hash32.rb.

class Hash32
  def compute(str)
    i = 0
    sum = 0
    size = str.size
    str =
    missing = 4 - (size % 4)
    str << '_' * missing if missing != 4
    while i < size
      sum ^= mix(str[i], str[i+1], str[i+2], str[i+3])
      i += 4

  def mix(a, b, c, d)
    (a << 24) + (b << 16) + (c << 8) + d

I decided to optimise only the method mix because it don’t required a lot of knowledge about Ruby C API. I decided to write the fast version in an other file, called hash32.c.rb, to be more elegant and useful.

class Hash32
  require 'inline'
  c_inline = <<__INLINE__
VALUE mix(unsigned int a, unsigned int b, unsigned int c, unsigned int d)
  return UINT2NUM((a << 24) + (b << 16) + (c << 8) + d);
  inline do |builder| builder.c(c_inline) end
rescue LoadError => e

First of all we have to load the inline library. Then we write the C code for the mix method. Ruby Inline translates most of basic types. However in this case we need the macro UINT2NUM. I advice you to read the chapter Extending Ruby of the programming Ruby book. You d’ont have to be a Ruby C API guru to optimize with Ruby Inline. But it’s better to have some basic knowledge about it.

Ruby Inline is not included in the Ruby standard library. Moreover Ruby Inline requires a C compiler like gcc and Ruby headers to work. That’s why it may not work on all platforms. The best solution is to use the C version where Ruby Inline is available otherwise the slow version of mix. This job is done by the rescue LoadError if require ‘inline’ failes.

You can install Ruby Inline via gem.

gem install RubyInline

And don’t forget to install Ruby headers too. You can download the complete source code. There is a file named collisions.rb that tests the hash sum with about 90 000 words. The script is executed in 7.2 seconds in pure Ruby and in 4.3 seconds with the inline version of mix.

Wishing you nice optimisations with Ruby Inline.

The end of ensecure authentications

When I’m logging up on the web, on unsecure connections, I’m always thinking that anybody can discover my password. The attacker just needs to listen to the network using some Ethereal-like software.

One solution is to use the HTTPS protocol instead of HTTP. Unfortunately too few web sites use it because it requires an SSL certificate.

To get a certificate you have two possibilities. You can buy a certificate or generate one for yourself. Even if this isn’t really hard, too many web sites don’t encrypt passwords between the client and the server. We need a solution which doesn’t include HTTPS’ constraints. Here is an answer which doesn’t need any configuration one the web server.

First we have to admit that good web sites don’t store passwords in plain text in their database. Passwords have to be stored as a hash sum mixed with a unique seed to each user. It should look like this.

password_sum = md5(password + seed)

We only store password_sum in the database. Thus if an attacker gets a copy of the database he can’t guess the passwords.

For each login attempt the web site will generate a temporary seed. This seed enables to fluctuate the bytes representing the passwords. This way, any sniffer attacks are countered.

The client will compute the following formula before sending the connection request.

encoded_password = md5(md5(password + seed) + temp_seed)
And the server will compute this one to check the connection attempt.

encoded_password = md5(password_sum + temp_seed)
If both values are equal, then the user typed in the correct password.

I’m over with the therory part. We will build our own secure authentication system with Ruby on Rails.

The form will contain the following fields: login, password, seed, temp_seed and encoded_password.

The main difficulty is to get the seed of the specified user. Indeed the seed is unique for each user. It’s not possible to provide it when the form is created, because we don’t know the user’s login. Fortunately Rails (actually prototype) has an AJAX helper method called observer_field. It enables to send a request when a field is edited. Thus when the login field is modified, a request is sent in the background to retrieve the user’s seed into the seed field. The form looks like this.

<%= javascript_include_tag 'md5' %>
<%= javascript_include_tag 'connection' %>
<%= javascript_include_tag 'prototype' %>
<p style="color: red"><%= flash[:error] %></p>
<%= start_form_tag 'check' %>
  Login: <%= text_field_tag :login %> <br/>
  Password: <%= password_field_tag :password %> <br/>
  Seed: <span id='seed_container'><%= text_field_tag :seed %></span> <br/>
  Temp seed: <%= text_field_tag :tmp_seed, @tmp_seed %> <br/>
  Encoded password: <%= text_field_tag :encoded_password %> <br/>
  <%= submit_tag 'Connection', :onClick => "hash_password('password', 'seed',   'tmp_seed', 'encoded_password')" %> <br/>
  <%= observe_field :login, :frequency => 0.5, :url => {:action => 'seed'}, :update => 'seed_container' %>
<%= end_form_tag %>

Finaly we need to compute the password sum before sending the form. As you can notice in the source code the function hash_password (in connection.js) is called when the user clicks on the submit button.

function hash_password(pwd_id, pwd_seed_id, tmp_seed_id, pwd_sum_id) {
    var pwd = document.getElementById(pwd_id);
    var pwd_seed = document.getElementById(pwd_seed_id);
    var tmp_seed = document.getElementById(tmp_seed_id);
    var pwd_sum = document.getElementById(pwd_sum_id);
    // Compute md5(md5(password + seed) + temp_seed)
    // And don't forget to blank the original password field
    pwd_sum.value = md5_hex(md5_hex(pwd.value + pwd_seed.value) + tmp_seed.value);
    pwd.value = '';

This function is quite simple. The two last lines are the most important ones. The first computes the sum as explained before. And the last blanks the password field to not send it over the network. As you may notice I included an md5 file which isn’t mine and can be found here. You also can use sha-1 instead of md5.

The connection attempt is checked in the ‘check’ action of the controller connection.

def check
  tmp_seed = session[:tmp_seed]
  raise if tmp_seed.nil? or tmp_seed != params[:tmp_seed]
  @session[:tmp_seed] = nil
  usr = get_user(params[:login])
  raise if usr.nil?
  client_password = params[:encoded_password]
  server_password = hash_sum(usr.password_sum + tmp_seed)
  raise if client_password != server_password
  flash[:notice] = "Connected as #{usr.login}"
rescue => e
  flash[:error] = 'Bad login'
  redirect_to(:action => :open)

I hope that this note will help you have your own secure authentication and/or made you realize the importance of having clear passwords over the internet.