Friday, December 5, 2008

JRuby vs. Java speed (re: Stochastic Simulation with SSJ)

Just read Ali Rizvi's blog post about "JRuby: Stochastic Simulation with SSJ" and decided to take a look at the implementation real quick... of course the Java version will be much faster, but the numbers given seemed a little strange.  After setting up the environment and running the tests, I saw the major slowdown as expected, then I decided to take a look at the Ruby code to see if I could speed things up a little.  This was done rather quickly, if you have better result... please share!

First, the Java version:
real    0m7.960s
user    0m7.839s
sys     0m0.056s

The JRuby version took too long to run (I'm pressed for time) so I added a few options to the runtime:

jruby -J-Djruby.compile.fastest=true --server collision.rb
real 3m55.834s
user 3m53.684s
sys 0m2.355s


Not as bad as reported by the blog ref'd above... but still pretty bad. I had a hunch the loops were what was killing the time and made the following changes to the collision.rb file:

@k.times { |i| @used[i] = false}
changed to:
@used.fill(false)


That brought the time down a bit:


real 1m1.430s
user 1m1.344s
sys 0m0.538s


It would seem loops are a sore spot. (kinda expected? the loop for setting happens in the Java impl) Decided to make one last change to see what happened. I condensed the "generate_c" method back into "simulate_runs" method to get the loops in one place. Some (not much) benefit was shown in my tests:


real 0m59.081s
user 0m59.027s
sys 0m0.502s


CONCLUSION: Yes, straight java is faster. Yes, there are ways to tune your Ruby to make things a bit more efficient. Yes, I'm sure there are other ways to make this run faster. I think prototyping this sort of thing in JRuby is just fine... then when you need more perf (if this stuff will be called from say a JRuby on Rails app) you can code it up in Java (Collision.java) and just call that from JRuby layer.

Here is the final collision.rb I ended up with :

require 'java'
require 'ssj.jar'

import 'umontreal.iro.lecuyer.rng.RandomStream' import 'umontreal.iro.lecuyer.stat.Tally' import 'umontreal.iro.lecuyer.rng.MRG32k3a' class Collision def initialize(k, m) @k = k @m = m @lambda = m * m / (2.0 * k) @used = Array.new(k, false) end
def simulate_runs(n, stream, stat_c)
stat_c.init
n.times do
c = 0
@used.fill(false)
@m.times do
loc = stream.nextInt(0, @k-1)
if @used[loc]
c += 1
else
@used[loc] = true
end
end
stat_c.add(c)
end
stat_c.setConfidenceIntervalStudent()
puts stat_c.report(0.95, 3)
puts " Theoretical mean: #{@lambda} "
end

def self.run
stat_c = Tally.new("Statistics on collision")
col = Collision.new(10000,500)
col.simulate_runs(100000, MRG32k3a.new, stat_c)
end

end

Collision.run

No comments: