Sunday, November 25, 2007

Simple file IO in different dynamic languages

Was just messing around and thought I would look at the different ways to skin a cat... in this case, the cat being "read each line of a file and spit out it's contents". I'm using TCL, Python, PHP, Ruby, Groovy, Scala and Javascript (Rhino) for this example. Let's take a look at what I came up with:



[sourcecode language="css"]
set fp [open "test.txt" r]

while {[gets $fp line] != -1} {
puts stdout $line

close $fp[/sourcecode]
Here we need to open the file pointer and loop over the file using a procedure call to 'gets' which sets a local variable 'line' for us. ('gets' is wrapped in brackets so we can validate on it's return, the length of the string returned by gets proc call.) Finally we will need to close our file pointer.


[sourcecode language="python"]
f = open("test.txt")
for line in f:
print line.strip()
Similar to TCL, we are getting our file pointer (in this case an object), looping the file object and closing the pointer. Version 2.6 will support the 'with' keyword allowing you to wrap the file IO in a block (therefor closing the file pointer for you after executing the block.) Here's what it will look like:
[sourcecode language="python"]
from __future__ import with_statement # <-- only use this line for Python 2.5!

with open("test.txt") as f:
for line in f:
print line.strip()[/sourcecode]


[sourcecode language="php"]
foreach (file('test.txt') as $line) {
echo $line;
This code looks 'blockish', but we're actually calling a function 'file' which returns the file's contents as an array... the file pointer is created and closed within this function so we don't have to worry about closing it in this example.


[sourcecode language="ruby"]
#'test.txt', 'r').each do |line| <-- BAD: open file pointer
File.foreach('test.txt', 'r') do |line| <-- CORRECTION
puts line
UPDATE: was leaving file pointer open... thanks for the correction Sebastian Hungerecker.

Now we are getting OO with blocks. We're creating a File object and calling that object's 'each' method which iterates over each line of the file and passes the contents to the supplied block.


[sourcecode language="java"]
new File("test.txt").eachLine { line ->
println line
Again, OO with blocks... very similar to Ruby.


[sourcecode language="css"]

for {
(line) <- Source.fromFile("test.txt").getLines
} print(line)[/sourcecode]
Procedural type programming using a language that 'goes both ways.' Similar to the PHP code above, the file IO is wrapped up in the 'Source.fromFile' method call which returns a 'Source' object already populated with the file's contents. We then call the getLines method which returns an iterable collection.

Javascript (Mozilla Rhino)

[sourcecode language="javascript"]
lines = readFile("test.txt").split("\n");
lines.pop(); // <-- last item is empty... EOF?

for (i in lines) {
Using a Rhino built-in 'readFile' to get the contents of the file, splitting on line breaks using OO method call, and finally looping the array and printing it's contents. You see something I had to do here to get rid of the empty item in 'lines'.


Update: I failed to mention, these times include JVM startup so it's not a true speed check for IO... rather from the command line experience (total execution time of one call.) This can paint a bad light on the JVM impls, but keep in mind the JVM is built for long running rather than one shot processes.

Of course I was curious... what kind of speed do these languages have for this basic IO? The test file contained only the following:


I don't have the latest and greatest for a few of these... but here are the versions and real time taken. (using 'time' on Mac OS X 10.4.10)

Native Impls

[sourcecode language="css"]
RUBY: 0m0.009s (1.8.6)
TCL: 0m0.012s (8.5)
PHP: 0m0.024s (4.4.7)
Python: 0m0.034s (2.5.1)[/sourcecode]

JVM Impls

[sourcecode language="css"]
RHINO: 0m0.310s (1.6R2)
GROOVY: 0m0.885s (1.0)
SCALA: 0m1.312s (2.6.0)
JRUBY: 0m1.668s (1.1b1)[/sourcecode]What does all of this mean? Not much really... just that there are a lot of different ways (many of which I didn't cover here) to skin a cat. Just choose the one that fits your needs and have fun with it.

Did I miss your favorite language here? Did I not use the most optimal code for accomplishing this task? Let me know!


Mike said...

FYI: just ran Jython... it comes in between Groovy and Scala with 0m1.118s...

Anders said...

There's also an alternative way to output the file content in Ruby:


Less OO, but less verbose.