Scala: for a/vs while

After posting about the (almost non-existent) performance impact of implicit conversions, I noticed something that blew my mind somewhat: if I used a for expression instead of while, the code ran much longer. How much? About a hundred times. Here’s a simple example, and again it’s such an example that brings out the difference between two “primitive” things taken out of context, so it doesn’t really tell us anything about real-life application performance:

object ComprehensionPerfImpact {
  def main(args : Array[String]) : Unit = {
    val n = 100000000

    Time("for-comprehension") {
      for (x <- 1 to n) {}
    Time("while") {
      var x = 0
      while (x <= n) { x += 1 }
The output:<code># Block "for-comprehension" completed, time taken: 16797 ms (16.797 s)
# Block "while" completed, time taken: 125 ms (0.125 s)</code>

<b>Update:</b> I forgot to mention that this test was made with Scala version 2.6.0. I have also now run this on Scala 2.7.0 RC1, where the results are about <a href="" title="for a/vs while {update}">2.5 times faster</a>.

<span id="more-11"></span>

The difference is more than a 100 times! This surprised me. I expected it to be significantly slower, but not this much. Lets take a look at where this impact comes from by examining what the library code and the compiler actually do here, as I did in a <a href="" title="Compiler Surprises">previous post</a>. The imperative style <code>while</code> expression will be compiled into some pretty efficient byte-code, which is similar to Java compiler output and looks like this (if you can read Java byte-code):
0:   iconst_0
1:   istore_1
2:   iload_1
3:   aload_0
4:   getfield        #12; //Field n$0:I
7:   if_icmpgt       17
10:  iload_1
11:  iconst_1
12:  iadd
13:  istore_1
14:  goto    2
17:  return

But the <code>for</code> comprehension will actually be turned into a functional expression that iterates over a range and for each element calls the body of the for expression as an anonymous function.

First <code>1 to n</code> is translated to <code>new RichInt(1).to(n)</code>, which through several method calls ends up creating <code>new Range(1, n + 1, 1)</code>, which is a range from 1 until n + 1, stepping by 1 (the default). Note that the difference between "to" and "until" in this context is that "to" is inclusive, while "until" is exclusive (i.e. doesn't include the last element). <code>Range extends RandomAccessSeq.Projection</code>, which in turn <code>extends Seq.Projection with RandomAccessSeq</code> (RandomAccessSeq here is a trait that is mixed in. If you don't know what that means, <a href="" title="Traits and Types">Daniel Spiewak has written about this</a>). <code>Seq.Projection extends Seq with Iterable.Projection</code>, the latter of which <code>extends Iterable</code>... Quite a complicated hierarchy we've got here.

Anyway, <code>Iterable</code> defines among others the following method

<code>def foreach(f: A =&gt; Unit): Unit = elements.foreach(f)</code>

The compiler will actually translate the <code>for</code> expression into a call to this method! Now, this method calls the <code>foreach</code> method of it's <code>elements</code>, which is of type <code>Iterator</code> (is your head spinning from all the types yet?), and so the definition of <code>elements.foreach</code> comes from there:

<code>def foreach(f: A =&gt; Unit) { while (hasNext) f(next) }</code>

The implementation that defines <code>hasNext</code>, however, is in <code>RandomAccessSeq</code>, as an anonymous subtype of <code>BufferedIterator.Advanced</code>, and implements <code>hasNext</code> in the following way:

<code>override def hasNext = idx &lt; RandomAccessSeq.this.length</code>

So <code>hasNext</code> will be called before applying the function <code>f</code> to each element in the collection. And that in turn calls the <code>length</code> method, which is implemented in <code>Range</code>:

def length : Int = {
  if (this.step == 0) throw new Predef.UnsupportedOperationException
  if (start < end && this.step < 0) return 0
  if (start > end && this.step > 0) return 0

  val base = if (start < end) end - start
             else start - end    assert(base >= 0)
  val step = if (this.step < 0) -this.step else this.step
  assert(step >= 0)
  base / step + (if (base % step != 0) 1 else 0)

Notice that there’s actually quite a lot of calculations being done in this method. And this method is being called for each iteration! Actually, if we’d investigate deeper, we would find that it is called not one, but several times per iteration! But looking deeper into that is out of scope for this blog post. While without proof it’s only a suspicion that this method is the real culprit for the slowdown, it should be quite obvious by now that at least it is doing much much more than a simple while cycle.

It’s possible that other Iterables in Scala are actually more efficient than Range, though. I see this as a bug in Range actually. And wanting to report the bug, I went searching in the Scala Trac if something like this was already recorded, and lo and behold! They have already fixed this! Range‘s length is still defined similarly, but as a lazy value instead of a function:lazy val length : Int = { ... }

A lazy value is evaluated the first time it is accessed, and the computed value will be used in subsequent accesses. So here the several-times-per-iteration cost of this method will be reduced to one time for the entire for expression. Side note: notice that the lazy val length used to be a function earlier (def length). Assuming that the interfaces haven’t changed, an abstract function definition is being overwritten with a value definition! How can that be? Well, simply because the awesomeness of Scala allows such overwrites: a value can actually be considered a method that always returns the same thing (constant).

So, seeing as this is already fixed in trunk, I guess I’ll have to do an update on this after the next Scala release and hopefully it will reduce the performance penalty of using for expressions over Ranges considerably. The for expressions are certainly very useful and I would say that go ahead and use them where they result in more readable code than while, but avoid them if you iterate over huge collections or need top performance in a frequently invoked piece of code.

Below is a more complex example that demonstrates how for expressions can make iterations more concise and how easy it is to run queries on XML files in Scala — it compares the dependencies in two Maven POM files, and prints those dependencies which have the same group and artifact ID-s. If you want to learn more about how for expressions get translated into functional calls (which also relates to monads), I recommend going over to James Iry’s blog.

val pom1 = xml.XML.loadFile(“pom1.xml”)
val pom2 = xml.XML.loadFile(“pom2.xml”)

for {
dep1 <- pom1 \ "dependencies" \ "dependency" dep2 <- pom2 \ "dependencies" \ "dependency" if (dep1 \ "groupId") == (dep2 \ "groupId") if (dep1 \ "artifactId") == (dep2 \ "artifactId") } { println("groupId: " + dep1 \ "groupId") println("artifactId: " + dep1 \ "artifactId") println("versions: " + (dep1 \ "version" text) + ", " + (dep2 \ "version" text)) println() }[/sourcecode]This shows Scala's great potential as an XML manipulation language, and some tests that I’ve run actually indicate that Scala can be much faster here than equivalent Ruby or Java code. I may blog about that later.

One thought on “Scala: for a/vs while

  1. Pingback: Scala: for a/vs while {update} « Villane

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s