Slang language goals for 2012

I’m setting myself some goals for the Slang language (I’m renaming Klang to Slang) for the coming year. By the end of 2012 I should have

  • a usable and useful language
    • programs must not crash, unless you purposefully make them crash
    • compiler must always provide nice error messages (and not crash)
    • must be able to do IO
    • must be able to do OpenGL
    • must have a small standard library
      • IO, Unicode strings, basic collections
      • parser combinators (maybe)
  • programs that are mostly pure by default, except when they’re not (doing IO etc.)
  • allocation in pools or a garbage collected heap
  • performance mostly on par with C
  • open source compiler written in Scala
  • a comprehensive test suite
    • all language features must have tests
    • all library functions must have tests
    • all compiler errors and warnings must have tests
    • some use case tests, maybe a Project Euler based test suite
  • all the things mentioned in the post where I outlined the first language (most of those exist already)

There are a few additional things that would be nice to have, but are not my explicit goals for the year. The first is a self-hosting compiler. The current compiler spews out .ll files, which then get parsed and compiled by LLVM tools such as llc, but it would be nice to work with LLVM directly. Another thing is separate compilation, which I’m probably not going to implement in the current compiler. Same goes for JIT. And lastly: support for debugging; syntax highlighters; maybe a bare-bones Eclipse IDE.

I should be able to pull off the things listed here, some sooner and some later. I still have a lot to learn about language design, type systems and compilers and am not sure at which point I’m going to open source it, but I’m aiming for June 2012 or so.

Klang: How To Solve Overloading?

I haven’t had much time to work on Klang since my vacation ended, but I’ve been thinking about how to solve overloading. Note that this post contains pseudo-code that bears resemblance to, but is not actual Klang code — these are just thoughts so far.

Given types Double and Vector2(x: Double, y: Double), there are mathematical operations that take both types as operands in different combinations, but have the same name. For example, multiplication:

a: Double * b: Double = // intrinsic operation a * b
a: Double * v: Vector2 = Vector2(a * v.x, a * v.y)
v: Vector2 * a: Double = Vector2(a * v.x, a * v.y)

Solution 1: Allow overloading of class methods

package klang
class Double {
  def *(that: Double) = // intrinsic this * that
}

package vecmath
extending Double {
  def *(v: Vector2) = Vector2(this * v.x, this * v.y)
}

Solution 1a: Allow overloading of package level functions

package klang
def *(a: Double, b: Double) = // intrinsic a * b

package klang.vecmath
def *(a: Double, v: Vector2) = Vector2(a * v.x, a * v.y)

Solution 2: Allow for Haskell-style type classes

package klang
typeclass Multiplication[A, B, C] {
  def *(a: A, b: B): C
}
instance Multiplication[Double, Double, Double] {
  def *(a: Double, b: Double) = // intrinsic a * b
}

package klang.vecmath
instance Multiplication[Double, Vector2, Vector2] {
  def *(a: Double, v: Vector2) = Vector2(a * v.x, a * v.y)
}

Pros and Cons

The first solution of allowing method/function overloading as in Java or Scala will certainly complicate things, especially when I introduce some form of inheritance (currently there is none). However, I’m not sure the second method, which is more like the Haskell way of handling overloading, is any better. Especially the third type class argument (return type) will be confusing: am I then allowed to also provide an

instance Multiplication[Double, Vector2, String]

which differs only in the last argument. Obviously it shouldn’t be so: if selecting a type class instance based on method return type was allowed, that wouldn’t work with Klang’s Scala style type inference. Maybe the type class could have an abstract type member instead:

typeclass Multiplication[A, B] {
  type Result
  def *(a: A, b: B): Result
}
instance Multiplication[Double, Vector2] {
  type Result = Vector2
  def *(a: Double, v: Vector2) = Vector2(a * v.x, a * v.y)
}

But I haven’t really decided whether to even have type classes or type members. There seems to be a lot more ceremony required with type classes compared to simply allowing overloading methods or module-level functions. And I’m not sure if that provides any real advantage…

I could say that defining a function *(a: Double, b: Vector2) creates an implicit type class *[A, B] and an implicit instance of it *[Double, Vector2]. It would be equivalent to the manually defined type classes, wouldn’t it? Except in the manual case you could put division into the same type class with multiplication, but that would still require more ceremony than simple overloading.

And what of functions like max(a: Double, b: Double) and max(a: Collection[Double])? It would be nice to be able to do have the 2-argument version for performance reasons. Type classes wouldn’t necessarily solve that unless one could abstract over method argument arity.

I should probably read more literature to make better informed decisions about these things, just haven’t gotten around to those parts yet :)

What do you think? Do you have an opinion about which way to go or any pointers to reading material?