<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Villane</title>
	<atom:link href="https://villane.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>https://villane.wordpress.com</link>
	<description>Thoughts on software development</description>
	<lastBuildDate>Sun, 29 Jan 2012 18:16:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='villane.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>https://s-ssl.wordpress.com/i/buttonw-com.png</url>
		<title>Villane</title>
		<link>https://villane.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="https://villane.wordpress.com/osd.xml" title="Villane" />
	<atom:link rel='hub' href='https://villane.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Mixfix Operators &amp; Parser Combinators, Bonus Part 2a</title>
		<link>https://villane.wordpress.com/2012/01/21/mixfix-operators-parser-combinators-bonus-part-2a/</link>
		<comments>https://villane.wordpress.com/2012/01/21/mixfix-operators-parser-combinators-bonus-part-2a/#comments</comments>
		<pubDate>Fri, 20 Jan 2012 23:26:01 +0000</pubDate>
		<dc:creator>Erkki Lindpere</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Programming languages]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[Parsers]]></category>
		<category><![CDATA[Slang]]></category>

		<guid isPermaLink="false">http://villane.wordpress.com/?p=249</guid>
		<description><![CDATA[This is a short bonus post in the Mixfix Operator series. Part 1 was an introduction to mixfix operators and in part 2 we looked at them more closely in the context of a grammar for a boolean algebra and arithmetic language. The implementation of a parser for the language is coming next, but before [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=249&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This is a short bonus post in the Mixfix Operator series. <a href="http://villane.wordpress.com/2012/01/16/mixfix-operators-parser-combinators-part-1/">Part 1</a> was an introduction to mixfix operators and in <a href="http://villane.wordpress.com/2012/01/17/mixfix-operators-parser-combinators-part-2/">part 2</a> we looked at them more closely in the context of a grammar for a boolean algebra and arithmetic language. The implementation of a parser for the language is coming next, but before that I thought it would be interesting to see what the grammar would look like if we removed the mixfix abstraction and mechanically converted the precedence graph to <a href="http://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form">BNF</a> notation.</p>
<p>It turns out this is not that hard if we turn each operator group (graph node) into a separate production and leave out the irrelevant productions for the types of operators we don&#8217;t have in those groups. This is especially easy in our case since we only have the same types of operators in each group. I&#8217;ll use shorthand names here for brevity.</p>
<blockquote><p><code><br />
expr ::= or | and | not | eq | cmp<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | add | mul | exp | neg | tightest</p>
<p>or&nbsp;&nbsp; ::= (or | or↑) "|" or↑<br />
or↑&nbsp; ::= and | not | eq | cmp | tightest</p>
<p>and&nbsp; ::= (and | and↑) "&amp;" and↑<br />
and↑ ::= not | eq | cmp | tightest</p>
<p>not&nbsp; ::= "!" (not | not↑)<br />
not↑ ::= eq | cmp | tightest</p>
<p>eq&nbsp;&nbsp; ::= (eq | eq↑) ("=" | "≠") eq↑<br />
eq↑&nbsp; ::= cmp | add | mul | exp | neg | tightest</p>
<p>cmp&nbsp; ::= (cmp | cmp↑) ("&lt;" | "&gt;") cmp↑<br />
cmp↑ ::= add | mul | exp | neg | tightest</p>
<p>add&nbsp; ::= (add | add↑) ("+" | "-") add↑<br />
add↑ ::= mul | exp | neg | tightest</p>
<p>mul&nbsp; ::= (mul | mul↑) ("*" | "/" | "mod") mul↑<br />
mul↑ ::= exp | neg | tightest</p>
<p>exp&nbsp;  ::= (exp | exp↑) "^" exp↑<br />
exp↑ ::= neg | tightest</p>
<p>neg&nbsp; ::= "-" (neg | neg↑)<br />
neg↑ ::= tightest</p>
<p>tightest := ("(" expr ")") | value<br />
</code></p></blockquote>
<p>To be honest, encoding the whole graph as BNF is a lot simpler than I initially thought, and so is translating this into a combinator parser. It makes me think whether the mixfix grammar abstraction could be overkill. Of course, this is so easy only because we have relatively few different operators: only left-associative infix, prefix and closed. If we had more operators, with more holes in them, and different types of operators in one group (which is probably not usual, though), perhaps we wouldn&#8217;t find the conversion to be that simple any more.</p>
<p>Plainly this simplified scheme won&#8217;t help much with user-defined operators and precedence, so I think the mixfix parser abstraction is still useful. However, in cases where there are only a few operators/operator groups, maybe the straightforward translation of this BNF form into parser combinators is preferable. If there&#8217;s room left in the next post, I&#8217;ll include an alternative implementation based on this scheme.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/villane.wordpress.com/249/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/villane.wordpress.com/249/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/villane.wordpress.com/249/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/villane.wordpress.com/249/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/villane.wordpress.com/249/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/villane.wordpress.com/249/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/villane.wordpress.com/249/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/villane.wordpress.com/249/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/villane.wordpress.com/249/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/villane.wordpress.com/249/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/villane.wordpress.com/249/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/villane.wordpress.com/249/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/villane.wordpress.com/249/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/villane.wordpress.com/249/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=249&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://villane.wordpress.com/2012/01/21/mixfix-operators-parser-combinators-bonus-part-2a/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/983a1933160b289c4debbf3cd7820563?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">villane</media:title>
		</media:content>
	</item>
		<item>
		<title>Mixfix Operators &amp; Parser Combinators, Part 2</title>
		<link>https://villane.wordpress.com/2012/01/17/mixfix-operators-parser-combinators-part-2/</link>
		<comments>https://villane.wordpress.com/2012/01/17/mixfix-operators-parser-combinators-part-2/#comments</comments>
		<pubDate>Tue, 17 Jan 2012 10:00:35 +0000</pubDate>
		<dc:creator>Erkki Lindpere</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Programming languages]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[Parsers]]></category>
		<category><![CDATA[Slang]]></category>

		<guid isPermaLink="false">http://villane.wordpress.com/?p=216</guid>
		<description><![CDATA[In the previous post I introduced the notion of mixfix operators. In this post we will look at them more closely, in the context of an actual grammar. In the next part we will implement the parser for this grammar, look at performance issues and try to fix them with packrat parsers. We will implement [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=216&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In the <a href="https://villane.wordpress.com/2012/01/16/mixfix-operators-parser-combinators-part-1/">previous post</a> I introduced the notion of mixfix operators. In this post we will look at them more closely, in the context of an actual grammar. In the next part we will implement the parser for this grammar, look at performance issues and try to fix them with packrat parsers.</p>
<p>We will implement a simple language that consists of boolean algebra and integer arithmetic expressions. The grammar for the language looks like the following (we’re only considering tokens here and assume that a lexical parser has already identified literals, identifiers and delimiters in the text)</p>
<blockquote><p><code>statement &nbsp; ::= expression | declaration<br />
declaration ::= variable ":=" expression<br />
expression&nbsp; ::= ??? | value<br />
value &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ::= literal | variable<br />
literal &nbsp;&nbsp;&nbsp; ::= booleanLiteral | integerLiteral<br />
variable &nbsp;&nbsp; ::= identifier</code></p></blockquote>
<p>What should the expression productions look like, though? In examples of parsers and grammars we can commonly find an arithmetic expression language described with concepts of ‘factor’ and ‘term’ to create a precedence relation between addition and multiplication:</p>
<blockquote><p><code>expression ::= (term "+")* term<br />
term &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ::= (factor "*")* factor<br />
factor &nbsp;&nbsp;&nbsp; ::= constant | variable<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | "(" expression ")"</code></p></blockquote>
<p>This seems simple, but when we add more precedence rules, it can get quite complex, especially if we are writing a parser for a general purpose programming language instead of a simple expression language, and we also do semantic actions (create AST nodes) in the parser. This also makes the set of operators rather fixed: you might have to change several grammar productions to add a new operator with a new precedence level. I didn’t even try building Slang’s precedence rules into the grammar in this fashion.</p>
<p>Mixfix parsers still make the precedence part of the grammar, but there is a layer of abstraction there: we describe operators and their precedence rules as a directed graph, where (groups of) operators are the nodes and precedences are the edges. Then we instantiate the grammar with that particular precedence graph.</p>
<p>Before getting to the precedence rules in the language we are about to create, lets look at the operators it will have. In the list below, <code>_</code> means a hole in the expression that can contain any other expression that “binds tighter” than the operator in question. In the case where the hole is closed on both left and right, it can contain any expression at all. Only a pair of parentheses forms a closed operator in this language.</p>
<blockquote><p><code>( _ )</code> – parentheses<br />
<code>_ + _</code> – addition<br />
<code>_ - _</code> – subtraction<br />
<code>&nbsp; - _</code> – negation<br />
<code>_ * _</code> – multiplication<br />
<code>_ / _</code> – division<br />
<code>_ ^ _</code> – exponent<br />
<code>_ mod _</code> – modulo/remainder<br />
<code>_ = _</code> – equality test<br />
<code>_ ≠ _</code> – inequality test<br />
<code>_ &lt; _</code> – less than<br />
<code>_ &gt; _</code> – greater than<br />
<code>_ &amp; _</code> – conjunction<br />
<code>_ | _</code> – disjunction<br />
<code>&nbsp; ! _</code> – logical not</p></blockquote>
<p>This doesn’t include many common operators in real programming languages, but it is enough to demonstrate some interesting aspects of mixfix operators and using a DAG to describe their precedence relations. I used <code>mod</code> instead of <code>%</code> to show that operators don’t have to be symbols.</p>
<p>Before defining the precedence rules, lets look at some sample expressions and how we want them to be interpreted, mostly sticking with existing well known precedence rules, such as those in C, Java or Scala, but occasionally deviating from them:</p>
<blockquote><p><code>a + b * c &nbsp;&nbsp;&nbsp;&nbsp; = a + (b * c)<br />
a &lt; b &amp; b &lt; c&nbsp; = (a &lt; b) &amp; (b &lt; c)<br />
-5 ^ 6 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = (-5) ^ 6<br />
a &amp; !b | c &nbsp;&nbsp;&nbsp; = (a &amp; (!b)) | c<br />
5 &lt; 2 ≠ 6 &gt; 3&nbsp; = (5 &lt; 2) ≠ (6 &gt; 3)<br />
1 &lt; x &amp; !x &gt; 5 = (1 &lt; x) &amp; !(x &gt; 5)</code></p></blockquote>
<p>I think that’s enough examples for now. Lets try to describe the rules behind these somewhat intuitive expectations as a precedence graph. First, we’ll put the operators into groups where all operators in one group bind just as tightly as the others in the same group. For example <code>1 + 2 - 3</code> will be <code>(1 + 2) - 3</code> and <code>1 - 2 + 3</code> will be <code>(1 - 2) + 3</code></p>
<blockquote><p><code>parentheses&nbsp;&nbsp; : ()<br />
negation&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : - (prefix)<br />
exponent&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : ^<br />
multiplication: *, /, mod<br />
addition&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : +, -<br />
comparison&nbsp;&nbsp;&nbsp; : &lt;, &gt;<br />
equality&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : =, ≠<br />
not&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : ! (prefix)<br />
and&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : &amp;<br />
or&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : |</code></p></blockquote>
<p>Negation (prefix <code>-</code>) is in it’s own group so that we can do: <code>-2 + 1</code>. If it was in the same group with infix <code>-</code> and <code>+</code>, then it couldn’t appear next to them without parentheses because prefix operators are treated as right-associative, but most infix operators, such as <code>-</code> and <code>+</code> are left-associative. And we can’t mix left-associative and right-associative operators of the same precedence level! Why? Take the expression</p>
<blockquote><p><code>1 + 2 - 3</code></p></blockquote>
<p>If <code>+</code> and <code>-</code> are left-associatve, it means <code>(1 + 2) - 3</code>.</p>
<p>If <code>-</code> is right-associative instead, then both <code>(1 + 2) - 3</code> and <code>1 + (2 - 3)</code> would be right!</p>
<p>We could read the list of operator groups above as an order of precedence, where the first group (<code>parentheses</code>) binds tightest and the last group (<code>or</code>) binds least tight. This would be mostly compatible with many programming languages and we would have a good enough set of precedence rules right there.</p>
<p>However, as mentioned earlier, Danielsson’s mixfix grammar scheme describes precedence relations as a directed graph. Each of the groups above is a node in the graph, and a directed edge from one node to another <code>a -&gt; b</code> means: <code>b binds tighter than a</code>. So lets describe these relations as a graph instead — it will be in reverse order compared to the above list where we started from the most tightly binding:</p>
<blockquote><pre><code>or             -&gt; and, not, equality, comparison, parentheses
and            -&gt; not, equality, comparison, parentheses
not            -&gt; equality, comparison, parentheses

equality       -&gt; comparison, addition, multiplication, exponent, negation, parentheses
comparison     -&gt; addition, multiplication, exponent, negation, parentheses

addition       -&gt; multiplication, exponent, negation, parentheses
multiplication -&gt; exponent, negation, parentheses
exponent       -&gt; negation, parentheses
negation       -&gt; parentheses</code></pre>
</blockquote>
<p>Notice that from each group we draw the edge not into a single group, but into all of the groups that bind tighter. This is because of the non-transitivity of precedence in this scheme: each pair of operator groups that is to have a precedence relation must have an edge between them in the graph. The advantage of this is that we don’t need to describe the precedence between operators that aren’t related at all. This is one of the motivations for using a directed graph to represent operator precedence.</p>
<p>I hope that from the names of the operators it was clear that some of them will apply only to booleans and some only to integers. For example, the <code>&amp;</code> operator isn’t defined as bitwise <code>&amp;</code>, only as logical conjunction. Thus, assuming that our language is strongly typed, some of the operators can’t appear in the holes of some other operators in a correct program.</p>
<p>A parser doesn’t do type checking of course, but with this mixfix grammar scheme, it does implicitly do precedence correctness checking. For example <code>4 + 5 &amp; 6 + 4</code> is not precedence correct, as we didn’t define a precedence relation between <code>addition</code> and <code>and</code>. And due to the parser’s precedence checking, this expression will not even parse.</p>
<p>If we had used a total precedence order instead, we would have <code>+</code> binding tighter than <code>&amp;</code>. The expression would be interpreted as <code>(4 + 5) &amp; (6 + 4)</code> but would probably yield a type error as <code>&amp;</code> works on booleans, but <code>+</code> works on integers. We could write <code>(4 + 5) &amp; (6 + 4)</code> ourselves and that would also parse, because we made the precedence explicit. Well, actually parentheses follow the same rules: remember that in our graph, <code>()</code> bind tighter than everything.</p>
<p>The fact that the parser only produces precedence correct expressions can be both a blessing and a curse.</p>
<p>On one hand, this allows us to view some unrelated groups of operators almost as sublanguages. In our case, boolean algebra and integer arithmetic. This might be good for implementing internal DSLs in the presence of extremely flexible user-defined mixfix operators. We could allow users to extend our precedence graph or even replace it completely with their own. If a DSL has boolean logic in it, but no arithmetic, it might have precedence relations to logical operators, but not to arithmetic operators. This would preclude arithmetic operators from appearing in the DSL without being surrounded by parentheses. Or the DSL could even disallow parentheses. Implementing this much flexibility in a host language is complicated, though. For example, the parser would have to know about any custom mixfix grammars defined in imported modules.</p>
<p>On the other hand, this puts some correctness checks at the wrong level. Arguably, a parser should only validate the syntax of a program and nothing else. If a simple mistake such as using a wrong operator (equal to calling a non-existing method in some languages) would prevent the whole program from being parsed, it would also prevent the compiler from doing <a href="http://james-iry.blogspot.com/2012/01/type-errors-as-warnings.html">other interesting and useful things</a>, or reporting better error messages.</p>
<p>So maybe this grammar scheme isn’t ideal for a general purpose programming language. I am sticking with it in Slang for now, because the scheme is relatively simple and works for me at least as long as I’m the only user of Slang :) And perhaps there are workarounds that would allow a precedence-incorrect expression to be accepted by the parser still. But I don’t have immediate plans to allow a wide variety of user-defined mixfix operators or operator precedence.</p>
<p>Anyway, for our simple language, I think this scheme works well enough as long as we don’t care whether it is the parser or the type checker that reports the errors in incorrect programs. There aren’t any useful direct precedence relations between boolean algebra operators and arithmetic operators here. Only by having <code>equality</code>, <code>comparison</code> or <code>parentheses</code> between them, can we put them in the same expression.</p>
<p>Lets look at one of the consequences of our rules more closely. Many languages, including Java and C, would put most prefix (unary) operators such as <code>!</code> and <code>-</code> at the same level of precedence, binding tighter than all infix (binary) operators. In Java, <code>!6 == 5</code> is a type error because the operator <code>!</code> is bound to <code>6</code>, not to <code>6 == 5</code>, and <code>!</code> isn’t defined on integers. In our language, it isn’t necessary to have <code>!</code> at the same level as <code>-</code>, though. Since there is no (precedence) relation between logical and arithmetic operators, <code>!6 + 5</code> will not parse. But <code>!</code> does have a relation to comparison and equality tests (they bind tighter), so you can write <code>!6 = 5</code> and it will mean <code>!(6 = 5)</code>.</p>
<p>The precedence rules that have <code>=</code> binding tighter than boolean operators is based on the assumption that booleans are rarely compared to each other, but multiple comparisons of other types of values are often used in disjunctions, conjunctions and complements.</p>
<p>To get back to the question in the beginning of the post, what would the expression production in the grammar look like instead of <code>expression ::= ??? | value</code>? The short answer is that we replace <code>???</code> with the mixfix grammar scheme instantiated with our particular precedence graph. The long answer would probably take an entire blog post by itself. You can read more about this scheme in the <a href="http://www.cse.chalmers.se/%7Enad/publications/danielsson-norell-mixfix.html">Agda paper</a>, or look at the source code of my <a href="https://github.com/Villane/mixfix-parsers/tree/master/src/main/scala/mixfix">mixfix library</a>. The scheme looks somewhat like the parser combinators in the following pseudo-code (<code>~</code> means sequential composition):</p>
<pre><code>value = variable | literal
expression = mixfixGrammar(precedenceGraph) | value

mixfixGrammar(graph) = {
  // graph - the precedence graph
  // g - an operator group, node in the graph
  // op - an operator in a group

  ⋁(parsers) = // returns the result of the first parser in the list to succeed

  opsLeft(g)   = // all left-associative infix operators in g
  opsRight(g)  = // all right-associative infix operators in g
  opsNon(g)    = // all non-associative infix operators in g
  opsClosed(g) = // all closed operators in g
  opsPre(g)    = // all prefix operators in g
  opsPost(g)   = // all postfix operators in g

  operator(op) =
    if (op.internalArity == 0)
      op.namePart1
    if (op.internalArity == 1)
      op.namePart1 ~ expression ~ op.namePart2
      // expression is an recursive reference back to the "outer" production
      // these are the internal "holes" that can take any expression

  group(g)  = closed(g) | non(g) | left(g) | right(g)    // any ops in this group

  closed(g) = ⋁{ opsClosed(g) map operator }             // closed ops

  non(g)    = ↑(g) ~ ⋁{ opsNon(g) map operator } ~ ↑(g)  // non-associative ops

  left(g)   = (left(g) | ↑(g))                           // left-associative ops
              ~ ( ⋁{ opsPost(g) map operator }
                | ⋁{ opsLeft(g) map operator } ~ ↑(g) )

  right(g)  = ( ⋁{ opsPre(g) map operator }              // right-associative ops
              | ↑(g) ~ ⋁{ opsRight(g) map operator } )
              ~ (right(g) | ↑(g))

  ↑(g) = ⋁{ graph.groupsTighterThan(g) map group } // every group that binds tighter than g
         | value                                   // or the tightest "group" of values

  return ⋁{ graph.nodes map group }
}</code></pre>
<p>If you don’t understand this right now, no big deal — it’s late enough that I couldn’t come up with a better representation of the actual code that would fit in this post. And if you are not familiar with parser combinators I would recommend reading Daniel Spiewak’s <a href="http://www.codecommit.com/blog/scala/the-magic-behind-parser-combinators">post on the subject</a>, at least before continuing to the next part of this series.</p>
<p>If you notice, the <code>value</code> and <code>expression</code> productions are referenced inside the <code>mixfixGrammar</code>. This is no good if the mixfix library is to be a separate module, so I actually implemented that by introducing a pseudo operator group that has a custom parser. This pseudo-group is then added to the precedence graph along with edges from every other group into that “really tight” group.</p>
<p>This concludes part 2. In the next part we will forget this pseudo-code and use Scala’s parser combinators and my <a href="https://github.com/Villane/mixfix-parsers/tree/master/src/main/scala/mixfix">mixfix library</a> to implement an actual parser for the language, and maybe an AST and an interpreter as well.</p>
<p><em>Thanks to <a href="https://twitter.com/#!/milessabin">Miles Sabin</a> and <a href="https://twitter.com/#!/djspiewak">Daniel Spiewak</a> for reviewing drafts for this series of posts.</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/villane.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/villane.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/villane.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/villane.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/villane.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/villane.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/villane.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/villane.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/villane.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/villane.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/villane.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/villane.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/villane.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/villane.wordpress.com/216/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=216&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://villane.wordpress.com/2012/01/17/mixfix-operators-parser-combinators-part-2/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/983a1933160b289c4debbf3cd7820563?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">villane</media:title>
		</media:content>
	</item>
		<item>
		<title>Mixfix Operators &amp; Parser Combinators, Part 1</title>
		<link>https://villane.wordpress.com/2012/01/16/mixfix-operators-parser-combinators-part-1/</link>
		<comments>https://villane.wordpress.com/2012/01/16/mixfix-operators-parser-combinators-part-1/#comments</comments>
		<pubDate>Mon, 16 Jan 2012 08:00:19 +0000</pubDate>
		<dc:creator>Erkki Lindpere</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Programming languages]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[Parsers]]></category>
		<category><![CDATA[Slang]]></category>

		<guid isPermaLink="false">http://villane.wordpress.com/?p=213</guid>
		<description><![CDATA[Until recently, Slang’s parser really sucked. It was a quick hack implemented with Scala’s parser combinator library. Nothing really wrong about that in particular, but there was a gaping hole in the grammar: no operator precedence. So to get an expression like a + b * c to mean a + (b * c) I [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=213&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Until recently, <a href="http://villane.wordpress.com/2011/12/24/slang-language-goals-for-2012/">Slang</a>’s parser really sucked. It was a quick hack implemented with Scala’s parser combinator library. Nothing really wrong about that in particular, but there was a gaping hole in the grammar: no operator precedence. So to get an expression like <code>a + b * c</code> to mean <code>a + (b * c)</code> I had to add the parentheses myself. In fact, there were even more problems — some things that should have been left-associative were right-associative. This resulted in very hairy test code, with lots of parentheses everywhere.</p>
<p>Although I think parsers are cool, I am actually not very good at writing one for a complex grammar. I feel that I just know too little about the theory behind them or how to put it to practical use. I’ve used parser combinators before and think they are probably the easiest way for newbies like me to implement parsers, so that’s what I used. The use of symbolic names in the library might be scary the first time, but actually I think parsing is one of the few contexts where use of lots of symbols and extremeley concise code is desirable. It allows one to put a lot of code on a few lines, and when you are looking at or writing a parser, you want to see many productions of the grammar at the same time to understand what is going on. At least I do.</p>
<p>For Slang, I implemented something minimal that could parse the language. I had no idea how to solve operator precedence well with parser combinators, and I didn’t want to spend a lot of time studying parsers, because the next compiler phases seemed more interesting at first. But getting the parser right is important for actually using the language because it’s the first thing that processes the code and reports errors. A parser that only kind of works can be very annoying.</p>
<p>Thankfully <a href="https://twitter.com/#%21/milessabin">Miles Sabin</a> suggested that I should look into mixfix operator parsers, and I did. I don’t know exactly where the word mixfix comes from, so I’m assuming it means mixed fixity — operators can be prefix, infix, postfix or closed. Here are some samples:</p>
<ul>
<li>prefix : <code>-a</code></li>
<li>infix : <code>a + b</code></li>
<li>postfix: <code>n!</code></li>
<li>closed : <code>(a)</code></li>
</ul>
<p>Of course, most languages have operators with all of these fixities. The term mixfix actually refers to something more flexible than that — a mixfix operator can be seen as a sequence of alternating name parts and “holes in the expression”. A hole is where the operator’s arguments go.</p>
<blockquote><p><code>_ + _</code> has two holes and one name part <code>+</code> (and is infix)<br />
<code>if _ then _ else _</code> has three name parts <code>if</code>, <code>then</code>, <code>else</code> and three holes (and is prefix)</p></blockquote>
<p>In the mixfix viewpoint, many syntactic constructs might be seen as operators that can have precedence in relation to others, and this concept of many name parts can make it easier to let users define their own operators in a more flexible way than just a single prefix or infix word (as is allowed by Scala). I think this would be a really nice way of creating internal DSL-s. In Slang, like in Scala, most operators are really methods. Slang doesn’t allow user-defined fixity or precedence for methods yet (or even multiple argument lists), but I may add this feature one day.</p>
<p>There are existing languages that support mixfix operators, such as <a href="http://wiki.portal.chalmers.se/agda/">Agda</a>, <a href="http://maude.cs.uiuc.edu/maude2-manual/html/maude-manualch3.html#x15-360003.9">Maude</a> and <a href="http://www.bitc-lang.org/browse/compiler/MixFixProcessing.html">BitC</a>. To my knowledge, all these languages assign numeric precedence values to operators, and no language currently uses the exact scheme we will look at, although it was proposed for Agda.</p>
<p>Mixfix operators can be implemented in many ways, but one of the first things I found was the paper <a href="http://www.cse.chalmers.se/%7Enad/publications/danielsson-norell-mixfix.html">Parsing Mixfix Operators</a> by Anders Danielsson and Ulf Norell that was a great help to me. I was able to implement the grammar scheme described in that paper on top of Scala’s parser combinators and patch that into Slang’s existing parser with minimal changes to existing productions. The characteristics of the grammar scheme described in Danielsson’s paper seemed like a good enough fit for what I wanted for Slang:</p>
<ul>
<li>operator name parts and holes alternate — there can’t be two subsequent name parts or two subsequent holes</li>
</ul>
<blockquote><p><code>if _ then _ else _</code> is ok, <code>if _ _ else _</code> is not</p></blockquote>
<ul>
<li>operator precedence is described as a <a href="http://en.wikipedia.org/wiki/Directed_acyclic_graph">directed acyclic graph (DAG)</a>, not as a total or partial ordering. You only have to describe the precedence relations where they make sense (more about this in the next post)</li>
</ul>
<blockquote><p>a directed edge <code>'+' -&gt; '*'</code> means “<code>*</code> binds tighter than <code>-</code>”</p></blockquote>
<ul>
<li>operator precedence is not transitive</li>
</ul>
<blockquote><p><code>'=' -&gt; '+'</code> and <code>'&amp;' -&gt; '='</code> does not mean “<code>+</code> binds tighter than <code>&amp;</code>”</p></blockquote>
<ul>
<li>prefix operators are treated as right-associative</li>
</ul>
<blockquote><p><code>!!a = !(!(a))</code></p></blockquote>
<ul>
<li>postfix operators are treated as left-associative</li>
</ul>
<blockquote><p><code>n!! = ((n)!)!</code></p></blockquote>
<ul>
<li>left-associative and right-associative operators of the same precedence can’t appear next to each other</li>
</ul>
<blockquote><p>assuming <code>+:</code> is a right-associative <code>+</code>, <code>a + b +: c</code> would not be allowed</p></blockquote>
<ul>
<li>parses are precedence correct</li>
<li>implementation using left-recursion is possible, for example when using Scala’s Packrat parsers</li>
</ul>
<p>There weren’t any restrictions I couldn’t live with (in fact, we could relax some of the above requirements and the scheme would still work for some grammars), so I decided to implement this grammar scheme for Slang, pretty much as described in the paper. Although I didn’t really grok all of the Agda code samples, the principles were easily understandable. I implemented it as a separate library (<a href="https://github.com/Villane/mixfix-parsers/tree/master/src/main/scala/mixfix">available on GitHub</a>) that builds on top of the existing Scala parser combinator library. It might even be somewhat usable in it’s current state, but needs improvement.</p>
<p>In the next post we’ll look at how to define a grammar for an arithmetic and boolean algebra language using mixfix operators. In the third part, we will actually implement the parser for the grammar, look at performance issues and whether we can solve them with packrat parsers.</p>
<p><em>Thanks to <a href="https://twitter.com/#!/milessabin">Miles Sabin</a> and <a href="https://twitter.com/#!/djspiewak">Daniel Spiewak</a> for reviewing drafts for this series of posts.</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/villane.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/villane.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/villane.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/villane.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/villane.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/villane.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/villane.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/villane.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/villane.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/villane.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/villane.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/villane.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/villane.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/villane.wordpress.com/213/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=213&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://villane.wordpress.com/2012/01/16/mixfix-operators-parser-combinators-part-1/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/983a1933160b289c4debbf3cd7820563?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">villane</media:title>
		</media:content>
	</item>
		<item>
		<title>Slang language goals for 2012</title>
		<link>https://villane.wordpress.com/2011/12/24/slang-language-goals-for-2012/</link>
		<comments>https://villane.wordpress.com/2011/12/24/slang-language-goals-for-2012/#comments</comments>
		<pubDate>Sat, 24 Dec 2011 14:04:37 +0000</pubDate>
		<dc:creator>Erkki Lindpere</dc:creator>
				<category><![CDATA[Language Design]]></category>
		<category><![CDATA[Programming languages]]></category>
		<category><![CDATA[compilers]]></category>
		<category><![CDATA[klang]]></category>
		<category><![CDATA[language design]]></category>

		<guid isPermaLink="false">http://villane.wordpress.com/?p=204</guid>
		<description><![CDATA[I&#8217;m setting myself some goals for the Slang language (I&#8217;m renaming Klang to Slang) for the coming year. By the end of 2012 I should have a usable and useful language programs must not crash, unless you purposefully make them crash compiler must always provide nice error messages (and not crash) must be able to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=204&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m setting myself some goals for the Slang language (I&#8217;m renaming Klang to Slang) for the coming year. By the end of 2012 I should have</p>
<ul>
<li>a usable and useful language</li>
<ul>
<li>programs must not crash, unless you purposefully make them crash</li>
<li>compiler must always provide nice error messages (and not crash)</li>
<li>must be able to do IO</li>
<li>must be able to do OpenGL</li>
<li>must have a small standard library</li>
<ul>
<li>IO, Unicode strings, basic collections</li>
<li>parser combinators (maybe)</li>
</ul>
</ul>
<li>programs that are <em>mostly</em> pure by default, except when they&#8217;re not (doing IO etc.)</li>
<li>allocation in pools or a garbage collected heap</li>
<li>performance mostly on par with C</li>
<li>open source compiler written in Scala</li>
<li>a comprehensive test suite</li>
<ul>
<li>all language features must have tests</li>
<li>all library functions must have tests</li>
<li>all compiler errors and warnings must have tests</li>
<li>some use case tests, maybe a <a href="http://projecteuler.net/">Project Euler</a> based test suite</li>
</ul>
<li>all the things mentioned in the <a href="http://villane.wordpress.com/2011/07/11/my-language-experiment/">post where I outlined the first language</a> (most of those exist already)</li>
</ul>
<p>There are a few additional things that would be nice to have, but are not my explicit goals for the year. The first is a self-hosting compiler. The current compiler spews out <code>.ll</code> files, which then get parsed and compiled by <a href="http://llvm.org/docs/CommandGuide/index.html">LLVM tools</a> such as <code>llc</code>, but it would be nice to work with LLVM directly. Another thing is separate compilation, which I&#8217;m probably not going to implement in the current compiler. Same goes for JIT. And lastly: support for debugging; syntax highlighters; maybe a bare-bones Eclipse IDE.</p>
<p>I should be able to pull off the things listed here, some sooner and some later. I still have a lot to learn about language design, type systems and compilers and am not sure at which point I&#8217;m going to open source it, but I&#8217;m aiming for June 2012 or so.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/villane.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/villane.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/villane.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/villane.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/villane.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/villane.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/villane.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/villane.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/villane.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/villane.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/villane.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/villane.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/villane.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/villane.wordpress.com/204/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=204&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://villane.wordpress.com/2011/12/24/slang-language-goals-for-2012/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/983a1933160b289c4debbf3cd7820563?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">villane</media:title>
		</media:content>
	</item>
		<item>
		<title>Klang: How To Solve Overloading?</title>
		<link>https://villane.wordpress.com/2011/12/10/klang-how-to-solve-overloading/</link>
		<comments>https://villane.wordpress.com/2011/12/10/klang-how-to-solve-overloading/#comments</comments>
		<pubDate>Sat, 10 Dec 2011 15:37:57 +0000</pubDate>
		<dc:creator>Erkki Lindpere</dc:creator>
				<category><![CDATA[Language Design]]></category>
		<category><![CDATA[Programming languages]]></category>
		<category><![CDATA[compilers]]></category>
		<category><![CDATA[klang]]></category>
		<category><![CDATA[language design]]></category>

		<guid isPermaLink="false">http://villane.wordpress.com/?p=191</guid>
		<description><![CDATA[I haven&#8217;t had much time to work on Klang since my vacation ended, but I&#8217;ve been thinking about how to solve overloading. Note that this post contains pseudo-code that bears resemblance to, but is not actual Klang code &#8212; these are just thoughts so far. Given types Double and Vector2(x: Double, y: Double), there are [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=191&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I haven&#8217;t had much time to work on <a title="My Language Experiment" href="http://villane.wordpress.com/2011/07/11/my-language-experiment/">Klang</a> since my vacation ended, but I&#8217;ve been thinking about how to solve overloading. <em>Note that this post contains pseudo-code that bears resemblance to, but is not actual Klang code &#8212; these are just thoughts so far.</em></p>
<p>Given types <code>Double</code> and <code>Vector2(x: Double, y: Double)</code>, there are mathematical operations that take both types as operands in different combinations, but have the same name. For example, multiplication:</p>
<p><pre class="brush: plain;">
a: Double * b: Double = // intrinsic operation a * b
a: Double * v: Vector2 = Vector2(a * v.x, a * v.y)
v: Vector2 * a: Double = Vector2(a * v.x, a * v.y)
</pre></p>
<h3>Solution 1: Allow overloading of class methods</h3>
<p><pre class="brush: scala;">
package klang
class Double {
  def *(that: Double) = // intrinsic this * that
}

package vecmath
extending Double {
  def *(v: Vector2) = Vector2(this * v.x, this * v.y)
}
</pre></p>
<h3>Solution 1a: Allow overloading of package level functions</h3>
<p><pre class="brush: scala;">
package klang
def *(a: Double, b: Double) = // intrinsic a * b

package klang.vecmath
def *(a: Double, v: Vector2) = Vector2(a * v.x, a * v.y)
</pre></p>
<h3>Solution 2: Allow for Haskell-style type classes</h3>
<p><pre class="brush: scala;">
package klang
typeclass Multiplication[A, B, C] {
  def *(a: A, b: B): C
}
instance Multiplication[Double, Double, Double] {
  def *(a: Double, b: Double) = // intrinsic a * b
}

package klang.vecmath
instance Multiplication[Double, Vector2, Vector2] {
  def *(a: Double, v: Vector2) = Vector2(a * v.x, a * v.y)
}
</pre></p>
<h3>Pros and Cons</h3>
<p>The first solution of allowing method/function overloading as in Java or Scala will certainly complicate things, especially when I introduce some form of inheritance (currently there is none). However, I&#8217;m not sure the second method, which is more like the <a href="http://learnyouahaskell.com/types-and-typeclasses#typeclasses-101">Haskell way of handling overloading</a>, is any better. Especially the third type class argument (return type) will be confusing: am I then allowed to also provide an</p>
<p><pre class="brush: plain;">
instance Multiplication[Double, Vector2, String]
</pre></p>
<p>which differs only in the last argument. Obviously it shouldn&#8217;t be so: if selecting a type class instance based on method return type was allowed, that wouldn&#8217;t work with Klang&#8217;s <a href="http://docs.scala-lang.org/tutorials/tour/local-type-inference.html">Scala style type inference</a>. Maybe the type class could have an abstract type member instead:</p>
<p><pre class="brush: scala;">
typeclass Multiplication[A, B] {
  type Result
  def *(a: A, b: B): Result
}
instance Multiplication[Double, Vector2] {
  type Result = Vector2
  def *(a: Double, v: Vector2) = Vector2(a * v.x, a * v.y)
}
</pre></p>
<p>But I haven&#8217;t really decided whether to even have type classes or type members. There seems to be a lot more ceremony required with type classes compared to simply allowing overloading methods or module-level functions. And I&#8217;m not sure if that provides any real advantage&#8230;</p>
<p>I could say that defining a function <code>*(a: Double, b: Vector2)</code> creates an implicit type class <code>*[A, B]</code> and an implicit instance of it <code>*[Double, Vector2]</code>. It would be equivalent to the manually defined type classes, wouldn&#8217;t it? Except in the manual case you could put division into the same type class with multiplication, but that would still require more ceremony than simple overloading.</p>
<p>And what of functions like <code>max(a: Double, b: Double)</code> and <code>max(a: Collection[Double])</code>? It would be nice to be able to do have the 2-argument version for performance reasons. Type classes wouldn&#8217;t necessarily solve that unless one could abstract over method argument arity.</p>
<p>I should probably read more literature to make better informed decisions about these things, just haven&#8217;t gotten around to those parts yet :)</p>
<p>What do you think? Do you have an opinion about which way to go or any pointers to reading material?</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/villane.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/villane.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/villane.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/villane.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/villane.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/villane.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/villane.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/villane.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/villane.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/villane.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/villane.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/villane.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/villane.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/villane.wordpress.com/191/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=191&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://villane.wordpress.com/2011/12/10/klang-how-to-solve-overloading/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/983a1933160b289c4debbf3cd7820563?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">villane</media:title>
		</media:content>
	</item>
		<item>
		<title>Klang: Classes and Math Notation</title>
		<link>https://villane.wordpress.com/2011/07/18/klang-classes-and-math-notation/</link>
		<comments>https://villane.wordpress.com/2011/07/18/klang-classes-and-math-notation/#comments</comments>
		<pubDate>Sun, 17 Jul 2011 23:39:31 +0000</pubDate>
		<dc:creator>Erkki Lindpere</dc:creator>
				<category><![CDATA[Language Design]]></category>
		<category><![CDATA[Programming languages]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[compilers]]></category>
		<category><![CDATA[klang]]></category>
		<category><![CDATA[language design]]></category>

		<guid isPermaLink="false">http://villane.wordpress.com/?p=175</guid>
		<description><![CDATA[Just got started with adding classes to Klang. At the moment, only immutable classes can be created and there is no inheritance. The fields and methods are more separated than usual &#8212; the data is defined in the beginning of the class body as a single tuple type. On the other hand, there is a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=175&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Just got started with adding classes to Klang. At the moment, only immutable classes can be created and there is no inheritance.</p>
<p>The fields and methods are more separated than usual &#8212; the data is defined in the beginning of the class body as a single tuple type. On the other hand, there is a uniform access principle and both field accesses and method calls are represented the same way internally and in syntax. Actually the field access is implemented similarly to an intrinsic method call. Intrinsics in Klang usually translate to a single <a href="http://llvm.org/docs/LangRef.html#instref">LLVM instruction</a> and I&#8217;m going to expose some of them to users as well.</p>
<p>Here&#8217;s a sample program with a class:</p>
<p><pre class="brush: scala;">
class Vector2 {
  data (x: Double, y: Double)

  // other methods skipped

  def +(v: Vector2) = Vector2(x + v.x, y + v.y)
  def length() = √(x² + y²)

  alias |…| = length
}

def computeVectorLength(): Double = {
  val v1 := Vector2(1, 1)
  val v2 := Vector2(2, 3)
  | v1 + v2 |
}

def main() = computeVectorLength().toInt
</pre></p>
<p><code>Vector2</code> is a class whose data is represented as <code>(x: Double, y: Double)</code>. Reusing tuples again :) The compiler also generates a no-op function named Vector2 that takes the same tuple type and returns an instance of the class. It&#8217;s a no-op because the instances of classes are still held in LLVM registers (LLVM has an infinite number of registers) simply as the data they represent.</p>
<h3>Mathematical syntax</h3>
<p>There is also some nice syntax resembling mathematical notation in the example. I want to allow as much mathematical syntax as can be represented in plain Unicode text, but also provide aliases for those who prefer plain English words. I took some ideas from Fortress, but I don&#8217;t want to go into that double-syntax stuff where the written syntax is weird, but can be rendered into readable mathematical notation.</p>
<p>I defined some specific characters as left/right braces that when surrounding an expression get turned into a specially named method call on that expression:<br />
<code>| v1 + v2 |</code> is turned into <code>(v1 + v2).|…|</code> (the ellipsis is a single character, not three dots). </p>
<p>Unfortunately, I think using <code>|</code> as a left/right brace makes it hard to use <code>|</code> in the <code>or</code> meaning or allowing it as a regular identifier. Maybe it&#8217;s possible, but I&#8217;m not yet good enough with grammars to know for sure. Anyway, I think actual words <code>and</code> and <code>or</code> might be better than <code>&amp;</code> and <code>|</code>. Certainly <code>^</code> as <code>xor</code> doesn&#8217;t make much sense. Of course, since I want to enable mathematical syntax, <code>∧</code>, <code>∨</code> and <code>⊻</code> are aliases for <code>and</code>, <code>or</code> and <code>xor</code> as well.</p>
<p>The above code example doesn&#8217;t actually parse yet &#8212; until I improve the parser and the lexer, there is some additional spacing and parentheses needed. For example <code>√((x ²) + y ²)</code> would compile.</p>
<p>If you know any programming languages which are good at mathematical notation in plain text, let me know in the comments.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/villane.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/villane.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/villane.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/villane.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/villane.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/villane.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/villane.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/villane.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/villane.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/villane.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/villane.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/villane.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/villane.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/villane.wordpress.com/175/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=175&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://villane.wordpress.com/2011/07/18/klang-classes-and-math-notation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/983a1933160b289c4debbf3cd7820563?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">villane</media:title>
		</media:content>
	</item>
		<item>
		<title>Klang: Program Structure</title>
		<link>https://villane.wordpress.com/2011/07/13/klang-program-structure/</link>
		<comments>https://villane.wordpress.com/2011/07/13/klang-program-structure/#comments</comments>
		<pubDate>Wed, 13 Jul 2011 21:30:29 +0000</pubDate>
		<dc:creator>Erkki Lindpere</dc:creator>
				<category><![CDATA[Language Design]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Programming languages]]></category>
		<category><![CDATA[compilers]]></category>
		<category><![CDATA[klang]]></category>
		<category><![CDATA[language design]]></category>
		<category><![CDATA[Scala]]></category>

		<guid isPermaLink="false">http://villane.wordpress.com/?p=158</guid>
		<description><![CDATA[I thought I&#8217;d keep my posts about Klang, the language I&#8217;m designing, shorter (at least I&#8217;ll try). So in this post I&#8217;ll describe the overall program structure and just a few constructs, focusing more on current status rather than goals. Version 1 of the Klang compiler is going to be pretty basic. I&#8217;m not even [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=158&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I thought I&#8217;d keep my posts about Klang, <a href="http://villane.wordpress.com/2011/07/11/my-language-experiment/">the language I&#8217;m designing</a>, shorter (at least I&#8217;ll try). So in this post I&#8217;ll describe the overall program structure and just a few constructs, focusing more on current status rather than goals.</p>
<p>Version 1 of the Klang compiler is going to be pretty basic. I&#8217;m not even sure if it&#8217;ll have separate compilation (well, it kind of does have that via externs). As I started moving some stuff from the compiler into a library, I just made it prepend the library source code to the source being compiled. So at least right now, the compiler takes just one source file (or string) and turns it into a single <a href="http://llvm.org/docs/LangRef.html">LLVM assembly</a> module (.ll file), which can in turn be compiled to machine code using the LLVM toolchain.</p>
<h3>Functions</h3>
<p>A Klang module is currently made of function definitions and function declarations (externs). Classes will come later. A function named <code>main</code> is the entry point, as usual. A Hello World program in Klang might look like this:</p>
<p><pre class="brush: scala;">
// puts is defined in the C library
extern puts(string: Byte *): Int
def main() = puts(&quot;Hello, World!&quot;)
</pre></p>
<p><code>Byte *</code> is a pointer to a Byte, but I added pointer types only as a quick hack to interface with native libraries. I might keep them for that reason, but limit their use to extern declarations &#8212; extern declares a function defined in another module.<br />
You probably noticed that functions are defined similarly to Scala.</p>
<p><code>'def', name, argument list, '=', body expression</code></p>
<h3>Expressions</h3>
<p>In Klang, like in Scala, most things are expressions i.e. evaluate to a value. Here&#8217;s what the mandatory naive Fibonacci example looks like:</p>
<p><pre class="brush: scala;">
def fib(n: Int): Int =
  if n == 0 then
    0
  else if n == 1 then
    1
  else
    fib(n - 2) + fib(n - 1)
</pre></p>
<p>Specifying the return type is necessary here because the function is recursive, and the type inferencer can&#8217;t handle that. <code>if</code> is an expression, so both the <code>then</code> and <code>else</code> branches must be of the same type. An <code>if</code> without an <code>else</code> would evaluate to <code>()</code>, the empty tuple value, also known as <code>Unit</code>. It is similar to <code>void</code>, except void is the lack of a value.</p>
<p><pre class="brush: scala;">
def addOneThenSquare(n: Int) = {
  val nPlus1 := n + 1;
  nPlus1 * nPlus1
}
</pre></p>
<p>A block is a list of statements surrounded by <code>{</code> and <code>}</code>. Statements are usually separated by semicolons because the current parser is a quick hack implemented with <a href="http://www.codecommit.com/blog/scala/the-magic-behind-parser-combinators">parser combinators</a> and can&#8217;t do the semicolon inference well. The statements can be definitions as well as expressions, but the last statement is what is returned and needs to be an expression.</p>
<p><code>val</code> starts a local value definition. Local values cannot be reassigned after they are defined. I want to avoid mutable local variables for as long as I can, though in the end I will probably have to add them. <code>:=</code> is perhaps an unusual choice for assignment and I might revert to using <code>=</code> for that instead.</p>
<h3>Loops</h3>
<p>I haven&#8217;t even implemented Arrays yet, but loops are still useful, especially as there are no tail calls yet and the Fibonacci function above would probably exit with a segmentation fault for a large value when the stack grows too big. But how do you do loops without mutable variables? I&#8217;m talking about primitive loops that must be the basis for other constructs such as comprehensions. I thought about it for a couple of days and came up with something like this (there are probably similar constructs in other languages):</p>
<p><pre class="brush: scala;">
def helloHello(n: Int) =
  // loop takes a tuple of parameters similar to a function argument list
  // but it must always specify default (starting) values
  loop(i := 0)
    // and also an expression that ends with continue(...) or break()
    if (i &lt; n) then {
      puts(&quot;Hello?&quot;);
      // continue &quot;calls&quot; the loop again with the new values
      continue(i + 1)
    } else {
      // break returns from the loop
      break()
    }
  // yield can see the *last* values of the loop parameters
  yield i

def main() = helloHello(3)
</pre></p>
<p>This program will print &#8220;Hello?&#8221; three times and return 3. But this kind of loop seems very verbose and actually what it does is similar to tail recursion, using LLVM&#8217;s <a href="http://en.wikipedia.org/wiki/COMEFROM">Phi nodes</a>. I might rethink what kind of loops to have in the future, but for now I&#8217;ll stick with this. It can be simplified a bit:</p>
<p><pre class="brush: scala;">
def helloHello(n: Int) =
  loop(i := 0) while (i &lt; n) {
    puts(&quot;Hello?&quot;);
    i + 1
  } yield i
</pre></p>
<p>There are some potential programming errors that can easily arise from this kind of loop, but I&#8217;ll talk about that later (or maybe you can guess).</p>
<p><code>while</code> can only come directly after <code>loop(...)</code> and is defined in terms of <code>if</code>, <code>break</code> and <code>continue</code>:</p>
<p><code>while (condition) expression = if (condition) continue(expression) else break()</code></p>
<p>This translation is done immediately after parsing, to simplify the compiler phases.</p>
<p>With these constructs, and passing functions around, one could already write a program that does something a little bit useful. I think my next focus is on adding Strings and Arrays into the mix, but my next post will probably go more into functions and tuples.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/villane.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/villane.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/villane.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/villane.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/villane.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/villane.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/villane.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/villane.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/villane.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/villane.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/villane.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/villane.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/villane.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/villane.wordpress.com/158/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=158&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://villane.wordpress.com/2011/07/13/klang-program-structure/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/983a1933160b289c4debbf3cd7820563?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">villane</media:title>
		</media:content>
	</item>
		<item>
		<title>My Language Experiment</title>
		<link>https://villane.wordpress.com/2011/07/11/my-language-experiment/</link>
		<comments>https://villane.wordpress.com/2011/07/11/my-language-experiment/#comments</comments>
		<pubDate>Mon, 11 Jul 2011 14:52:42 +0000</pubDate>
		<dc:creator>Erkki Lindpere</dc:creator>
				<category><![CDATA[Language Design]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Programming languages]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[compilers]]></category>
		<category><![CDATA[klang]]></category>
		<category><![CDATA[language design]]></category>

		<guid isPermaLink="false">http://villane.wordpress.com/?p=142</guid>
		<description><![CDATA[Last time I posted about how it came to be that I started designing a new programming language and writing a compiler for it. I think I should continue by explaining a bit about how I started the project, how I&#8217;m implementing the compiler, and describe the project&#8217;s goals and the language I want to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=142&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Last time I <a href="http://villane.wordpress.com/2011/07/05/holy-bitcode-batman-youre-writing-a-compiler/">posted</a> about how it came to be that I started designing a new programming language and writing a compiler for it. I think I should continue by explaining a bit about how I started the project, how I&#8217;m implementing the compiler, and describe the project&#8217;s goals and the language I want to create.</p>
<p>About the same time I started having ideas of a more high-level object-functional language allowing a data-oriented programming style and more close-to-machine runtime than the JVM, <a href="http://twitter.com/#!/gereedy">Geoff Reedy</a>&#8216;s <a href="http://greedy.github.com/scala/">Scala on LLVM</a> appeared. I think that&#8217;s a really interesting project and I hope it turns into something awesome. But I think that is not quite what I was looking for, because it still needs the Java libraries and must carry some of the baggage that comes from the Scala/Java interoperability (such as <em>null</em>).</p>
<p>But <a href="http://llvm.org/">LLVM</a> caught my attention as I had been hearing more and more about it. I went through the <a href="http://llvm.org/docs/tutorial/index.html">Kaleidoscope tutorial</a> and shortly had my mind set that LLVM would be the library/toolset I would use as the back-end of my compiler to output machine code. By the way, I suggest that anyone who is new to implementing compilers like me should try that tutorial &#8212; it takes about a day to get through and in the end you have a simple toy language with a basic REPL and JIT.</p>
<p>I tried to continue adding stuff to Kaleidoscope, but realized yet again that I really don&#8217;t like C++ (which is what the tutorial used). I found no Java bindings for LLVM, so I thought to learn a language that has bindings: <a href="http://www.haskell.org/haskellwiki/Haskell">Haskell</a>. I <a href="http://learnyouahaskell.com/">Learned me a Haskell</a>, or part of it. It is another awesome language, but the syntax and purity scares me a bit and I just couldn&#8217;t imagine being as comfortable writing Haskell as I am when writing Scala. So, back to Scala. I initially thought to output LLVM Bitcode from Scala, but after reading some docs, it seemed easier to generate LLVM assembly text files (.ll), similarly to the Scala LLVM project (I even reused some bits of code from there). And indeed, it was much easier getting something running that way, and that&#8217;s how I&#8217;ll implement the first compiler.</p>
<p>But lets talk about the language and the project goals as well. Actually, I want to split this project into separate phases.</p>
<ul>
<li>The first phase is to create a relatively simple language that has a nice Scala-like syntax.</li>
<li>The second phase is trying out various designs and implementations based on that small base language.</li>
<li>The third phase, if I get that far, is at the moment an Unknown, trying to take what I&#8217;ve learned in the previous phases and try to apply that to actually implement the language I want.</li>
</ul>
<p>At some point, I will try to bootstrap a compiler in the language itself. The project might end up creating yet another esoteric language, but I hope it will turn out to have some usefulness.</p>
<p>The separation into phases is necessary because I still have a lot to learn and don&#8217;t want to rush into creating a full-blown general programming language. And also because I don&#8217;t want to think too far ahead at the moment. I started reading some books on programming languages and compilers: <a href="http://www.cs.rochester.edu/~scott/pragmatics/">Programming Language Pragmatics</a> for the introduction, next will be the <a href="http://dragonbook.stanford.edu/">Dragon Book</a> and <a href="http://www.cis.upenn.edu/~bcpierce/tapl/">Types and Programming Languages</a> (thanks to <a href="http://twitter.com/#!/djspiewak">Daniel Spiewak</a> for the suggestions). I have a lot of theory to go through and progress may be slow.</p>
<p>The language from the first phase is codenamed</p>
<h3>Klang</h3>
<p>The name comes from Kaleidoscope + language (no relation to <a href="http://twitter.com/#%21/viktorklang">Viktor Klang</a> :)), because Kaleidoscope was what I started with, although I altered the syntax to be more like Scala. What features will make it into Klang is not yet set in stone, but I&#8217;ll list some of them and describe the language in more detail in the next post, because this one is already pretty long. Anyway, some of the intended features</p>
<ul>
<li>A Scala-like (but not always identical) syntax</li>
<li>Type safety</li>
<ul>
<li>Type inference (somewhat similar to Scala&#8217;s)</li>
</ul>
<li>Functions</li>
<ul>
<li>Named and default arguments</li>
<li>Passing functions as arguments to other functions</li>
<li>Implicit conversions (but they can&#8217;t be used for pimping)</li>
<li>Anonymous functions</li>
<li>Extern function declarations (for linking to native C libraries)</li>
</ul>
<li>Primitive types: Byte, Short, Int, Long, Float, Double, Boolean</li>
<ul>
<li>Considering having Byte be unsigned, but not sure</li>
</ul>
<li>Tuple types with optionally named elements</li>
<ul>
<li>function argument lists are tuples</li>
<ul>
<li>can call a function with any expression that returns a tuple (if the argument list matches)</li>
</ul>
<li>multiple named return values from functions</li>
</ul>
<li>Arrays and UTF-8 Strings</li>
<li>Blocks and local values (no mutable local variables)</li>
<li>Control structures</li>
<ul>
<li>If expressions</li>
<li>Some kind of for-loop like structure, specifics not decided yet</li>
<li>&amp;&amp; and ||</li>
</ul>
<li>Packages or namespaces</li>
<li>A tiny standard library</li>
<li>&#8230; maybe a few more things</li>
</ul>
<p>A feature that will probably not make it is polymorphism. There will likely not be an Any or Object type that all other objects inherit from. And you will not be able to ask for the type of a value at runtime.</p>
<p>At least the above is roughly what I think will be in the first version of Klang. Once that version is done, I might make it available on Github under some liberal license. More in the next post.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/villane.wordpress.com/142/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/villane.wordpress.com/142/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/villane.wordpress.com/142/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/villane.wordpress.com/142/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/villane.wordpress.com/142/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/villane.wordpress.com/142/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/villane.wordpress.com/142/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/villane.wordpress.com/142/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/villane.wordpress.com/142/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/villane.wordpress.com/142/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/villane.wordpress.com/142/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/villane.wordpress.com/142/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/villane.wordpress.com/142/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/villane.wordpress.com/142/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=142&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://villane.wordpress.com/2011/07/11/my-language-experiment/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/983a1933160b289c4debbf3cd7820563?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">villane</media:title>
		</media:content>
	</item>
		<item>
		<title>Holy Bitcode Batman, You&#8217;re Writing a Compiler!</title>
		<link>https://villane.wordpress.com/2011/07/05/holy-bitcode-batman-youre-writing-a-compiler/</link>
		<comments>https://villane.wordpress.com/2011/07/05/holy-bitcode-batman-youre-writing-a-compiler/#comments</comments>
		<pubDate>Mon, 04 Jul 2011 22:22:49 +0000</pubDate>
		<dc:creator>Erkki Lindpere</dc:creator>
				<category><![CDATA[Game programming]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Language Design]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Programming languages]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[compilers]]></category>
		<category><![CDATA[klang]]></category>
		<category><![CDATA[language design]]></category>

		<guid isPermaLink="false">http://villane.wordpress.com/?p=131</guid>
		<description><![CDATA[Recently I&#8217;ve taken an interest in programming language design and have started working on a compiler for a new language. The reasons for doing that are perhaps less practical than I&#8217;d like, because I&#8217;m a practical man, or sometimes pretend to be. At least I&#8217;m usually more interested in the application of mathematics than the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=131&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Recently I&#8217;ve taken an interest in programming language design and have started working on a compiler for a new language. The reasons for doing that are perhaps less practical than I&#8217;d like, because I&#8217;m a practical man, or sometimes pretend to be. At least I&#8217;m usually more interested in the application of mathematics than the beauty of mathematics itself. The same used to go for programming languages.</p>
<p>As the few readers of this blog may know, I&#8217;ve been experimenting with game programming now and then, but haven&#8217;t been driven enough to actually complete and publish a game. In early years I tried Pascal and C++. I did complete some stupid little games in Pascal, but C++ was more a hindrance to me than an enabler. At some point I learned <a href="http://en.wikipedia.org/wiki/UnrealScript">UnrealScript</a> while developing a <a href="http://en.wikipedia.org/wiki/Deus_Ex">Deus Ex</a> mod and kind of liked it. Actually I think that was part of what lead me to Java and getting jobs in Enterprise Java programming. I never really thought to write games in Java, though.</p>
<p>When I discovered <a href="http://www.scala-lang.org/">Scala</a> I fell in love and thought it was the be-all and the end-all of programming languages and very applicable to games. Tried to write a couple of games in it, wrote some libraries for my own use and even ported a <a href="http://box2d.org/">physics engine</a> from Java and C++ to Scala. I loved doing it, loved the language features, the type system, the syntax, the preference of immutability over mutability. Perhaps most of all I loved Scala&#8217;s elegant mixture of object-oriented and functional programming.</p>
<p>Eventually something about making games in it started nagging me, though. The performance was good enough for the <a href="http://www.youtube.com/watch?v=hyIyZVXg-Mo&amp;feature=related">type of game</a> I was making, but the game was also using a lot of resources for the type of game it was. It started to feel wasteful. The idea that games are a big part of what is pushing hardware forward and have to take the most out of it was somehow stuck in my head.</p>
<p>Scala runs on the JVM, which has the nice abstraction of a big heap, and the memory management is done for you. My boss <a href="http://twitter.com/#!/ekabanov">Jevgeni</a> has given a <a href="http://www.con-fess.com/web/guest/sessions?p_p_id=trackOverview_WAR_portlets101_INSTANCE_tMn0&amp;p_p_lifecycle=0&amp;p_p_state=normal&amp;p_p_mode=view&amp;p_p_col_id=column-1&amp;p_p_col_count=2&amp;_trackOverview_WAR_portlets101_INSTANCE_tMn0_at.irian.confess2011.web.sessionToLoad=146">really awesome talk</a> on that topic titled &#8220;Do you really get memory?&#8221;. But whatever tricks the JVM does to make that abstraction work well enough for most applications, it does produce some amount of waste &#8212; extra CPU cycles and garbage objects which need to be collected to free the memory for new objects. And that is a big part of what the JVM engineers are continuously improving on. They are probably the most efficient at collecting (virtual) garbage in the world! But there are cases where that kind of heap abstraction doesn&#8217;t seem to hold well, and high-performance games are one of those cases.</p>
<p>My game engine used lots of immutable, short-lived objects. Things soon to become garbage, in other words. Garbage, dying a slow death in the heap. Every small <code>Vector2(x, y)</code> tracked by the collector, maybe living in a separate heap region from its closest friends. And looking up bits from here and there in the heap is really expensive from the CPUs perspective. Even when the Sun Java VM started optimizing away heap allocation of very short-lived objects (enabled by escape analysis), that only gave me a small performance boost. The situation has improved, but back then I decided to try and avoid so much garbage being produced. I optimized some functions manually, doing scalar replacement of Vector and Matrix objects. That made the code look really ugly and unreadable because it hid the mathematical formulas.</p>
<p>I couldn&#8217;t stand it. Neither could I stand all these cycles wasted on GC. So I wrote an <a href="https://github.com/Villane/vecmath/tree/master/vecmath-optimizer">optimizer</a> that plugged into the Scala compiler and did the scalar replacement automatically. It worked, and gave a more significant improvement than the JVM&#8217;s escape analysis optimizations at that time; garbage production was being reduced, I was going green! But it was hard for me to maintain the optimizer as I knew almost nothing about compilers and was just going by my nose. There were some corner cases that were hard to handle correctly. It only worked on code written against a very specific library, optimizing away well-known constructor and method calls.</p>
<p>Writing that optimizer got me somewhat interested in compilers, though. I remember saying to someone during a job interview a few years ago that I like complex problems, but am not interested in the really complex stuff like compilers. Sometimes things work out as the reverse of what you think.</p>
<p>Anyway, working on my game engine, I wanted to create a really powerful entity system. I wanted to use mix-ins and other nice Scala features. Reading some blog posts about game entity systems, I realized that most people seemed to be moving away from inheritance-based systems into component-based systems. Reading more about component-based systems made me run into the topic of <a href="http://research.scee.net/files/presentations/gcapaustralia09/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf">data oriented design</a> (PDF), which is all about thinking about data first, and how the program processes it. A couple of presentations on that left an impression on me and made me realize just how expensive it actually is to make the CPU churn through megabytes of random memory.</p>
<p>But I didn&#8217;t want to switch to C++ or C to be able to take advantage of the kinds of optimizations that data-oriented programming can give. I had the idea that maybe there should be a language that was object-functional like Scala, but compiled down to very CPU- and cache-friendly data structures and functions, the kind of structures one would use when doing data-oriented programming manually. And I have huge respect for people who do the latter. But I noticed that some of them seemed to be wanting a better language than C++ as well.</p>
<p>So, a language as expressive and type safe as Scala, similarly object-functional, but with more efficient memory access, CPU cache and parallelization friendly, enabling a data-oriented programming style with less hassle than C++. Perhaps one that could even run some functions on the GPU. What could that look like? I started thinking that maybe I should find out first-hand. <strong>[Update</strong> to clarify my goals<strong>]</strong> Well, at least I want to find out what it would be like to try to get there. I&#8217;m sure combining the type safety and power of Scala with the raw performance of C is way too ambitious for me. So I&#8217;m setting the goal way lower, but more about that in the next post.<strong>[/Update]</strong></p>
<p>I&#8217;ve never really been a language geek. I&#8217;ve programmed in several different languages and at the moment am good friends with only two: Java and Scala. There are scores of other programming languages out there and I&#8217;m sure there is some language that is at least 2/3 of what I&#8217;m looking for.</p>
<p>But I think some bug bit me, because I couldn&#8217;t let go of the idea of creating one myself. And this blog post was a lengthy, boring preface to a series of hopefully less boring posts documenting my experiment. I have no illusions &#8212; I don&#8217;t think I&#8217;ll create the &#8220;next big language&#8221; or anything close &#8212; there are people much smarter than me who have been working on programming languages for years and decades. But this will be a fun learning experiment and maybe something useful will come out of it. Next time I will talk about the kind of language(s) I want to create. The (s) is because I want to create a really simple language first, to learn more about compilers during the process, and later expand from that base.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/villane.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/villane.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/villane.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/villane.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/villane.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/villane.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/villane.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/villane.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/villane.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/villane.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/villane.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/villane.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/villane.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/villane.wordpress.com/131/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=131&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://villane.wordpress.com/2011/07/05/holy-bitcode-batman-youre-writing-a-compiler/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/983a1933160b289c4debbf3cd7820563?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">villane</media:title>
		</media:content>
	</item>
		<item>
		<title>JRebel 3.6 for Eclipse Released</title>
		<link>https://villane.wordpress.com/2011/02/01/jrebel-3-6-for-eclipse-released/</link>
		<comments>https://villane.wordpress.com/2011/02/01/jrebel-3-6-for-eclipse-released/#comments</comments>
		<pubDate>Tue, 01 Feb 2011 15:48:44 +0000</pubDate>
		<dc:creator>Erkki Lindpere</dc:creator>
				<category><![CDATA[Eclipse]]></category>
		<category><![CDATA[JRebel]]></category>

		<guid isPermaLink="false">http://villane.wordpress.com/?p=121</guid>
		<description><![CDATA[Today we (ZeroTurnaround) released version 3.6 of JRebel, our productivity tool for Java developers that eliminates many redeploys and restarts. Along with the core JRebel release, we made a major update to JRebel for Eclipse. With this release, we hopefully made getting started with and using JRebel for Eclipse super-easy, although I&#8217;m sure there are [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=121&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Today we (<a href="http://www.zeroturnaround.com/">ZeroTurnaround</a>) released version 3.6 of <a href="http://www.zeroturnaround.com/jrebel/">JRebel</a>, our productivity tool for Java developers that eliminates many redeploys and restarts. Along with the core JRebel release, we made a major update to <a href="http://marketplace.eclipse.org/content/jrebel-eclipse">JRebel for Eclipse</a>. With this release, we hopefully made getting started with and using JRebel for Eclipse super-easy, although I&#8217;m sure there are still things we can improve in the future.</p>
<p>There were quite a few things we did to make setting up JRebel for Eclipse easier. First, the plug-in is now available from the Eclipse Marketplace, making finding the plug-in and the installation process easier. Second, there is now a new plug-in that embeds JRebel itself &#8212; meaning that you will not have to install it separately, the Marketplace install contains everything you need. Third, with this release we also started signing our Eclipse plug-ins, eliminating one more step from the install process.</p>
<p>We also made other improvements to our Eclipse integration, including</p>
<ul>
<li>improvements to the debugger integration &#8212; stepping should now perform exactly as you expect</li>
<li>support for more launch configurations (WTP server editor sections and JRebel tabs for launch configurations), including OSGi and Virgo launch configurations</li>
<li>small UI improvements, making it easier to find logs, change JRebel&#8217;s settings, see licensing information and redeploy statistics</li>
<li>numerous other small improvements</li>
</ul>
<p>If you are not already using JRebel, we offer evaluation licenses to JRebel for Eclipse users and <a href="http://sales.zeroturnaround.com/">free licenses to open source and Scala developers</a>. But be warned: once you give it a try, you&#8217;ll never look back!</p>
<p style="text-align:center;"><a href="http://villane.files.wordpress.com/2011/02/install-marketplace.png"><img class="size-full wp-image-124 aligncenter" title="Install JRebel via Marketplace" src="http://villane.files.wordpress.com/2011/02/install-marketplace.png?w=450&#038;h=555" alt="" width="450" height="555" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/villane.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/villane.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/villane.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/villane.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/villane.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/villane.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/villane.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/villane.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/villane.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/villane.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/villane.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/villane.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/villane.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/villane.wordpress.com/121/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=villane.wordpress.com&amp;blog=820948&amp;post=121&amp;subd=villane&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://villane.wordpress.com/2011/02/01/jrebel-3-6-for-eclipse-released/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/983a1933160b289c4debbf3cd7820563?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">villane</media:title>
		</media:content>

		<media:content url="http://villane.files.wordpress.com/2011/02/install-marketplace.png" medium="image">
			<media:title type="html">Install JRebel via Marketplace</media:title>
		</media:content>
	</item>
	</channel>
</rss>
