Holy Bitcode Batman, You’re Writing a Compiler!

Recently I’ve taken an interest in programming language design and have started working on a compiler for a new language. The reasons for doing that are perhaps less practical than I’d like, because I’m a practical man, or sometimes pretend to be. At least I’m usually more interested in the application of mathematics than the beauty of mathematics itself. The same used to go for programming languages.

As the few readers of this blog may know, I’ve been experimenting with game programming now and then, but haven’t been driven enough to actually complete and publish a game. In early years I tried Pascal and C++. I did complete some stupid little games in Pascal, but C++ was more a hindrance to me than an enabler. At some point I learned UnrealScript while developing a Deus Ex mod and kind of liked it. Actually I think that was part of what lead me to Java and getting jobs in Enterprise Java programming. I never really thought to write games in Java, though.

When I discovered Scala I fell in love and thought it was the be-all and the end-all of programming languages and very applicable to games. Tried to write a couple of games in it, wrote some libraries for my own use and even ported a physics engine from Java and C++ to Scala. I loved doing it, loved the language features, the type system, the syntax, the preference of immutability over mutability. Perhaps most of all I loved Scala’s elegant mixture of object-oriented and functional programming.

Eventually something about making games in it started nagging me, though. The performance was good enough for the type of game I was making, but the game was also using a lot of resources for the type of game it was. It started to feel wasteful. The idea that games are a big part of what is pushing hardware forward and have to take the most out of it was somehow stuck in my head.

Scala runs on the JVM, which has the nice abstraction of a big heap, and the memory management is done for you. My boss Jevgeni has given a really awesome talk on that topic titled “Do you really get memory?”. But whatever tricks the JVM does to make that abstraction work well enough for most applications, it does produce some amount of waste — extra CPU cycles and garbage objects which need to be collected to free the memory for new objects. And that is a big part of what the JVM engineers are continuously improving on. They are probably the most efficient at collecting (virtual) garbage in the world! But there are cases where that kind of heap abstraction doesn’t seem to hold well, and high-performance games are one of those cases.

My game engine used lots of immutable, short-lived objects. Things soon to become garbage, in other words. Garbage, dying a slow death in the heap. Every small Vector2(x, y) tracked by the collector, maybe living in a separate heap region from its closest friends. And looking up bits from here and there in the heap is really expensive from the CPUs perspective. Even when the Sun Java VM started optimizing away heap allocation of very short-lived objects (enabled by escape analysis), that only gave me a small performance boost. The situation has improved, but back then I decided to try and avoid so much garbage being produced. I optimized some functions manually, doing scalar replacement of Vector and Matrix objects. That made the code look really ugly and unreadable because it hid the mathematical formulas.

I couldn’t stand it. Neither could I stand all these cycles wasted on GC. So I wrote an optimizer that plugged into the Scala compiler and did the scalar replacement automatically. It worked, and gave a more significant improvement than the JVM’s escape analysis optimizations at that time; garbage production was being reduced, I was going green! But it was hard for me to maintain the optimizer as I knew almost nothing about compilers and was just going by my nose. There were some corner cases that were hard to handle correctly. It only worked on code written against a very specific library, optimizing away well-known constructor and method calls.

Writing that optimizer got me somewhat interested in compilers, though. I remember saying to someone during a job interview a few years ago that I like complex problems, but am not interested in the really complex stuff like compilers. Sometimes things work out as the reverse of what you think.

Anyway, working on my game engine, I wanted to create a really powerful entity system. I wanted to use mix-ins and other nice Scala features. Reading some blog posts about game entity systems, I realized that most people seemed to be moving away from inheritance-based systems into component-based systems. Reading more about component-based systems made me run into the topic of data oriented design (PDF), which is all about thinking about data first, and how the program processes it. A couple of presentations on that left an impression on me and made me realize just how expensive it actually is to make the CPU churn through megabytes of random memory.

But I didn’t want to switch to C++ or C to be able to take advantage of the kinds of optimizations that data-oriented programming can give. I had the idea that maybe there should be a language that was object-functional like Scala, but compiled down to very CPU- and cache-friendly data structures and functions, the kind of structures one would use when doing data-oriented programming manually. And I have huge respect for people who do the latter. But I noticed that some of them seemed to be wanting a better language than C++ as well.

So, a language as expressive and type safe as Scala, similarly object-functional, but with more efficient memory access, CPU cache and parallelization friendly, enabling a data-oriented programming style with less hassle than C++. Perhaps one that could even run some functions on the GPU. What could that look like? I started thinking that maybe I should find out first-hand. [Update to clarify my goals] Well, at least I want to find out what it would be like to try to get there. I’m sure combining the type safety and power of Scala with the raw performance of C is way too ambitious for me. So I’m setting the goal way lower, but more about that in the next post.[/Update]

I’ve never really been a language geek. I’ve programmed in several different languages and at the moment am good friends with only two: Java and Scala. There are scores of other programming languages out there and I’m sure there is some language that is at least 2/3 of what I’m looking for.

But I think some bug bit me, because I couldn’t let go of the idea of creating one myself. And this blog post was a lengthy, boring preface to a series of hopefully less boring posts documenting my experiment. I have no illusions — I don’t think I’ll create the “next big language” or anything close — there are people much smarter than me who have been working on programming languages for years and decades. But this will be a fun learning experiment and maybe something useful will come out of it. Next time I will talk about the kind of language(s) I want to create. The (s) is because I want to create a really simple language first, to learn more about compilers during the process, and later expand from that base.

A Simple Permission Based Security Library in Scala

I started writing a simple security library in Scala that supports permission checks similar to the java.security package, but is a lot simpler to extend or use and meant for securing application code and model objects based on user roles and permissions, not for protecting malicious code from executing. I personally like the basic concept of how the java.security.Permissions work, but that implementation is not suitable for handling user permissions for several reasons. The programming model with PAW (current code name for the library, it may change) is simple:

Extend the Permission trait in objects or case classes to define your own permissions. The Permission trait defines only one method:

def implies(that: Permission): Boolean

This is very similar to how java.security works: permissions can imply other permissions. Testing if a permission is implied is the same as testing if a user has a permission. For your own permission classes you only have to implement the implies method. But if you have simple permissions that don’t imply each other, but only themselves, you can just do this:

object MyPermission extends SimplePermission // A SimplePermission only implies itself and nothing else

Coming versions of the library may provide convenience classes or traits for hierarchically named permissions, CRUD permissions etc.

Permissions are grouped into PermissionSets, and they can be nested because PermissionSets are Permissions as well. There are two special Permission sets: AllPermissions and NoPermissions. The former can be used for “superadmin” users or during development; the latter can be used for guest users for example.

When you have defined your permissions, you need to set up a SecurityContext. JAAS integration, which gets the user Principal from JAAS, but stores permissions in a ThreadLocal, is provided with the library, but other implementations (such as Spring Security) can be plugged in. To set up the JAAS Context, you need to make sure that JAASSecurityContext is loaded, by making your code refer to it somewhere during initial setup:

init {
// You could also skip this and just use JAASSecurityContext everywhere instead of SecurityContext

You can ask SecurityContext for the current user Principal (this is still java.security.Principal) and it’s permissions. But most of the time you should need just two methods:

SecurityContext has MyPermission // SecurityContext.has(MyPermission) if you prefer

will tell you if the user has the permission or not. However, this still requires you to write an if-else statement if you want to throw an exception in case the user doesn’t have the required permissions. You can also do this:

SecurityContext mustHave MyPermission // will throw AccessControlException() if the permission is not implied

SecurityContext mustHave List(MyPermission, YourPermission) // must have both of them

Mostly, that’s it. You can write one-liners to check for user permissions and IMHO, this is in most cases superior to using Java EE annotations like @RolesAllowed("storeManager", "truckDriver", "cleaner"). You can go into details as much as required and you can put the permission checks anywhere in your code, including unmanaged objects.

I personally strongly prefer this security programming model. When I’ve had to use the Java EE role-based stuff I’ve usually had to “hack in” permissions by representing them as roles, which seems conceptually wrong. I haven’t taken a look at Spring Security for a while, but it seemed to follow a model similar to Java EE when I last tried it (years ago).

Of course, my library currently doesn’t cover roles or authentication. These can be handled by Spring Security, JAAS or something else, and may require custom integration work. But some of these features may also make into the PAW library in the future.

Feedback about the library would be greatly appreciated. Go a head and take a look at the sources at GitHub, it’s only about 250 lines of code currently, half of which are ScalaDoc comments. The sources and binaries are also available in my Maven repo.

PS. One thing to consider is this: the implication of Permissions is not transitive by default. i.e. A implies B, B implies C does not mean A implies C. When you test: SecurityContext has C, but the user has only A, an instance of B may not even exist. A.implies(that) is coded so that if the argument is B, it will return true, but there is no way to ask A for a list of all permissions it implies and if it does not directly imply C, then C is simply not implied. Can you think of a good solution for this?

Baseline Eclipse for Java Game Development?

The Eclipse IDE as it is offered for download at eclipse.org has become quite large. Size in bytes is probably not the biggest issue, but I think some of the functionality showing up in the UI is unnecessary clutter for many purposes that Eclipse could be used for. One such purpose I’m personally interested in is game development on the Java Platform (using Scala, Java, or another JVM language).

Some Java game development tools are already being built as Eclipse plug-ins and most likely more will be in the future. But right now they must provide them as a plug-in for an existing Eclipse installation that has all the clutter or build a completely custom Eclipse-based application, possibly an RCP app that doesn’t even include the Java Tools. I think this shouldn’t have to be the case: users of Java game dev tools should be able to download a version of Eclipse that is freed (as much as possible) from the cruft that they will not need, and be able to install specific tools and engine specific libraries into that baseline game development environment.

So I’m thinking of creating a lighter distribution of Eclipse JDT + XML tools that removes rarely used features from the Eclipse for Java Developers distribution and is specifically geared towards game development rather than enterprise Java or application development. Perhaps this should also extend to C/C++, but I am not familiar with the CDT so I’m not considering that aspect right now.

This distribution shouldn’t come from eclipse.org of course, but rather from the open source game engine and tools developers. I am willing to work on such a project, as long as it provides actual value to game developers and doesn’t eat up too much of my time. The first version of it could be fairly basic, just a lighter distro of Eclipse. Later versions can start incorporating game dev specific tools and UI concepts. Is there anyone interested in contributing to such a project? Or using such a build of Eclipse? Please let me know.

The first step of the project should be to create a reduced version of Eclipse JDT + XML tools that is still updateable, so that game development plug-ins and plug-ins such as Scala IDE, M2Eclipse, Subclipse, EGit can be installed. I have done some initial experiments and found an approximate list of features that can be easily removed from Eclipse JDT. If you would be interested in using such a distribution, please comment what you would like to be kept:

  • APT or Annotation Processing Tool support — I think this is very rarely used and just clutters the UI in some screens.
  • Ant support — I think a lot of people would be against this, so it should probably stay
  • CVS support — CVS should not be used by anyone for other than legacy reasons, now that we have Git, Mercurial, SVN etc. And I expect that games development will not involve much poking around in legacy repositories. I think most open source game tools use SVN (correct me if I’m wrong).
  • Help — this is debatable, but I would remove it from the first version and decide later what to do with it
  • Welcome screen — same as help, can be added back later if it’s found to have some use
  • JUnit3 support — JUnit4 should be enough, I think
  • Various internal tools and plug-ins, backwards compatibility stuff

I’m estimating that the size of this distro would be somewhere between 50-70 MB. The P2 update manager included in Eclipse will increase the size on disk though, because it may create a lot of meta data and caches. But disk space is relatively cheap, and I don’t think there’s a better update manager available. My main concern is removing UI clutter so that game dev tools that plug into it will have more UI space and better visibility. Compared to the Eclipse IDE for Java Developers distribution, which lists the following “features” (feature in Eclipse terms is a collection of plug-ins):

  • org.eclipse.cvs
  • org.eclipse.epp.usagedata.feature
  • org.eclipse.equinox.p2.user.ui
  • org.eclipse.help
  • org.eclipse.jdt
  • org.eclipse.mylyn.bugzilla_feature
  • org.eclipse.mylyn.context_feature
  • org.eclipse.mylyn.ide_feature
  • org.eclipse.mylyn.java_feature
  • org.eclipse.mylyn.wikitext_feature
  • org.eclipse.mylyn_feature
  • org.eclipse.platform
  • org.eclipse.rcp
  • org.eclipse.wst.xml_ui.feature

Only these would remain (and even those not in complete form):

  • org.eclipse.equinox.p2.user.ui (this is the update manager)
  • org.eclipse.jdt (custom lite version with APT and maybe a couple of more small things removed)
  • org.eclipse.platform (custom, with various plug-ins removed)
  • org.eclipse.rcp
  • org.eclipse.wst.xml_ui.feature (possibly custom with a couple of plug-ins removed)

Later versions could add optional updates that bundle engine-specific tools and libraries, Scala, SVN, Mercurial, Git or Maven support etc.

What do you think? Is there need for such an Eclipse distribution? I’m especially interested of the opinion of Eclipse based game tool developers, if any of you happen to read this post. Would you contribute if someone does the initial work? Has someone already done something like this (not engine specific)?

Call to JVM library developers: upload sources to Maven repos

I wish all developers that release libraries for the JVM platform (Java, Scala and other libraries) and put their libraries on Maven repos would also put the sources there. It makes it so much easier to get sources automatically attached in an IDE without going digging for them. Maven doesn’t do this by default, but building (and deploying) the sources Jar along with the binary Jar should be a matter of adding these settings to your library’s POM (or better yet, to a parent POM if you use a parent which defines commons settings).


Ziggy Returns, Rewritten in Scala

For about three years already I’ve been working on and off on a server/bot that runs Interactive Fiction games on web forums, effectively turning the single-player text adventure games into cooperative. The beta version launched at Idle Thumbs more than a year ago, but unfortunately it had some problems that made it crash regularly, and I didn’t find the time to fix them.

We’ll, a couple of weeks ago I decided to take the time and rewrote the main server code in Scala. That took only about a week and a half, the bot is now live again and running Hitchhiker’s Guide to the Galaxy. There are still several Java libraries in use of course: Z-Machine from Zinc, Bulletin Board API and ECF, Velocity, HTTPClient etc. But the main server code is now written in Scala.

My main goal for this rewrite, besides switching to Scala, was to simplify the architecture (back to the roots, I guess): the previous architecture was becoming too enterprisey for such a small project. It had too many layers of abstraction, some multi-threading issues made the code hard to debug, and using an OSGi runtime proved to be not that useful in my context. I still think OSGi is great, but sometimes simplicity is even better :)

Thankfully, ECF still works outside of OSGi and only requires two libraries from Equinox. By dropping OSGi, I could simplify the build process (or at least move it to what I’m more comfortable with – Maven). On the server I’m simply launching a Scala object with a main method instead of an OSGi runtime and that is good enough for this project.

The management console application was dropped as well, but I’m exposing some MXBeans on the server so VisualVM or JConsole can be used for management. Web-based management is planned for the future.

Another simplification was dropping some layers of abstraction and removing code that was there only to support hypothetical future features (such as games on IRC and IM, many Interactive Fiction VM implementations etc.). I may be adding some of those layers back in the future, but for now the code should focus on making the core functionality (running a Z-machine game in a vBulletin or phpBB forum thread) as good as it can be.

Perhaps the most improvement that came from this rewrite was that Scala code tends to be about two times shorter than equivalent Java code, as I aslo experienced with my Scala port of Box2D, which I mentioned in a previous post.

But I would say that using Actors for communication between different threads has also gained me some simplicity and I was now better able to understand and fix some of the concurrency issues that used to make the server hang. However, I wasn’t able to move to a completely Actor-based communication model yet, as some of aspects of the messaging between the Z-machine and the forum bot still seemed easier done using other methods, such as blocking queues. Maybe I just don’t know enough about actors yet.

Overall, I’m happy with the result so far and since I find programming in Scala to be more enjoyable than Java, I think I will be more motivated to further develop Ziggy and keep it running smoothly from now on.

Scala: for a/vs while

After posting about the (almost non-existent) performance impact of implicit conversions, I noticed something that blew my mind somewhat: if I used a for expression instead of while, the code ran much longer. How much? About a hundred times. Here’s a simple example, and again it’s such an example that brings out the difference between two “primitive” things taken out of context, so it doesn’t really tell us anything about real-life application performance:

object ComprehensionPerfImpact {
def main(args : Array[String]) : Unit = {
val n = 100000000

Time(“for-comprehension”) {
for (x < - 1 to n) {} } Time("while") { var x = 0 while (x <= n) { x += 1 } } } }[/sourcecode] The output:# Block “for-comprehension” completed, time taken: 16797 ms (16.797 s)
# Block “while” completed, time taken: 125 ms (0.125 s)

Update: I forgot to mention that this test was made with Scala version 2.6.0. I have also now run this on Scala 2.7.0 RC1, where the results are about 2.5 times faster.

Continue reading