Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Writing a Compiler in Go (squanch.org)
171 points by dcw303 on May 3, 2016 | hide | past | favorite | 45 comments


4 years ago Notch (creator of Minecraft) released specs for DCPU-16 - imaginary 16-bit CPU that could run limited, but quite complete set of instructions. The goal was to put this into his new game - 0x10c. People went crazy about this idea and a whole community appeared with people building assemblers, compilers, specs for new hardware and other tools for creating software for this architecture.

I also got interested in it and found the same article from 1988 that OP is referencing, which resulted in me writing a simple BASIC to DCPU-16 ASM compiler, also in Go [0]. The code isn't of very high quality (it was just a pet project) but if anyone wants to take a look, go ahead.

[0] https://github.com/M4v3R/DCPU-Bas


It's a shame that Notch dropped the 0x10c project. The concept seemed to me like a great way to get kids more interested in actual computer engineering (as opposed to just using computer applications).


It seems TIS-100[0] would fill that gap pretty nicely (I've heard good things from friends). It's by the same author[1] that introduced the concept of block worlds (that Minecraft used for example).

[0]: http://www.zachtronics.com/tis-100/ [1]: https://en.wikipedia.org/wiki/Zachary_Barth


There is a similar type game called Human Resource Machine[1] which I can also recommend. It's made by the same people who made Little Inferno.

My one gripe with it however is that it takes the middle ground between teaching assembly to people unfamiliar with programming and those who already have some knowledge, and doesn't quite satisfy either.

[1] https://tomorrowcorporation.com/humanresourcemachine


Tech Compliant seems to have picked up the torch: https://www.reddit.com/r/techcompliant/


We're hoping to fill that gap with https://armoredbits.com - although admittedly at a higher level (any off the shelf programming language).


Yeah, I was very sad when it was shelved. The idea had a great potential, which was apparent by a very vibrant community that sparkled around it, with literally thousands of people creating content for a game that didn't even exist yet.


Through some off the record interviews with the music composer for 0x10c it seems like Microsoft owns it now. I'm really hoping that someone convinces them to make it.


He said he dropped it because he couldn't find a way to make it fun. I'd look to other games in the same vein instead of pushing for this specific one to be released.


I would like to mention that most, if not all, of the Go compiler and related utilities is written in Go itself, with high quality, open source code that can be found in the "go" package of the standard library:

https://golang.org/pkg/go/

I would recommend using that as a template for one's own compiler, because it's written in idiomatic Go (originally by the language creators) and is cleanly split across abstraction boundaries (packages "scanner", "parser", "ast", and so on.)

There's also a smaller, simpler parser in the "text/template/parse" package, that is used to parse the templating languages in "text/template" and "html/template". As an introduction to it, I recommend the talk "Lexical Scanning in Go" by Rob Pike:

https://www.youtube.com/watch?v=HxaD_trXwRE


Actually, after a quick examination of the Go libraries, I'm not sure I can agree that it's "idiomatic Go". I still see many things that strike me as "artifacts of the C conversion" more than "the way someone would have written this from scratch".

The careful package split is something to be copied, though.


The conversion from C to Go was done in 2014, so it is surprising that it is not complete yet.

See "Go, From C to Go by Russ Cox" (with video) at https://github.com/gophercon/2014-talks


it is complete, but it's a conversion, not a rewrite. i doubt anyone will spend the time to rewrite the compiler in idiomatic Go soon. bigger fish to fry :)


Where is a good place to read "idiomatic Go?"


Much of the rest of the standard library. The compiler portion is a special case.


The standard library.


>I would like to mention that most, if not all, of the Go compiler and related utilities is written in Go itself, with high quality, open source code

High quality? Isn't the compiler since 1.5 auto-generated from C for the most part, and quite crap with regards to optimizations/readability?


I can also highly recommend the Coursera Compilers class. Sign up for it after it's over, and just work through things at your own pace.

The provided scaffolding code is either Java or C++, but with a bit more work, it's perfectly reasonable to re-write it in whatever language you desire. Here's mine, in Go: https://github.com/zellyn/gocool

Admittedly, it is the world's dumbest compiler (uses a single register, and there's no optimization), but it's still fun to actually "slay the dragon" and stop believing that only wizards can build a compiler. :-)


When I did my compiler class at the university, our professor teached us a nice way to do dumb compilers, that I always suggest as learning process.

Generating bytecodes in a suitable text format, that can be either be easily validated and executed with a dumb interpreter, or translated into machine code via a Macro Assembler.

Surely the code performance sucks, but it gives the students the gratification of producing something that executes real machine code and that they can show off to their friends.


I want to second this. I'm in the middle of this course right now.

It has a lot of theory, which may not be terribly interesting if you just want to learn to build a compiler, but it's complete and the language you're building the compiler for has a little bit of everything.


> I’ve been picking up Go lately, and I wanted to apply it to a project that I could sink my teeth into.

> What I came up with is very-much-not-idiomatic-go, but it does map (basically) line for line with the original Turbo Pascal.

Why not use this as an exercise in how to write idiomatic Go? The backdrop of wanting to apply Go isn't exactly followed up on if you're just going to copy completely different idioms without any regard for your new language.

I know it's touched upon later in the piece, but it really doesn't make sense to make this an afterthought.


That's a fair point. It wasn't so much that I was deliberately trying to ignore the idioms, but more that I wanted to write a translation in a current language to accompany someone who was working through the tutorials. Some may find that useful, others may not. I'm totally fine with that.

As a separate project I would certainly like to write a compiler that takes advantages of go's features.


Agreed. Seems to be a lot of disregard for the idioms from people picking up Go. I've recently had myself removed from a pairing where the project was some sort of ghastly java oriented Go. Not enjoying it.


That's a real shame indeed as the biggest strength of Go IMHO is exactly the culture it emerged out of.

Reading the source of the Go stdlib is a good exercise in preparation of "letting go" of your idiosyncratic perspective. In that regard it doesn't matter if you are coming from Java or Haskell or any other "world".

"Perspective is worth 80 IQ points." - Alan Kay

(who probably hates Go but that's OK too ;)


I've read a decent amount of the std lib in the near 3 years I've been using Go. I think the idiosyncratic perspective is that of the people who are unable to let go of the language they're coming from and adapt to the new one. Bringing ideas to an ecosystem is one thing, disregard is another. While I'm here I'll throw in mention of npm-style 3 line function packages I've been encountering. I think you've presumed too much about my perspective, and lost out on that 80.


Please make this accessible! (contrast ratio of 2.1 falls below WCAG)

https://en.wikipedia.org/wiki/Web_Content_Accessibility_Guid...


This is awesome. Did you feel like Go is good enough for the task? I've always thought its lack of expresiveness make it very unsuitable for compilers and all kinds of pattern-matching/tree-representation-based code (opposite to e.g. Haskell/OCaml)


I'd call it average-ish at best. It's not as bad as C, but it's not as strong as Haskell, which is a high bar to meet.

However, if you write things idiomatically and use a bit of thought, it's not hopeless, either. You can write type-safe nodes with interfaces, for instance:

    type ArithExpr struct {
        Op ArithOp
        L  ArithNode
        R  ArithNode
    }
where ArithNode is definitely an interface (implemented by ArithExpr) and ArithOp may be, depending on your mood. "Pattern matching" can not be perfectly replicated, especially if you deeply match, but "run this on ArithNodes recursively" is actually not that hard to express. Invert the pattern matching into method calls; it isn't a perfect translation but it hits many of the use cases. Modest cleverness in the methods may be helpful, for instance, it may be helpful for all nodes to implement a "RunOnNodeIfArithNode" method that usually does nothing, and only works on ArithNodes, to avoid a lot of assertions and checking in the core algorithms, that sort of thing. Don't try to port a pattern-matching approach, use the native stuff you've actually got. In the end I can't promise you'd never have any type assertions, but it should be possible to write a compiler that isn't drenched in them.

Still not my first choice, though. Might prefer it over Python, though; it would certainly be more verbose but the way I'd use the strong typing would probably make up for that in safety.


Don't try to port a pattern-matching approach, use the native stuff you've actually got.

What you just said has got this old Smalltalker starting to think of pattern matching as "polymorphism on steroids." Is that what it actually is?


It bears certain similarities to multiple dispatch, as if a case statement was an inlined multiple-dispatch function, but I'd say that much as objects can theoretically be implemented with closures and closures can theoretically be implemented with objects but I don't think they're best thought of as the "same thing", polymorphism and various dispatch techniques aren't really the same thing as pattern matching.

My real point here is use what you've got; trying to use object polymorphism in Haskell would be as big a mistake there as trying to naively jam pattern matching into Go. Stick to the native idioms as much as possible. While I agree that Go's native idioms are limiting compared to some other languages I also think a lot of people grossly overestimate the limitations.


Everything in Smalltalk was pretty much polymorphism, lambdas, and reflection. I suspect that's why I like golang. (There are a few other tricks, but better not to go there unless you really know what you're doing.)


>Everything in Smalltalk was pretty much polymorphism, lambdas, and reflection. I suspect that's why I like golang.

Huh? While it has both, Go is as far from lambdas and reflection as you can go and still be modern language.


That the Visitor pattern, right?


Go's compiler is written in Go.


Which doesn't really say much about how suitable it is to write a compiler in go.

Most C compilers are written in C. I would not argue that C is a language particularly suited to writing compilers though.


That only proves it's technically possible


Go's compiler is written in Go that was automatically converted from C.


> Let’s Build a Compiler, by Jack Crenshaw

Uau! Great idea, I went through those tutorials back when I started hanging around USENET comp.compilers group.

I learned a lot with them, nice to see them being brought to newer generations.


Go also has a builtin yacc implementation: https://golang.org/cmd/yacc/

  It is largely transliterated from the Inferno version
  written in Limbo which in turn was largely transliterated 
  from the Plan 9 version written in C and documented at 
  http://plan9.bell-labs.com/magic/man2html/1/yacc
For a real example in action, see the mtail project: https://github.com/google/mtail/blob/master/vm/compiler.go


(some shameless self advertising)

I built e8vm.io (https://github.com/e8vm/e8vm, https://e8vm.io), which is all written in go, and it has a full-blown compiler, with no optimizations though. The language has go-like syntax and C-like run time.

shanhu.io (https://shanhu.io) provides an online playground for the language, which runs the compiler right inside the browser, thanks to gopherjs.


and for the DCPU fans, e8vm also has its own small mips-like instruction set, which is currently the only target "platform" that my compiler supports.


Gray text on black background is awfully unreadable.


Agreed. And the blue color of the links is even worse; I actually can't read them unless I move my head closer to the screen than is comfortable.


For an example of a language written in Go(besides Go), there's always https://github.com/ark-lang/ark


This is nice but I had just used freepascal, Crenshaw's series is not about a learning a programming language but about creating a compiler for a new one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: