Go: Easy to Read, Hard to Compile Corner cases when compiling Go Ian Lance Taylor Google iant@golang.org * Introduction - To really learn a language, write a compiler for it - Compiler bugs imply language complexity - Or, compiler bugs imply differences from C/C++ - Sometimes simpler for users is harder for compilers - Fortunately Go is much simpler to compile than C++ or even C This talk is based on Go compiler bugs encountered over the years. * Recursive Types Names in Go packages are defined in the entire package, so Go types can refer to themselves recursively. .code compiling/rtype1.go /1 START OMIT/,/1 END OMIT/ This is not permitted in C/C++, except for the special case of a struct/union/class field which is a pointer/reference. All Go compiler code that walks over types has to be careful to avoid endless loops. * Recursive Types What good is a recursive pointer type? It can only be nil or a pointer to itself. That's enough for Peano arithmetic. .code compiling/rtype1.go /2 START OMIT/,/2 END OMIT/ * Recursive Types Actually, a recursive pointer can have a bit more information: it can have a finalizer. .play compiling/rtype1.go /3 START OMIT/,/3 END OMIT/ * Recursive Types Recursive function types are actually useful: they can implement a state machine. .code compiling/rtype2.go /1 START OMIT/,/1 END OMIT/ * Recursive types .play compiling/rtype2.go /2 START OMIT/,/2 END OMIT/ * Recursive Types Simple rule: all names at package scope are visible in the entire package. Complex consequence: compiler must handle recursive types (also recursive initializers). * Constants Go has both typed and untyped constants. They follow the same rules, except that a typed constant must be representable in its type. This is reasonably clear for integers, less so for floats. .play compiling/const1.go /1 START OMIT/,/1 END OMIT/ * Constants Go's floating point variables follow IEEE-754 rules. Constants do not. .play compiling/const2.go /1 START OMIT/,/1 END OMIT/ * Constants The special unsafe.Sizeof function returns a constant. .play compiling/const3.go /1 START OMIT/,/1 END OMIT/ * Constants Simple rule: constants are untyped; they are mathematically exact and do not require type conversions. Complex consequence: exact floating point behavior depends on the type. * Name Lookup Name lookup in a Go compiler is simple compared to many languages. For every name the scope in which to look it up is obvious. This makes parsing Go quite simple. With one exception. What is the scope for i? .code compiling/name1.go /1 START OMIT/,/1 END OMIT/ * Name Lookup One possibility. .play compiling/name1.go /2 START OMIT/,/2 END OMIT/ * Name Lookup Another possibility. .play compiling/name2.go /2 START OMIT/,/2 END OMIT/ * Name Lookup Simple rule: in a struct composite literal you can use field names as keys. Complex consequence: if you don't know the type of the composite literal, the lookup scope of names used as keys is unclear when parsing. * Methods Any named type can have methods. Any struct type can inherit methods from an embedded field. It follows that you can sometimes call methods on a variable even if it has an unnamed type. .play compiling/var1.go /1 START OMIT/,/1 END OMIT/ * Methods Simple rules: named types can have methods; structs can have embedded fields. Complex consequence: unnamed types can have methods. * Conclusion - Go is simpler to compile than most languages - There are still complexities for the compiler - Most complexities stem from making Go easier to write