in which the word column has several meanings, none of them architectural

Sometimes adding what appears to be the tiniest little feature to a program can lead you down a rabbit hole of edge cases and bugs. This time around it was an innocent-looking question about column numbers in the Fennel compiler error messages. The Fennel compiler has for a while now had a feature where errors show the line containing the error and pinpoint exactly where in the line the error occurred.

Compile error in test/bad/expected-even-bindings.fnl:1
  expected even number of name/value bindings

(let [x 1 y] x)
     ^^^^^^^
* Try finding where the identifier or value is missing.
mossy stairs

You can see here that the ^^^^^ indicator highlights the entire binding vector which caused the error. We didn't invent this feature in Fennel; we surveyed existing compilers and stole ideas from the ones that people really seemed to love. In particular, Elm and Rust kept coming up as role models whose error reporting was worth emulating, and both had this error pinpointing as one of the things that people reported liking.

Now the bar is pretty low in general when it comes to compiler error messages; many compilers make only a token effort at providing error messages that are clear and understandable. On the bright side, this means you can put a small amount of effort into this and quickly place in a surprisingly high percentile of quality! Adding this error pinpointing to Fennel only took a couple weekends, and the payoff has been fantastic.

However, it's not all smooth sailing. In order to display this information the Fennel compiler originally stored byte offsets in the AST and read the source file up to the line in question. However, using byte offsets for this led to some issues. Take a look at this code:

(let [无为] nil)

In this case, we want to highlight the [无为] section of the code because it tries to introduce a new local without giving it a value. But its byte offsets are 5-13. If we use those byte offsets to draw the pinpoint indicator, we get this:

(let [无为] nil)
     ^^^^^^^^

Just because the identifier 无为 is 6 bytes doesn't mean it is 6 characters wide; in fact it is only two. Now we don't have a foolproof solution for this problem in Fennel, but we can manage OK. Fennel's compiler runs on the Lua VM, and in versions of Lua from 5.3 onward, we have the utf8.len function which can correctly identify "无为" as a string containing two characters. We have to fall back to the byte offsets when running on older versions of Lua, (unless the utf8 library has been backported) but when we can do better, we try our best.

Unfortunately that's not the only wrinkle we have to deal with. What about this code?

(print "วัด" unknown-var)
             ^^^^^^^^^^^

This code doesn't even do anything "fancy" like unicode identifiers, but it still throws off the error pinpoint indicator. That's because the word "วัด" consists of three characters, ว, ั, and ด. The second is attached to the first, because Thai is an abugida, meaning the vowels are often attached above or below their consonants rather than being written separately. So วัด is nine bytes and three characters, but it only takes up two columns, throwing off our indicators. At this point we have run up against an inherent limitation in this approach. As long as we assume our output is displayed on a terminal, there is no way to determine the width of the code that we're trying to draw attention to!

Even if we can determine the "width" of a character in columns, using that to produce an underline relies on the assumption that every column of characters is an equal width in pixels. The idea that a monospace font can reliably be monospace across all characters simply isn't true. In the best case, a larger character will be displayed in some width which is an integer multiple of the "normal" characters, but there's no guarantee that this will happen, and different fonts will make different decisions about how to render a given character. The only way to know how wide something will be is to attempt to render it in a specific font and measure it.

And Fennel is not the only compiler to run into this bug. Rust and Elm both are easily confused by strings and identifiers which use characters outside the ASCII range, as are OCaml, Clang, and every other compiler we could find which tries to help identify errors this way.

In my opinion the root cause of this bug is that when a white american developer sees a terminal they immediately interpret it as this idealized cartesian plane where they can lay out any writing they want in neatly-spaced characters that behave in "predictable" ways (in other words, behave exactly like English). But the reality is a lot more complex than that! You can't assume that everything people write will fit into your assumptions and your neat abstract boxes. Programming, of course, is forgetting, but we need to at least try to be aware of the costs of the abstractions we choose and consider who it is that ends up being forgotten.

We did find a way1 around the problem—putting the indicator inline with the code itself allows you to avoid having to know how wide it is. So the next version of Fennel will highlight errors using terminal escape codes, which will even allow the functionality to work with right-to-left languages, assuming your terminal supports them:

Fennel attempting to compile some
  Qalb code, which is written in Arabic

[1] I want to thank Reed Mullanix for his post about Unicode and source spans; I had come to the conclusion on my own about the old pinpointing approach being inherently impossible to do correctly from Fennel but his article helped me see that the inline highlighting alternative as a workaround for the problem which could still convey the same information reliably.

« older | 2022-07-25T17:44:10Z