PHOTOOG Photography writings by Olivier Giroux

26Jul/070

A Gap in the Education of Programmers

Most universities have a course about this stuff, and most make this course mandatory. Yet somewhere, at some point, many students just didn't get it and they discarded that knowledge. To make room for learning Perl or PHP no doubt... sigh.

I'm talking about BNF of course! Backus Naur Form. Optionally with an 'E-', for the Extended version.

It saddens me how BNF is ruefully underutilized in the field. That people still choose to write complex character parsers by hand has got to be the single most glaring example of effort underestimation and second only to memory mis-management as a source of bugs. This is a great failure of the education system for programmers.

When confronted about it, programmers will often spin tales of how "BNF is for compilers" and that their parsers have magical properties like high performance or (OMG) simplicity. How could an imperative programming language like C or the C subset of C++ (their choice for this) possibly be any simpler than BNF when it comes to parsing?

Besides just scoffing at the state of the art-less, can we dig into why this is?

I propose the following reasons:

  1. BNF is only taught in compiler classes. It has nothing to do with compilers, it has everything to do with a machine understanding a chunk of text. Compilers just happen to take text files as their inputs most of the time. The BNF part of a compiler is the first micron of its painted exterior.

  2. Bison and Flex suck. It's plain BNF (no E-). The generated code looks like par for 1977. There's a whole 50 people on the planet who can pull off an object-oriented thread-safe Bison set-up. If I was a student I wouldn't want to touch Bison with a 10 foot pole (fyi, when I was a student I used LEMON for years, then switched to Spirit).

  3. Most people have only heard of Bison and Flex. Did you know about LEMON? What about Spirit? GNU is fixated on replicating what AT&T did in the 1970s - let's move on please, that's not where the state of the art is. In turn my university was fixated on GNU. >.<

  4. Boost Spirit is truly a revolution for BNF but it can only be used by a Grand C++ Master. Operator overloading, meta-programming, higher-order functions, lambdas, closures, oh my! Make a simple typo and you'll get spanked with 50 pages of template errors. Use it wrong and it can really burn a hole in your compiler or linker.

  5. There are no BNF tools that come with VC++. The leading IDE for the leading platform doesn't understand any dialect of BNF. You can download ports of BNF tools but then the hook-up into the IDE is 2nd rate (except for Spirit of course). It would of course be even worse if it had its own dialect, please no.

  6. Most people don't program in C/C++. Of course this is the biggest issue of them all. All of the BNF frameworks mentioned here are meant to be used as part of a C/C++ environment and their level of support for other languages varies from poor to non-existant. How ironic then that the wave of post-modern programming languages, where the line between language and library is hair-thin, did nothing for parsing beyond regular expressions.

Personally I'm a devoted user of Spirit, but I don't see how I could have learned how to use it while working a day job. It took me two whole weeks of college spare time (which is to say, just time) to learn to do anything complex with it. And I had already been using BNF regularly for years at that point... my starting point wasn't the average baseline. Then it took another 2 years of use to become completely comfortable with it.

This said, I can just imagine how many bugs could be avoided if BNF was better incorporated into the collective meme pool...


Filed under: Uncategorized Leave a comment
Comments (0) Trackbacks (0)

No comments yet.


Leave a comment

No trackbacks yet.