Source code and education

For a long time I’ve been interested in how good programmers get that way. Back in 2002 I posted a comment to a mailing list of hackers. This group is the original sort of hackers – people who program for love, not the modern sort who write viruses and try to crack systems. One of them was so taken by it that he posted it on his own website

What it says is:

If we taught writing the way we try to teach programming …

Imagine if we tried to teach writing (in English or any other natural language) the way we try to teach programming.

We’d give students dictionaries and grammar books. We’d lecture them on the abstract structure of stories. We’d give them dreadful stuff to read – only things written by the most junior writers, like advanced underclassmen or young grad students (some of whom can indeed write well, but most of whom are dreadful). We’d keep the great literature secret. Shakespeare would be locked up in a corporate vault somewhere. Dickens would be classified Secret by the government. Twain would have been burned by his literary executor to prevent it competing with his own efforts.

And when people take jobs as writers (here the analogy begins to break down) their primary assignments for the first five to fifteen years of their working lives will be copy editing large works that they won’t have time to read end-to-end, for which there is no table of contents or index, and which they receive in a large pile of out-of-order, unnumbered pages, half of which are torn, crumpled, smudged, or otherwise damaged.

Is it any wonder that good programmers are so rare in the wild?

The thinking behind that statement developed back in the 1980s when I was a grad student at CMU. A group of grad students, me among them, met monthly to read code and drink wine. We all agreed that an important ingredient in learning to be a good programmer was reading good and bad code.

Unfortunately, in those days there was precious little code to read. Interestingly, all of the best software research and education institutions of the time were organized around repositories of software that all of the members contributed to and partook of. I include in this category organizations that I knew, or knew of, well enough to make this claim. They included MIT’s AI Lab, Stanford’s AI Lab, CERN, CMU’s Computer Science Department, IBM’s Research Lab, Princeton, Yale, and Berkeley. (There were certainly others, but I didn’t know people there or what sort of source code sharing went on there.) On reflection it’s interesting to realize that this is similar to how some of the earliest universities, Oxford and Cambridge in the UK, came about – a bunch of scholars pooling their most critical and precious resource, their books.

In the early days of software being a programmer meant much more than writing code. Programming included working with users, designing user interfaces, laying out the architecture, as well as writing the actual code. Nowadays we expect software people to be highly specialized and work in large teams, but we continue to believe deeply that a software person must be broadly educated and experienced to be valuable.

Educating a programmer in the ’80s was a challenge because our models of what software was and how it should be built were only beginning to gel. Thus you couldn’t learn design patterns because they hadn’t been invented. Object-oriented programming was implicit in the Simula work that dated from the late 1960s, but the OO intellectual movement didn’t really form until some key ideas escaped from Xerox’s Palo Alto Research Center. And the adoption of those ideas was delayed by the limitations of Smalltalk until C++ and later Java reached maturity.

Things have changed.

The source code to many interesting systems is broadly available to anyone now. The result is that we now have a better ability to educate software people today than twenty and more years ago. And the opportunity to create great software education is no longer limited to the small number of institutions that managed to combine wealth and vision in the right mixture to produce comprehensive source code repositories. Anyone with an Internet connection can get tons of source code to study. Now the challenge is how to focus attention, given how much is available.

In addition, we’ve come a really long way on building software that is composable. UNIX pipelines were a tantalizing hint of the power of composable modules, but they were never quite enough to leap the shark. Now, however, the real action is in the composition of services into mashups, something that can be done rapidly and easily without any formal computer science training.

And with the source code sharing culture, it’s increasingly easy for great artists to compose instead of merely imitating.