Books versus Covers

Back when I was a young scholar there were several things one learned that violated the “never judge a book by its cover” rule. One was that when you saw a disheveled fellow walking down the street talking to himself, you could reliably assume that he was disturbed and probably not taking his medication. And you could assume that a nicely typeset and printed article was worth reading.

Things have changed.

Now when you see an unshaven fellow in rumpled clothes walking down the street conducting an animated conversation you can’t assume that he’s off his Chlorpromazine. He might just as well be an investment banker working on a big deal.

Why did typsetting signify quality writing? Dating from the days of Aldus Manutius typesetting a book or an article attractively in justified columns using proportionally spaced fonts was a time-consuming task involving expensive skilled labor. Because of that high up-front cost, publishers insisted on strong controls on what made it to press. Thus we had powerful editors making decisions about what got into commercial magazines and books. And we had legions of competent copy editors engaged in reviewing and refining the text so that what did make it to press was spelled correctly, grammatically sound, and readable.

No one ever had to explicitly tell us that nicely typeset stuff was generally the better stuff, we learned it subconsciously.

Some years ago, in the first blush of desktop publishing, someone handed me a beautifully typeset article. Shortly after starting to read it I realized that it was hopeless drivel. After a few repetitions of this experience I came to the realization that with Framemaker, Word, and similar systems prettily typeset output could now be produced with less effort than a draft manuscript in the bad old days. An important cultural cue was lost. The book could no longer be judged by its cover.

Fixing a bug in the TreeTable2 example

This New Year I resolved to run backups of our computers regularly in 2007. My vague plan was to dump the data to DVDs, since both of our newest machines, a Dell PC running Windows XP Pro and a Mac have DVD burners.

What, to my dismay, did I learn when I examined the Properties of my home directory on the PC? It weighs in at over 140 gigabytes. The DVDs hold about 6 gigabytes, so it would take at least 24 DVDs to run a backup. Aside from the cost, managing 24 DVDs sort of defeats the purpose.

Before going to plan B, getting an outboard disk drive to use as the backup device, I thought I’d investigate all of this growth in my home directory. Last time I looked, my home directory was less than 10 gigabytes.

In the past I’ve used du from the bash command line to investigate the file system. This is powerful, but it’s slow and very painful. What I really wanted was a tree browser that was smart enough to show me the size of each subtree.

In a project that I’d worked on a couple of years ago I learned that there’s a cool thing called a TreeTable that has just the right properties. The leftmost column is a hierarchical tree browser, while the columns to the right are capable of containing data associated with the tree nodes on the left.

Thought I, “let’s get a treetable class from somewhere and then marry it with some code that can inspect the file system.” So I googled for ‘treetable’ and found not only a very nice treetable library available free, but an example built using it that did exactly what I wanted.

After downloading the source code and creating a project in Eclipse, I ran it. It worked nicely and was just what I wanted. But there was one small problem.

It reported that my home directory had a negative size:

Tree Table - Negative Size

Immediately that told me that somewhere in the code a node size was being represented as an integer, a 32-bit quantity that couldn’t represent more than 2 gigabytes before wrapping and showing a negative number. What I really wanted was an unsigned 64-bit number, though I suspected that I’d have to settle for a long, a 64-bit signed number. That would be adequate for now, since my 140 gigabyte file system size could be represented comfortably in a 38-bit integer.

The next step was to find and fix the problem with the code. My fear was that the underlying system call was returning an integer, which would have made the fix potentially quite painful. Fortunately, however, the problem turned out to reside in a single line of code in FileSystemModel2.java:

if (fn.isTotalSizeValid()) {
return new Integer((int)((FileNode)node).totalSize());
}

In this you can see that the long that is being returned by totalSize() from the FileNode node is being forcibly converted (don’t you love the word “coerce?”) to an integer.

Replacing the coercion with an appropriate Long object was the work of moments:

if (fn.isTotalSizeValid()) {
return new Long(((FileNode)node).totalSize());
}

Which had the desired result:

Tree Table - Correct Size

With this version I was able rapidly to navigate to the directory where I had stored the video that I’d made at the Bar Mitzvah of a friend’s son, files that I certainly didn’t need to back up and that represented the vast bulk of the 140 gigabytes.