All posts tagged computing

The Coming War on General Purpose Computing

I like Cory Doctorow’s principled, long view of things.

The Computer History Museum has a terrific video of Steve Jobs talking in the late seventies, early eighties about computing. Nice work.

April 1 Is Backup Day

April 1 is international backup day, which seems like an odd day to choose. I think it would be better, if also equally unfortunate for those of who live in societies that celebrate April Fools, to mark it as open information, or open access, day. Today is the 200th birthday of Robert Bunsen, famous for his eponymous burner, which he chose not to patent and, in fact, pursued those who tried to patent it for themselves.

In celebration of open information day, I offer up this passage from Benjamin Franklin’s Autobiography which details his refusal to patent the Franklin stove:

In order of time, I should have mentioned before, that having, in 1742, invented an open stove for the better warming of rooms, and at the same time saving fuel, as the fresh air admitted was warmed in entering, I made a present of the model to Mr. Robert Grace, one of my early friends, who, having an iron-furnace, found the casting of the plates for these stoves a profitable thing, as they were growing in demand.

To promote that demand, I wrote and published a pamphlet, entitled “An Account of the new-invented Pennsylvania Fireplaces; wherein their Construction and Manner of Operation is particularly explained; their Advantages above every other Method of warming Rooms demonstrated; and all Objections that have been raised against the Use of them answered and obviated,” etc.

This pamphlet had a good effect. Gov’r. Thomas was so pleas’d with the construction of this stove, as described in it, that he offered to give me a patent for the sole vending of them for a term of years; but I declin’d it from a principle which has ever weighed with me on such occasions, viz., That, as we enjoy great advantages from the inventions of others, we should be glad of an opportunity to serve others by any invention of ours; and this we should do freely and generously.

An ironmonger in London however, assuming a good deal of my pamphlet, and working it up into his own, and making some small changes in the machine, which rather hurt its operation, got a patent for it there, and made, as I was told, a little fortune by it. And this is not the only instance of patents taken out for my inventions by others, tho’ not always with the same success, which I never contested, as having no desire of profiting by patents myself, and hating disputes. The use of these fireplaces in very many houses, both of this and the neighbouring colonies, has been, and is, a great saving of wood to the inhabitants. (From Franklin’s Autobiography.)

And I also note that my colleague Jason Jackson and the team at Open Folklore have exciting news of their own.

Weekend Watching

A lovely history of IBM by Errol Morris with music by Philip Glass. Worth it for the archival film footage alone:

Structure and Interpretation of Computer Programs

The influential computer-science text Structure and Interpretation of Computer Programs by Abelson, Sussman, and Sussman is available on-line, along with a range of teaching aids. Go MIT Press!

I am considering using some parts of the text to, at least, introduce the idea of computing, into my seminar surveying the digital humanities. I know I want to focus on some basic tools, including perhaps some exposure to Python, and, yes, there is always John Zelle’s Python Programming: An Introduction to Computer Science — which is still available for download in its 2002 incarnation here (careful, that’s a link to a 1.3MB PDF), but it’s nice to have options and to be able to offer students different explanations for the same concepts. (I know I need it when it comes to some aspects of computer science.)

Here is the table of contents.

Machine Learning for Human Memorization

A machine learning researcher, Danny Tarlow, has come up with a way to describe his problem in competitive scrabble in programming terms. Here’s a link to the post, and here’s his rough description of the problem:

As some of you know, I used to play Scrabble somewhat seriously. Most Tuesdays in middle school, I would go to the local scrabble club meetings and play 4 games against the best Scrabble players in the area (actually, it was usually 3 games, because the 4th game started past my bedtime). It’s not your family game of Scrabble: to begin to be competitive, you need to know all of the two letter words, most of the threes, and you need to have some familiarity with a few of the other high-priority lists (e.g., vowel dumps; short q, z, j, and x words; at least a few of the bingo stems). See here for a good starting point.

Anyway, I recently went to the Toronto Scrabble Club meeting and had a great time. I think I’ll start going with more regularity. As a busy machine learning researcher, though, I don’t have the time or the mental capacity to memorize long lists of words anymore: for example, there are 972 legal three letter words and 3902 legal four letter words.

So I’m looking for an alternative to memorization. Typically during play, there will be a board position that could yield a high-scoring word, but it requires that XXX or XXXX be a word. It would be very helpful if I could spend a minute or so of pen and paper computation time, then arrive at an answer like, “this is a word with 90% probability”. So what I really need is just a binary classifier that maps a word to probability of label “legal”.

Problem description: In machine learning terms, it’s a somewhat unique problem (from what I can tell). We’re not trying to build a classifier that generalizes well, because the set of 3 (or 4) letter words is fixed: we have all inputs, and they’re all labeled. At first glance, you might think this is an easy problem, because we can just choose a model with high model capacity, overfit the training data, and be done. There’s no need for regularization if we don’t care about overfitting, right? Well, not exactly. By this logic, we should just use a nearest neighbors classifier; but in order for me to run a nearest neighbors algorithm in my head, I’d need to memorize the entire training set!

At some point, I really do need to figure out why our Windows 7 machine won’t sleep. (Yes, we have a Windows machine: it’s in our kitchen as a place for our daughter to do occasional bits of homework that require a computer, for all of us to look things up, for cooking with Pandora playing, and, yes, for me to play the occasional game.) When I get around to figuring things out, I should probably start here.

The things you end up teaching yourself

One of the applications to which we were introduced at the NEH Institute on Networks and Networking in the Humanities — which goes by the hash tag nethums by the way — was a Carnegie-Mellon application called ORA. It and its companion application, AutoMap, are very useful tools for network analysis and visualization.

My difficulty with the applications was simply in getting them to run on my MacBook Pro. The problem was, is, that ORA, AutoMap, and their installers require an older version of Java than is included with Mac OS 10.6. With 10.6, Apple dropped the versions of Java 1.4 and 1.5 that they had been carrying and only provided 1.6. Java 1.4 is still available, but navigating Oracle’s site to get it, and getting it onto my MacBook was a longer road than I wanted to travel.

Now that I am back home, I got the good word that ORA had been updated. Great news! I headed over to the site only to learn that the Windows and Linux versions had been updated to version 2.2.2 but the Mac was still back at 1.6.9.

Sigh.

Two routes now lay open to me, if I wanted to run one of the newer versions on my Mac:

  1. Pick up a copy of VMWare Fusion or Parallels and run either Windows or Linux in a virtual machine, or
  2. Determine if there was a way to run the Linux application on Mac OS X (which is also a certified *nix now).

I had just spent a fair amount of money on corpus linguistics text — I’m working on refining a notion of “corpus folkloristics” — and so the idea of spending more money on virtualization software as well as for a copy of Windows is less than appealing. (I am already about to buy a copy of Windows 7 for our home desktop, but Microsoft offers now family pack the way Apple does, and so multiple copies of Windows is a little out of my price range for now.)

So, let’s go with the second option: run Linux apps on my Mac.

A page on Simple Help promised me a complete walkthrough of the process, the first step of which is getting Fink on my MacBook. (I had been using MacPorts before upgrading to 10.6, but the upgrade had broken it and so I was okay switching to Fink.)

Oops, no binary installer for 10.6. I was going to have to install it from source. Luckily, the Fink Project has a page up that walks you through installing from source. It does a pretty good job of getting you through everything, and it even tells you to run:

/sw/bin/pathsetup.sh

which would suggest to a command-line novice — I’m not quite a noob! — like me that, well, my path is going to be setup for me, which makes it all the more maddening when you enter:

fink selfupdate

and get the command not recognized response. Uh oh. And so I double-checked my PATH environment:

echo $PATH

and got all the usual suspects:

/sw/bin:/sw/sbin:/opt/local/bin:/opt/local/sbin:  
/usr/local/mysql/bin:/opt/local/bin:/opt/local/sbin:  
/usr/local/bin:/usr/local/subversion/bin:/usr/bin:  
/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:  
/opt/local/bin:/usr/local/git/bin:/usr/X11R6/bin

What’s going on? I closed the terminal and started doing some reading up on editing my PATH when I decided to double-check my work and ran fink selfupdate again. What do you know, it worked! Here’s the trick: I forgot to follow the directions and open a new terminal window after the initial installation.

And so I taught myself to follow directions.

Magic Is Now Here

Adobe’s John Nack posted the following video on his blog revealing a new “Context Aware” healing/deletion functionality in PhotoShop CS5. I don’t do that much with PS that I typically need to upgrade — I only went from CS1 to CS3 for the Intel compatibility — but this new functionality, no, this new magic is amazing:

A Cyborg Composer?

While I am writing about new forms of creativity, I would also like to point out this terrific profile of UC Santa Cruz emeritus professor David Cope. Cope was the inventor of Emmy, Experiments in Musical Intelligence (EMI, or “Emmy”), which was well received by some but made others uncomfortable with the questions it raised about human creativity — the short answer for me is that all the formulas Cope entered into Emmy were clearly based on work done by humans, but I don’t know entirely how Emmy works. Cope is about to release a successor to Emmy, known as Emily Howell. Two compositions by Emily are included in the article. They make for an interesting listen.

Emily Howell Sample Composition

Linguists Agree to Publish Data

My friend Jason Jackson passes on the news that at the annual meeting of the Linguistics Society of America, the following resolution was passed:

Whereas modern computing technology has the potential of advancing linguistic science by enabling linguists to work with datasets at a scale previously unimaginable; and

Whereas this will only be possible if such data are made available and standards ensuring interoperability are followed; and

Whereas data collected, curated, and annotated by linguists forms the empirical base of our field; …

Therefore, be it resolved at the annual business meeting on 8 January 2010 that the Linguistic Society of America encourages members and other working linguists to:

  • make the full data sets behind publications available, subject to all relevant ethical and legal concerns; …
  • work towards assigning academic credit for the creation and maintenance of linguistic databases and computational tools; and
  • when serving as reviewers, expect full data sets to be published (again subject to legal and ethical considerations) and expect claims to be tested against relevant publicly available datasets.

Goodbye, InfoBits, and Thanks

The last issue of InfoBits was published this month. While I was never a heavy user of the service/bibliography, it was always nice to know it was there, to have it there. Perhaps this marks the beginning on one era of computing/IT in the humanities or perhaps it simply reveals how much such things are functions of particular individuals — to whom we later recognize we owe a debt — or perhaps it reveals only a particular moment in the funding of higher education in the U.S. No telling which way to read these tea leaves.

Tea Leaves

The Man Behind Fictional UIs

Mark Coleran designs UIs (user interfaces) for the movies. You’ve seen his work in the various Bourne movies, in the Lara Craft movie, and in a number of other places.

Coleran's UI for "Tomb Raider"

He has collected the various UIs on a single page on his website, and it’s a great place to go for inspiration both when you are trying to design an interface but also when you are just trying to sketch out the structure of a problem. (Sometimes how you look at data helps you to imagine what your data is.)

Vygotsky and Coding

At the recent Microsoft Professional Developers Conference, an all-star cast of coding greats were convened on “Microsoft Perspectives on the Future of Programming. ” Among other things, Butler Lampson, Erik Meijer, Don Box, Jeffrey Snover, Herb Sutter, and Burton Smith discussed the improvement in IDEs (integrated development environments) and in various languages and how making coding easier, or at least less likely to fail, also means people not knowing everything they should in order to become great. One contributor likened it to anti-lock break systems: “Now you don’t have to be a great driver to perform well in snow. You just mash the brakes and the anti-lock system does all the heavy lifting for you and it pumps much faster than you ever could. It’s just, in my view, a case where computers actually help you think less. It’s like what Vygotsky in activity theory distinguishes between your performance and your competence.” The video is here, and the statement is right at 40:00 in. Check it out.

Tattoo You

The Text Analysis Developers Alliance has released an embeddable Flash widget which provides embedded TAPOR analytics for the page on which it resides.

Here’s an example of the embedded widget:

Oh, yeah, that tattoo is short for Text Analysis TOOls. (Actually, it gets even worse, but I’m too embarrassed to repeat their version.)