Seeing the XML Contents of a Word DOCX “File”

Ah, the file which is not one but is really a zipped directory of XML files and directories of XML files. In its defense, this isn’t uncommon for Mac applications —¬†and I assume for other application-generated “files” on other operating systems as well. In case you have the bad luck of trying to understand what’s going on with a particular Word docx file, you can do the following in a terminal window:

cd path/to/your/file.docx
unzip file.docx -d file-content

Batch Converting DOCX Files

My students live in a Microsoft universe, for the most part. I don’t blame them: it’s what their parents and teachers know. And I blame those same adults in their lives for not teaching them how to do anything more powerful with that software, turning Word into nothing more than a typewriter with the ability to format things in an ad hoc fashion. Style sheets! Style sheets! Style sheets! As an university professor, I duly collect their Word documents, much I would collect their printed documents, and I read them, mark on them, and hand them back. Yawn.1

Sometimes, just to play with them, I take all their papers and I mine them for patterns: words and phrases and topics that occur across a number of papers. You can’t do that with Word documents, so you need to convert them into something more useful. (And, honestly, much of what my students turn in could be done in plain text and we would all be better off.)

On a Mac, textutil does the trick nicely:

textutil -convert txt ./MyDocxFiles/*.docx

I generally then select all the text files and move them to their own directory, where, for some forms of mining I simply lump them into one big file:

cat ./texts/*.txt > alltexts.txt

(I should probably figure out how to do the “convert to text” and “place in another directory” in one command line.)

pandoc can also do this, and I need to figure that syntax out.

  1. I also sit through their prettily formatted but also fairly substance-less PowerPoints — I’m not just picking on them here: I also work with them on making such presentations more meaningful. 


While Mac OS X’s Calendar and the iCloud service handle the back-end, I increasingly only use Fantastical for interacting with times, dates, appointments, and schedules. Not only is its natural language system very good — “Meet with this person next Friday from 2 to 2:30 in my office” — but it just looks good:

Screen Shot 2015-07-24 at 3.27.44 PM

Shrink Preview Files

[Macworld has a great tip]( on how to shrink Preview files without ruining image quality. Essentially, it entails navigating to:


and then copying the file `Reduce File Size.qfilter` to some place where you can edit it. No fear: it’s an XML file, which means you can produce multiple versions for different effects, making sure you give the versions different names so you can move them back into the `Filters` directory when you are done. (You will need to be able to authenticate to do so, just as you had to make a copy of the file elsewhere in order to work with it: Mac OS X does not want you editing its innards live.)

The parameters you are going to adjust are: Compression Quality and ImageSizeMax. The article has some good suggested values, which are a good place to start as you tweak things for your own benefit.

Cleaning Up Lion’s Launchpad

I confess that the application launcher I use most is [Quicksilver][qs]. Some of Quicksilver’s functionality has been available through Spotlight for years now, but if you have recently upgraded to Lion, then you know that Apple has also introduced a graphical app launcher, [LaunchPad][lp]. LaunchPad is very nice: it is accessible from the Dock and it has readily recognizable functionality precisely because it looks and acts just like iOS. The downside is that, also like iOS, it can get cluttered.

By default, Apple has placed all your installed applications in LaunchPad. One way to clean things up a bit is to pursue the iOS model of stuffing things into the iOS equivalent of folders. Another way is to wipe the slate clean and put exactly those apps you want in the LaunchPad.

**Warning**: *Command line goodness ahead.*

To proceed with the latter course, you will need to open a terminal window in your terminal app of choice — I use the built in — and enter the following on one line:

sqlite3 ~/Library/Application\ Support/Dock/*.db “DELETE from apps; \
DELETE from groups WHERE title<>”; DELETE from items WHERE rowid>2;” \
&& killall Dock

The backslash `\` above simple denotes a line wrap.