Mapping Metaphor

The Metaphor Map of English is now available. It’s a fascinating labyrinth, especially if you have any interest in conceptual metaphors. Here’s what the map’s developers have to say:

The Metaphor Map of English shows the metaphorical links which have been identified between different areas of meaning. These links can be from the Anglo-Saxon period right up to the present day so the map covers 1300 years of the English language. This allows us the opportunity to track metaphorical ways of thinking and expressing ourselves over more than a millennium; see the Metaphor in English section for more information.

The Metaphor Map was built as part of the Mapping Metaphor with the Historical Thesaurus project. This was completed by a team in English Language at the University of Glasgow and funded by the Arts and Humanities Research Council from 2012 to early 2015. The Metaphor Map is based on the Historical Thesaurus of English, which was published in 2009 by Oxford University Press as the Historical Thesaurus of the Oxford English Dictionary.

From CSV to Bipartite Network to One-Mode Projection

I am continuing my effort to develop my own stack of scripts that do exactly what I want and that I understand how they work. They are not, in all honesty, scalable like the work done by Tim Tangherlini, but my work here pleases me.

I don’t have weighting in place yet to make the projection have more meaning than it does, but this code uses pandas and works quite well:

#! /usr/bin/env python

import pandas as pd, networkx as nx, matplotlib.pyplot as plt
from networkx.algorithms import bipartite

# Build lists of nodes and edges:

df = (pd.read_csv('tales-01.txt', header=None)
    .apply(lambda x : pd.DataFrame ([[x.iloc[0,0],v] for v in x.iloc[0,1:]]))
edges = df.values.tolist()
nodes_0 = list(set(df['text'].values.tolist()))
nodes_1 = list(set(df['word'].values.tolist()))

# Build a bipartite graph:

B = nx.Graph()
B.add_nodes_from(nodes_0, bipartite=0) # Add the node attribute "bipartite"
B.add_nodes_from(nodes_1, bipartite=1)

# Project one side of the graph:

G = nx.projected_graph(B, nodes_1)
        with_labels = True,
        node_color = '#00CCFF')

# Choose your output:
# plt.savefig("graphing.png", dpi=300)

For those less familiar with bipartite networks, Wikipedia as always has a decent introduction.

re: Notebooks

Lab notebooks (323.365)

Over the years, I have made a number of posts about various dimensions of notebooks, but, really, the only point I want to make is: get one, keep one. I no longer try to keep projects in single notebooks (more on this in a moment), but I do keep a notebook with me at all times, and, when in doubt, whatever it is I need to record or I want to write/think about goes in there. I can always copy that material to some other location, but I cannot do that if it is lost to the vagaries of time.

When I am working in Python, I am working inside an iPython notebook, which like the script pane in RStudio, allows you to write and run code in a way that also allows you to keep track of what you have done. This is different from working in an IDE, where your sole focus might be developing a piece of code. In many instances, scientists and scholars are interested in what a particular piece of code does to a particular piece/stretch of data. In my case, I am still learning so much about the interaction of code and data, and I need to take notes about not only what I thought I was doing but also what I wish I could do.

This is a lot like a lab notebook. As Dutch data scientist Jeroen Janssens notes: “Doing research is hard. Recalling which steps you’ve taken, and why, is even harder. To be an effective researcher, you may want to keep a laboratory notebook. Besides having a record of your steps and results, this also allows you to improve reproducibility, share your research with others, and, yes, think more clearly. So, why wouldn’t you keep a notebook?” Lab notebooks are important for their ability to track your thinking.

That noted, and in contrast with the usual advice for lab notebooks, when I am working on a project, I tend to use pads of paper or looseleaf paper: for the record, I use either engineering paper or law-ruled paper with wide left margins, and I will almost always have a pad of one or the other out on my desk when I am working. When I am done for the moment, and I am prepared to take the project off my work surface, I take the various sheets of paper I have generated and place them in a folder.

I try to keep folders between a half-inch and three-quarters of an inch in thickness. Above that and it gets too hard to find something quickly, which is the whole point of putting things in a folder and of filing systems in general. If that means I have to spend five to fifteen minutes thinking about how to break a burgeoning folder into two smaller folders, I am okay with the time spent. Like the time spent filing things, I regard it as an opportunity to review what I have done so far, what remains to be done, and what, if any, changes in direction need to be undertaken. I’m at that point right now in a project: I didn’t realize it until I wrote that sentence, but there’s been some slight friction in getting things done, and it’s because the folder has gotten too big, too sloppy.

When it comes time to archive a project, I don’t mind big, sloppy folders. Sometimes, in fact, I’ll take several smaller folders and empty them into one, re-write the label (and this is why I write labels in pencil), and then put the thing in the box, or drawer, of projects done. If I ever need to work through that material again, then plowing through a pile of paper is just one way to refresh my memory.

For participants in my courses reading this, maybe because they’ve decided to find more about me or maybe because I’ve told them to look this post up, the TL;DR version of this post is this:

  • Spiral-bound notebooks are wrong for a number of reasons: it’s too tempting to tear out a page if you think you’ve made a mistake. (Keep your mistakes: it’s part of learning.) It’s also tempting to tear out a page when you need a blank piece of paper: no one wants your ratty tassels of paper. It’s also tempting to think that you can fit everything within a given space of a notebook, especially those big, stupid “multi-subject” notebooks. There is “no one notebook to rule them all.” (I’m betting even Sauron didn’t have a multi-subject notebook.) You’re going to feel like an idiot when you run out of paper halfway, two-thirds, or three-quarters of the way through the semester.
  • Life breathes in and out. Get a capture system that does too. Pads of paper, or loose-leaf paper, kept in a fashion that it doesn’t get beat up as you take it in and out of your bag are the way to go. Write as much as you want whenever, and wherever, you want during the day. Stack all your paper in a common folder for the day, and then, at the end of the day, you can parse it into folders that represent courses, subjects, or projects. (Maybe you can even have one called “just for me.” Think about it.) As you sort the notes from the various events that filled your day — classes, meetings, etc. — you also get a chance to review your day, go over what’s important, remind yourself — even write it down in a calendar or todo list — of the things that need to get done. This review process, almost everyone agrees, is central to getting things done.

DH Positions

I’ve seen a number of digital humanities positions of late, some running centers, and a lot based in libraries. I’m interested, but I think I need more credentials, the old-fashioned kind that come with publishing in established venues, before I apply to them. I really like the look of the Penn State position, and the Purdue job is equally compelling. Both are good universities with track records for thinking about the long-term.

Film Scanners

Having recently embarked upon the task of transferring all my MiniDV footage to iMovie and the MP4 format, before the tapes themselves go bad or the little Sony camera no longer functions — more on the lost nature of my Sony MiniDisks some other time, I find myself wondering about the boxes of slides and film negatives also in my possession, some of which holds either memories quite dear to me or material that could serve my own research and teaching or that of others.

To address this issue, I started looking around for what is the current state of film scanners and what the pricing looks like. B&H has a nice survey, which runs the gamut in price and quality of scan — and the two are closely tied, of course. I wish it were otherwise, but it looks like the lower-end Plusteks, which run about $300 or so, are about where I am headed. Does anyone have any advice? Some of my university’s units used to have film scanners, but I don’t know how well they’ve been maintained over the years, and in at least one case, it was a SCSI device. (Good luck finding a connector for that these days — I tried recently, out of curiosity, to revive the first external hard drive I ever owned, a LaCie with, I think a whopping 10MB inside something the size of a cigar box.)