# Semi-Automated

As part of our hand editing of the TED talk data we had to retrieve missing information for, luckily, a small subset of the speakers. This meant Kinnaird splitting off two CSVs, one for the TED main event speakers and one for the other TED-sponsored event speakers, and then me trudging row by row and cell by cell, working back and forth between the CSV and a web page. Copy and pasted and two CSVs filled in. Yes.

Then it was time to fold these filled in rows back into the main CSVs from whence they came. Each smaller CSV had between 15 and 20 rows, so it didn’t seem like a task worthy of firing up a Python session and writing something in pandas to replace the rows with missing information with the filled-in rows.

I started doing the work by hand: copy a row from the missing.csv and paste it below the matching row in the speakers.csv and then deleting the matched row. Oi! Sure it was only 17 rows, but, still, there has to be a somewhat faster way!

So I decided to merge the two files using cat and then simply finding the dupes in Easy CSV Editor and deleting the row with missing data. Semi-automated?

# Found note

Found with the date 26 July 2016:

Books/master notes for a class on “Folklore and Psychology” as well as, looking backwards, perhaps the same thing for Louisiana Folklore. The idea being that the book would also be interactive with questions and guided experiences as well as case studies.

# Splitting Wood

As I continue to observe the maelstrom of negativity and falsehoods that is Facebook, I still want to make notes about things that happen. And I want to be able to share those notes. And then I remember that I have this blog, which is what web logs, or blogs, were supposed to be before they turned into self-publishing platforms and the key to modern success.

I am not yet decided on how much I want to reclaim this particular domain — my own name (jl.o) — or some other space where I don’t feel responsible for hosting certain pages which have become mainstays, seemingly, on the web. On the one hand, this was once my “everything that doesn’t have any other place to go goes here” space. And a chunk of that stuff was about my daughter when she was young, but then the internet got creepy and I shifted from talking about her in what I now understood was probably an all too public forum. At the same time, as blogs “came of age” and became vehicles for the blossoming of personalities, some of whom became celebrities — e.g., John Gruber or Merlin Mann — I became increasingly concerned about “managing my brand.” That this blog was a space for me to demonstrate my professional abilities and to discuss professional interests.

And then I started tracking my experiments with computational matters and suddenly this thing got popular. Other people wanting to experiment with Python and/or with thinking about texts as data were searching for things and they found a post of two of mine that was helpful and they must have told people about them because suddenly this thing had something of a readership. It freaked me out so much that I froze like the proverbial deer in headlights and stopped publishing.

And now those pages that people found useful then are still being found useful, but I haven’t tracked my voyage, and discoveries, since then, and now it feels all weird to come back to this, especially since I have Evernote for web capture and Bear for everything else, including capturing all those stray thoughts that shoot through my head like neutrinos making their way across the solar system. But both of those applications somewhat obscure where your data is — in order, I think, to make sure you don’t mess with it outside the app and possibly corrupt the sync process.

There is, I think, something remarkably re-assuring about writing all my notes in plain text — structured with some version of markdown — and storing them in plain files or in a widely-known data structure like SQL. An ideal format, to my mind, would be something like FoldingText as the UI and MySQL on the backend with a blog an easy offshoot and one simply tags, or otherwise indicates which posts are public — it would have to be a choice each and every time.

Part of all this is, I admit, in addition to a response to the way matters are developing on Facebook but also my own preference not to give over my data to someone else so that they can then monetize it. That is, by using Facebook to stay in touch with family and friends, instead of other means, I’m allowing the company to profit from my relationships. That was acceptable, to some degree, when it was a happier place, but now I find that the dark side has emerged, and it has me not only walking away from the platform, but also considering walking away from some relationships.

So, this is not only about taking a break from at least one form of social media, but also about re-focusing my own energies and making my writing my own and finding positive places in which to publish it.

I did this thinking, by the way, while splitting wood, using a maul and wedge given to me by my stepfather and an old hatchet I had lying around the house. There’s no better time to focus then when trying to follow the grain of a log, especially when you find you’ve driven a wedge into an unsplittable natural joint in the wood:

The whispy shadows of hair in the lower left are my child, still finding her way into this blog, who took this photo for me as I stood nearby, somewhat hunched over and breathing hard …

… I guess I need to split more wood.

# Why I Hate Moodle

Or at least my university’s implementation of it.

Let me begin with two assertions about what I see as the strengths about the nature of the web, so that people who see things another way do not need to bother themselves with either reading further or in arguing with me.

The first thing I want to note about the web is something I, and thousands of others, have observed in countless other ways and places and that is that the web is the platform without parallel for the delivery of content. Let me emphasize content, which I do over and against the delivery of an experience. The content itself may involve the user (or viewer or reader or listener) in some kind of experience, but the web itself is less about the delivery of experiences.

The second thing I want to observe about the web reveals my age: the web is at its best when it is semantic, when the way content is structured is part and parcel of its meaning. And I mean semantic in a deep sort of way, with UX/UI at the surface but reaching all the way down to <tag>s.

So, let me walk you through the way Moodle is set up at my university and you can begin to understand why I think its anathema to the promise of the web. And we can begin with the way I begin, which is to click on a link to a course that I am teaching in order to manage some aspect of it:

There are two things that I find difficult to accept with this: first, the content, the actual content of the course, is squashed between a whole lot of navigation and other matters that amount to little more than unnecessary cognitive overhead. Sure, I could customize the interface to get rid of all the extraneous blocks, but I use the default setup because it’s what I see most, if not all, of my students using and their experience of the course is my concern. If I design things based on my tweaked-out setup, and those things do not look the same for them, then I have failed them.

The second thing seems obvious to me: I’m a teacher. I’m coming to Moodle to do things, but in order to do things I have to click a button. And I can’t tell you the number of times I have scrolled down the page to start something only to realize I have to scroll back up, click on the edit button, and, then, scroll back down to the section I want to edit and click on the Add an activity or resource button.

And speaking of too much scrolling and clicking, when you do click on the Add button, you are greeted with the following pop-up:

Congratulations if you want to add one of a dozen of Moodle’s “activities” designed, one supposed to “enhance the educational experience” — because what undergraduate doesn’t want to use Hangman, or a Hidden Picture!, to learn about speciation or topic modeling? So more scrolling in order to get to ways to add actual content: URLs, pages, files, etc.

Perhaps the most fundamental, the most basic form of content there is is a web page. Setting aside that this web page is a column squished between a whole lot of other material, if you attempt to paste it into a text box, your formatting options look like this:

Forget meaningful things like H1 headings or passages of code, because you aren’t getting them here. For a while, if you dug deep enough into Moodle’s bowels you could enable a Markdown filter, so that you could write and maintain pages as semantic plain text, but they have moved that switch around so much that it’s clear they don’t want you to write structured prose, just roll back to the 1980s and WordPerfect for DOS and stick to one-off formatting of text.

Moodle is ugly, takes too many clicks to do anything meaningful, and it undoes everything that was once semantic about the web. Which is kind of like Facebook, which I guess makes sense.

# David Rumsey Map Collection

The David Rumsey Map Collection is a pretty impressive accomplishment. According to the site, the collections “contains more than 150,000 maps. The collection focuses on rare 16th through 21st century maps of North and South America, as well as maps of the World, Asia, Africa, Europe, and Oceania. The collection includes atlases, wall maps, globes, school geographies, pocket maps, books of exploration, maritime charts, and a variety of cartographic materials including pocket, wall, children’s, and manuscript maps. Items range in date from about 1550 to the present.”

# Rooth 1980: “Pattern Recognition, Data Reduction, Catchwords and Semantic Problems”

If, like me, you are committed to finding prescient work in the realm of computational approaches to the humanities, it means you are often tracking down somewhat difficult to find volumes and quickly photocopying an article or two while you still have the volume in your hands. Anna Birgitta Rooth’s “Pattern Recognition, Data Reduction, Catchwords and Semantic Problems” is one such article, and the PDF I am making available has been OCRed.

# The Mathematics of Arches

Arches are part of the design feature set of our house — so is a mansard roof, but I am not as keen to replicate it — and as I add or replace various features on the house, I would like to add the same kind of flattened arch that features on facade of the house and in some of the cabinetry. For that, I need math. In particular, given the width of a given opening and how high I would like the arch to be, I need to be able to calculate the length of material of the resulting arch.

For those who missed this particular part of geometry, here are the parts involved:

For the math, we need the following:

(x - x[0])^2 + (y - y[0])^2 = r^2


Or:

x[0] = c/2

y[0] = (s - x[0]^2/s) / 2

r^2 = x[0]^2 + y[0]^2

Y = y[0] + sqrt(r^2 - (x - x[0])^2)


# Bookends Not Importing RIS Files

I recently tried to import a RIS file I had downloaded from JSTOR into Bookends (13.1.5). I selected the RIS filter in the dialogue box and clicked okay:

But the RIS files are grayed out:

# The Room in Which I Work

The room in which I work is not part of our home’s heating and cooling system. It was once simply a space between the house and the detached garage that a previous owner of our forty year old house decided to enclose both to make it possible to bring in groceries while not getting rained on. It measures 89 inches wide by 101 inches deep for a total of 8989 square inches or 62 square feet. (That’s a little under 6 square meters for my European friends.) The hallway between the garage and the house is about the same size. To be clear, whoever had this space built was no fool, for the space doesn’t seem small, thanks to a large skylight and a large sliding glass door, which open the space to the world. And being so small makes it fairly simple to heat on cold and gray winter days: a cheap little heater from a big box store usually does a reasonable job.

And, too, I am fortunate enough that I can work almost anywhere these days. All I really need are my computer, and, for noisier environments, a pair of headphones or earbuds that, plugged into my phone, can block out most distractions. I am not keen on fighting volume with volume, though, and I prefer quiet spaces over noisy ones for working.

Working in such a small space means I have a kind of physical limit to my impulse to collect things. As much as I might like to accumulate piles of books and papers and memorabilia, I cannot. There is no room for it. In fact, with so little room, a certain minimalist mindset has slowly crept into my aesthetic, which, to be fair, has long been shaped by the modernist impulses of my childhood homes. The result is a kind of slow inculcation of a resonance to this space that makes me want to work within it.

Over time, I have also slowly succumbed to the dictates of this space by dispensing with any of the ordinary furniture with which I might fill it. The only furniture here that I have not built is the chair. The shelves, the desk, the monitor stand were all custom built so as to take up as little room as possible, and even now I am considering taking the two shelf units that are currently vertical, and thus taking up floor space, and stringing them up along the top of the wall like the other two units, leaving only the long narrow desk at which I work, and the chair, on the floor.

The only real problem with that plan are … files. Oof, folders of paper. Paper, paper, paper.

One thing I could do, I must admit, is to go through all that paper to determine what actually needs to be kept and what might be better kept and what can be tossed. Things like records that have to kept are easy. What’s hard is those things which force a decision: what are the projects that are going to move forward and what are those projects which will, in all honesty, never leave the Someday pile? That is hard, because it also reveals the reality of time, of death, and my own nature.

There are so many projects which I have marked as “someday” which I really should have done, if only I had been better disciplined. Not only scholarly projects, but the notes for stories that I have not written. Pulling those folders out is like having to revisit so many one’s own worst regrets, facing all the things about myself that disappoint me.

At the same time, letting those projects go might free up physical, and thus also mental, space to get new projects done…

# pip installation with sudo

Quick CLI tip for installing Python packages with pip: sudo -H pip install packagename

# 24-liter Daypack Comparison

I recently decided to upgrade from my Deuter Speedlite 20 as the backpack I took for day-long outings with my family. The Speedlite has served me well for four years now, and remains my go to back for any number of other purposes, but for longer hikes, especially on warmer days, it clings to my back, collecting heat and sweat. (Thankfully, it dries quickly.) Its one-inch wide hip belt is not terribly comfortable, and its size means I can’t quite pack everything I would like, especially if I’d like to make it possible for a traveling companion to carry nothing. I already have a Deuter Futura 32, and so I know that it is more than I wanted, so I set my sights on something in the mid-twenties, 24 or 26 liters.

As fellow backpackers know, there are at least two categories of bags, if not really three, that occupy the 20 to 30 liter capacity range: the light packs, the standoff packs, and the technical packs. In the light category are Osprey’s Talon series, Deuter’s Speedlite series, and Gregory’s Miwok series. All fine packs, and reasonably comfortable with their various corrugated foam and mesh back panels, but not as comfortable as their slightly heavier cousins in Osprey’s Stratos series and Deuter’s Futura series. (Gregory’s offerings in the 20 liter range are Salvos, and then they shift to the Zulu line.) There are other backpack makers, I know, but I already owned both Osprey and Deuter packs and they have been super reliable for me: I use an Osprey Momentum 22 to commute to work and an Osprey Porter 46 for travel. I was familiar with Gregory, having looked at a Miwok pack before settling on the Osprey Momentum.

Deuter, Gregory, and Osprey also appear to be the only ones focused on making day packs with light metal frames and mesh backs that are comfortable on days you sweat. And so my comparison shopping came down to the Gregory Salvo 24, the Osprey Stratos 24, and the Deuter Futura 24. All good bags, but a couple of them are handicapped by recent design choices. In the case of the Deuter, the hip belt pockets have recently been dropped, and the hip belts themselves somewhat shrunken. In the case of the Osprey, they have gotten ride of the roomy outer stuff pocket in favor of some weird vertical zippered pocket that everyone agrees is useless when the pack is full.

That left the Gregory Salvo 24. It offered everything I wanted: 24 liter capacity, a large central compartment with panel access, a padded hip belt, with pockets, and a stuff pocket on the front. But I was not comfortable making a decision without some comparison, and so I added the Deuter Futura 26 into the mix: it’s a slightly taller bag, and one of the issues here is my torso. As a six foot plus tall man with 34″ legs, I have a longer torso, and finding a pack that gets the straps far enough up my back to reach my shoulders comfortably is unreasonably difficult. The Deuter Futura 26 is built like bigger packs: it has a spindrift collar, a brain, and while it doesn’t have a front stash pocket, it does offer easy access to the main compartment via a zippered panel.

I wore both packs around the house with a gallon bucket of paint stashed inside: its bulky and heavy (and it was handy). Both packs seemed fine. I then took them out to a nearby park, again with the can of paint handy, and walked around with them. While the Deuter was a bit taller, it also felt like it was fighting me a little bit, and the Gregory just seemed more comfortable, which may in part be a function of the pack staying a little closer to my back. (This feature may become a bug, since obviously there will be less air between my back and the pack, but I cannot know that within the window I have to make a decision.)

So, in breaking with a long tradition of only owning Osprey and Deuter packs, with a couple of Timbuk2 shoulder bags, it looks like a Gregory is joining the family. I’ll post a photo from an upcoming hike as soon as I have one.

# New Book Thursday

Julia Flanders and Fotis Jannidis. 2018. The Shape of Data in Digital Humanities: Modeling Texts and Text-based Resources. Routledge.

Data and its technologies now play a large and growing role in humanities research and teaching. This book addresses the needs of humanities scholars who seek deeper expertise in the area of data modeling and representation. The authors, all experts in digital humanities, offer a clear explanation of key technical principles, a grounded discussion of case studies, and an exploration of important theoretical concerns. The book opens with an orientation, giving the reader a history of data modeling in the humanities and a grounding in the technical concepts necessary to understand and engage with the second part of the book. The second part of the book is a wide-ranging exploration of topics central for a deeper understanding of data modeling in digital humanities. Chapters cover data modeling standards and the role they play in shaping digital humanities practice, traditional forms of modeling in the humanities and how they have been transformed by digital approaches, ontologies which seek to anchor meaning in digital humanities resources, and how data models inhabit the other analytical tools used in digital humanities research. It concludes with a glossary chapter that explains specific terms and concepts for data modeling in the digital humanities context. This book is a unique and invaluable resource for teaching and practising data modeling in a digital humanities context.

For those of you thinking “Oh, no, Routledge. I can’t afford it. You are correct.” This was not the promise the internet made to knowledge distribution.

# AFS 2018

For those who have asked, below are links to the paper I gave at this year’s meeting of the American Folklore Society along with the slides and the handout (which was a version of the slides, so you don’t need both). As I catch up with everything on which I have fallen behind, I will post my notes about the conference itself in some fashion.

Here are: the paper, the slides, and the handout for “It’s about Time: How Folk Narratives Manage Time in Discourse.”

Abstract: Concluding his consideration of “Time in Folk-Narrative,” Bill Nicolaisen noted that the nature of human experience is centrally of time and that what marked genres of folk narrative, perhaps as much, or more, than anything else, was their management of time: “What must be stressed, however, is that in contrast to the concepts and realization of an extended present and of narrated time in the folktale, the dramatic comparisons made in the legend are designed to demonstrate the incompatibility of the two time frames, which exist as parallel systems” (318). Much of Nicolaisen’s efforts are focused on a careful compilation of how time is signaled, and thus managed, within the discourse of ten fairy tales drawn randomly from Thompson’s One Hundred Favorite Folktales. This paper revisits and extends Nicolaisen’s work, taking as its central task the careful attention to words used. Where Nicolaisen focused principally on the folktale, with occasional references to legend, this paper, part of a larger examination of legends in the current moment, uses a number of legends taken, first, from oral discourse, and then a number of legends found online. It follows this examination with a look at, what the paper itself argues, is the adjacent genre of the personal anecdote, sometimes also known as the personal experience narrative, in order to determine how a close examination of the management of time, in discourse, might reveal where the two genres converge or diverge, in hopes of finding a better way to model both and reliable discursive cues. Some of the methodologies deployed are computational in nature, beginning with forms of markup first explored by computer scientists Pustejovsky et alum and followed up by recent attempts to automate temporal signals in texts by David Elson. The current work seeks to re-imagine the pioneering work of Bill Nicolaisen, and before him Benjamin Colby, in light of recent developments in computational modeling of narrative with an especial focus on what that means for the study of genre.

Nicolaisen, William. 1978. Time in Folk-Narrative. In Folklore Studies in the Twentieth Centuries, 314-319. Ed. Venetia Newall. Rowman and Littlefield. (Available as a PDF.)