As I work on my paper for this year’s annual meeting of the American folklore society, I find myself treasuring one of a collection of offprints once sent to me by Bill Nicolaisen. I am pretty sure that others will find his work compelling and that the conference proceedings in which it appeared, Journées d’Études en Littérature Orale: Analyse des contes, problèmes de méthodes, is probably pretty hard to find. Here’s a PDF version. (The OCR is okay, not great: I’m working on am improved scan.)
Text Analytics APIs 2018: A Consumer Guide is $895 for a single user license. At 299 pages, that’s about $3 per page. The blurb notes that:
Robert Dale is an internationally-recognized expert in Natural Language Processing, with three decades of experience in academia and industry. With a PhD from the University of Edinburgh, he’s worked for Microsoft and Nuance, and he’s driven the development of SaaS-based NLP software for a startup. He has taught at the University of Edinburgh in the UK and at Macquarie University in Sydney, and presented tutorials and summer school courses around the world. He has over 150 peer-reviewed publications, including a comprehensive Handbook of Natural Language Processing, and the de facto textbook Building Natural Language Generation Systems.
I remember watching Connections in the 70s and feeling like this was the kind of knowledge I wanted to possess, the kind of facility I wanted to possess. And now you can see Connections for yourself thanks to Archive.org.
In his keynote at JupyterCon, Paco Nathan discusses the connections between an open society and open science. He refers to work by Vannevar Bush (and Jorge Luis Borges?) and Karl Popper’s The Open Society and Its Enemies in particular.
Katherine Kinnaird is very smart: listen for yourself.
I was interested in the data for the age of European populations, but I found myself more taken with the color scheme used in the visualization:
A full-sized version of the image is available on request — it’s really big. But the map is part of an article in The Lancet.
WikiArt Emotions is a dataset of 4,105 pieces of art (mostly paintings)
that has annotations for emotions evoked in the observer. The pieces of art
were selected from WikiArt.org’s collection for twenty-two categories
(impressionism, realism, etc.) from four western styles (Renaissance Art,
Post-Renaissance Art, Modern Art, and Contemporary Art). WikiArt.org shows
notable art in each category in a Featured page. We selected ~200 items
from the featured page of each category. The art is annotated via
crowdsourcing for one or more of twenty emotion categories (including
neutral). In addition to emotions, the art is also annotated for whether it
includes the depiction of a face and how much the observers like the art.
We do not redistribute the art (images), we provide only the annotations.
If you like to know where you are at all times, here’s one map:
The range of transcription options has opened up considerably since I last considered the possibility of turning over some, but not all, transcription to software. It appears to be largely done in the cloud, with offerings from the following:
- Transcribe appears to be simply an on-line version of the mechanical transcription machines I used to use: load the audio and then type. The “automagic” version allows you to listen to the audio through a headset and then dictate it to the site, which will then transcribe. That’s interesting.
- f4transkript is another on-line service where you load your audio and then you do the typing.
If you’re interested in these traditional forms of transcription, wherein you do the typing, then may I also suggest you check out the transcription options in Scrivener. It’s not a service, so you just buy the software and use it. And a license is very inexpensive.
For those interested in letting an AI of some kind transcribe the audio for you — ah, the future, then there appears to be Descript. It appears to be the case that you upload your files either online, or you simply load them into an app installed on your local machine: it’s not quite clear if you pursue the latter course if the transcription takes place entirely on your machine or if the AI that does the heavy lifting lives in the cloud. The demos appear to work in real time, but the site suggests that perhaps you can load an audio of whatever length as a digital file and in less time than it takes to play it, you can have a transcript back.
I’m going to see how much you can do with a free account and report back. This could be very, very useful. (And cool!)
Gridzzly is for those times you need graph paper with triangles or hexagons with lines or dots.
If you are interested in the digital humanities and don’t know where to start, you could do far worse than to browse the possibilities on TAPoR (Text Analysis Portal for Research), which offers 912 possible resources.