AFS 2019 Abstracts

Seymour Chatman’s diagram of narrative

The short abstract (97 words):

With an understanding that no text is composed, or received, in a single “mode of discourse” (description, narration, exposition, etc.), this paper explores the nature of non-narrative elements found within folk narrative, pursuing a path first begun by literary critic Meir Sternberg and linguist Carlota Smith. While Sternberg and Smith used literary texts as the basis for their study, this paper draws, like the previous one, upon folk narratives collected by a number of folklorists, including myself, in order to see if there are consistent structures of discourse present and at what level those structures lie.

The long abstract (492 words):

Save a few exceptions, folklorists have largely approached folk narrative as given, with occasional considerations of non-narrative elements. Our close readings of texts tend to focus on the topical and not the formal, on the contextually meaningful and not the structurally significant. This paper is part of a larger project to understand the nature of the components that make up a folk narrative text in order to explore what structures might emerge, and which, if any, are general and which might be cultural. The project is founded on the work of literary critic Meir Sternberg and linguist Carlota Smith, each of whom pursued parallel paths in trying to discern modes with a given text. Starting in the late seventies and working through the nineties, Sternberg attempted to extend narratological considerations to include non-narrative moments and passages in texts. Pursuing similar research but apparently unaware of Sternberg, Smith developed the notion of “modes of discourse,” based on her own work on temporal aspect, in which she explored how languages encode time and how they encode the way events happen over time. Both Sternberg and Smith, however, draw upon literary sources for the exploration and application of their ideas and methods. What would a consideration of folkloric texts bring to the table, and what role would dialogue—long established as a central feature in oral text-making—play in a possible revision of any typology of discourse modes? This paper only briefly outlines Sternberg’s work, as well as referencing the work of Labov and Waletzky which has had some role in folkloristic considerations of narrative (as outlined in a previous paper), in order to provide a backdrop for a consideration of Smith’s work to folkloristic considerations of text. In a previous paper I argued that folklore studies is as guilty as other domains in proclaiming anything narrative. In this paper, I explore other modes of discourse and then consider just how little the narrative mode has to be present for it to be received as narrative in its entirety. All examples are drawn either from my own fieldwork or from colleagues who have entrusted me with examples from their own work.

Labov, William, and Joshua Waletzky. 1967. Narrative Analysis: Oral Versions of Personal Experience. Proceedings of the 1966 Annual Spring Meeting of the American Ethnological Society, 12–44.

Smith, Carlota S. 2003. Modes of Discourse: The Local Structure of Texts (Cambridge Studies in Linguistics). Cambridge University Press.

Sternberg, Meir. 1981. Ordering the Unordered: Time, Space, and Descriptive Coherence. Yale French Studies (61, Towards a Theory of Description): 60–88.

———. 1982. Proteus in Quotation-Land: Mimesis and the Forms of Reported Discourse. Poetics Today 3 (2): 107–56.

———. 1990. Telling in Time (I): Chronology and Narrative Theory. Poetics Today 11 (4, Narratology Revisited II): 901–48.

———. 1992. Telling in Time (II): Chronology, Teleology, Narrativity. Poetics Today 13 (3): 463–541.

———. 2001. How Narrativity Makes a Difference. Narrative 9 (2, Contemporary Narratology): 115–22.

What We Talk about When We Talk about Stories

Rejected for a special issue of the Journal of Cultural Analytics, but, still, I think, an interesting project and one I will continue to pursue. If anyone else is interested, this is part of a larger project I have in mind and I am open to there being a working group.

Current efforts to treat narrative computationally tend to focus on either the very small or the very large. Studies of small texts, some only indifferently narrative in nature, have been the focus for those interested in social media, networks, and natural language technologies, which are largely dominated by the fields of information and computer sciences. Studies of large texts, so large that they contain many kinds of modalities with narrative the dominant, have largely been the purview of the field we now tend to call the digial humanities, dominated by the fields of literary studies, classics, and history.

The current work proposes to examine the texts that fall in the middle: larger than a few dozen words, but smaller than tens, or hundreds, of thousands of words. These are the texts that have historically been the purview of two fields that themselves line either side of the divide between the humanities and the human sciences, folklore studies and anthropology (respectively).

The paper profiles the knot of issues that keep these texts out of our scholarly-scientific systems. The most significant issue is the matter of “visibility”, of accessibility, of these texts as texts and thus also as data: largely oral by nature, most folk or traditional narratives (must) have been the product of a transcription process that cannot guarantee the same kind of textuality of a “born literary” text. (The borrowing of the notion of natality is somewhat purposeful here, since we often distinguish between texts that have been, sometimes laboriously, digitized and those that were “born digital.”) As scholarly fictions, if you will, they are largely embedded within the texts that treat them, only occasionally available in collections. With limited availability, and traditionally outside the realm of the fields that currently dominate the digital humanities, folk/traditional/oral narratives are not yet a part of the larger project to model narrative nor of efforts to consider the “shape of stories.”

This accessibility gap has overlooked both human and textual populations: most of the world’s verbal narratives are in fact oral in nature and millions upon millions are produced everyday by millions and millions of people and those narratives tend to range in size from somewhere around a hundred words to, perhaps, a few thousand words in length. The result is that any current model or notion of shape simply has allowed the wrong “figures figure figures.” Put another way, there can be no shape of stories without these stories.

Meir Sternberg on Narratology’s History

As part of a larger effort to think about the shape of small stories, I have begun to try to delineate more carefully the modes of oral discourse — e.g., description, narration, exposition, etc. Apart from the early work by Labov and Waletzky, whose work on narrative versus free clauses is foundational, the work I have found most compelling is that of Meir Sternberg. Re-reading his 1981 essay on “Ordering the Unordered: Time, Space, and Descriptive Coherence” is an exercise in wondering how one mind could anticipate so much of what was to come and what still needs to get done.

I’ll have more to say about Sternberg later, but in the mean time, I found this delightful excerpt from an interview in which he explains the difference, or the lack thereof, between classical and post-classical narratology.


LoC Blogs

In case you haven’t been keeping up, the [Library of Congress hosts a number of blogs][blogs]. While some of them only infrequently publish, the overall amount of material available is really impressive.


The Saffron Research Browser

I’m still trying to figure out what all I can do with the [Saffron][] browser/visualizer. It claims to analyze the research communities of natural language processing, information retrieval, and the semantic web through “text mining and linked data principles.”

The list of research domains is rather short and under-explained for the uninitiated:

Saffron's List of Research Domains

Saffron’s List of Research Domains

I clicked on [ANLP][], which is *applied natural language processing, and you get both a list of hot topics:

Hot Topics in ANLP

Hot Topics in ANLP

As well as a taxonomy network/tree that offers labels when you hover over nodes, which are themselves clickable links:

Taxonomy Network for ANLP

Taxonomy Network for ANLP

Clicking on one of the “hot topics,” in this case [natural language text][], gives you a bar chart of the frequency of the topic in documents for the past thirty years:

Frequency of Natural Language Text as a Topic over 30 Years

Frequency of Natural Language Text as a Topic over 30 Years

A list of similar topics:

Topics Similar to "Natural Language Text"

A list of experts:

Saffron's List of Experts Associated with "Natural Language Text"

Saffron’s List of Experts Associated with “Natural Language Text”

And a list of publications:

The Top 5 Publications for "Natural Language Text"

The Top 5 Publications for “Natural Language Text”

Like a lot of browsers, this kind of static presentation of the results impoverishes the exploration that it encourages. I also haven’t explored what are its inputs: I wonder how full/complete its historical record is.

[natural language text]:

Why Count Words?

“Why count words?” It was a simple question[^cf1]. The person asking the question did not ask it in an overly skeptical, or hostile, fashion. He was honestly taken aback by a series of numbers I had rattled off that corresponded to a collection of texts, of legends, that I had assembled as my first step in my exploration of computational approaches to narrative. The illustration in front of the room had been a bar chart of sixteen legend texts, each collected by an established folklorist (and so the original oral texts were, I felt, reliably represented). The longest text in the collection was a little over one thousand words (1025); the shortest, only 150.

A multiplier of seven is not an order of magnitude in difference, but it is still enough of a spread that it bears further investigation. Mount Everest is, for example, seven times taller than Ben Nevis, the highest mountain in the British Isles. Climbing the former is considerably more prestigious than climbing the latter. The Gross Domestic Product of the U.S. is seven times greater than Brazil. The distance from New York to London is seven times greater than the distance from New York to Washington, D.C. The difference in the latter amounts to a change in continent and a trans-oceanic passage.

My initial answer to the question was simple: I counted words because I wanted to know if it is possible to create a story world using 150 words, and, if so, then I want to understand how that can happen. Given the size of a great number of literary forms, one thousand words is already amazingly concise, but 150 words? Each word must pack an incredible amount of power: something made even more amazing when one realizes that only half that number of words are unique in their usage in this little text. That is, one word alone, he, gets used twelve times. The next nine words that get used most often in this little legend are also fairly uninteresting: and, a, was, the, it, his, said, to, they. So a list of the text’s top ten words doesn’t reveal anything about the story itself, except that, perhaps, there is a singular figure, he, who is counterposed against a group of some kind, they. (It is only when we get to the next ten most often used words, all of which appear only two or three times in the text, that we beginning to get a sense of what the story might be about: man, dog, with, when, went, there, saw, off, horse, controller.)

How is this possible? How can such a small subset of words from an already small text make a story go? That is, I think, the real question. Counting words is but one step along the way, but an important one, and one that we, as folklorists, have failed to undertake. Think for a minute of all the texts that are indexed in the great collection projects of the twentieth century. Add to them all the texts we have collected under the auspices of the ethnography of speaking. It’s an impressive amount of work, and while we have made some synthetic gestures, we have, by and large, mostly focused on differences. All of those differences are, of course, quite compelling, but in focusing on differences, we have also missed an opportunity to make attempts at larger kinds of claims about human nature and culture.

The impulse to count words, for me, is but one step towards a larger understanding of how humans think their way through the world through things of their own making. In the case of texts, they quite literally string one word after another, usually within the flow of a larger program of discourse that itself may or may not be conducive to text-making. Despite all the complexities, people in a variety of speech act contexts somehow decide to initiate a text, place one word upon another in a sequence they both anticipate and, at the same time, manipulate, until they are satisfied, in some fashion, with the result and, like a discursive Atropos, end the life of the string.

Counting words, then, is but one step towards a larger understanding not only how many words, but which words, and in what order. Why these words and not others? And what are the relationship of these words used here to instantiate a story world, but of the actions within the story world to the human world within which they are embedded? In short, what can 150 words tell us about the relationship between words, ideas, and actions?

The great indices of the previous era of folklore scholarship took one step in this direction by attempting to map, mostly in bibliographic terms but indirectly in cartographic, the various texts that had been collected in the initial wave of the philological project. At the same time as Stith Thompson turned his great carousel to compile the Motif Index three-by-five card by three-by-five card, however, a few scholars and scientists were beginning to play with the idea of using computers, as slow and expensive as they were then, to compile statistics about texts[^cf2].

Statistics remains, for most humanists, either an enigma or an enemy. It represents, for many (and with good reason), a regime of mathematics, itself something of a mystery, which has been used too often to summarize a situation or a group of people when a more subtle form of analysis was needed. I will not, in this essay, defend its use in such contexts. Nor am I interested in defending, or capable of discussing, the larger statistical turn that so many forms of knowledge production have undertaken. I have only this, a reworking of a dite from my own childhood and perhaps yours too: just because others are doing it is not a reason for us to do it, too.

I understand very well the humanistic impulse to draw a line in the discursive stand and to cry out “the crunching of us into numbers ends here.” My suggestion here, at this metaphorical line lying before us, is that the crunching will go on and on, and it can do so either without us or with our efforts not only to humanize the crunching but also to stuff it so full of the human that it might very well turn into a new kind of science, a new kind of scholarship that will not only be interesting to others, but also to us as well.

One of the central requirements of statistics is that you must convert information — perhaps a simply little story about a treasure buried somewhere, perhaps a few dozen of such stories, or perhaps several thousand — into data. But such a transformation amount simply to assigning values, most often numbers but they need not be, to the objects that are central to the problem. The analyst defines the problem, and the analyst assigns the values. Folklore studies has already done this in the form of tale type numbers, and motif numbers, and even when we describe the process of contextualization of a particular text.

So why count words? Well, clearly one reason to do so is simply to explore texts and textuality, to satisfy our curiosity about the fundamental dimensions of human expressivity: the number of words in a text, the word clusters (or collocations) that occur within a text as well as the words that always appear in conjunction with others in particular kinds of texts (co-occurrences). A second reason to proceed in this fashion is to make it possible to discover relationships between texts that we have not yet discovered by more traditional means of study. Discovery, indeed the notion of indexing itself, are the chief reason behind so much of the effort in natural language processing, as we will discuss in a moment. The final reason is that by seeing folklore texts in a new light and seeing relationships between texts that we have not gleaned before leads to new forms of knowledge, forms that need not displace but rather refine and extend current ways of knowing.

[^cf1]: The first public presentation of this research project was at the 2013 meeting of the International Society for Contemporary Legend Research. I would like to thank that group for their incredibly generosity and hospitality.

[^cf2]: The image of Stith Thompson sitting in a building dedicated to housing a carousel forty-feet in diameter is one that I owe entirely to Henry Glassie.

My Digital Humanities Wish

I almost posted this on the new [Bamboo DiRT Wishlist][bdw], but then I thought better of it. The wish list needs to function as a straightforward wish list for a while before someone goes “meta” with it.

But my wish remains: a brief scan of the list of tools already available reveals that a lot of them do much the same thing. That in itself is not a bad thing, but somewhere I think it would be useful if we had a list of the kinds of things that often get done and what they might mean for various kinds of approaches to humanistic objects and topics.

For example, a lot of tools can quickly break texts into word lists — *oh, those the bags of words!* — and they can produce various kinds of outputs based on those lists: filter out function words, perhaps filter out proper names, or words that appear less than a certain number of time — or advanced uses can filter out words by part of speech. But what does it mean to get at the rich “middle” of words that do not appear too frequently or too infrequently in a text? Can we have some discussion on what we are up to?

Let me give an example from my own current work: I have developed a small collection of narratives. They are all legends, all from this part of the world (Louisiana), and they have all been transcribed by folklorists from audio recordings, and so I can assume, at least for the time being, a fair amount of fidelity to what was actually said by the speaker. (Two of the texts are, in fact, from my own field work.) My goal with this small collection of texts is to explore them in various ways to see what computational methods tell me interesting things and which ones seem to be less fruitful in this context. That’s not hard to do when you have roughly two dozen texts which range in size from a hundred some odd words to a little over one thousand words.

In fact, this particular range in size is one of the things that I find interesting. The shortest legend weighed in at a mere 153 words. The longest was seven times its size at 1015. The two stories I had collected were firmly in the middle at 375 and 655.

Now, let’s leave off for the time being that where a story starts and where it stops is not something we can necessarily declare with absolute certainty. Parts of conversation tend to wash over both ends of a story. Indeed, some stories tend to invite conversation at certain points or throughout their performance. But let’s do leave that boundary issue and let’s instead assume that an analyst can reliably depend on fellow analysts working in the same discipline to make judgments of like enough nature to his that texts are comparable and thus so are their word counts.

One of the audience members at this year’s meeting of the International Society for Contemporary Legend Research put it rather succinctly: *Why the hell are you counting words?*

Good question. I didn’t have a terribly good answer at the time, but here’s what I offered up and I stand by it:

1. First, I am fascinated by the idea that a text as small as 153 words can be said to accomplish all the things we now believe happens within an unit of narrative: the holding of an audience’s attention in a discursive situation, the creation of a storyworld, the deployment of a sequence of events that listeners perceive as having continuity.
2. Second, we have to start somewhere. Folklorists have an approximate sense of the size of various kinds of texts, but we do not have any qualified sense of their size. I am not suggesting that we will ever arrive at a moment where our definition of legend includes something like “a narrative of x type that ranges in size from 153 to 1025 words.” Rather, following David Herman here, I think we will find ourselves with better definitions that describe a kind of centrality of the kind “most legends range in size from x to y.” There are always going to be outliers: if we introduce numbers, we typically are doing so for statistical purposes, and so that gives us the change to look at mean, median, and mode as well as things that stand outside those dimensions.

But, look, this is a long digression from the topic with which I began. Let me bring it back into the fold by noting that my point is that we do not yet have within my own field of folklore studies let alone in the broader humanities any sense of what kinds of quantification matters. We have lots and lots of tools that quantify things, sometimes in ways that appear almost magical: think of the moment you first glimpsed a word cloud, a projection (visualization) of a word frequency list that is so intuitively “right” that many people use them without really having any idea what lies behind them.

And think, too, of all the people using on-line word cloud generators that have built-in stop word lists and who never stop to think about what it means for a word to be on such a list and that perhaps some of the words that are getting “stopped” are potentially part of a pattern they are seeking to discern? Humanists are, in some fashion, playing willy-nilly in a park created by linguists and computer scientists. It’s fun to jump on their various play sets, to keep the metaphor going for just a moment longer, but sometimes all you end up is dizzy and tumbling off the carousel. (And with that, the conceit went too far.)

Linguists built the tools with certain questions in mind. Linguistics as a discipline has historically been very focused on the sentence as its chief unit of analysis. Sentences and words. (I find this to be true as I explore various corpora, but this is my own auto-didactic impression and may be entirely out of line.) Folkloristics has historically operated one level of discourse up from linguistics and one level down from literary studies, focusing its efforts on single texts or series of single texts as discrete units of discourse in various forms of human behavior. The kinds of precision that linguistics long ago achieved is still emergent in folkloristics, but we lag substantially behind in quantitative descriptions of the materials with which we work. (There was a moment in the 50s and 60s when a variety of metric efforts were attempted — e.g., labometrics, chorometrics, etc. — but those efforts were displaced with the turn towards performance. (If you are a folklorist, keep an eye out for the essay from Jonathan Goodwin and me which attempts to quantify the “turn” within folklore’s intellectual history.)

I suspect we will need to wander about in quantities before we reach a moment where we can have a fuller discussion about what we are doing and why, but I would love to see that conversation sooner rather than later, if only because I really want to play with more ideas.

**Side note**: For those wondering why I am worried about a small collection of legends — I dare not call it a *corpus* yet — I am interested in being able to automate the process of determining a morphology for narratives. I’m not yet convinced that the CS solutions I’ve seen are very good. And I think they start with texts that are too big, too complex. My end goal here is to begin to map out the relationship between ideas (ideologies as networks of ideas as glimpsed through texts) and the narratives that contain / convey / shape / are shaped by them. Sort of network meets syntax. Parallel meets sequence.