Useful Pandas Posts

Please note that this post is, yes, “under construction” as I compile various notes from across my file system and decide what’s worth keeping here and what’s going into the virtual trash bin.

If, like me, you are not very familiar with R and thus you do not readily grasp how pandas brings much of R’s coolness to Python data analysis workflows, then having the occasional overview and/or cheat sheet on hand is useful.

For overviews, I found the following really helpful in understanding how pandas organizes data and the methods available for working with it:

For quick tips that border on almost being cheat sheets, there is Chris Albon’s “Technical Notes on Using Data Science & Artificial Intelligence to Fight for Something That Matters”, at the bottom of which is a compendium of great tutorials and tips on using pandas. (And as you scroll, you glimpse a lot of other really useful stuff as well.)

Rewiring Apple Speakers

Parts needed:

1 x Standard stereo Y cable with a standard mini plug attached.
1 x Pair of wirecutters/strippers
1 x Roll of electrical tape

The operation:

  1. Using the wirecutters, remove the Apple junction and all cabling behind it, including the Apple mini-plug. Please don’t worry, this is safe.
  2. Strip about 1 inch of insulation from each cable that remains attached to the Apple Pro Speakers. Do the same to the stereo Y cable we purchased.
  3. Inside each cable, one will notice a pair of wires. Strip about 1/2 inch of insulation from each wire. Do the same to the stereo Y cable.
  4. Join each wire from the Apple Pro Speaker stereo pair to it’s compliment on the stereo Y cable. Anyone thathas done wiring of a home stereo system will have no problem recognizing what to do from here on out.
  5. Tape of each join securly, or solder if one likes and then tape up each joint. Also, heat sealing joiners can be used as well. I used electrical tape myself, but I may go back and get heat seals as they look nicer.
  6. Insert stereo mini plug into the audio output of choice an enjoy!

The right speaker has: shielding on the outside, a layer of foil, a brown wire, a yellow wire.

The left speaker has: shielding on the outside, a layer of foil, a blue wire, a white wire.

  • Pull the foil away and when connecting these wires MAKE SURE THEY DO NOT TOUCH WIRES THAT THEY ARE NOT PURPOSELY ATTACHED TO.
  • Get a mini or headphones connector where the left and right wires are easily stripped and separated. I used one where the right wire is clearly red and the left is clearly white and both wires have shielding.
  • Make sure the stripped wires are stripped at least an inch above the shielding so there is less opportunity to touch.
  • Twist the shielding of all wires together and cover with electrical tape.
  • Twist the Brown and the Blue wires from the two speakers together and cover with electrical tape
  • Twist the white wire from the Apple Pro speaker together with the left (in my case white) wire connected to the mini connector, cover with electrical tape.
  • Twist the Yellow wire from the Apple Pro speaker together with the right (in my case red) wire connected to the mini connector, cover with electrical tape.
  • The blue and brown wires, now connected, need to be connected to the shielding wire.

Science Fiction Stories by Keith Laumer in the Public Domain

If like me you found yourself very excited by the news on Boing Boing that possibly as much as 80% of the books published between 1924 and 1963 might now be in the public domain thanks to their copyrights not being renewed, then like me you also clicked on the links to the New York Public Library’s explanation as well as Leonard Richardson’s discussion. What was most exciting, to me, to discover was that a fair amount of science fiction from that period, which includes the so-called golden era, might be in the public domain.

That’s the good news. The bad news, or the news that requires patience, is that you still have to track down that work, much of which hasn’t been scanned. Some of it has been scanned, and is possibly available through the Hathi Trust, but it hasn’t been OCRed and curated into clean digital versions. But some of it has, and in the case of some work by Keith Laumer, a favorite of mine, it’s available on Gutenberg.

The list of texts is all sitting on one page, and it’s only 12 works, so writing a BeautifulSoup script seemed like overkill, especially when my preferred plain text note application, [Bear][], does a terrific job of turning HTML into easily edited markdown. From there, I edited the URLs following the pattern I gleaned from one of the texts using Textmate’s block edit functionality. I got the following list:

http://www.gutenberg.org/cache/epub/51258/pg51258.txt
http://www.gutenberg.org/cache/epub/53132/pg53132.txt
http://www.gutenberg.org/cache/epub/51509/pg51509.txt
http://www.gutenberg.org/cache/epub/51712/pg51712.txt
http://www.gutenberg.org/cache/epub/51267/pg51267.txt
http://www.gutenberg.org/cache/epub/26782/pg26782.txt
http://www.gutenberg.org/cache/epub/51781/pg51781.txt
http://www.gutenberg.org/cache/epub/21627/pg21627.txt
http://www.gutenberg.org/cache/epub/23028/pg23028.txt
http://www.gutenberg.org/cache/epub/52844/pg52844.txt
http://www.gutenberg.org/cache/epub/52855/pg52855.txt
http://www.gutenberg.org/cache/epub/21782/pg21782.txt

I saved it to a file, cded into my texts repo and ran wget:

wget -w 2 -i ~/Desktop/laumer.txt

A half minute later it was done:

FINISHED --2019-08-14 19:01:18--
Total wall clock time: 24s
Downloaded: 12 files, 1.2M in 1.4s (874 KB/s)

You can do the same, or you can grab the collection of plain text files out of my GitHub texts repo: Laumer stories.

Thanks for coming to my TED talk

If you’re curious about what I have been up to, it’s working with Katherine M. Kinnaird: “TED talks as Data” is the first in three planned installments of our collaboration — data, words, discourse. In the mean time, as they say, “thanks for coming to my TED talk”:

On Attentiveness

Wendell Berry in an essay entitled “Preserving Wildness” collected inHome Economics makes the case for what may be called an economy of attentiveness (as opposed to an economy of mere attention).

The good worker loves the board before it becomes a table, loves the tree before it yields the board, loves the forest before it gives up the tree. The good worker understands that a badly made artifact is both an insult to its user and a danger to its source. We could say, then, that good forestry begins with the respectful husbanding of the forest that we call stewardship and ends with well-made tables and chairs and houses, just as good agriculture begins with stewardship of the fields and ends with good meals.

How It Feels to Be Copied

I wrote this some time ago, in 2015, I think. I thought it was published here, and when I discovered it lying in an archive of notes on my computer, I thought it was only right to put it where I intended all those years ago.

It begins with whispers and occasional sideways glances among the people who know what is happening, and with very odd questions among the people who don’t know — I remember someone from across the university remarking that I should check out the other project, since it too was on the same topic. Then, someone finally steps forward and points out what others have known, or suspected, for a while. They show you a website, and I was confused because the prose, while not exactly my own, was so much like how I wrote, how I thought, and the title of the project was remarkably similar to my own, and, in fact, was fairly close to a phrase I had used in an essay that I had published out of the larger research project. Finally, when I kind of stumbled back to my office, unsure of what to think, a hallmate sticks his head in to say that the other person has been asking about the topic. My hallmate tells me that he kept telling the other person to talk to me, but …

But what? How do such things happen? As a student of culture, I am fully aware that there is such a thing as zeitgeist, that ideas have their moments. I have also chosen to pursue a scholar’s life in the humanities, which means I have chosen to sacrifice greater economic opportunities for the ability, I hope, to serve the greater good, to make a contribution not only to the domain of human knowledge but also to make a difference in the lives of individuals students and the life of my community. And so, the first thing I feel is betrayal. Someone else has done something to me.

But, really, the other person doesn’t really need to care that much.

As for the other project. It takes my idea, which is to examine creativity through a clearly creative object, and focuses on an old wooden boat form that is only made by a few antiquarians for other antiquarians. It’s not a terrible thing to spend time with someone older than you making antiques, but call it that. Don’t call it scholarship. It’s a memoir.

The difficult part is when universities begin to confuse this kind of work with the actual work of scholarship and science, which is probably going to happen more often in more places as universities allow themselves to be run by professional managers and not academics.

This has always been a risk, of course. The great mass at the center of almost any university is the spread of abilities. One of the central tensions in the academy has always been between those who prefer to research, and do it well; those who prefer to teach, and do it well; and those who prefer to manage things, and maybe they do it somewhat competently. But the pay hierarchy goes: administration, research, teaching.

As bean counters take over, not only will they count butts in seats but they will also count publications, without any sense of what matters and what does not. A colleague of mine reported that in her conversation with our dean, when she pointed out that she felt like her work, published in the top journals in her field, was largely being undervalued, our dean replied, “that quantity matters, not quality.” (If he was being ironic, there was no later action he took to reveal that subtle dimension: floggings continued with hopes of morale improving.)

Annual Performance Evaluation Response

At the end of the tedious online performance evaluation process faculty are allowed to make a statement of sorts. Please note that my final evaluation was 4.8 out of 5: I was assessed at 5 for research (55% of my evaluation), 4.5 for teaching (35%), and 5 for service (10%). Here is what I submitted:

Why are we embarked upon a process which the provost himself has described publicly as “inane”? What does it mean for a faculty member to receive a grade of 4.8? Moreover, what’s the point of seeking to distinguish oneself when the only thing at stake is a cost of living adjustment misnamed as a “merit raise”? And this in a year when there will be no adjustments, no raises? Mr. LeBlanc can shake his head in sympathy that my salary will, in effect, be diminished by 3.5%, and maybe the dean will, too, but they both are comfortable in their 6-figure salaries and I have to decide, once again, how much of my savings I will divert to send my child to school, to effect meaningful home repair, and/or offset other expenses which grow each year even as our salaries do not. I am lucky to hold an endowed professorship, but it is only ever temporary, and I know that all I have to do is stumble professionally or annoy the wrong person and that will be taken away and things will be as bad as they are for everyone else in the department. (Or, worse, its removal could be the outcome of an under-considered process possessing no strategic focus nor even a sense of the variance in values of publications to an institution that claims to aspire to higher status.) The evaluation of performance in the absence of any meaningful reward and only the ever-present damoclean threat of punishment is not evaluation but a constant reminder that the administration always holds a knife to faculty’s throat.

Python and PDFs

Real Python has a tutorial on How to Work With a PDF in Python. I subscribe to Real Python because I find their tutorials well-written or, in the case of video tutorials, well-presented. The focus of this tutorial is the PythonPDF module, which can get metadata from a PDF, rotate pages, merge or split a PDF, and/or encrypt it. While the tutorial mentions “extract information” it does not mean PythonPDF can get text from a PDF that does not have a text layer already embedded on its pages — you could argue that the unintuitive nature of PDFs reveals their brokenness but that’s for another time. If you want to get text where there is no text layer, but you still want to use Python, it looks like you have to turn to PDFMiner — though a quick skim of its GH page doesn’t reveal if it has OCR capabilities backed in. Sigh.

HTML to PDF

In an ideal setup, my workflow would have me writing in some version of plain text — a flavor of markdown in all probability — that could be quickly and easily outputted to a variety of formats and media. In most instances, that output gets printed, or at least paginated, which means it probably has to, at least for a moment, be instantiated as a PDF. (If I remember correctly, this is essentially how the macOS display and printing system work.) What that would mean would be a collection of CSS files that transformed the generated HTML into the various kinds of documents I regularly produce: essays, reports, letters, lectures, etc.

This function is what the Marked app does and does well — it’s also functionality built into the Ulysses app if I remember. Neither of those apps, I believe, offer pagination, which is often critical to what I output. And so, I have continued to search for my own solution in hopes of building it into a workflow — for the record, when I am working on long-form plain text, my editor of choice is FoldingText because it does a brilliant job of hiding the markdown unless you are working on that sentence and, as the name implies, it makes it possible to hide all but the section of the document on which you are working. It’s brilliant. (To be clear, I am a fan of all the apps mentioned here and of their developers.)

Getting from plain text via markdown or MultiMarkdown to HTML and then pairing that HTML with a page-media aware CSS file and then outputting to PDF is not as easy as it should be. The one app of which I have been aware up until recently was PrinceXML, which its creators have made free for non-commercial use, but with the imposition of a small watermark. That’s very generous, but it’s not quite what I want and I don’t have the kind of money to afford a desktop license.

And so it was a delightful surprise to discover that there are free software options to explore:

  • wkhtmltopdf is an “open source (LGPLv3) command line tools to render HTML into PDF and various image formats using the Qt WebKit rendering engine. These run entirely headless and do not require a display or display service.”
  • **WeasyPrint is a “visual rendering engine for HTML and CSS that can export to PDF. … It is based on various libraries but not on a full rendering engine like Blink, Gecko or WebKit. The CSS layout engine is written in Python, designed for pagination, and meant to be easy to hack on.”

Next up … trying WeasyPrint and an update/report here.

Museum Anthropology Review in Transition(s)

Mar banner

Jason Jackson’s account of the rise and revision of Museum Anthropology Review may very well be as “inside baseball” as anything academic can get, but it is a detailed chronology of the events, and the reasons, that he helped establish an open access journal that continues to thrive today. I recommend it to my students for its very clear articulation of the inner workings of scholarship: there are costs; there is labor.

AFS 2019 Abstracts

Seymour Chatman’s diagram of narrative

The short abstract (97 words):

With an understanding that no text is composed, or received, in a single “mode of discourse” (description, narration, exposition, etc.), this paper explores the nature of non-narrative elements found within folk narrative, pursuing a path first begun by literary critic Meir Sternberg and linguist Carlota Smith. While Sternberg and Smith used literary texts as the basis for their study, this paper draws, like the previous one, upon folk narratives collected by a number of folklorists, including myself, in order to see if there are consistent structures of discourse present and at what level those structures lie.

The long abstract (492 words):

Save a few exceptions, folklorists have largely approached folk narrative as given, with occasional considerations of non-narrative elements. Our close readings of texts tend to focus on the topical and not the formal, on the contextually meaningful and not the structurally significant. This paper is part of a larger project to understand the nature of the components that make up a folk narrative text in order to explore what structures might emerge, and which, if any, are general and which might be cultural. The project is founded on the work of literary critic Meir Sternberg and linguist Carlota Smith, each of whom pursued parallel paths in trying to discern modes with a given text. Starting in the late seventies and working through the nineties, Sternberg attempted to extend narratological considerations to include non-narrative moments and passages in texts. Pursuing similar research but apparently unaware of Sternberg, Smith developed the notion of “modes of discourse,” based on her own work on temporal aspect, in which she explored how languages encode time and how they encode the way events happen over time. Both Sternberg and Smith, however, draw upon literary sources for the exploration and application of their ideas and methods. What would a consideration of folkloric texts bring to the table, and what role would dialogueโ€”long established as a central feature in oral text-makingโ€”play in a possible revision of any typology of discourse modes? This paper only briefly outlines Sternberg’s work, as well as referencing the work of Labov and Waletzky which has had some role in folkloristic considerations of narrative (as outlined in a previous paper), in order to provide a backdrop for a consideration of Smith’s work to folkloristic considerations of text. In a previous paper I argued that folklore studies is as guilty as other domains in proclaiming anything narrative. In this paper, I explore other modes of discourse and then consider just how little the narrative mode has to be present for it to be received as narrative in its entirety. All examples are drawn either from my own fieldwork or from colleagues who have entrusted me with examples from their own work.

Labov, William, and Joshua Waletzky. 1967. Narrative Analysis: Oral Versions of Personal Experience. Proceedings of the 1966 Annual Spring Meeting of the American Ethnological Society, 12โ€“44.

Smith, Carlota S. 2003. Modes of Discourse: The Local Structure of Texts (Cambridge Studies in Linguistics). Cambridge University Press.

Sternberg, Meir. 1981. Ordering the Unordered: Time, Space, and Descriptive Coherence. Yale French Studies (61, Towards a Theory of Description): 60โ€“88.

โ€”โ€”โ€”. 1982. Proteus in Quotation-Land: Mimesis and the Forms of Reported Discourse. Poetics Today 3 (2): 107โ€“56.

โ€”โ€”โ€”. 1990. Telling in Time (I): Chronology and Narrative Theory. Poetics Today 11 (4, Narratology Revisited II): 901โ€“48.

โ€”โ€”โ€”. 1992. Telling in Time (II): Chronology, Teleology, Narrativity. Poetics Today 13 (3): 463โ€“541.

โ€”โ€”โ€”. 2001. How Narrativity Makes a Difference. Narrative 9 (2, Contemporary Narratology): 115โ€“22.

Plywood Graph Paper

I have been doing a bit of home renovation and construction of late, and some of it has involved breaking down sheets of plywood, which I now do using a piece of rigid foam lying on my garage floor and a Kreg Circular Saw Guide. As long as I am crawling around with a saw in my hand, I prefer to make as many cuts as I can. For that, I use a cut sheet, which I make using the incomparable Incompetech’s Graph Paper Generator. The settings captured in the image below produce a graph of 4 x 8 squares broken into 6-inch and then one-inch cells:

Incompetech Graph Paper Generator Settings