Top 10 Python libraries of 2016

Tryo Labs is continuing its tradition of retrospectives about the best Python libraries for the past year. This year, it seems, it’s all about serverless architectures and, of course, AI/ML. A lot of cool stuff happening in the latter space. Check out this year’s retrospective and also the discussion on Reddit. (And here’s a link to Tryo’s 2015 retrospective for those curious.)

Flowingdata has a list of their own: Best Data Visualization Projects of 2016/. If you haven’t seen the one about the evolution of bacteria that is a “live” visualization conducted on a giant petri dish, check it out.


Expertise matters. As Ezra Pound once noted at the beginning of the ABC of Reading, it’s a matter of having money in the bank. If I write you a check for a million dollars, that check is worthless. If Warren Buffett writes you a check for a million dollars, it’s worth it, quite literally. If I tell you something about texts, it’s worth it. Buffett? Not so much.

  1. We can all stipulate: the expert isn’t always right.
  2. But an expert is far more likely to be right than you are. On a question of factual interpretation or evaluation, it shouldn’t engender insecurity or anxiety to think that an expert’s view is likely to be better-informed than yours. (Because, likely, it is.)
  3. Experts come in many flavors. Education enables it, but practitioners in a field acquire expertise through experience; usually the combination of the two is the mark of a true expert in a field. But if you have neither education nor experience, you might want to consider exactly what it is you’re bringing to the argument.
  4. In any discussion, you have a positive obligation to learn at least enough to make the conversation possible. The University of Google doesn’t count. Remember: having a strong opinion about something isn’t the same as knowing something.

Building a Corpus-Specific Stopword List

How do you go about finding the words that occur in all the texts of a collection or in some percentage of texts? A Safari Oriole lesson I took in recently did the following, using two texts as the basis for the comparison:

[code lang=python]
from pybloom import BloomFilter

bf = BloomFilter(capacity = 1000, error_rate = 0.001)

for word in text1_words:

intersect = set([])

for word in text2_words:
if word in bf:


UPDATE: I’m working on getting Markdown and syntax highlighting working. I’m running into difficulties with my beloved Markdown Extra plug-in, indicating I may need to switch to the Jetpack version. (I’ve switched before but not been satisfied with the results.)

Towards an Open Notebook Built on Python

As noted earlier, I am very taken with the idea of moving to an open notebook system: it goes well with my interest in keeping my research accessible not only to myself but also to others. Towards that end, I am in the midst of moving my notes and web captures out of Evernote and into DevonThink — a move made easier by a script that automates the process. I am still not a fan of DT’s UI, but its functionality cannot be denied or ignored. It quite literally does everything. This also means moving my reference library out of Papers, which I have had a love/hate relationship with for the past few years. (Much of this move is, in fact, prompted by the fact that I don’t quite trust the program after various moments of failure. I cannot deny that some of the failings might be of my own making, but, then again, this move I am making is to foolproof systems from the fail/fool point at the center of it all, me.)

Caleb McDaniel’s system is based on Gitit, which itself relies on Pandoc to do much of the heavy lifting. In his system, bibtex entries appear at the top of a note document and are, as I understand it, compiled as needed into larger, comprehensive bibtex lists. To get the bibtex entry at the top of the page into HTML for the wiki, McDaniel uses an OCAML library.

Why not, I wondered as I read McDaniel, attempt to keep as much of the workflow as possible within a single language. Since Python is my language of choice — mostly because I am too time and mind poor to attempt to master anything else — I decided to make the attempt in Python. As luck would have it, there is a bibtex2html module available for Python: [bibtex2html](

Now, whether the rest of the system is built on Sphinx or with MkDocs is the next matter — as is figuring out how to write a script that chains these things together so that I can approach the fluidity and assuredness of McDaniel.

I will update this post as I go. (Please note that this post will stay focused on the mechanics of such a system.)

Namespaces, Scopes, Classes

I’m still a babe in the programming woods, so Shrutarshi Basu’s explanation of namespaces, scopes, and classes in Python was pretty useful. I can’t tell if I had read around enough in preparation for final understanding or if Basu simply wrote about it in a fashion that I understood clearly, seemingly for the first time.

Eye Contact

A recent study published in Cognition suggests that eye contact may interfere with verbal productivity. That is, one reason that people may look away is that the two processes, making eye contact and generating language, may be using the same resources.

Rogue Considerations

We took our daughter and a friend to see Rogue One a few nights ago, and it turned out to be one of the better offerings in the Star Wars franchise. Perhaps none of the films will live up to the promise, the hope (more on this in a moment), of A New Hope, but, in my current moment, I would put Rogue One up there with The Empire Strikes Back as offering the promise of a continuing story which has a nice mix of characters and world(s) and a larger story, or braided collection of stories, to be told.

That is not to say, however, that the film isn’t without its problems, many of which are inherent in the Star Wars universe itself. I’m going to put aside the light saber, which is both ridiculous and cool at the same time, as a necessary fiction, like faster-than-light travel, and FTL communication, is to much of science fiction. What I can’t put aside is the weird reliance that Star Wars has on things like crystals. Early on, we learn that Jedda is being mined for its crystals, because that’s the power behind the Death Star’s weapon. It felt like a throwback to the original Star Trek‘s "dilithium crystals" as the basis for power. Even worse, at some point a minor character says something like "these crystals are the power behind the brightest suns" or some such nonsense. In a universe that already has ion cannons and some sort of fusion drive, we need to have crystals, too?

In the same vein, the SW franchise seems to be sticking with the need for its big bad weapon to be an energy weapon — at least with the Death Star, both the weapon and its target need to be in some kind of physical proximity, unlike The Force Awakens where a radiation-based weapon, which we see as light, can cross trans-galactic distances at hyper-light speeds. Alas, in Rogue One our planet-killer is not quite up to speed and it can only kill cities, but, oh, this is impressive. So, we are led to believe that a civilization based on advanced technology has no knowledge of an atom bomb, which is an effective city killer and at a fairly small cost, all things considered; nor is it aware that simply speeding an asteroid of a decent size will accomplish the same thing? (This is something that Babylon 5 got exactly right, and the scene where one of its main characters stands and watches his civilization’s fleet shoot rocks at another civilization’s home planet is quite effective in capturing the mixed emotions of destroying a fellow civilization.) Worse, the empire functionaries, here in the form of a CGI-revenant Peter Cushing as Governor Tarkin, seem reasonably impressed with the results. In reality, if all you could do is destroy a city from space, given what the empire has spent, your empire overseer should be pretty pissed.

But let’s put under-imagined — and that’s what it is, just flat out under-imagined and uninspired — science and technology aside and discuss a few things that are central to the Star Wars universe, woven into the very fabric of its plots to the point where it becomes almost ideologically necessary, it seems. There are two, one major and one minor: the major device is orphans; the minor, revenants.

At this point my wife walked into my study and shook her head, noting that orphans are a very common narrative figure / trope and that I shouldn’t hold Star Wars responsible for killing off parents on such a scale that Obi Wan Kenobi might very well feel a disturbance in the force. If we begin with the fictional chronology, we have, of course Anakin Skywalker, who is already sort of orphaned — the mother seems a pretty minor character — and who gets officially orphaned by the third episode of the series.

That leads us to the orphaning of both Luke and Leia, the former of which is raised by an uncle who is never explained — we have to assume he’s either Anakin’s (older?) brother, who’s a complete bum for letting his mom rent his younger brother out to pay off debt or he’s a great uncle by being the mom’s brother or he’s just sort of a avuncular character that Obi Wan knows, likes, and trusts with a kid because, well, Tattooine! (The SW need for desert planets is something we can discuss another time.)

So Darth Vader is an orphan, and, as it turns out, Luke isn’t really an orphan since he had a dad the whole time, but then dad gets killed, as does another orphan’s dad, Galen Erso, father of Jyn, and thus, as the father of a Star Wars hero, doomed to die. The heroine of The Force Awakens, Rey, is also an orphan, making her way through another desert world, Jakku, all alone, only to discover she has a bit of family, and, oh yeah, she may be the daughter of Luke?

While you may have your doubts about Jedi parenting — after all, one could argue that Qui Gon and Kenobi do a terrible job of wrangling the terrible teenager Anakin and the galaxy pays the price (Oh! What a millennial he is!) — what you cannot doubt is that having a family means you’re aren’t going to be having a grand adventure any time soon. Granted that a character being orphaned is simply a way to dramatize that feeling of being alone that all of us encounter, and our loneliness may actually be considered a strength. Star Wars has turned being orphaned into something like a fetish. If I were a kid in a galaxy far, far away, I’d want to get rid of my parents to increase my chances of getting in on some adventure, hopefully the evil empire ending kind of adventure, but whatever.

The minor fetish, er, regular plot device in Star Wars is, of course, the revenant. Darth Vader is our prime example — and was no one taken between the sadness of old Kenobi remembering in A New Hope that Vader killed Luke’s father and the hacking at limbs of the young Kenobi? Maybe it was just me. The business of bringing people back from the dead was brought home to me when we got to witness Vader in one of the life support tubes — perhaps left over from the second or third Alien film, but it was highlighted even more watching the creepy CGI version of Peter Cushing — wouldn’t another Grand Moff had done, and he, or she, could simply have said, “When Grand Moff Tarkin gets here, he’s gonna be pissed.” This is something older films get right: oblique is better than the creepy computer zombie of a beloved character actor. That goes for zombie Princess Leia as well: just have a woman in the white costume glimpsed only from behind. The audience will get it. (Lucas, and now Disney, has never had much confidence in his audience.)

And how are they going to bring back Kylo Ren from the blowing up of the death planet thingy? He’ll come back. The same way that there’s some talk of the Rogue One character re-appearing in the next Star Wars films as the Knights of Ren. Because why complicate things with new characters when you can recycle familiar ones?