One Meaning of “Statistical Analysis”

One of the things that interests me is all the ways that “statistical analysis” can be defined, even within the confines of a relatively nascent domain like text analytics. Of course, being nascent also means that things are not yet defined. Moreover, as a domain, text analytics is emerging at the intersection of a number of fields. Some of the differences about assumptions of what were the applicable dimensions of statistics, let alone mathematics, were quite striking at this year’s Culture Analytics program at UCLA’s Institute for Pure and Applied Mathematics.

Below is a recent request posted on The Humanist that I am capturing here as another entry in this area:

The work will involve investigating the temporal relationships between
spoken and gesture events, so experience with methods for conducting
statistical analysis (correlation, t-test, anova, hypothesis testing) are expected.

In addition, the preferred workflow is as follows:

Ideally, the work will be done in Python (ideally using pandas), but if people prefer using R, I’d be happy to hear from them.

Simpson’s Paradox

In the middle of a great explanation of [Simpson’s paradox]( comes this:

> fields of graduate study that are generally more crowded, less productive of completed degrees, and less well funded, and that frequently offer poorer professional employment prospects. [=humanities]

Thanks to Scott Weingart for the heads up. (Can’t find the original tweet right now.)

Comprehension in a Year

My goal is to be able to understand the software description below in a year’s time:

> The Altmann-Fitter is an interactive software for the iterative fitting of univariate discrete probability distributions to frequency data. It uses the Nelder-Mead Simplex Algorithm.
> In its present version it contains about 200 distributions and is one of the most voluminous collections of distributions at all. It aims at the analysis of data from all empirical domains, e.g. biology, economy, sociology, meteorology, ecology, linguistics, literary science, communication, technical sciences and production. It is indispensable for practitioners.
> Fitting is automatic, i.e. no initial estimators are necessary, and it improves iteratively. The goodness-of-fit test is performed by means of the chi-square test. A number of options and configurations enables the user to flexibly process data.
> Altmann Fitter runs under all Microsoft Windows® versions since Windows XP® and including Windows 8®. For best performance, the computer should be equipped with at least 512 MB of RAM. Different graphical outputs are available.

> Visit our web-site: (here you find a demo version and an user guide – free download).

Right now, I don’t. I understand, I think, the first sentence, but after that … not so much. This only confirms for me that statistics is foundational to being an effective part of the larger discussion.

A Little PMI

A little *pointwise mutual information* goes a long way, as [Burr Settles makes clear][] in his discussion of the terms *geek* and *nerd*. How does he distinguish between the noun that is also a verb (something he doesn’t really discuss and I would, hmm, kinda geek out on)? He concludes:

> In broad strokes, it seems to me that geeky words are more about stuff (e.g., “#stuff”), while nerdy words are more about ideas (e.g., “hypothesis”). Geeks are fans, and fans collect stuff; nerds are practitioners, and practitioners play with ideas. Of course, geeks can collect ideas and nerds play with stuff, too. Plus, they aren’t two distinct personalities as much as different aspects of personality.

But you gotta really check out the graph. It’s really pretty astonishing.

[Burr Settles makes clear]:

Added Anxiety

As if concerns about the rising disparity between the haves and have-nots in the developed countries of the world, and most especially in those countries that led the industrial revolution, the U.K. and the U.S.A., were not enough, now comes a piece by Norm Augustine in Forbes magazine which adds to the  anxiety: “America Is Losing Its Edge in Innovation.” Essentially Augustine observes that while we still train much of the world’s scientists and engineers, they are no longer choosing to remain in the U.S.A. and work here. The result is that their application of their knowledge is happening elsewhere. Augustine blames American culture, i.e., parents especially, and educational institutions for not making it clear that science and engineering are great lives to lead. Along the way, he uses some interesting, if not particularly coherent, statistics:

  • U.S. consumers spend significantly more on potato chips than the U.S. government devotes to energy R&D.
  • In 2009, for the first time, over half of U.S. patents were awarded to non-U.S. companies.
  • China has replaced the U.S. as the world’s number one high-technology exporter.
  • Between 1996 and 1999, 157 new drugs were approved in the U.S.  Ten years later, that number had dropped to 74.
  • The World Economic Forum ranks the U.S. #48 in quality of math and science education.

I’m with you, Mr. Augustine, that we have allowed, I don’t know, bankers and lawyers — and athletes —  to become the praetorians of American capitalism, but I don’t think its education’s fault. I think the problem is much more complex and knotted and it’s going to take the kind of serious long-view thinking that there doesn’t seem much interest in embracing at the moment. As a folklorist, I feel like much of my job is to take good notes and try to describe all this as best I can, in hopes of beginning to understand it over the next decade. As a citizen, I’m kind of okay with America “losing its edge,” if by edge we mean domination. I would rather see our fine country decline a bit, retreat from imposition of empire by might, and re-emerge as a world power in ideas and production of things that matter. As a parent, and someone looking at retirement twenty or so years from now, I don’t like that I think it’s exactly during that span of time that my chance to save money for myself and my family is going to be most tested.

The Slow Wheel of Time

Yes, I work at an university, and so you would think that I’m some variety of liberal and that all my colleagues are liberals. At least that’s what some of the commentators on Fox News or on various radio shows would have you believe. The truth is that university faculty and staff come in about as wide a range of political persuasions as everyone else. It’s also the truth that university faculty and staff tend to lean toward what is considered the left in American politics. But that makes sense doesn’t it? Like teachers and nurses and police officers and firefighters they have decided that serving the public good is more important than making a lot of money. They have a different version of *richness* and *rewards*. That should be allowable, no?

At the same time that I don’t see the point in conservatives hectoring liberals, I also don’t see the point in liberals hectoring conservatives needlessly — in fact my chief concern with political discourse in our era is that it seems to be about winning and not governing.

Myself, I am a moderate with liberal leanings on social issues and conservative leanings on fiscal issues. After all, I don’t make that much money; I’d like to keep what little I have. (And, for the record, I think businesses, which are oriented toward a private good, should be regulated in view of the public good. That’s just plain old fashioned common sense.)

I am also more prone to patience and to objectivity than a lot of my friends on either side of contemporary politics, mostly because politics in our time is just that: in our time. What the future holds is sometimes beyond the reach of politics, political discourse, and politicians. This illustration from []( makes that case awfully well. What it reveals is that acceptance of gay marriage, even in the very historically conservative south, is inevitable in many ways. As older voters who are more troubled by it decline in terms of voting strength — which is a polite way of saying *die* — the younger voters will increasingly move to the so-called “left” on this particular issue, because, well, because they already are on the left on this issue according to the data on which this illustration is based.

a visual illustrating support for gay marriage by state and age