From at least one, if not many more, perspective, I’m an old man, a somewhat established scholar. What could be more bizarre than wanting to quit everything else and spend six months learning all the math you wish you already knew? Apparently, I am not alone. Warren Henning, a software engineer, is doing much the same, and he explains his motives as follows in an essay on Medium:
To me, math is raw, untapped power. Statistics is helpful in computer programming, period. My dream is to learn the statistics, probability, and linear algebra needed to really understand machine learning and computer vision, which has had a major spurt of activity in the past 5–7 years. To realize this goal, I need a solid foundation so that I can truly understand what’s going on: why something works, when it won’t work, and what to do differently if it doesn’t.
Open Refine is a “tool for working with messy data: cleaning it; transforming it from one format into another; extending it with web services; and linking it to databases.” Link takes you to a page with lots of video tutorials. There is also Thomas Padilla’s Getting Started with OpenRefine.
Oliver Elliott has a pithily written introduction to the Unix command line. It occasionally serves as my cheatsheet when I need to remember how to use a command that I don’t use very often, but it can also serve as a primer on how to do things on the command line. And one of the greatest things about it, in this era of page views, is that it is all on one page. Nice.
There’s a new visual programming interface (language?) for text analysis in town and it’s Orange Textable: “Orange Textable is an open-source add-on bringing advanced text-analytical functionalities to the Orange Canvas visual programming environment (itself open-source). It essentially enables users to build data tables on the basis of text data, by means of a flexible and intuitive interface.” Looking through the documentation, it reminds me of something like the MEANDRE/SEASR infrastructure/application setup from the NCSA (National Center for Supercomputing Applications) a few years ago. (The project has disappeared from both the NCSA and the I-CHASS sites.)