As part of our hand editing of the TED talk data we had to retrieve missing information for, luckily, a small subset of the speakers. This meant Kinnaird splitting off two CSVs, one for the TED main event speakers and one for the other TED-sponsored event speakers, and then me trudging row by row and cell by cell, working back and forth between the CSV and a web page. Copy and pasted and two CSVs filled in. Yes.

Then it was time to fold these filled in rows back into the main CSVs from whence they came. Each smaller CSV had between 15 and 20 rows, so it didn’t seem like a task worthy of firing up a Python session and writing something in pandas to replace the rows with missing information with the filled-in rows.

I started doing the work by hand: copy a row from the missing.csv and paste it below the matching row in the speakers.csv and then deleting the matched row. Oi! Sure it was only 17 rows, but, still, there has to be a somewhat faster way!

So I decided to merge the two files using cat and then simply finding the dupes in Easy CSV Editor and deleting the row with missing data. Semi-automated?