Mark Davies of BYU announced that the NOW (News on the Web) Corpus has arrived at its own word of the year: “Based on 1.7 billion words of text from 2017 in NOW, our Word of the Year for 2017 is fake news (more info), followed by the related phrase alternative facts.”
I am most drawn to the early entries, and I am especially curious about the gaps in various moments in the timeline. Click the link at the bottom of the iframe to go to the page itself and see the timeline fill your browser window. (Sorry for the paned version below, but it’s a preview.)
Sometimes I do feel like [corpus linguists have already answered my question](http://andreadallover.com/2013/08/25/unsurprisingly-corpus-linguists-have-already-answered-your-question/), but sometimes I feel like they didn’t ask very interesting questions in the first place, and that’s why I continue to stumble along, covering ground they have already covered. My best example of this? The corpus linguistic notion of *lexical diversity* doesn’t seem very interesting — at least not in the forms I have encountered it. I am trying to use it as a way to think about the nature of speech genres. (Please don’t ask me to say more: I don’t have any answers at all. Only questions that CL hasn’t answered for me.)
I’ve never heard of [corpus pattern analysis][cpa], but the description sounds interesting: “The basis principle of this work is to attach meanings to patterns of usage (“constructions”or words in context), rather than to words in isolation.” Sounds like folklore to me. See [Patrick Hanks’ page][ph] for more information.