[Göttingen Centre for Digital Humanities is hosting a hackathon] focused on automatic detection of various kinds of text re-use. As you might imagine, text re-use comes in a wide variety of forms: “Text re-use can take the form of an allusion, a paraphrase or even a verbatim quotation, and occurs when one author borrows or re-uses text from an earlier or contemporary author.” Most of these re-uses are *intentional*. Scholars of text re-use also have a category of *unintentional* re-use, which, from a folklorist’s point of view, seems fairly familiar: “Unintentional text re-use can be understood as an idiom or a winged word, whose origin is unknown and that has become part of common usage.” Winged words seems a particular form of traditionalizing, since they are “words which, first uttered or written in a specific literary context, have since passed into common usage to express a general idea—sometimes to the extent that those using them are unaware of their origin as quotations” (okay, [Wikipedia]).
Interestingly, there doesn’t seem to be any interest in, or awareness of, words or phrases that are uttered within the vernacular domain, become widespread in usage, and achieve stickiness purely that way, or even get captured into a literary text. There is, however, a lovely illustration by Marco Büchler that graphs out the various possible kinds of text re-use:
[Göttingen Centre for Digital Humanities is hosting a hackathon]: http://etrap.gcdh.de/?p=669