Last week Science published an article introducing the term “culturomics” – the quantitative study of cultural trends. By constructing a database out of the by now 15 million books that Google have digitized over the past years, a Harvard based research team led by Jean-Baptiste Michel have created a powerful searchable tool which makes it possible to create quantitative date for analysing cultural trends. As they state in the abstract:
We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of ‘culturomics,’ focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.
In short, this is a tool which has the potential to revolutionise research methods in a vast number of fields. The best part: Google Labs have made the tool (the Ngram viewer) publicly available. Before even starting reading the article I found myself thinking about a number of applications for my own research and field. Below follow some rough examples, and preliminary results which already seem to challenge established knowledge in the history of esotericism.
As the authors of the Science article noted the tool can be used to perform a number of interesting things. The basic idea is this:
- Together with Google, libraries and publishers across the world have digitized a total of 15 million books (about 12 % of everything that has been published since the invention of the printing press), using optical character recognition (OCR) technology;
- These books have been provided with metadata;
- From this pool, the researchers created a data set of 5 million books (ca. 4 % of all ever published), based on the quality of the meta data and character detection. Due to these criteria, the data set is much more complete and representative for books published after 1800, and for books published in English (although German, French, Chinese, Russian and Hebrew books have also been given their own corpora).
- Finally, they developed a powerful tool for computational analysis, which can create frequency ratios for any specific word or cluster of words, or “n-grams” (one word = 1-gram; two words, 2-gram; etc.). The tool can divide the number of occurrences of a certain n-gram by the total number of words published in any given year, and thus find out how frequent it is.
And that’s when the fun begins. Multiple searches give room for comparisons, choosing the time frame makes it possible to analyse trends.
Obviously this can be used for interesting research, for fun, or both. What better way to illustrate secularisation than by the graph below, for example (click to enlarge)?
From being a very frequent word in the 1810s and 1830s, “religion” has gradually lost popularity in the world of books, until, in the 1980s, it was finally surpassed by the word “sex”. Or is it quite so final? Apparently, from the year 2000, religion is again on the increase, while sex is getting less frequent (the word, that is). Is secularisation giving way to desecularisation (as Peter Berger would have it in 1999) and de-sexualisation (whatever that means)? Perhaps a more likely explanation is that more of the sexual discourse has moved away from print media, towards online forums – although that should also be the case for religion, which is still increasing in print (the most significant increase since the 1940s, at least).
And this was just a warm-up try. What about applying the tool somewhat more seriously to my own sub-field, esotericism? One can do lots of interesting things only by checking the frequencies of the word itself.
Quite expectedly, the word “esotericism” has never been more frequent than during the past two decades. This may be explained by two parallel trends: the professionalisation of the academic study of esotericism (from ca. 1990), which has led to the production of many books and articles on the subject; secondly, by the still growing “alternative” and “spiritual” publishing business, which often makes use of the word.
It is a little more surprising to notice that there was a drastic fall just before the year 2000, and that so far, the frequency peak was 1995. At least for the point about the academic literature this would be surprising, since the field has certainly grown and become much more productive, visible and established from that moment until today. In other words we expect that something else must be going on. Perhaps the rise of the academic study of esotericism coincided with a turn away from that term (towards other alternatives, there are plenty of them) on the part of new agers, occultists, pagans, and alternativites? I don’t know, but searches for the corresponding German and French words show the same pattern, with German dropping only a few years later (figures below).
If we look backwards there are still other relevant questions that arise. For example, the genealogy of the term “esotericism” has been the object of research over the years (mentioned here previously), and the tool is well equipped to help researchers in the future. Indeed, it is especially in this kind of painstakingly precise work that the “culturomic” tool may revolutionalise the way we work. Research that could previously fill a four year PhD project can now be solved in 20 minutes on a laptop.
It was for a long time thought that the first instance of a modern noun for esotericism was the French “ésotérisme”, appearing in Jacques Matter’s Histoire critique du gnosticisme from 1828 (Laurant 1992: 19; cf. Hanegraaaff 2010). Last year it was shown that earlier instances indeed existed in the German “Esoterik” and the corresponding “Esoteriker” (Neugebauer-Wölk 2010). In English, the term is known to have been popularised with Theosophy as late as the 1870s and 1880s, but not much systematic philological scholarship exists.
Three searches and three minutes is all it takes to get a better picture.
In the French, it turns out that Jacques Matter is predated by two other references to ésotérisme. Both are (accidentally, it would seem) from 1811: the second volume of Pierre Leroux’s De l’humanité, de son principe et de son avenir, and in volume 9 of Henri Martin, Histoire de France. Both references use esotericism dismissively about features of religion the authors don’t like: the esotericism of the essenes and pharisees in the case of Leroux, and that of the Papacy in the case of Martin (although the latter is more ambiguous, distinguishing between the esotericism of the “ancient Orient” and the “negative esotericism” of the “sceptical philosophers”).
When we try to search for the German “Esoterik”, we find that the word is not only much more frequent in the German data set than in the French during this period (i.e. the early 1800s), but also that it appears in a number of different sources already in the 1780s, confirming Monika Neugebauer-Wölk’s recent findings. The earliest German reference that we find is “Esoteriker” (“esotericist”) rather than Esoterik, appearing in association with Pythagoras (and apparently as synonymous with “Mathematiker”), in the first volume of Christoph Meiners, Geschichte des Ursprungs, Fortgangs und Verfalls der Wissenschaften in Griechenland und Rom (1781). The second reference, in vol. 2 of Archiv für Freimäurer und Rosenkreuzer, by Konrad Friedrich Uden, seems derived from the prior, because here as well we find “Esoteriker oder Mathematiker” connected to Pythagoras. And that is it. Similar references to esotericists and esotericism keep popping up in German histories of philosophy, being joined by theological literature in the early 19th century, until, by 1840, there is even a dictionary entry for “Esoteriker” in Vollständiges Wörterbuch der deutschen Sprache.
Here, however, we encounter a weakness, because Neugebauer-Wölk actually found references that were even earlier, references that do not show up in the database search. This reminds us that the tool is not yet perfect; particularly we should recall that the Harvard researchers had already warned that the corpora are less accurate and complete before 1800, and especially in the non-English corpora. Among other things, this has to do with the difficulty of applying OCR technology to these older prints – particularly, it seems that the letter recognition runs into trouble with the Gothic script used in German publications of this era. Indeed, making a separate search for the titles which Neugebauer-Wölk had found (e.g. Meiners’ Revision der Philosophie, 1772) shows that the books have indeed been digitalised and registered, but that they are not yet searchable. When the technology has improved, and the number of pre-1800 books increased, we can only imagine what kinds of findings may be achieved.
Returning now to English, we find no reference to the noun “esotericism” before 1838, when it shows up in an article in a letter to the editor of The Christian Observer. This author connects esotericism to exotericism and uses it as a derogatory. Yet another sense is found in the surprising second reference to “esotericism” in the English language. It is found in an article in the Quarterly Review in 1842, on the highly esoteric topic of gardening:
“To produce new seedling varieties of one’s own, by hybridizing and other mysteries of the priests of Flora, is indeed the highest pleasure and the deepest esotericism of the art.”
After this there are several other references, mostly in political history (“the esotericism of the High Whig Party”) and yet more theological polemical literature, until the occultist literature starts growing in the 1880s.
This quick analysis should already make it clear that central research questions in the field of esotericism can benefit greatly from the new culturomic tool, brought to you by Google Inc.
I will be back with more examples of relevant uses in a later installment.
- Jean-Baptiste Michel, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K. Gray, The Google Books Team, Joseph P. Pickett, Dale Hoiberg, Dan Clancy, Peter Norvig, Jon Orwant, Steven Pinker, Martin A. Nowak and Erez Lieberman Aiden. “Quantitative Analysis of Culture Using Millions of Digitized Books”. Science, 14 January 2011: Vol. 331 no. 6014, pp. 176-182.
- Hanegraaff, Wouter J. 2010. “The Birth of Esotericism from the Spirit of Protestantism“. Aries 10.2 (2010), pp. 197-216.
- Laurant, Jean-Pierre. 1992. L’ésotérisme chrétien en France au CICe siecle. Lausanne: L’Age d’Homme.
- Neugebauer-Wölk, Monika. 2010. “Der Esoteriker und die Esoterik: Wie das Esoterische im 18. Jahrhundert zum Begriff wird und seinen Weg in die Moderne findet“. Aries 10.2 (2010), pp. 217-231.
UPDATE: It has surfaced that the two French references are in fact later than 1811. For a full update and correction, see the new errata post.
This work by Egil Asprem was first published on Heterodoxology. It is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.