Category Archives: Digital Humanities

The OED and EEBO TCP

Last Christmas, a friend who happens to be an antiquarian bookseller posted on Facebook an image of what he took to be the first recorded instance of the expression “merry Christmas” in print. The book in question was An Itinerary VVritten by Fynes Moryson Gent (1617).

blogmerrychristmasfbA basic search on the Early English Books Online Text Creation Partnership database shows that there were several occurrences of this expression prior to 1617, the first being Nicolas Breton, A Floorish vpon Fancie (1577).

Screen Shot 2015-06-26 at 13.19.35

My friend protested that he was going by the Oxford English Dictionary, which does indeed give the 1617 work as the first occurrence of the expression with modern spelling:

Merry Christmas OED

But, clearly, OED has got it wrong!

I recently used this example to kick start a workshop on EEBO, and followed it with another example, this time one provided by Kenji Go, one of the attendees of the workshop. He did some work on the origin of the cosmic sense of space, published in Notes and Queries, showing that the earliest use of the word “space” to denote the place where the heavenly bodies are located  predates the first usage cited in the OED (which at that time was given as Milton, 1667) by some 85 years. In response, OED has updated its entry:

Screen Shot 2015-06-26 at 16.12.01

While reading through Professor Go’s work, and checking through the OED entry, I couldn’t help but notice the close link between “space” as “cosmos” (sense 8 in OED) and space as physical extent or area (sense 7), especially Shakespeare’s usage in Hamlet:

Screen Shot 2015-06-26 at 16.10.21

Once again, a check on EEBO TCP shows that OED has missed a number of earlier references to “infinite space”, the earliest being A Sermon of Saint Chrysostome (1542). The usage that particularly interested me was in Sermons of Master Iohn Caluin, vpon the Booke of Iob (1574), “behold the heauen is of infinite space in cōparison, & yet we see it is borne vp by the only power of God” (p. 494). It seemed to me that the concept of space as physical extent or area was morphing here into the concept of space as cosmos; “heaven”, as used here, is not  an abstraction, an idealized world unknowable while we are in this world, but  something we can “behold” and “see”, that is, the place where the sun and the moon and the stars are.

A search on the Swiss database of early modern texts shows that the translation follows Calvin’s French exactly: “Or voila le ciel qvi a vne espace infinie”.

Calvin infinite space French

The page on the Swiss database is located here. In the French, as in the English, heaven is described as being of an infinite space, rather than as being, in itself, an infinite space, but it does begin to look as if the English concept of space as cosmos either owes something to French usage or developed in tandem with it.

Either way, the basic point is that this whole subject of the earliest usage of particular words and phrases is not something I have made a particular study of. My research interests are quite different, and these examples – “merry Christmas”, the cosmic sense of space and the expression “infinite space” – are just random examples that happen to have crossed my radar by chance. Doubtless, there are many more examples out there, and substantial revision of the OED is going to be needed in the light of the EEBO TCP database.

UPDATE:

OK, so the friend who put up the merry Christmas Facebook post now tells me he’s pushed “merry Christmas” back to 1534, in a letter from John Fisher to Thomas Cromwell.

fishermerry christmas

I note though that here “merry” only has one “r”. Still, with the growth of online databases we are more and more being forced to acknowledge that whatever we think we know about the early modern period is provisional. I guess we’ll have to wait until early modern manuscript material goes text-searchable to get the real dope!

Call to action!

Related posts:

Damned if we do! Using the EEBO TCP database

Using the Early English Books Online and Text Creation Partnership Databases

Digital humanities and resources; a selection of useful links

Damned if we do! Using the EEBO TCP Database

Can we use the EEBO TCP database?

This looks like a no-brainer – what would be the use of the Early English Books Online Text Creation Partnership if we can’t use it? – but it’s actually something of a minefield. How often, I wonder, has work citing the database been met with a response like the following?

I somewhat distrust the author’s generalizations because several of them appear to come from typing keywords into the Early English Books Online searchable database.

The starting point of any online database research  will  inevitably be typing keywords. If that is wrong in itself then all the money that has been spent on creating databases has clearly been misspent! And any research which does no more than type in keywords and simply report on results is hardly worthy of the name research.

Let’s start by taking a look at an example of a keyword search and the follow-up work it entails:

[This is the third in a series of three videos I posted a few weeks ago on the use of the EEBO and TCP databases. The complete series of videos is here.]

It should be clear from this that searches of this kind are pretty gruelling. Typing in keywords is the starting point, but after that a wide range of variables needs to be taken into account, from variant spellings to differences in the number of books published within a particular genre during a particular period. And, crucially, the process involves checking the results of the searches to ensure that the occurrences really are valid examples of the particular usage one is interested in.

For me, that’s just the starting point, the spadework before getting down to the job of analyzing usage in particular contexts, relating that to source texts (a lot of my work is with translations, so I want to know what the original text said), checking the background and views of authors, placing the usage in the context of other related texts and so on.

I’m a texty kind of guy, so I’m less interested in the statistical stuff than in seeing the results in context, but the raw figures can sometimes be of interest. EEBO TCP is still incomplete, but it nevertheless offers a much bigger – and more representative – sample than, say, a MORI poll, and it is unlikely that the general pattern of discourse usage picked out in the video above will alter very much once the gaps remaining in the database have been filled.

Last summer (2013) I attended a conference on early modern digital humanities. I could have done with that kind of input before embarking on Pain, Pleasure and Perversity; I might have escaped some of the more obvious pitfalls. I only cite the database eight times in 235 pages, and I don’t think the few claims I made based on it are wrong to any substantial degree, but even so I can see, in hindsight, ways I could have tightened up my approach/presentation.

What really interests me, though, is the discovery that, in acknowledging my use of the database, I appear very much to have stuck my neck out.  A search for “EEBO TCP” on Google Books currently purports to turn up some 450 results, though in fact it dries up after page 7, giving fewer than 70 results (does anyone know why this happens on Google?). Astonishingly (to me), my book appears on the first page (at the bottom)! Most of the other books on that first page are specifically on the use of online databases in early modern studies. Can I really be so unusual as a researcher working in the field and giving credit to the database?

Apparently, yes. I searched again, specifying publications since 2010, and there is only one page of results!

So what is actually happening here? Are scholars just not using the database? I don’t think so. The impression I get from talking to people at conferences, etc., is that early modernists are logging in at about the same rate as other people have hot breakfasts. Is this such a recent development that it is not yet fully reflected in print? Probably, to some extent. About three years ago, after I had been working solidly on the database for about three weeks in the Rare Books Room at Cambridge University Library, one of the librarians came up and asked me what I was working on. I showed him the database and he was astounded; Cambridge was affiliated to it, but none of the library staff even knew it existed! A couple of months later they held a seminar on it, but prior to that it seems not to have been on anyone’s radar; I certainly didn’t see anyone else using it.

American scholars appear to have been quicker off the mark. I would frequently notice a marked slowdown in download times in the middle of the afternoon, which would be about the time people in the US would be logging on.

I could be wrong about this, but what it looks like to me is that lots of people are using the database, but not many are acknowledging it. Top marks on that score to Bruce R. Smith (in Christie Carson and Peter Kirwan, eds, Shakespeare and the Digital World: Redefining Scholarship and Practice, CUP, 2014), who writes:

I didn’t even have to rely on my recollections of just where the passages I wanted were located. I could simply enter a keyword as a search term, and there the desired text would be on my computer screen, ready for cutting and pasting directly into my draft …  What effectively connected me to the texts I wanted was not just my possession of a computer but my university’s subscriptions to EEBO and EEBO-TCP. (Pp. 24-5)

Even then, though, Smith’s main point is how he was brought back to the reality of the printed book when one of the texts he wanted to access wasn’t on the database.

Many others, I suspect, are being less than candid about their use of the database. I could have done the same. How smart I would have looked, with all that intimate knowledge of such a wide range of texts!

I’m glad I was up-front about it, though. I would be the first to agree that there is nothing quite like the printed book, and uses of the database that took me away from reading and analyzing text just wouldn’t interest me but, like Smith, the database ‘connected me to the texts I wanted’ (or to many of them), and enabled me to find out things about the early modern printed corpus that simply would not have been discoverable by any other means.

Related posts:

The OED and EEBO TCP

Using the Early English Books Online and Text Creation Partnership Databases

Using the Early English Books Online and Text Creation Partnership Databases

(This post contains the substance of a presentation I gave at the Annual Conference of the Shakespeare Society of Japan in October, 2014.)

For those who are not familiar, here is an introduction to the use of the Early English Books Online database (EEBO):

EEBO requires a log-in, but many – if not most – universities subscribe to the database and access can be gained through them. Another way to gain access is by joining the Renaissance Society of America, which includes access to EEBO in the membership package.

EEBO gives access to PDF files of early modern books. These files are not text-searchable. The Text Creation Programme (EEBO TCP) gives access in a text-searchable form, and its use is explained, using the public access portion of Eighteenth Century Collections Online (ECCO) here:

The following is an example of the kinds of methods that can be used to incorporate searches on the database into an early modern studies research programme:

These videos were made in October 2014, and part of the TCP database came into the public domain in January 2015. This means that there is now public access to some 25,000 early modern texts in text-searchable form. The text-searchable files do not correlate with PDFs in the way the files viewable by subscription do, and there is the rather serious disadvantage that numbering is not given for books numbered by signature rather than by page. But, provided one has access to the PDFs through the EEBO database, one can work around this, and it is still a valuable resource.

Related posts:

Damned if we do! Using the EEBO TCP database

The OED and EEBO TCP