In a recent oral argument exchange at the Supreme Court in ZF Automotive US, Inc. v. Lucshare Ltd., counsel brought up a corpus linguistics article that discussed the statutory term at issue: “foreign tribunals.” Chief Justice John Roberts was dubious: “Yeah, I don’t quite know what to make of that. That’s – that’s something new. I mean, have we relied on that source before?” But Justice Amy Coney Barrett explained that, although “the Court has never used the Corpus Linguistics database before,” other courts had and the Supreme Court itself had conducted similar informal surveys, including an opinion written by the Chief himself.
It remains to be seen whether (and how) the Supreme Court will address corpus linguistics later this summer, but this method of statutory interpretation is already appearing in other courts. In Health Freedom Defense Fund, Inc. v. Biden, Judge Kathryn Kimball Mizelle included corpus linguistics in her analysis of the “traditional tools of statutory interpretation.” Although corpus linguistics was one tool among many in that case, the court’s sua sponte corpus linguistics analysis highlights the need for attorneys to be familiar with this tool so they can either harness it or counter it successfully.
Why should we care? Aren’t dictionaries good enough?
At this point, you may be asking yourself why dictionaries aren’t good enough. Dictionaries have their uses, but they are not designed for determining ordinary meaning. To take just one example, Webster’s Third—favored by lawyers for its descriptive instead of prescriptive definitions—orders the senses of each word by “earliest ascertainable meaning.” In many cases, the earliest meaning is not the ordinary current meaning, and may not be commonly used at all.
So, what can dictionaries tell you? Dictionaries provide various possible definitions of a word. Sometimes, that is enough. If opposing counsel is arguing that a word means something that the dictionary does not include, then citing a dictionary makes sense. But what happens when the court decides that the same word has competing senses, as in Health Freedom Defense Fund, and you need to know which sense is the more ordinary? That is the function of corpus linguistics.
How do you conduct a corpus linguistics search?
You can read over Judge Mizelle’s order to see how one judge decided to incorporate corpus linguistics analysis into an opinion. But how does one conduct a corpus linguistics search? The starting point is deciding which database to search. Once that decision is made, the search itself is similar to searching a legal database such as Westlaw.
The database that Judge Mizelle searched is called the Corpus of Historical American English (COHA). That database contains 475 million words from fiction, popular magazines, newspapers, and non-fiction books published between 1810 and the 2010s. COHA won’t have much to tell you about a legal term of art, but it’s the right tool to illumine the ordinary meaning of a term in a given historical period.
Within that database, Judge Mizelle performed a simple search for the word “sanitation” and looked at the context in which each of those results appeared. The results of a corpus linguistics search are similar to the results of a Westlaw search, displaying your search terms as they appear in the broader context of a certain document. The difference is that Westlaw is searching court opinions and the corpus linguistics database is searching newspapers, books, transcripts of television shows, etc.
More advanced searches are also possible. For example, you could analyze how two words typically relate to each other. To offer an overly simplified example, say that a statute reads: “Do not sit on the bank.” It is unclear from any surrounding context whether the statute refers to river banks or financial banks. Both senses of “bank” appear in the dictionary. If this were a legal question, you might enter a Westlaw search for something like “bank /5 sit.” In a corpus, you can perform a collocate search to look at phrases where “bank” appears within five words of “sit.” Either Westlaw’s legal database or COHA’s ordinary language database would show you similar results: People do not ordinarily talk about sitting on a financial bank.
Perhaps like Chief Justice Roberts, you are still left scratching your head about corpus linguistics, and especially how to use it yourself or counter it if opposing counsel (or the court) brings it up. Whether or not you’re already familiar with it, incorporating corpus linguistics into your practice might not be as challenging as you think. In some respects, the process is as familiar and simple as a Westlaw search. Like Chief Justice Roberts, you may have been using corpus linguistics methodology informally without having a specific word for it. In a future post, I will discuss how that analysis might end up in front of the court, either by your own initiative or from opposing counsel or the court, and what to do with it when it does.