
Providing large scale text corpora for research
The Stanford RegLab and the Stanford Literary Lab have both been processing and analyzing large text corpora for many years now and both recently received a chunk of OCR content from Stanford Libraries thanks to work that DLSS has undertaken to retrieve the digital files of more than 3 million items from the Stanford Libraries catalog that were scanned by Google.