DLSS has a new lab! In late September, under the roof of the Stanford Media Preservation Lab located at SULAIR's site on Page Mill Road, we installed equipment to support the digitization of video collections held at Stanford Libraries. Two digitization workstations, a host of analog video tape players and supporting system components, and tools for cleaning and repairing aging videotapes and other recording media are installed and in production.
Blog topic: Digital library
Re-Posted from the Special Collections and Archives Exhibits Program listing -
The Monuments of Printing Exhibition highlights first 250 years of printing in the West
Johannes Gutenberg's printing of a Bible from movable type in Mainz, Germany in 1455 marked the beginning of a communication revolution in the West. Printers were able to reproduce texts efficiently in quantities virtually unimaginable to a scribe. Monuments of Printing: from Gutenberg through the Renaissance, the first of two exhibitions spanning five-hundred years of printing history, demonstrates the development of typography and printing in Europe over a 250-year period as seen in selected works in the rare book collections of the Stanford University Libraries. The exhibition will open Monday, August 1, in the Peterson Gallery and Munger Rotunda on the second floor of the Bing Wing of Green Library, Stanford University, and is free and open to the public.
I recently attended a workshop of the KEEP project (Keeping Emulation Environments Portable) in Rome. KEEP is an EU funded project to develop software that virtualizes old computer hardware and software environments. This allows you to run old operating systems and the applications that were designed for them on modern computers. The KEEP project is multi-partner project that than includes a consortium of national libries (BNF, Koninklijke Bibliotheek), the University of Portsmouth, a computer history museum (Computerspiele Museum), commercial partners (Tessella), and the European Game Developers Association.
The project is scheduled to end in February 2012 and has already released software version 1.0.0 on SourceForge ( http://emuframework.sourceforge.net/ ). This version supports:
* 5 platforms: x86, C64, Amiga, BBC Micro, Amstrad
* 6 emulators included: Dioscuri, Qemu, VICE, UAE, BeebEm, JavaCPC
* 22 file formats supported: PDF, TXT, XML, JPG, TIFF, PNG, BMP, Quark, ARJ, EXE, disk/tape images and more
* Integration with format identification FITS
* Web services for software and emulator archives
We've been examining whether or not to restore stopwords to the SearchWorks index. Stopwords are words ignored by a search engine when matching queries to results. Any list of terms can be a stopword list; most often the stopwords comprise the most commonly occurring words in a language, occasionally limited to certain functions (articles, prepositions vs. verbs, nouns).
The original usage of stopwords in search engines was to improve index performance (query matching time and disk usage) without degrading result relevancy (and possibly improving it!). It is common practice for search engines to employ stopwords; in fact Solr (http://lucene.apache.org/solr), the search engine behind SearchWorks, has English stopwords turned on as the default setting.
In our implementation of SearchWorks, there was no compelling reason to change most of the default Solr settings; thus, since SearchWorks's inception we have been using the following stopword list: a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, s, such, t, that, the, their, then, there, these, they, this, to, was, will, with.
What follows is an analysis of how stopwords are currently affecting SearchWorks, and what might happen if we restore stopwords to SearchWorks, making every word signficant for every search.
The Digital Production Group is very excited about an upcoming project featuring the personal papers of "Laura Bassi, a noted 18th-century Italian scientist and Europe's first female professor, " with Project Manager Cathy Aster at the helm.
More information to come, but in the meantime take a look at this recent article in the Stanford University News.
The (meta)data underneath SearchWorks is largely based on our MARC records from Symphony. MARC records are exported from Symphony, then slurped up by an application called SolrMarc, which transforms the MARC data into an index for the Solr search engine used by SearchWorks.
SolrMarc is open source software made available by Bob Haschart of the University of Virginia Libraries. SolrMarc is used by all(?) VuFind sites as well as most Blacklight sites built on MARC data (e.g. SearchWorks). SolrMarc has been great for us -- it gave us an enormous jump start for SearchWorks. Bob is also a great guy, and made me a "committer" almost immediately -- so I can make contributions to the open source code.
Open Source Software does best when there is a critical mass of developers: group wisdom rocks, as does sharing the work. To date, SolrMarc is very much Bob's project, despite a number of committers such as myself. There are some ... interesting ... practices as to how SolrMarc is organized and how it is tested. I've even contributed a bit to some of its squirreliness. Occasionally, changes to the SolrMarc codebase break the code I've written especially for Stanford.