Improvements to SearchWorks in Summer and Fall 2018
During a four-month span between August and November of 2018, an interdepartmental team from the Stanford Libraries worked diligently to make a series of improvements to SearchWorks, Stanford’s world-class online catalog and discovery system. The improvements are wide-ranging and diverse, and touch nearly every aspect of SearchWorks, which is an essential tool for Stanford faculty and students in support of research and instruction. The work described below is the result of over four months of hard work by a world-class team of experts drawn from across several departments in the libraries. Special thanks to Chris Beer, Kam Chan, Shelley Doljack, Alissa Hafele, Christina Harlow, Jessie Keck, Jack Reed, Sarah Seestone, Camille Villa, Jennifer Vine, Drew Winget.
The work was divided into two broad categories :
- Back-end work to improve performance, stability and monitoring
- Front-end work adding new user-facing features
Performance, stability, and monitoring
SearchWorks is used by faculty, students, and visitors from all corners of the globe, at all hours of the day and on every type of web-enabled device. Users rely on SearchWorks to be available 24 hours per day, 7 days a week as they search the catalog for materials critical to their coursework, dissertation research, or publication deadlines. The 21st century web searcher also expects results to be returned nearly instantaneously. With over 9 million catalog records, 300 million articles and ebooks, and half a million digital objects available in SearchWorks, maintaining this stability and performance is no small task.
Updates to the SearchWorks index
During the summer of 2018, the SearchWorks engineering team focused on updating the system’s backend, and specifically the search index. The index is the data store that contains all of the libraries’ various resource records, and from which results are returned when you enter search terms. The technology behind the index is called Solr, a popular open-source technology used by commercial and nonprofit websites worldwide to drive search. The team successfully updated the version of Solr used by SearchWorks to the latest version (version 7.5 at this writing). SearchWorks had originally been on Solr 4, lacking regular maintenance, adding to performance and security concerns. By being on the latest version of Solr, SearchWorks benefits from the most recent advances in performance and security, as well as new features that improve the quality of search.
In addition to this major upgrade, the team also moved the SearchWorks index to a new platform called SolrCloud that is used to support the indexes of other library services such as Exhibits and EarthWorks. Consolidating our indexes onto a single platform represents an important step for improving the management of these essential indexes. This means that we can more efficiently update or repair all of our search services and ultimately maintain a higher level of service for users.
New indexing pipeline
The search index is created by machine processing all the various sources of catalog records in a way that gives the user the most relevant result. The indexing process retrieves data from various systems, maps metadata fields to the appropriate format to yield precise search results, and sometimes enhances the data (like normalizing dates) to improve search results. The team rewrote this process for all of our data sources, making it easier to update mappings and add new data sources. These changes also increased the speed of indexing our nine million plus records fourfold.
It is also notable that the new indexing pipeline was rewritten using a technology called Traject, which is an open-source technology that has been adopted by many of our peer institutions. By adopting open-source, community-supported technologies, Stanford benefits from improvements made by any other adopting institution, and we can contribute back to the commons as well. One of the most compelling benefits of Traject is that it allows metadata and cataloging experts to directly review and improve metadata mappings in the SearchWorks index so that we can more quickly refine and improve results across sources and resource types.
Improved monitoring and status dashboard
In light of all these efforts to improve stability and performance we also want to make sure users can easily learn about the status and health of SearchWorks and related systems, and confirm the existence of issues or outages they experience. To support this goal of increased transparency, the team developed a new dashboard that can be found at library-status.stanford.edu.
The status dashboard contains three main categories of information:
- Current status: A prominent indicator showing the current status of SearchWorks, and an accompanying table with indicators reporting the status of various other library web systems.
- Incidents: The incidents section includes two Twitter feeds, used by library staff to keep you updated on the status of incidents. The two separate feeds report on (1) the status of incidents affecting SearchWorks and related web systems and (2) regular maintenance and outages of licensed resources and third-party databases.
- Performance metrics: Here you will find a set of interactive graphs illustrating the recent and historical performance of SearchWorks for both the catalog and Articles+ indexes. These graphs provide additional insight into the performance of these systems and give users an indication of whether or not page load time, or search response time is outside the norm.
See a detailed tour of the new system status dashboard in the video below:
New user-facing features
The team also spent considerable time adding new features to SearchWorks, addressing long-standing issues and fixing thorny bugs. The list is too long to cover in its entirety, but here we describe a few highlights and key features in which we think users will be interested.
Select, cite, and share in Articles+
In 2017 we added to SearchWorks a feature called Articles+, which gave users access to over 300 million articles and ebooks through EBSCO Discovery Service. This new feature was met with great enthusiasm but had one major and notable shortcoming. Users could not select individual articles into a list, and then email or export their citation lists to reference managements systems like EndNote, RefWorks, or Mendeley. This functionality, our users told us, was critical for the research process.
Our colleagues at EBSCO worked hard to add this capability to their tools, and we are now thrilled to announce that we have added this critical feature to SearchWorks Articles+. From within Articles+ users can now select individual results, or a full page of results, and save them to a working selection list. At any time the user can generate a citation list in one of five citations formats, print their list, send via email, transfer their list to their RefWorks account, or download their list in a format that can be easily imported into a desktop reference management system. See a demo of this feature in action here, and try it out.
Easy Access to IIIF resources
Visibility of IIIF (International Image Interoperability Framework) content in SearchWorks has been greatly improved. From the SearchWorks landing page there is now a featured resource pointing users to all IIIF-compliant resources in SearchWorks, as well as information about IIIF. A IIIF icon placed on appropriate digital content will allow you to drag and drop the image into any IIIF-compatible viewer, such as Mirador.
Improved presentation of digital content
As different types of digital content have been added to SearchWorks, a mix of conditions and layouts have led to inconsistency and confusion. In search results, the placement of the green “online” badge, and the “online” button that displayed one link but obscured additional sources both caused confusion. Some useful metadata was frequently overlooked. On the record view, metadata sections appeared in different places according to the type of embedded viewer, and some metadata and links were redundantly repeated.
Both views have been revised to improve consistency, reduce redundancy, and clearly label sections. In all cases, the link to an online resource is to the right of the green badge. The presence of more than one online source is visible in search results. The team has been thanked for adding metadata that was really there all along but difficult to see.
Improved presentation and navigation of digital serials
Digital serials and multi-part works were previously presented in SearchWorks with an auto-generated “Part #” label that didn’t reflect the actual title or sequence of the parts. Changes to the indexing pipeline have allowed the addition of existing part labels in MODS records to be included in the SearchWorks index, and SearchWorks was modified to sort and display those labels, making it much easier to see what’s available and find a specific issue without having to open each digital object. The original generic labels will continue to appear in some records while their metadata is updated to take advantage of this feature.
Student work facet
As publishers no longer have all Stanford dissertation metadata, we have added a feature to improve discovery of Stanford student work in SearchWorks. Earlier this year, a working group began a process of iterative indexing and metadata remediation to create new facets, to filter Stanford student work by type of degree, or by school or department. The work cycle completed this effort, and the new facets are now available in SearchWorks. The facets appear when you choose the “Theses & dissertations” resource on the home page.
Changes to how recalls work
Users seeking materials that are currently checked out to another user are presented the option to request the material in SearchWorks. After the requesting user presses submit in the Request form, the current user of the material receives a recall notice even if an additional copy is available on the shelf at Stanford or through BorrowDirect. Obviously, this is disruptive to the current user, and doesn’t provide the level of service one would expect from an institution with vast resources and connections. The team has implemented a solution that will identify alternate available copies at Stanford that can be pulled for the user. If a copy isn’t available at Stanford, the service will check availability at our partner libraries using BorrowDirect. If no copy is available at Stanford or through BorrowDirect, only then will the current Stanford user receive a recall notice. This change, coming in January 2019, will eliminate unnecessary recalls and considerably improve the Request service in SearchWorks.
Other assorted tasks
In addition to the larger features highlighted above, at least 40 issues, based on user feedback, were resolved during this effort. These issues varied in both scope and area of impact, including improvements and bug fixes throughout SearchWorks and related applications. Some examples include:
- Changing the default sort for databases from title to relevance based on user request.
- Fixing a bug causing the date slider in Articles+ to disappear for particular searches.
- Improvements to Requests related emails to clarify information for users.
- Many metadata related updates for both MARC and MODs records.