Noteworthy discoveries in the SLAC web archive

In the course of creating a browsable archive of the SLAC earliest websites, we discovered a number of interesting facts and features that might not be readily apparent on casual browsing. While surely not an exhaustive catalog, we hope that these observations will help you to quickly get into the archive and discover some of what it has to offer.
Establishing web conventions
- Before the "home page": the notion and labeling of the "home page" didn't emerge until some time after the inception of the Web itself. Before settling on this convention, SLAC actually had three simultaneous (what we might now call) home pages, corresponding to three types of information seekers. The first reference to the "SLAC Home Page" appeared on the August 25, 1994 version of the SPIRES page, almost three years after the creation of the oldest U.S. website.
- Other evolving conventions: other conventions that we take for granted now were not so obvious when the Web only consisted of a handful of sites. SLAC implemented their initial website with the string /FIND/ in the address, but this was just an artifact of CERN's implementation of a similar service.
Evolving web architecture
- Simpler, smaller web pages: the first U.S. web page was only 11 lines long. The most recent SLAC web page in the archive, dating to January 4, 1999, is still only 172 lines long. For stark comparison, the code that generates the banner on each archived web page is 654 lines long.
- Late use of the <html> element: the <html> element doesn't appear in the archive until some of the 1995 pages, e.g., the SPIRES page and the SLAC introduction page.
- First image on a SLAC website: the first image that we found appeared on a version of the SLAC home page dated January 2, 1994. The image likely doesn't appear in your browser, though, since support for the X BitMap format has been dropped from Chrome, Firefox, Internet Explorer, and Safari. The following version of the SLAC home page features the same logo as a GIF file; it is the same graphic as appears at the top of this blog post.
Updates and mistakes
- Infrequent updates: even by 1999, the websites were not updated frequently. For instance, examining the last version of the SLAC home page in the archive, there is a gap of at least two months between the last modified date indicated in the footer and the backup date of the page.
- A couple of the earliest broken links: the April 30, 1992 version of the first SLAC web page has two broken links (i.e., starting with http://slacvm.slac.stanfordedu/ instead of http://slacvm.slac.stanford.edu/). When we were performing the restoration, we had to check with the SLAC "WWW Wizards" to confirm that it wasn't just that DNS worked differently back then. By the following version on August 18, 1992, the broken links had been fixed.
Web archiving considerations
- Deepening the record of the historical web: Open Wayback, the software that we're using to re-present the websites, features a hard-coded 1996 start date, reflecting the year that Internet Archive and other institutions started archiving the Web. Our digital archaeology project has resulted in temporally-addressable content going back to 1991, which required us to modify the upstream code.
- The Web has always been hard to archive: the web archiving community recognizes that the Web's transition from a collection of static documents accessible to crawlers to an executable environment only intelligible to more sophisticated archiving clients poses a growing challenge for content capture and preservation. The oldest U.S. website, itself just a database front-end, demonstrates that even early web systems are complicated to archive using contemporary tools.
- Ancient web; modern browser: a major part of the experience of the SLAC earliest websites that our re-presentation doesn't effectively simulate is accessing them through a historical interface, such as the MidasWWW browser. CERN provides a line-mode browser emulator for accessing the first webpage, and we're interested in exploring a similar feature in the future.