On Saint Patrick’s Day, SLA chapter members got a behind the scenes look at LOCKSS and CLOCKSS presented by Vickie Reich, one of the first employees of HighWire Press, part of Stanford Libraries. LOCKSS (Lots of Copies Keeps Stuff Safe) came into being in 1998 to address the trend in academic libraries to rely on leasing access to content. rather than actively acquiring collections.
Reich focused on the preservation of e-materials, concerned that a lot of digital content was digital only and not appearing in print. If libraries lose access due to funding, or publishers go out of business or formats changes, then their research communities lose access to this content. Creating an archive of digital content which separates payment from ongoing access ensures this value is not lost.
The key to digital preservation is maintaining a minimum of 7 copies of digital content in libraries in various geopolitical locations. If one copy is corrupted or destroyed,
it can be reconstructed from the remaining copies using a bit-by-bit comparison. (It's a myth that replication of web content is exact - there are random errors in each copy). Libraries are the logical neutral custodians of this content.
Each participating library has a LOCKSS Box with at least 3 terabytes of storage. The software itself is based on open source, peer-to-peer protocols. Each Box has a web crawler which regularly crawls the publishers websites and downloads the content.The next step in the process is the determination of the "authoritative" version of the content, by making bit by bit comparisons. The authoritative version of the content is then distributed to each Local Box. LOCKSS also preserves the artifact as well as the content because the look and feel of the content is important. However, not all components can be preserved; only items remaining static on the website for a week or more can be preserved.
Functionally the LOCKSS content is integrated into the online catalog, pointing to the original publishers’ URL. LOCKSS preferentially serves the content from the publisher, and only if content is not available from the publisher does the catalog go to the contents in the local LOCKSS Box. Private LOCKSS Network (PLN) is available for Special Collections. An additional benefit is that LOCKSS has created a de facto standard for archiving
The LOCKSS team is 8 strong at HighWire. LOCKSS currently has over 200 participating research libraries and over 420 participating publishers. None of the "big" publishers is currently participating in the LOCKSS program, a contentious point with librarians.
CLOCKSS (Controlled LOCKSS),is a separate entity based on the LOCKSS technology. Conceptually, it is an international community-governed archive with low fees to encourage participation. It is set up to allow free access to publications when they cease publication, ensuring access to archives which are no longer economically supported by their creators. Once content is no longer available from publishers, it becomes freely available through CLOCKSS with a Creative Commons license.
Recruitment began October 2009 and there are currently 24 participating publishers, with some of the big names negotiating to include their ceased publications. HighWire is working to create an endowment to further lower fees and encourage participation. CLOCKSS Boxes are currently located at research libraries worldwide with geographic, geopolitical, and geological dispersion. According to Reich, all major search engines crawl the open access content, which often is also available behind a paywall.
The presentation was fascinating and provided a lot of food for thought regarding the future of access to electronic content, and the role of libraries.
Special thanks to Roche Palo Alto for hosting the SLA chapter program, and Silvia Patrick for sponsorship. Your support is appreciated.
Submitted by Jeanie Fraser and Jean Bedord, Photos by Cliff Mills.
Comments