As David Rosenthal pointed out, as long as there is a piece of commercial software or an open source project capable of accessing a format, it cannot be considered truly obsolete. I agree, but I fear this this ‘absolute’ format obsolescence is a poor proxy for the real problem, which is to ensure that our content is not just kept safe, but also remains accessible to our readers both now and in the (near) future. I am perfectly able to compile an open source software application, but I’m not everybody. Indeed, the British Library is committed to providing continuous access for a wide range of people who are almost entirely not me.
Somewhat later than planned, I present here my final installment on the Format Registry Challenge.
During this “experiment” I have spent some time becoming acquainted with the Git version control system, working with both Windows msysgit and the Eclipse Git plugin EGit. One result is that my prototype registry source code (and associated war archive) is available on GitHub:
I just wanted to point out a very interesting discussion on format obsolescence: The Half-Life of Digital Formats, A Puzzling Post From Rob Sharpe and Rob Sharpe’s Case For Format Migration. I think this is a very important issue, and I think we must address it as it cuts to the core of what the Planets tools are for.
A couple of weeks ago we announced the market research into the potential of Digital Preservation Practices and Tools outside its traditional market Libraries and Archives.
Last week the report was presented to an audience consisting of the business school examination board and the customer (OPF, Adam Farquhar and myself). Both Adam and I where impressed by the quality and the amount of work the team managed to complete in such a short period.
At the hackathon it was clear that the identification discussion started by Fido represented an archetypal example of why this community wants to work together. No matter what the institution, whatever the context or workflow, we all need reliable tools for identifying files and formats. Of course, reliable identification requires reliable identifiers, and so the discussion about the tools is necessarily intertwined with the idea of a format registry.
It is part OPF’s mission and plan to sustain tools and practices from Planets and other R&D initiatives relevant to digital preservation and long term access. Most Software products from R&D projects run Software prototypes on platforms that are suitable for R&D purposes. From a development perspective totally OK and fine as long as there is no serious intention to deploy these tools and practices in production.
During the past couple of weeks, there have been some thoughtful and well-informed discussions about Fido, Droid, Pronom, and file format identification in the comment stream of this blog. They make interesting reading.
In a recent comment, Shaun Zevin raises some points about the algorithmic complexity of the Droid and Fido pattern matching.
OPF has established a Technical and Architecture Advisory Board to provide technical leadership to member organisations and to offer technical advice to the OPF Board of Directors.
The Technical Board comprises Senior Developers and Architects from member organisations.They have a keen interest in engaging the developer community and actively participating in the decision-making processes to support the development of production quality software.
The Danish National Archives, the State and University Library, the Royal Library and the Danish Film Institute are pleased to announce the release of a new website, www.digitalbevaring.dk, which shares knowledge about digital preservation. The site is aimed at archives, libraries and museums responsible for the preservation of digital collections, but we hope that people in general will find useful information on how to preserve their personal digital data.
Before I started with format editing, I realized that it would be very simple to implement cool URLs for file formats. The only difficulty was that Tapestry associates pages with Java classes, and “x-fmt” is not an allowed Java class name. This meant implementing URL re-directs, which is a special topic in Tapestry. Tapestry does not believe in heavily modifying the web.xml, almost all configuration takes part in so-called filter classes. Anyway, once I sorted out how to do re-writes, I could immediately produce nice output for requests like:
http://host/repository/x-fmt/123
http://planets-suite.sourceforge.net/