Last week I was at home in bed for four days with a bad back. I could not really type, but I could browse, so I did have an opportunity to catch up on the recent OPF blog entries. I was particularly interested in the discussions regarding a new Format Registry. I am an amateur Java programmer at best, but I enjoy getting my hands dirty now and again, so I decided to take up Adam’s challenge: how far could I get with a format registry application in one week? This blog entry is a report on my first two days of the project.
See more photographs from the OPF Hackathon at: http://www.flickr.com/photos/opf
Big big thank you to the IISH for hosting and supporting the OPF Hackathon. We had more people attending then originally planned and IISH really helped out in every aspect, their flexibility with regards to meeting rooms and catering was absolutely fantastic. Venue and facilities where perfect for our event, as was hospitality and support of IISH staff.
One of my favourite parts of the Planets project was the service developers’ workshops. The events brought together the developers from across the project (and from outside too). In each and every one, it was always clear that the people in that room really cared about this stuff, and really wanted to push things forward together…
I’m not at the OPF Hackathon this week in the Netherlands, and I’ll admit to being slightly envious of those who are! The idea behind the Hackathon is to bring practioners and developers together for some intense exchange of goals and ideas, collect use cases, show each other tools and approaches, and do some quality coding.
OPF foresees a potential market in commercial industry once awareness and subsequently compliance on long-term access become reality. This will have immediate impact on policies, practices and workflow in major enterprises. Major changes in compliance and change in industry are often driven by the major business consultancy firms such as IBM Global Services (formerly Price Waterhouse Coopers), KPMG, Deloitte, Accenture, Ernst & Young and many others. Consultancy firms are strategic partners to gain access to industry.
Content is King. The key to a good file format registry is not software; it’s not user interface; it’s not governance. The key is content, content, content. We will all win if we have a registry whose content is usable, accurate, and comprehensive.
I have a challenge for developers in the digital preservation community: can we build a file format registry without building any new software systems at all?
I’ve just started a small assignment for the OPF to investigate the options for a new file format registry, part of the toolbox needed for long-term preservation of digital material by archives, libraries and other memory institutions. This initiative was kicked off and sponsored by the National Archives of the Netherlands, and is now in progress under the auspices of the OPF.
Fido is a simple format identification tool for digital objects that uses Pronom signatures. It converts signatures into regular expressions and applies them directly. Fido is free, Apache 2.0 licensed, easy to install, and runs on Windows and Linux. Most importantly, Fido is very fast.
A couple of people have asked me if my experiments with Pronom and Fido would have been easier if Pronom had been available as RDF or LinkedData. The short answer to this question is ‘no’. Let me explain why.
http://planets-suite.sourceforge.net/