We recently posted an article on the UK Web Archive blog that may be of interest here, User-Driven Digital Preservation, where we summarise our work with the SCAPE Project on a little prototype application that explores how we might integrate user feedback and preservation actions into our usual discovery and access processes.
It is well-known that PDF documents can contain features that are preservation risks (e.g. see here and here). Migration of existing PDFs to PDF/A is sometimes advocated as a strategy for mitigating these risks. However, the benefits of this approach are often questionable, and the migration process can also be quite risky in itself. As I often get questions on this subject, I thought it might be worthwhile to do a short write-up on this.
The first part of the workshop will be a panel session at which David Giaretta (APARSEN), Ross King (SCAPE), and Ed Fay (OPF) will be discussing digital preservation.
In this post I'll be taking a look at format identification of PDF files and highlighting a difference in opinion between format identification tools. Some of the details are a little dry but I'll restrict myself to a single issue and be as light on technical details as possible. I hope I'll show that once the technical details are clear it really boils down to policy and requirements for PDF processing.
On September 8 the SCAPE/ APARSEN workshop Digital Preservation Sustainability on the EU Level is held at London City University in connection with the DL2014 conference.
The OPF is holding a PDF event in Hamburg on 1st-2nd September 2014 where we'll be taking an in-depth look at the PDF format, its sub-flavours like PDF/A and open source tools that can help. This is a quick post of list of things you can do to prepare for the event if you're attending and looking to get the most out of it.