Following the community response to our workshop last year, we want to invite you again to contribute your future preservation challenge!
The second year review of the FP7 Collaborative project SCAPE took place on April 18-19, 2013 at the AIT office in Vienna. As I have recently received the final written report of this review, I thought I would share some results with the preservation community.
This event is now full and registration has been closed. If you would like to be added to the waiting list please email Sharon McMeekin ( email@example.com).
A training event organised by APARSEN, presented in collaboration with TIMBUS, SCAPE, EUDAT and the IMPACT Centre, and sponsored by the European Commission.
This week-long event, hosted by the Digital Preservation Coalition at the University of Glasgow will bring together representatives from projects and organisations at the leading edge of digital preservation research, providing attendees with training at an advanced level. The training will aim to cover issues across the complete digital preservation life-cycle by addressing topics within four main themes: Governance and Management, Digital Object/Data Creation, Preservation Planning, and Infrastructure.
The course is a distinctive addition to digital preservation training activities in Europe and is the first iteration of what is to become a yearly training event, bringing together those at the forefront of digital preservation research and training. It is intended for managers and staff already working in digital preservation. It assumes a working knowledge of existing standards like the Open Archival Information System - OAIS - as well as an understanding of how issues of preservation apply to their own institution. An optional half-day digital preservation ‘boot-camp’ will be held prior to the commencement of the main course for those wishing a refresher on key concepts.
This training event is co-funded by the European Community’s 7th Framework Programme for Research and Development FP7/2007-2013 – ICT-2009.4.1: Digital Libraries and Digital Preservation (grant agreement No 269977), the APARSEN Project.
What Will We Do?
Using a mix of presentations, practical exercises, case studies, group discussion and tool demonstrations, the training event will examine four main themes encompassing issues across the digital preservation lifecycle.
The final syllabus will be confirmed prior to the event but topics covered will include the following (amongst others):
Who Should Come?
This training event is primarily aimed at:
- Records managers and information officers in organisations that rely on long-lived data
- Collections managers, librarians, curators and archivists in all institutions
- Innovators and researchers in information technology and computing science
This is not an entry-level course. Participants should have previous practical experience in digital preservation tools, technologies or standards. Note also that, in return for the subsidized attendance at this course, participants will be asked to evaluate the training materials presented.
Monday (AM): Digital Preservation Boot-Camp (Optional)
Monday (PM) – Tuesday (AM): Governance and Management
Tuesday (PM) – Wednesday (AM): Digital Object/Data Creation
Wednesday (PM) – Thursday (AM): Preservation Planning
Thursday (PM) – Friday (AM): Infrastructure
Friday (Before 1PM): Round-up and Close
For more information, please visit the DPC website: http://www.dpconline.org/events/details/62-APARSEN-Training-APJul13?xref=68
The DROID software tool is developed by The National Archives (UK) to perform automated batch identification of file formats by assigning Pronom Unique Identifiers (PUIDs) and MIME types to files. The tool uses so called signature files as a basis of information stemming from the PRONOM technical registry.
I am here presenting some considerations for using the tool on the Hadoop platform together with a performance evaluation of the job execution on a Hadoop cluster using the publicly available Govdocs1 corpus data set.
Before Easter we planned to do a correctness benchmark for Audio Migration QA, specifically targeting the new tool xcorrSound waveform-compare, see http://openplanetsfoundation.org/blogs/2012-07-09-xcorrsound-waveform-compare-new-audio-quality-assurance-tool.
We have been evaluating the use of the latest Fedora Commons, version 3.6.2, as a test repository. Having followed the straightforward installation process we were left with a repository with one preconfigured user – fedoraAdmin.
There are two APIs – API-A for access and API-M for management. For our test instance API-A was configured on installation to require a log in, but it can be configured to require no log in. It appeared that whilst the REST API for API-A was restricted, the SOAP API for API-A was not, this was corrected by using the example policy, below. Investigations of how to configure multiple users are also detailed.
Tika File Mime Type Identification and the Importance of Metadata
An evaluation was recently carried out to determine how well Apache Tika was able to identify the mime types of a corpus of test files, described in the ‘Data Set’ section. The purpose of the evaluation was to determine:
1. if the performance* of Tika has changed between versions 1.0 and the current version, 1.3 and,