FIDO News

PDF Eh? – Another Hackathon Tale

Here's a little newsbulletin about FIDO, the open source file format identification tool of OPF.

It seems that the use of FIDO is growing the last few months. I am getting responses by e-mail and through the Github issuetracker from all over the world, ranging from requests for help, giving suggestions for improvement and even some bugfixes. Thanks and please keep them coming!

RECENT CHANGES

Most important change currently is the versioning schema of tagged releases.
If you forked FIDO or watching the tags for updates, please notice that the versioning schema has changed from [major].[minor].[patch] to [major].[minor].[patch]-[PRONOM version number].
The reason for this is that from time to time there is a new PRONOM version available but there are no code changes to commit. As it is bad practice to update a tagged release this was the only reasonable way to fix this.

For example, release 1.3.1 has PRONOM version 70 distributed with it and is tagged '1.3.1-70'.
If a PRONOM update is available but there are no code changes the consecutive tag will be '1.3.1-71'. Please note that this is only reflected in release tags, FIDO will still only report its version number without the PRONOM version number.

Currently I am also working on the FIDO usage guide. It is still a work in progress, but it could help you on your way using FIDO.

FUTURE

I'll be the first to admit that FIDO is still far from being "the perfect file format identification tool". Although it is quite stable and many things are improved or fixed lately such as the handling of files passed to STDIN or the possibility to use only the official PRONOM signatures, it still needs improvement on many levels.

Recently Carl Wilson (OPF technical lead) and I started to work on thinking what needs changing for FIDO version 2. This second generation of FIDO will not differ much in functionality of the current version 1 generation but the way we plan on doing things will make a big difference. For starters we will be creating unit tests for every function of FIDO. Second important thing are unit testing of individual PRONOM signatures and PRONOM container signatures. With each update of PRONOM we will run unit tests using corpora files.

But the biggest change of all will be the way we build FIDO. It will no longer be just "a script", but rather an API. The "fido.py" script will then merely function as a prototype how to build your "own" FIDO into your workflow systems. It will also no longer output to STDOUT and STDERR but will return results in a more Pythonic way. You will read more about all this in a later post.

In the mean while I (with a little help of you) will continue on improving version 1 where possible. If you have any questions or suggestions about any of the above, please let me know.

FIDO @ Open Planets Github
FIDO releases @ Open Planets Github
FIDO usage guide

Leave a Reply

Join the conversation