Feed aggregator

raw2nexus small dataset evaluation

OPF Wiki Activity Feed - 8 April 2014 - 2:31pm

Page added by Alastair Duncan

View Online Alastair Duncan 2014-04-08T14:31:07Z

raw2nexus small dataset evaluation

SCAPE Wiki Activity Feed - 8 April 2014 - 2:31pm

Page added by Alastair Duncan

View Online Alastair Duncan 2014-04-08T14:31:07Z
Categories: SCAPE

Govdocs1 Corpus

OPF Wiki Activity Feed - 8 April 2014 - 2:21pm

Page edited by William Palmer

View Online William Palmer 2014-04-08T14:21:39Z

Govdocs1 Corpus

SCAPE Wiki Activity Feed - 8 April 2014 - 2:21pm

Page edited by William Palmer

View Online William Palmer 2014-04-08T14:21:39Z
Categories: SCAPE

Validate PDF&EPUBs and check for DRM

OPF Wiki Activity Feed - 8 April 2014 - 2:21pm

Page edited by William Palmer

View Online William Palmer 2014-04-08T14:21:10Z

Validate PDF&EPUBs and check for DRM

SCAPE Wiki Activity Feed - 8 April 2014 - 2:21pm

Page edited by William Palmer

View Online William Palmer 2014-04-08T14:21:10Z
Categories: SCAPE

Eyes of the World: George Jungbluth of the US National Oceanographic and Atmospheric Administration

The Signal: Digital Preservation - 8 April 2014 - 1:57pm

Hurricane Katrina moving over Mississippi Delta - Colorized from NCDC/NOAA.

Hurricane Katrina moving over Mississippi Delta – Colorized. Photo courtesy of NCDC/NOAA.

This post is part of our ongoing NDSA innovation group’s Insights interview series.

Scientific data is the biggest of the “big data.” In fact, research data and increased complexity and volume of data are two of the challenges addressed by the National Agenda for Digital Stewardship. To find out more about the data preservation and access challenges at the National Oceanic and Atmospheric Administration, I interviewed George Jungbluth, NOAA’s deputy chief of staff and director of communications.

Mike: Broadly speaking, could you tell us a bit about the kinds of data NOAA collects and why it is important?

George: The National Climatic Data Center collects many types of weather and climate data from the National Weather Service, weather stations, satellite processing systems, radars, weather and climate models, in situ data processing systems and paleoclimate studies.  These data are important to inform weather forecasting for the nation, assess drought severity, understand our changing climate, enable fisheries management, promote scientific research and inform decision makers on environmental matters.

Mike: Are you preserving any particularly problematic file types?

George: Our data holdings have many data formats (binary, ascii, text, BUFR, netCDF, JPG, PDF, etc.). Our preference is for platform-independent, self-describing formats such as Network Common Data Form (netCDF).  The most problematic formats are older data without proper documentation.

Mike:What are some of the challenges that NOAA faces with data preservation? Is scale a challenge?

George:The volume of data NCDC expects to preserve, store and provide access to is increasing at a rapid pace, which poses a challenge to the rate at which our systems and network bandwidth can scale.

Contiguous U.S. Temperature - February, 1901 - 2000. NCDC/NOAA.

Contiguous U.S. Temperature – February, 1901 – 2000. Graphic courtesy of NCDC/NOAA.

NCDC faces some challenges acquiring data, including how to securely collect the data from hundreds of providers and how to best interface with large volume/rate providers such as satellite systems and modeling systems but more on providing access to the large data volumes.

Mike: What different ways does this data come into NOAA?

George: NCDC acquires data from multiple sources including directly from data producers via documented interfaces (preferred), phone systems, internet transfers and data delivered on physical media.

Mike: There is a significant push to make more government data more broadly accessible. Is this an area that NOAA is doing much work in?

George: Yes. Data accessibility is one of our many faceted challenges.  Developing and managing the metadata required for search and display (dataset level) as well as more in depth (file or granule level) metadata needed for understanding and using the data is one challenge.  Developing the scalable system for hosting and managing the metadata is another challenge.  As mentioned earlier, providing access quickly to multiple petabytes of data is another issue.

Mike:Longitudinal data — continuous readings going back in time — is of critical value for work in studying topics like climate change. What are some of the challenges that NOAA faces in ensuring long-term access to its data?

First Satellite Image - 1960 (TIROS). Photo courtesy of NCDC/NOAA.

First Satellite Image – 1960 (TIROS). Photo courtesy of NCDC/NOAA.

George: There are many challenges to providing long-term access to data.  Mainly the management of the metadata for continuous understanding of data and the management of data formats and access mechanisms.

Mike: What lessons has NOAA learned about data preservation that might be useful for other organizations with similar issues?

George: Preservation planning should begin as early as possible, even before the data are obtained and/or produced.  Establish standards (ex. metadata, formats), guidelines and processes to support preservation, and provide tools to enable the preservation.  Determine what data and information will be preserved for the long term, understand the costs associated with preservation and provide the necessary resources.

Mike: How do you think NOAA’s experience would be useful for other Federal agencies or for other agencies in other countries with similar missions?

George: Sharing our challenges of scaling our systems to support data growth, developing scalable data and metadata management systems would hopefully prove useful to other organizations.

Mike: Are you developing any specific skill sets for long-term data preservation or data analysis?

George: At NCDC, we encourage our staff to explore new technologies, standards and tools.  We are also training our staff on Information Science and Technology principles.

Satellite data, January 1, 2014. Photo courtesy of NCDC/NOAA.

Satellite data, January 1, 2014. Photo courtesy of NCDC/NOAA.

Mike: Who are the primary users for NOAA’s data? What kinds of challenges do you face in meeting the needs of your users?

George: The NCDC users are extremely diverse. Users have widely varying expertise, access needs, response times and types of usage. Data uses include many societal benefit areas (insurance, tourism, energy, etc.). Data are used by researchers for understanding and studying weather and climate, by lawyers in cases involving weather conditions, regulators interested in evaluating energy rates and water usage, and by government and media for monitoring and reporting on recent weather and climate events. The challenge is to provide a set of data-access methods to our data that users can easily find what they need and have access to it.

Mike:What do you see as the biggest data challenges facing NOAA in the next five or ten years? In particular, what kinds of issues do you face in terms of increasing scale of data?

George: In the next five to ten years the volume of satellite data will more than double (from 4 to 10TB/day) and the volume of climate model data will grow exponentially.  Our technological infrastructure (storage, networks) will need to scale and our resources for managing data and metadata will need to increase. But the biggest problem by far will be in allowing the easy search, discovery, interpretation and effective analysis of this data so that the data have the most value to the nation.

Categories: Planet DigiPres

World of Warcraft: Warlords of Draenor Alpha has begun - Load The Game

Google News Search: "new file format" - 8 April 2014 - 12:54pm

Load The Game

World of Warcraft: Warlords of Draenor Alpha has begun
Load The Game
This new file format makes patching faster, improves real-world game performance and allows Blizzard to hotfix problems that would normally require patching, among other things. The transition to the new format will be made via a patch that will be ...

and more »
Categories: Technology Watch

Demonstrations

OPF Wiki Activity Feed - 8 April 2014 - 12:50pm

Page edited by Kristin Dill

View Online Kristin Dill 2014-04-08T12:50:31Z

Demonstrations

SCAPE Wiki Activity Feed - 8 April 2014 - 12:50pm

Page edited by Kristin Dill

View Online Kristin Dill 2014-04-08T12:50:31Z
Categories: SCAPE

SCAPE Azure Platform

OPF Wiki Activity Feed - 8 April 2014 - 12:26pm

Page edited by Ivan Vujic

View Online Ivan Vujic 2014-04-08T12:26:20Z

SCAPE Azure Platform

SCAPE Wiki Activity Feed - 8 April 2014 - 12:26pm

Page edited by Ivan Vujic

View Online Ivan Vujic 2014-04-08T12:26:20Z
Categories: SCAPE

SCAPE Azure Platform > SystemArchitectureOfScapeAzurePlatform.png

SCAPE Wiki Activity Feed - 8 April 2014 - 12:19pm

File attached by Ivan Vujic

PNG File SystemArchitectureOfScapeAzurePlatform.png (170 kB)

View Attachments Ivan Vujic 2014-04-08T12:19:00Z
Categories: SCAPE

SCAPE Azure Platform > ArchitectureComponents.png

OPF Wiki Activity Feed - 8 April 2014 - 12:18pm

File attached by Ivan Vujic

PNG File ArchitectureComponents.png (120 kB)

View Attachments Ivan Vujic 2014-04-08T12:18:03Z

SCAPE Azure Platform > ArchitectureComponents.png

SCAPE Wiki Activity Feed - 8 April 2014 - 12:18pm

File attached by Ivan Vujic

PNG File ArchitectureComponents.png (120 kB)

View Attachments Ivan Vujic 2014-04-08T12:18:03Z
Categories: SCAPE

Agenda of the SCAPE Information Day at the Austrian National Library

OPF Wiki Activity Feed - 8 April 2014 - 12:10pm

Page edited by Kristin Dill

View Online Kristin Dill 2014-04-08T12:10:02Z

Agenda of the SCAPE Information Day at the Austrian National Library

SCAPE Wiki Activity Feed - 8 April 2014 - 12:10pm

Page edited by Kristin Dill

View Online Kristin Dill 2014-04-08T12:10:02Z
Categories: SCAPE