Reprocessing Data Challenges of Producing A Time Series

August 2009 Monthly Chlorophyll-a Composite; data courtesy of the ESA Ocean Colour Climate Change Initiative project

August 2009 Monthly Chlorophyll-a Composite; data courtesy of the ESA Ocean Colour Climate Change Initiative project

Being able to look back at how our planet has evolved over time, is one of the greatest assets of satellite remote sensing. With Landsat, you have a forty year archive to examine changes in land use and land cover. For in situ (ground based) monitoring, this is something that’s only available for a few locations, and you’ll only have data for the location you’re measuring. Landsat’s continuous archive is an amazing resource, and it is hoped that the European Union’s Copernicus programme will develop another comprehensive archive. So with all of this data, producing a time series analysis is easy isn’t it?

Well, it’s not quite that simple. There are the basic issues of different missions having different sensors, and so you need to know whether you’re comparing like with like. Although data continuity has been a strong element of Landsat, the sensors on Landsat 8 are very different to those on Landsat 1. Couple this with various positional, projection and datum corrections, and you have lots of things to think about to produce an accurate time series. However, once you’ve sorted all of these out and you’ve got your data downloaded, then everything is great isn’t it?

Well, not necessarily; you’ve still got to consider data archive reprocessing. The Space Agencies, who maintain this data, regularly reprocess satellite datasets. This means that the data you downloaded two years ago, isn’t necessarily the same data that could be downloaded today.

We faced this issue recently as NASA completed the reprocessing of the MODIS Aqua data, which began in 2014. The data from the MODIS Aqua satellite has been reprocessed seven times, whilst its twin, Terra, has been reprocessed three times.

Reprocessing the data can include changes to some, or all, of the following:

  • Update of the instrument calibration, to take account of current knowledge about sensor degradation and radiometric performance.
  • Appyling new knowledge, in terms of atmospheric correction and/or derived product algorithms.
  • Changes to parallel datasets that are used as inputs to the processing; for example, the meteorological conditions are used to aid the atmospheric correction.

Occasionally, they also change the output file format the data is provided in; and this is what has caught us out. The MODIS output file format has changed from HDF4 to NetCDF4 with the reason being that NetCDF is a more efficient, sustainable, extendable and interoperable data file format. A change we’ve known about for a long time, as it resulted from community input, but until you get the new files you can’t check and update your software.

We tend to use a lot of Open Source software, enabling our clients to carry on working with remote sensing products without having to invest in expensive software. The challenge is that it takes software provider time to catch up with the format changes. Hence, the software is unable to load the new files or the data is incorrectly read e.g., comes in upside down. Sometimes large changes, mean you may have to alter your approach and/or software.

Reprocessing is important, as it improves the overall quality of the data, but you do need to keep on top what is happening with the data to ensure that you are comparing like with like when you analyse a time series.