ICOADS Web information page (Tuesday, 07-Jan-2014 23:58:18 UTC):
Integer Packing Used for PSD netCDF Files and Small Numerical Biases

Please note: The rounding-error biases described in this document have been corrected for all currently available data. The information here describes small rounding biases that existed in previous data releases. It also details minor residual effects (not numerically significant) of finite-precision arithmetic, which the corrected packing scheme still has on the currently available data.

Metadata conventions used for storage of gridded data in netCDF format at
NOAA/ESRL Physical Sciences Division (PSD) are described here, including
data attributes.  The "true" (floating-point) ICOADS data are packed as
16-bit integer "coded" values in the netCDF files at PSD using two of the
data attributes, add_offset and scale_factor, as follows:
     coded = nint((true - add_offset) / scale_factor)
where nint indicates rounding to the nearest integer.

When reading a netCDF file, the reverse formula must be used to recreate
the true value from the coded value:
     true = (coded * scale_factor) + add_offset
(as discussed below, this results in an approximation to the original
floating-point value, due to finite-precision arithmetic).

The attribute values are stored within the header of each netCDF file, and
can be accessed using software (Fortran, C, and other languages) available
from Unidata, or via other applications software (e.g., GrADS, IDL) that
reads netCDF.  In addition, the "ncdump" command (part of the standard suite
from Unidata) can be used to print the structure of a netCDF file, including
attribute values, as ascii text.

Previously available ICOADS 2°x2° and 1°x1° netCDF files at PSD were subject to
small numerical biases due to a rounding error in the C-language program used
to produce netCDF from the original ICOADS binary formats.  This calculated the
coded value according to the above formula, but failed to round the result to
the nearest integer, i.e., using this modified formula:
     coded = trun((true - add_offset) / scale_factor)
where "trun" indicates truncation of the decimal portion when creating the
integer.

This sometimes lead to a unit error in the integer value, and thus an error
of about one increment of the precision encoded for a given statistic of a
variable (e.g., for the mean of sea surface temperature, plus or minus 0.01°C).
Based on tests performed on 1950-79 data, the biases impacted only the mean,
standard deviation, and sextiles (no impact on other statistics) and generally
not more than one quarter of the gridboxes per year-month.  That program is
now corrected, and in conjunction with installation of ICOADS Release 1c
summaries (1800-1949) at PSD, these problems were fully resolved in all
the 2°x2° data (during April 2001) and 1°x1° data (on 4 May 2001).

Smaller floating-point differences will still exist between the data unpacked
from netCDF, and data unpacked from the original ICOADS Monthly Summary Group
(MSG) files (from which the netCDF data were derived), which should not be
considered numerically significant (i.e., they are smaller than the precision
encoded for a given statistic of a variable).  For example, Table 1 shows the
magnitude of the differences between the two packing methods (with and without
rounding) for the mean of wind speed.  As illustrated in detail in Table 2,
these differences arise from finite-precision arithmetic and the different
packing methods used in netCDF versus MSG.


Table 1.  Minimum, mean, and maximum differences between packing/unpacking
results using the PSD netCDF method (n), versus the ICOADS method (c), over the
defined range for the mean of wind speed (0.00 to 102.20 m/s).  The 1021 input
true values (0.00-102.20 m/s in increments of 0.01) were packed into coded1
(coded2) using truncation (rounding), and then unpacked into true1 (true2).
The netCDF method is described in the text, where for the mean of wind speed:
     add_offset = 327.650
     scale_factor = 0.01
The ICOADS method is:
     coded = nint((true / units) - base)
     true = (coded + base) * units
where for the mean of wind speed:
     units = 0.01
     base = -1
and "nint" indicates rounding to the nearest integer (or truncation, "trun,"
was used instead for calculation of coded1).
-------------------------------------------------------------------------------
                             true - true1  true - true2 true1 - true2  method
                           (input - trun)(input - nint) (trun - nint)
===============================================================================
min  difference over range   -0.010002136  -0.000017166   0.000000000     n
mean difference over range   -0.001002443  -0.000001241   0.001001201     n
max  difference over range    0.000015259   0.000015259   0.010009766     n
  
min  difference over range    0.000000000   0.000000000  -0.010000229     c
mean difference over range    0.000650405   0.000000000  -0.000650405     c
max  difference over range    0.010000229   0.000000000   0.000000000     c
-------------------------------------------------------------------------------


Table 2.  Detailed comparison between packing/unpacking results using the PSD
netCDF method (n), versus the ICOADS method (c), over the defined range of the
statistic (configured identically for all variables): mean latitude of the
observations with respect to the lower-left (SW) corner of a 2°x2° box (the
precision for this statistic is 0.2°).  The input true values were packed into
coded, and then unpacked into true2.  Differences are shown between each input
and unpacked value, with their minimum, mean, and maximum values shown below.
Using the netCDF method, for the mean latitude of the observations:
     add_offset = 3276.60
     scale_factor = 0.1
or using the ICOADS method:
     units = 0.2
     base = -1
-------------------------------------------------------------------------------
    true   coded     true2                 true - true2                method
===============================================================================
  0.0000  -32766    0.0000                  0.000000000                   n
  0.0000       1    0.0000                  0.000000000                   c

  0.2000  -32764    0.2000                  0.000048831                   n
  0.2000       2    0.2000                  0.000000000                   c

  0.4000  -32762    0.4001                 -0.000146478                   n
  0.4000       3    0.4000                  0.000000000                   c

  0.6000  -32760    0.6001                 -0.000097632                   n
  0.6000       4    0.6000                  0.000000000                   c

  0.8000  -32758    0.8000                 -0.000048816                   n
  0.8000       5    0.8000                  0.000000000                   c

  1.0000  -32756    1.0000                  0.000000000                   n
  1.0000       6    1.0000                  0.000000000                   c

  1.2000  -32754    1.2000                  0.000048876                   n
  1.2000       7    1.2000                  0.000000000                   c

  1.4000  -32752    1.4001                 -0.000146508                   n
  1.4000       8    1.4000                  0.000000000                   c

  1.6000  -32750    1.6001                 -0.000097632                   n
  1.6000       9    1.6000                  0.000000000                   c

  1.8000  -32748    1.8000                 -0.000048757                   n
  1.8000      10    1.8000                  0.000000000                   c

  2.0000  -32746    2.0000                  0.000000000                   n
  2.0000      11    2.0000                  0.000000000                   c
  
min  difference over range                 -0.000146508                   n
mean difference over range                 -0.000044374                   n
max  difference over range                  0.000048876                   n
  
min  difference over range                  0.000000000                   c
mean difference over range                  0.000000000                   c
max  difference over range                  0.000000000                   c
-------------------------------------------------------------------------------
[Documentation and Software][Links to additional]
U.S. National Oceanic and Atmospheric Administration hosts the icoads website privacy disclaimer
Document maintained by icoads@noaa.gov
Updated: Jan 7, 2014 23:58:18 UTC
http://icoads.noaa.gov/netcdf.html