| ||||||
ICOADS Web information page (Tuesday, 07-Jan-2014 23:58:18 UTC): Integer Packing Used for PSD netCDF Files and Small Numerical BiasesPlease note: The rounding-error biases described in this document have been corrected for all currently available data. The information here describes small rounding biases that existed in previous data releases. It also details minor residual effects (not numerically significant) of finite-precision arithmetic, which the corrected packing scheme still has on the currently available data.Metadata conventions used for storage of gridded data in netCDF format at NOAA/ESRL Physical Sciences Division (PSD) are described here, including data attributes. The "true" (floating-point) ICOADS data are packed as 16-bit integer "coded" values in the netCDF files at PSD using two of the data attributes, add_offset and scale_factor, as follows: coded = nint((true - add_offset) / scale_factor) where nint indicates rounding to the nearest integer. When reading a netCDF file, the reverse formula must be used to recreate the true value from the coded value: true = (coded * scale_factor) + add_offset (as discussed below, this results in an approximation to the original floating-point value, due to finite-precision arithmetic). The attribute values are stored within the header of each netCDF file, and can be accessed using software (Fortran, C, and other languages) available from Unidata, or via other applications software (e.g., GrADS, IDL) that reads netCDF. In addition, the "ncdump" command (part of the standard suite from Unidata) can be used to print the structure of a netCDF file, including attribute values, as ascii text. Previously available ICOADS 2°x2° and 1°x1° netCDF files at PSD were subject to small numerical biases due to a rounding error in the C-language program used to produce netCDF from the original ICOADS binary formats. This calculated the coded value according to the above formula, but failed to round the result to the nearest integer, i.e., using this modified formula: coded = trun((true - add_offset) / scale_factor) where "trun" indicates truncation of the decimal portion when creating the integer. This sometimes lead to a unit error in the integer value, and thus an error of about one increment of the precision encoded for a given statistic of a variable (e.g., for the mean of sea surface temperature, plus or minus 0.01°C). Based on tests performed on 1950-79 data, the biases impacted only the mean, standard deviation, and sextiles (no impact on other statistics) and generally not more than one quarter of the gridboxes per year-month. That program is now corrected, and in conjunction with installation of ICOADS Release 1c summaries (1800-1949) at PSD, these problems were fully resolved in all the 2°x2° data (during April 2001) and 1°x1° data (on 4 May 2001). Smaller floating-point differences will still exist between the data unpacked from netCDF, and data unpacked from the original ICOADS Monthly Summary Group (MSG) files (from which the netCDF data were derived), which should not be considered numerically significant (i.e., they are smaller than the precision encoded for a given statistic of a variable). For example, Table 1 shows the magnitude of the differences between the two packing methods (with and without rounding) for the mean of wind speed. As illustrated in detail in Table 2, these differences arise from finite-precision arithmetic and the different packing methods used in netCDF versus MSG. Table 1. Minimum, mean, and maximum differences between packing/unpacking results using the PSD netCDF method (n), versus the ICOADS method (c), over the defined range for the mean of wind speed (0.00 to 102.20 m/s). The 1021 input true values (0.00-102.20 m/s in increments of 0.01) were packed into coded1 (coded2) using truncation (rounding), and then unpacked into true1 (true2). The netCDF method is described in the text, where for the mean of wind speed: add_offset = 327.650 scale_factor = 0.01 The ICOADS method is: coded = nint((true / units) - base) true = (coded + base) * units where for the mean of wind speed: units = 0.01 base = -1 and "nint" indicates rounding to the nearest integer (or truncation, "trun," was used instead for calculation of coded1). ------------------------------------------------------------------------------- true - true1 true - true2 true1 - true2 method (input - trun)(input - nint) (trun - nint) =============================================================================== min difference over range -0.010002136 -0.000017166 0.000000000 n mean difference over range -0.001002443 -0.000001241 0.001001201 n max difference over range 0.000015259 0.000015259 0.010009766 n min difference over range 0.000000000 0.000000000 -0.010000229 c mean difference over range 0.000650405 0.000000000 -0.000650405 c max difference over range 0.010000229 0.000000000 0.000000000 c ------------------------------------------------------------------------------- Table 2. Detailed comparison between packing/unpacking results using the PSD netCDF method (n), versus the ICOADS method (c), over the defined range of the statistic (configured identically for all variables): mean latitude of the observations with respect to the lower-left (SW) corner of a 2°x2° box (the precision for this statistic is 0.2°). The input true values were packed into coded, and then unpacked into true2. Differences are shown between each input and unpacked value, with their minimum, mean, and maximum values shown below. Using the netCDF method, for the mean latitude of the observations: add_offset = 3276.60 scale_factor = 0.1 or using the ICOADS method: units = 0.2 base = -1 ------------------------------------------------------------------------------- true coded true2 true - true2 method =============================================================================== 0.0000 -32766 0.0000 0.000000000 n 0.0000 1 0.0000 0.000000000 c 0.2000 -32764 0.2000 0.000048831 n 0.2000 2 0.2000 0.000000000 c 0.4000 -32762 0.4001 -0.000146478 n 0.4000 3 0.4000 0.000000000 c 0.6000 -32760 0.6001 -0.000097632 n 0.6000 4 0.6000 0.000000000 c 0.8000 -32758 0.8000 -0.000048816 n 0.8000 5 0.8000 0.000000000 c 1.0000 -32756 1.0000 0.000000000 n 1.0000 6 1.0000 0.000000000 c 1.2000 -32754 1.2000 0.000048876 n 1.2000 7 1.2000 0.000000000 c 1.4000 -32752 1.4001 -0.000146508 n 1.4000 8 1.4000 0.000000000 c 1.6000 -32750 1.6001 -0.000097632 n 1.6000 9 1.6000 0.000000000 c 1.8000 -32748 1.8000 -0.000048757 n 1.8000 10 1.8000 0.000000000 c 2.0000 -32746 2.0000 0.000000000 n 2.0000 11 2.0000 0.000000000 c min difference over range -0.000146508 n mean difference over range -0.000044374 n max difference over range 0.000048876 n min difference over range 0.000000000 c mean difference over range 0.000000000 c max difference over range 0.000000000 c -------------------------------------------------------------------------------
[Documentation and Software][Links to additional]
U.S. National Oceanic and Atmospheric Administration hosts the icoads website privacy disclaimer Document maintained by icoads@noaa.gov Updated: Jan 7, 2014 23:58:18 UTC http://icoads.noaa.gov/netcdf.html |