ICOADS Web information page (Tuesday, 11-Feb-2014 18:53:29 UTC):
Release 2.1 Real-time (ICOADS.RT) Archive Overview (1997-2002)
This page describes the processing of the "real-time" (ICOADS.RT) archive, which is based entirely on data from the World Meteorological Organization (WMO) Global Telecommunications System (GTS). GTS data for March 1997 through 2002 were processed for Release 2.1 into this archive. NOTE: Work is planned to update documentation to describe more recent (2003-08) ICOADS.RT data, including preliminary data (subject to change) now being extended to near-current dates (updated monthly, and lagging real-time by 2-6 weeks).
2. Data sources
GTS data for March 1997-2002 from NOAA's National Centers for Environmental Prediction (NCEP), archived at NCAR (see Appendix A), were used. The data were translated by NCEP from the original GTS message strings (i.e., FM 13 SHIP or FM 18 BUOY code), into the Binary Universal Form for the Representation of Meteorological Data (BUFR) (WMO, 1995).
NCEP included in their versions of the BUFR format one (or more) of the input original GTS message string(s) used to construct each BUFR report, thus the NCEP format is referred to as "BUFR+string" in this document. Including the strings was extremely beneficial, because it allowed the resolution of some significant data problems in the earlier BUFR data through access to the originally reported data.
NCDC is also translating and archiving GTS data obtained from NOAAPort and other GTS sources. These NCDC data may be used in the ICOADS.RT archive in the future. Currently, however, only the NCEP data have been used.
3. Processing and products
For the final ICOADS.RT archive, translation (ii) was used for the period March 1997 through September 1999, and translation (i) for the period October 1999 through December 2002. October 1999 also marked a transition in the NCEP archive to "dumped" data, in which some partial duplicate reports had been merged and quality control (QC) applied by NCEP (Figure 3).
To further improve the quality of October 1999-December 2002 data, corrections of two known BUFR problems, which could be made to the BUFR data directly without accessing the original string, also were applied as part of translation (i). These and other details of the ICOADS.RT processing are provided in Appendix B. As discussed in Appendix B, some residual problems exist in the BUFR data through 2002, which were propagated into ICOADS.RT. Generally, however, these problems appear to be minor and confined to secondary elements such as clouds and waves.
In addition to the new IMMA format, the widely used LMRF observational format was produced, as well as full sets of 2° and 1° monthly summaries in MSG format (see Appendix B). (The monthly summaries will also be offered from NOAA/ESRL/PSD in netCDF format.)
4. Data features and problems
a) Data for March-December 1997 that overlap ICOADS.DM were processed primarily for validation purposes (using the original message strings). However, those overlapping data also provide a source for some data elements that are currently missing or incorrect in ICOADS.DM, because of problems in earlier BUFR data.
For example, the BUFR format at NCEP was expanded to include the wind speed indicator (WI), if available in the original GTS report, for data translated into BUFR around 21 October 1997. In contrast, an earlier version of the NCEP data (decks 892-896), used as input for ICOADS.DM, has extant WI erroneously set to a constant value (3 = knots, estimated) for approximately April-October 1997 (NCEP continued overlapping production of BUFR and their old Office Note 124 format, which preserved WI, until approximately 19 April 1997, but the mixture of NCEP data used at NCDC to create decks 892-896 is not known).
b) A simplified ("exact" on year, month, day, hour, latitude, longitude, and ID) duplicate elimination (dupelim) procedure was applied to all the data (see Figure 3). This was less powerful than the dupelim applied to ICOADS.DM (which, e.g., operates over 1° boxes). Undetected duplicates are therefore more likely in the ICOADS.RT data. For example, we know that some duplicates still exist during October 1999-May 2000 between some moored buoy reports and reports containing the generic ID "SHIP." For the latter reports, platform type (PT) in the output data mixture was set (apparently erroneously) to PT=5 (ship). At present, we are uncertain of the source and magnitude of this problem.
d) ICOADS.DM utilized global drifting buoy data output from delayed-mode QC processing at MEDS, Canada (deck 714). In contrast, ICOADS.RT utilized the raw GTS stream, which contains many more drifting buoy reports (see Figure 1), often closely spaced in time and space. Consequently, monthly summary boxes containing drifting buoys may be dominated by the frequent buoy reports (e.g., versus ship observations). We do not yet understand the reasons for the large differences between amounts of delayed-mode and GTS data, but a test for August 1997 (Figure 4) showed that not all buoys are impacted.
e) Similarly, ICOADS.DM utilized NOAA moored buoy data output from delayed-mode processing at NDBC (US moorings and C-MAN stations) and PMEL (TAO moorings), which were not utilized for ICOADS.RT. The moored buoy data from GTS for the NDBC and PMEL buoys may have approximately the same temporal frequency as the delayed-mode data, however, the delayed-mode sources frequently contain more complete (see Figure 1) and better quality data than were transmitted over GTS. For ICOADS.DM, "source exclusion" rules were applied during preparation of monthly summaries to reduce the NDBC data (only) to 3-hourly. These rules were not applied to any ICOADS.RT moored buoy data, thus some boxes will be dominated by NDBC moored buoys reporting hourly or sub-hourly, in contrast to ICOADS.DM summaries. In the future, we would like to reassess the rationale for the ICOADS.DM source exclusion rules, because they are only applied to NDBC buoys, and not, e.g., to foreign or TAO moorings.
f) In the NCEP BUFR+string data, there were two days (3-4 September 2001) of BUOY (FM 18) code data with corrupted original message strings. It is unknown what the source of this error is, and what impact it might have had on the BUFR data.
h) The NRT format documentation describes a few additional, generally minor problems that may also apply to ICOADS.RT. For instance, for the period 1545 UTC 30 May to 1530 UTC 2 June 1998, data from drifting and moored buoys, including the TAO and PIRATA arrays, were inadvertently not transmitted over the GTS, and thus not received by NCEP.
i) The dumped data used for ICOADS.RT for October 1999-December 2002 contained "quality marks" that were used at NCEP to correct the BUFR data. However, the quality marks themselves were not retained in the translation into ICOADS.RT (only the GTS strings--no BUFR fields--were output as supplemental data from the translation). In addition, GTS "bulletin header" information was stored in BUFR, which was not retained in the translation into ICOADS.RT. Some of this information can be important for specialized applications such as tracing the GTS routing of data. In the future we may wish to reassess whether more metadata from BUFR such as NCEP quality marks and bulletin header information should be preserved in ICOADS.RT.
Figure 1. Thousands of reports per month (after dupelim) for ICOADS.DM (through 1997), compared to ICOADS.RT (March 1997-December 2002). The data from both archives have been stratified by (approximate) platform type: drifting (DRIFT) and moored (MOOR) buoys, ships (SHIP), and Coastal-Marine Automated Network (C-MAN). ICOADS.DM contains more ship, moored buoy, and C-MAN reports received in delayed-mode, but fewer drifting buoy reports for reasons discussed in sec. 4.
Figure 2. Thousands of reports per month for ICOADS.RT, compared to NRT. The data from both archives have been stratified by (approximate) platform type, as for Figure 1. As discussed in sec. 4, "empty" NRT reports (containing no meteorological elements) are deleted in the preparation of that format at NCEP. This occurs to some extent for all platform types, but is most evident for drifting buoys.
Figure 3. Thousands of reports per month for BUFR, NRT, CMR (after NRT dupelim), and ICOADS.RT (after dupelim). The October 1999 transition in the BUFR curve illustrates the impact of switching to the NCEP dumped data, from the regular operational data. As expected, dupelim had little impact after October 1999 (on the dumped data), as shown by the close correspondence of the BUFR and ICOADS.RT curves. The NRT and CMR curves are lower owing to the exclusion of empty NRT reports, and for CMR owing to the NRT dupelim effect.
Figure 4. (a) (b) (a) Number of reports per individual drifting buoy for August 1997 from ICOADS.DM (deck 714) versus ICOADS.RT data (deck 794). The horizontal axis is unlabeled, but each step on the axis represents a different buoy ID number. The buoy numbers have been ordered by the amount of the difference between the two curves, thus the largest differences (more reports in deck 794 versus deck 714) appear farthest to the left. (b) This focuses in on the left-most part (approximately 1/4) of (a), listing individual buoy ID numbers on the horizontal axis.
Figure 5. (a) (b) (c) (d) (e) (a) Total number of observations for March 1997 through September 1999, shown separately for each IMMA field that had any data (all platform types). Blue (red) bars indicate that the translation of the BUFR data agrees (disagrees) with the translation of the original GTS message, into IMMA format. Disagreement includes the case of missing versus extant data. The amount of extant data resulting from the BUFR (GTS) translation is indicated by an orange "X" ("+"). If a field is completely missing the "X" or "+" is not shown. (b)-(e) As for (a) except the observations are plotted separately for FM 13 SHIP (ships), FM 13 BUOY (largely moored buoys), FM 18 BUOY (drifting buoys and some moored buoys including TAO), and FM 13 C-MAN. Note: Source ID (SID) and a few other fields including the time and temperatures indicators (TI and IT, which were unknown for some of these code types in the BUFR translation) were expected to have disagreement. IMMA documentation explaining the field abbreviations is available in PDF format in this directory.
Figure 7. Overview of the ICOADS.RT processing flow (simplified), and of related NCDC processing (data not yet used for ICOADS.RT). Letters in [brackets] refer to processing notes in Appendix B, sec. B2.
Slutz, R.J., S.J. Lubker, J.D. Hiscox, S.D. Woodruff, R.L. Jenne, D.H. Joseph, P.M. Steurer, and J.D. Elms, 1985: Comprehensive Ocean-Atmosphere Data Set; Release 1. NOAA Environmental Research Laboratories, Climate Research Program, Boulder, CO, 268 pp. (NTIS PB86-105723).
WMO (World Meteorological Organization), 1995: Manual on Codes. WMO-No. 306, Geneva, Switzerland (including Suppl. through No. 3 (VIII.2001)).
Appendix A: NCEP BUFR+string data archived in NCAR ds540.8
Table A1 lists the different "tanks" of NCEP BUFR+string data included in NCAR ds540.8. The "string" refers to the original GTS message in the SHIP (FM 13) or BUOY (FM 18) code.
Changes were tested in the "development" tanks, before finalization in the "operational" tanks, so the former may contain improved BUFR data for the overlap period in 1997 (other development data existed, but were not provided). However, only the operational data, including a more processed form of operational data called "dumped," were used for ICOADS.RT for March 1997-2002--no development data were used.
The dumped data were subjected to dup-merge processing at NCEP in which exact duplicates were removed and partial duplicates blended to create more complete BUFR reports. Also, a variety of real-time interactive QC processing checks were made to add "quality marks" to the data. Parts of the QC compared selected elements (sea level pressure, air and sea surface temperature, and wind speed and direction) to model first guess fields, and allowed for correction of obvious errors in the data (incorrect hemisphere, misplaced decimal, etc.). Quality marks indicating data corrections were used to correct the dumped BUFR data, but quality marks indicating "purge" (recommending that the data not be used) were not used. All the quality marks were retained in the BUFR data (but were not retained in the output ICOADS.RT data, as discussed in Appendix B).
Prior to sometime in March 2002, the dumped string included only the latest message. Starting then, all constituent messages (plus bulletin header information) were included. The messages were generally attached in reverse order of receipt time. However, a complicating factor is that for some buoys the data were reported as two separate messages: subsurface and surface.
Table A2 lists the different NCEP "subtanks" of data within each monthly file, which were assigned different decks within ICOADS. By themselves, these deck numbers do not always provide an accurate mapping to PT.
Table A1. Datasets (provided courtesy of Diane Stokes at NCEP) archived in ds540.8 from NCEP's operational and development tanks. The dumped data were taken from the operational tanks, and subjected to dup-merge and QC editing as discussed in the text. The source ID (SID) column lists the ranges of SIDs assigned to the different operational data (see Appendix B); no development data were used for ICOADS.RT.
1. On a 32-bit platform, the Cray blocking needs to be removed, then the files can be reblocked using a "cwordsh" utility (available from NCEP).
Table A2. Within each monthly file from Table A1, the data were separated into daily files "subtanked" as shown in this table. The deck (DCK) and platform type (PT) columns list the DCK and PT numbers assigned in ICOADS.RT to data from each subtank (see also Appendix B).
1. Note that the names "drifting" and "fixed" used by NCEP are not necessarily accurate, e.g., since some moored buoys report in the BUOY code.
Appendix B: ICOADS.RT field comparison results and processing notes
B1. Field comparison results
To create the ICOADS.RT archive, as discussed in the main text (sec. 3), translation (ii) of the original GTS messages was used for the period March 1997-September 1999, and translation (i) of dumped BUFR data was used for the period October 1999-December 2002. Figure 5 provides field-by-field comparison results between the two translations for the earlier period. Figure 6 provides the comparison results for the later period. These figures illustrate some improvements in BUFR for the later period compared to the earlier period, and helped guide our decision on which translation to use for ICOADS.RT.
For example, Figure 5a shows that many WI values are missing from BUFR in the earlier period, because this indicator was not added by NCEP until late in 1997 (as discussed in sec. 4 of the main text). In contrast, Figure 6a shows good agreement for WI between the two translations in the later period.
Similarly, Figure 5a shows significant amounts of disagreement between wet bulb, dew point, air, and sea surface temperatures (WBT, DPT, AT, and SST) between the two translations in the earlier period, whereas agreement is good (Figure 6a) in the later period. Storage of temperature values in BUFR was extended to hundredths Kelvin at NCEP starting during February 1999. Usage at NCEP of the factor 273.15 for conversion of Celsius temperatures, and rounding to tenths Kelvin precision (until that date the maximum precision available in BUFR), previously lead to some temperature errors of 0.1°C.
Figure 6 indicates that some residual differences exist in the later BUFR data. Generally, however, these problems appear to be minor and confined to secondary elements such as clouds and waves. Further research is planned to document the sources and significance of the individual field differences in Figures 5-6. (Note that field differences shown in Figure 6 in some cases may have arisen from dup-merge processing by NCEP of multiple messages and associated QC corrections, as discussed in sec. B2.)
B2. Processing notes
Figure 7 illustrates the current GTS processing flow for ICOADS.RT. The following are detailed processing notes (labeled in [brackets]) related to Figure 7:
[a] The NCEP BUFR+string data, as described in Appendix A. Regular operational data were used for March 1997-September 1999, and dumped data for October 1999-December 2002. Dup-merge processing was applied by NCEP to the dumped data. Also, "quality marks" from interactive QC were used at NCEP to correct the dumped BUFR data, and the quality marks were supplied with the BUFR data. However, we did not retain the quality marks (and some GTS "bulletin header" information available in BUFR) in ICOADS.RT.
[b] The NCDC GTS data. Presently, the main source for these data is NOAAPort.
[c] Translation (i) of the BUFR part of BUFR+string into IMMA. Until 15Z 23 July 2002, NCEP was inadvertently storing in BUFR the original GTS code values for the wet bulb temperature indicator (WBTI) and sea surface temperature method indicator (SI), rather than the defined BUFR values for these fields. As part of translation (i), these BUFR fields were interpreted as GTS codes through 23 July 2002, and thereafter as BUFR codes (thus a small amount of data after 15Z on 23 July was erroneously translated). Tables B1 and B2 define the mappings between the GTS and BUFR codes, and the resultant fields WBTI and SI in IMMA (WMO, 1995).
Table B1. WMO Code 3855 is the "Indicator for the sign and type of wet-bulb temperature reported" in FM 13. BUFR Code 0 02 039 is the "Method of wet-bulb temperature measurement." The mappings between these codes (by NCEP starting 15Z 23 July 2002), and from the BUFR code into IMMA WBTI (if applicable), are also listed.
Table B2. WMO Code 3850 is the "Indicator for sign and type of measurement of sea-surface temperature" reported in FM 13. BUFR Code 0 02 038 is the "Method of sea-surface temperature measurement." The mappings between these codes (by NCEP starting 15Z 23 July 2002), and from the BUFR code into IMMA SI (if applicable), are also listed.
1. Note that unless SST is missing it is not possible to report a "missing" indicator in the original GTS code (because the indicator must be extant to provide the sign of SST), and there is no value for "other" in BUFR. Therefore, "other" in GTS was mapped to "missing" in BUFR and SI. There is a value for "others" in SI, but this was derived from WMO's delayed-mode IMMT format, in which there are separate values for "others" and "missing" (plus instrument types in addition to the three listed in this table).
[d] Translation (ii) of the string part of BUFR+string into IMMA. Starting sometime in March 2002 multiple reports may exist in the string portion of the dumped BUFR data. In this case, the ICOADS.RT translation (ii) of the original data used only the first (usually most recent) GTS message, since the messages were generally attached in reverse order of receipt time. Appendix A notes the complicating factor that for some buoys the data were reported as two separate (subsurface and surface) messages. In this case, ICOADS.RT processing still utilized only the first message.
[e] Translation (iii) by NCDC of the NCDC GTS data. The NCDC data can be compared with the NCEP data, to check for possible gaps or variations in receipts at the different GTS centers. In addition, the attachment of the original GTS string to IMMA allows the re-translation of the same messages (e.g., using translation software (ii) versus (iii)) and comparison of those results to identify differences in the translation software.
[f] As part of ICOADS.RT processing, a simple form of dupelim was applied to all the data. This "exact" dupelim rejected all but one report among any set of reports with the same year, month, day, hour, latitude, longitude, and ID. As expected, the exact dupelim had much less impact on the dumped data than the regular operational data (Figure 3).
[g] Common software (qctrf) was used on all the data streams to add the regular ICOADS QC and "trimming" flags into the IMMA format.
[h] The final ICOADS.RT archive was assembled using the IMMA output from translation (ii) for March 1997-September 1999, and from translation (i) for October 1999-December 2002.
[i] The older LMRF observational format is widely used, and interfaces with the current statistics program. We plan to continue to offer it as a product until IMMA becomes widely used. (The production LMR format also played a role, not shown in Figure 7, in this processing.)
[j] For 1998-2002 we generated the full suite of ICOADS statistics: standard and enhanced, 2° and 1°, global and equatorial.
[k] NCDC-translated GTS data will be offered from NCDC in the future, but are not yet a regular part of ICOADS.RT. NCDC plans to ingest the data into a database management system (DBMS), for user access, but to archive the IMMA+string data.
[l] The NCDC GTS data are also flowing monthly to NCAR as a backup archive.
U.S. National Oceanic and Atmospheric Administration hosts the icoads website privacy disclaimer
Document maintained by firstname.lastname@example.org
Updated: Feb 11, 2014 18:53:29 UTC