=============================================================================== International Comprehensive Ocean-Atmosphere Data Set (ICOADS): Release 2.2 Data Preconditioning and Duplicate Elimination: 1998-2004 13 October 2005 ================================================================= Document Revision Information (previous version: none): ------------------------------------------------------------------------------- {1. Introduction} This document describes the rules for a sequence of processing steps performed for Release 2.2 1998-2004 ICOADS.DM data, including duplicate elimination (dupelim). Preconditioning (sec. 2), the first step in this sequence, was used to delete reports, or to correct or modify individual data fields within a given report. The second step involved setting the IMMA fields for platform type (PT) and ID indicator (II) (sec. 3). The final step was actual dupelim processing (sec. 4). {2. Preconditioning} Sec. 2.1 gives the rules for report deletions; sec. 2.2 gives the rules for field modifications. Similarly to setting PT and II (sec. 3), deck is the field that initially determines the rules to be used. Decks that are not specified are not subject to preconditioning. Some rules are labeled according to a lower-case letter, which indicates that more than one rule applies to a deck. Dates indicated as part of the rules are inclusive, e.g., "1998-October 1999" refers to the beginning of 1998 through the end of October 1999. {2.1 Report deletions} All decks Rules: Delete any report outside 1998-2004. Background: Some of the input data sources extended outside of the 1998-2004 update period. Deck 144: TAO/TRITON and PIRATA Buoys (from PMEL and JAMSTEC) Rules: Delete any deck 144 report not reported on the whole hour (i.e., if tenths and hundredths of hour are other than zero). Background: TAO and PIRATA mooring have transitioned to high-resolution 10-minute average data. In view of data volume considerations, we sub-sample the data to achieve better agreement with other buoy datasets (the complete data are, however, available as auxiliary data). Deck 714: Canadian Marine Environmental Data Service (MEDS) Buoys Rules: a) For 1998-October 1999, delete any report from deck 714 containing an ID whose first two characters are "46." b) Delete any report from deck 714 containing an ID whose first two characters are "91." c) Delete any deck 714 report if the third position of the ID falls in the range 0-4 (moored buoys). Background: a) Gaps exist in the MEDS archive because certain GTS bulletin headers were reserved for internal US circulation. It is not known exactly which bulletin headers were involved, but the set of all buoys with ID starting with "46" is our best approximation of the scope of the problem at this time. We will use the NCEP data (see deck 794) to get a more complete version of these buoys during this limited period. b) Beginning around November 1989, IDs for some NDBC Western Pacific C-MAN (Westpac) stations took a numeric form resembling a 5-digit WMO buoy number except beginning with "91" (not legitimate starting digits for a buoy number; see for additional background). Reports from these stations have been sporadically misassigned into moored and drifting buoy datastreams (e.g., decks 714, 793, 794). [NOTE: This rule is less restrictive than the one in , which required that the "91" be followed by three numeric characters.] c) MEDS quality controlled both drifting and moored buoy data, limited to data reported in the BUOY (FM 18) code. We use deck 144 for major arrays (TAO/TRITON and PIRATA) reporting in the BUOY code, and, for simplicity, NCEP data are used for any other/foreign moorings reporting in the BUOY code (appearing in deck 794). [NOTE: Similarly, we use deck 883 for the US NDBC array reporting in the SHIP code, and, for simplicity, NCEP data are used for any other/foreign moorings reporting in the SHIP code (appearing in deck 793).] Deck 792: US Natl. Cntrs. for Environ. Pred. (NCEP) BUFR GTS: Ship Data Rules: a) Delete any report from deck 792 with ID set to BOUY. b) For October 1999-May 2000, delete any report from deck 792 with ID set to SHIP. Background: a) See . b) In the input NCEP data for October 1999-May 2000, we estimated that ~10K deck 792 reports per month with ID=SHIP, duplicated moored buoy reports in deck 793 (from NDBC plus some foreign moorings). Otherwise during 1998-2002, there were typically 1-4K (presumably legitimate) reports with ID=SHIP per month in deck 792. This will eliminate the problem, but also the 1-4K/month presumably legitimate SHIP reports. Deck 793: NCEP BUFR GTS: Buoy Data (transmitted in FM 13 "SHIP" code) Rules: a) Delete any report from deck 793 containing an ID whose first five characters (ignoring any trailing characters) match one of the 98 NDBC moored buoy numbers that may have appeared on GTS during 1998-2004: 41001, 41002, 41004, 41008, 41009, 41010, 41012, 41013, 41025, 42001, 42002, 42003, 42007, 42019, 42020, 42035, 42036, 42038, 42039, 42040, 42041, 42053, 42054, 44004, 44005, 44007, 44008, 44009, 44011, 44013, 44014, 44017, 44018, 44025, 44027, 45001, 45002, 45003, 45004, 45005, 45006, 45007, 45008, 45012, 46001, 46002, 46003, 46005, 46006, 46011, 46012, 46013, 46014, 46015, 46022, 46023, 46025, 46026, 46027, 46028, 46029, 46030, 46035, 46041, 46042, 46045, 46047, 46050, 46053, 46054, 46059, 46060, 46061, 46062, 46063, 46066, 46069, 46071, 46072, 46075, 46078, 46079, 46080, 46081, 46082, 46083, 46084, 46086, 46087, 46088, 46089, 48011, 51001, 51002, 51003, 51004, 51028, 41611. b) This rule is identical to rule b) under deck 714, except that it is applied to deck 793. c) Delete any deck 793 report if the third position of the ID falls in the range 5-9 (drifting buoys). d) This rule is identical to rule a) under deck 794, except that it is applied to deck 793. Background: a) NDBC (deck 883) provides a higher quality and more complete set of data for these buoys. b) (See background under deck 714.) c) This is similar to rule c) under deck 794. No drifting buoys are expected to be reporting in the SHIP code. However, small amounts of data with drifting buoy IDs appear in deck 793, apparently due to GTS decoding problems, which this rule will address. d) (See background under deck 794.) Deck 794: NCEP BUFR GTS: Buoy Data (transmitted in FM 18 "BUOY" code) Rules: a) Delete any report from deck 794 containing an ID whose first five characters (ignoring any trailing characters) match one of the 114 TAO/TRITON and PIRATA buoy numbers that may have appeared on GTS during approximately 1990-2005 (rule does not test for date): 13008, 13009, 13010, 13011, 15001, 15002, 15003, 15004, 15005, 15006, 31001, 31002, 32011, 32303, 32304, 32305, 32315, 32316, 32317, 32318, 32319, 32320, 32321, 32322, 32323, 41026, 43001, 43008, 43011, 43301, 51006, 51007, 51008, 51009, 51010, 51011, 51012, 51013, 51014, 51015, 51016, 51017, 51018, 51019, 51020, 51021, 51022, 51023, 51301, 51302, 51303, 51304, 51305, 51306, 51307, 51308, 51309, 51310, 51311, 52001, 52002, 52003, 52004, 52006, 52007, 52008, 52010, 52011, 52012, 52043, 52044, 52045, 52046, 52071, 52072, 52073, 52074, 52075, 52076, 52077, 52078, 52079, 52080, 52081, 52082, 52083, 52084, 52085, 52086, 52087, 52088, 52301, 52302, 52303, 52304, 52305, 52306, 52307, 52308, 52309, 52310, 52311, 52312, 52313, 52314, 52315, 52316, 52317, 52318, 52319, 52320, 52321, 53056, 53057. These deleted data are written to a separate file, and blended with the sub-sampled deck 144 data. Further processing deletes any report whose date and ID match deck 144 (only date, not time, is considered for the match). Finally, the processed data are reintroduced. b) This rule is identical to rule b) under deck 714, except that it is applied to deck 794. c) During 1998-October 1999, retain any report from deck 794 containing an ID whose first two characters are "46." Otherwise, delete any deck 794 report (over the full update period 1998-2004) if the third position of the ID falls in the range 5-9 (drifting buoys) Background: a) PMEL and JAMSTEC (deck 144) provide higher quality and more complete data for these buoys than GTS, but the delayed-mode data coverage provided by deck 144 is discontinuous, and varies from buoy to buoy. This rule provides an approximate solution to the problem of retaining the GTS data only when the delayed-mode data are unavailable. [NOTE: The rule is applied simultaneously to decks 793-794. I.e., both decks 793-794 are deleted, combined with deck 144, and then reintroduced after the overlaps between decks 793-794 and 144 are eliminated. PMEL and JAMSTEC data are transmitted in the GTS BUOY code, and thus not expected to appear in deck 793. However, small amounts of data under those IDs do appear in deck 793, apparently as a result of GTS problems.] b) (See background under deck 714.) c) In general, MEDS (deck 714) provides a higher quality and more complete set of drifting buoy data. However, some gaps exist in the MEDS archive associated with IDs starting with "46" (see deck 714). Deck 795: NCEP BUFR GTS: Coastal Marine Automated Network (C-MAN) Data Rules: Delete all deck 795 reports. Background: NDBC (deck 883) provides a higher quality and more complete set of C-MAN data. Deck 883: US National Data Buoy Center (NDBC) Data Rules: a) Delete any deck 883 report if the third position of the ID falls in the range 5-9 (drifting buoys), applicable only to IDs not starting with "91." b) Delete any deck 883 report with ID=42A03. Background: a) MEDS (deck 714) provides a higher quality and more complete set of drifting buoy data. b) As discussed in , NDBC has replaced the digit in the middle position of ID with a letter, to indicate use of an alternative payload. During the 1998-2004 period the only such ID that appeared in archival NDBC data obtained from NCDC in 2005 was "42A03." Based on spot checks, these reports (limited to October 2003) duplicated reports with ID=42003. Further work would be needed to assess any differences between the reports identified by 42A03 versus 42003; the 42003 version was selected for this update. {2.2 Field modifications} Field modifications take the form of deleting, modifying, or adding a field. (see sec. 3 for the rules used for setting the LMR fields for platform type and ID indicator). All decks Rules: a) Left-justify ID, with missing right fill. b) Compute a missing dew point temperature if WBT and AT are extant; if SLP is missing 1015.0 is used as SLP. This rule is not applied if WBT is greater than AT. Constants ACON and BCON are set for computation of DPT relative to water: ACON=7.5 and BCON=237.3. The following Fortran code is then used to attempt computation of DPT: ESW = 6.1078*10.**(WBT*ACON/(WBT+BCON)) E = ESW-(.00066*SLP)*(((.00115*WBT)+1)*(AT-WBT)) IF(E.LT.0.) RETURN CCON = ALOG10(E/6.1078) DPT = BCON*CCON/(ACON-CCON) where the RETURN if vapor pressure (E) is less than zero leads to an error diagnostic, and otherwise the resulting DPT is rounded to the nearest 0.1@C. DPTI is left unchanged (DPTI may be missing or extant depending on input source; DPTI information does not exist in FM 13). Background: a) See . b) This step prepares for statistics by calculating DPT where it would otherwise be unnecessarily missing. [NOTE: The rule differs from the implementation in that: (i) there is no error attachment to check in the IMMA format, and (ii) DPTI is unchanged (whereas T2 in LMR was set in the earlier processing).] {3. Rules for assignment of platform type and ID indicator} This section describes the rules used to set IMMA fields platform type (PT) and ID indicator (II). For decks indicated by an asterisk (*), final assignment of PT and II was made in conversion from the original input format into IMMA; otherwise, assignment is determined at this stage of processing. Where assignment is determined here, it is important to note that initially PT and II are set to missing (even if ID is extant). Also a final check for generic ID (II = 2) overrides a previously set II. [NOTE: processing differed in that II was initialized to 0 if ID was extant. As a result, unrecognizable IDs were properly set to II=0 in the earlier processing, whereas in that case here II is left missing. Future work should seek to make this processing consistent.] See for additional background on the rules and nomenclature, and on the referenced checks on ID. No data were output for decks indicated by a pound sign (#),and the PT/II information is included only for reference. Deck 144: TAO/TRITON and PIRATA Buoys (from PMEL and JAMSTEC)* PT = 6, and II = 3 Deck 714: Canadian Marine Environmental Data Service (MEDS) Buoys if moored_buoy_ID, PT = 6 and II = 3 else if drifting_buoy_ID, PT = 7 and II = 3 Background: Rules to handle moored buoy IDs are retained only for reference. No moored buoy data should appear in this deck after preconditioning. Deck 792: US Natl. Cntrs. for Environ. Pred. (NCEP) BUFR GTS: Ship Data* PT = 5; if ship_ID, II = 1 Deck 793: NCEP BUFR GTS: Buoy Data (transmitted in FM 13 "SHIP" code)* if moored_buoy_ID or "BUOY", PT = 6; if moored_buoy_ID, II = 3 else if drifting_buoy_ID, PT = 7 and II = 3 Background: Rules to handle drifting buoy IDs are retained only for reference. No drifting buoy data should appear in this deck after preconditioning (moreover, drifting buoys are not reported in the SHIP code and in theory should never appear in this deck). Deck 794: NCEP BUFR GTS: Buoy Data (transmitted in FM 18 "BUOY" code)* if moored_buoy_ID or "BUOY", PT = 6; if moored_buoy_ID, II = 3 else if drifting_buoy_ID, PT = 7 and II = 3 Background: Some drifting buoy data (ref. sec. 2.1) remain in this deck after preconditioning. Deck 795: NCEP BUFR GTS: Coastal Marine Automated Network (C-MAN) Data# PT = 13; if C-MAN_ID, II = 5 Background: Deck deleted during report preconditioning. Deck 796: NCEP BUFR GTS: Miscellaneous (OSV, plat, and rig) Data# if OSV_ID, PT = 2 and II = 1 else if "RIGG" or "PLAT", PT = 15 Background: No data have been assigned to this deck. These are rules used to handle older ID forms. [NOTE: The only remaining OSV, Station "Mike," is operated by Norway (the Polarfront, recently with call sign LDWR). Because of uncertainty about whether the call sign has changed, reports from call sign LDWR will be found in deck 792 with PT=5. The ID conventions used for fixed drilling rigs and platforms (e.g., in the North Sea) reporting over GTS have evolved over the years, and may vary nationally. So far, we have been unable to locate sufficient documentation or metadata to establish time-varying rules to set PT/II for these stations. Probably they are identified as ships or buoys in recent ICOADS data.] Deck 883: US National Data Buoy Center (NDBC) Data if moored_buoy_ID, PT = 6 and II = 3 else if drifting_buoy_ID, PT = 7 and II = 3 else if C-MAN_ID, PT = 13 and II = 5 Background: Rules to handle drifting buoy IDs are retained only for reference. No drifting buoy data should appear in this deck after preconditioning. {4. Duplicate elimination} The 1998-2004 processing involved a scaled-down dupelim procedure, which was applied only to the NCEP GTS data. The remaining decks were passed through without change, as listed in Table 1. The scaled-down dupelim is an "exact" procedure, which rejected all but one report among any set of reports with the same year, month, day, hour, latitude, longitude, and ID. This is identical to the procedure applied to Release 2.1 ICOADS.RT data for March 1997-2002. Table 1. Duplicate elimination special deck rules and priority codes (after Table 1 in ). Deck 795 is eliminated during preconditioning, and deck 796 is currently nonexistent (listed only for reference). No priority was applied during this scaled-down processing, thus the column is empty. The two rules are: [a] Absolute pass through (as defined in ). [r] The ICOADS.RT "exact" dupelim procedure. ------------------------------------------------------------------------------- Rule Priority Deck Description =============================================================================== [a] - 144 TAO/TRITON and PIRATA Buoys (from PMEL and JAMSTEC) [a] - 714 Canadian Marine Environmental Data Service (MEDS) Buoys [r] - 792 NCEP BUFR GTS: Ship Data [r] - 793 NCEP BUFR GTS: Buoy Data (transmitted in FM 13 "SHIP") [r] - 794 NCEP BUFR GTS: Buoy Data (transmitted in FM 18 "BUOY") (deleted) 795 NCEP BUFR GTS: Coastal Marine Automated Network (C-MAN) (nonexistent) 796 NCEP BUFR GTS: Miscellaneous (OSV, plat, and rig) Data [a] - 883 US National Data Buoy Center (NDBC) Data -------------------------------------------------------------------------------