=============================================================================== International Comprehensive Ocean-Atmosphere Data Set (ICOADS): Release 2.1 Data Preconditioning and Duplicate Elimination: 1970-79 27 February 2004 ================================================================= Document Revision Information (previous version: 9 September 2002): Updates for Release 2.1 and ICOADS. ------------------------------------------------------------------------------- {1. Introduction} This document describes the rules for a sequence of processing steps performed in the duplicate elimination (dupelim) program for 1970-79 (originally referred to as COADS Release 1b) data. Preconditioning (sec. 2), the first step in this sequence, was used to delete Long Marine Reports (LMR6), or to correct or modify individual data fields within a given report. The second step involved setting the LMR fields for platform type (PT) and ID indicator (II) (sec. 3). The final step was actual QC/dupelim processing (sec. 4). During dupelim, additional reports were eliminated, and a limited number of changes was made to the contents of reports by substitution between duplicates. [NOTE: Because of processing differences, the three original COADS updates that compose ICOADS.DM, and accompanying documentation, are referred to as follows: Release 1a: 1980-97 Release 1b: 1970-79 1946-69 Release 1c: 1784-1949 These four documents describe the "preconditioning" and duplicate elimination processing used to create LMR for the indicated periods. 1946-49 Release 1b data were replaced by Release 1c data.] {2. Preconditioning} Sec. 2.1 gives the rules for report deletions; sec. 2.2 gives the rules for field modifications. Similarly to setting platform type and the ID indicator (sec. 3), deck is the field that initially determines the rules to be used. Decks that are not specified are not subject to preconditioning. Some rules are labelled according to a lower-case letter, which indicates that more than one rule applies to a deck. Dates indicated as part of the rules are inclusive, e.g., "February-July 1975" refers to the beginning of February 1975 through the end of July 1975. {2.1 Report deletions} Deck 119: Japanese Ships No. 2 Rules: Delete any report from deck 119 after 30 June 1961. Background: Uwai and Komura (1992) indicate that the period of record for data previously digitized from the Kobe Collection (included in ICOADS as decks 118 and 119) should extend only through June 1961. Two reports with incorrect year (1971) are present in the 1970-79 data. Deck 143: Pacific Marine Environmental Laboratory (PMEL) Buoys Rules: Delete any report from deck 143, except retain all deck 143 reports from SID=24. Background: Examination of sample data has indicated that deck 143 is duplicated in the Release 1 input data from two source IDs: SID=18 (70s Decade) and SID=24 (Buoy Data). [NOTE: Deck 143 was subject to dupelim during Release 1 processing. In contrast, deck 143 is now handled via a "limited pass through" (see sec. 4). This rule ensures that only the SID=24 copy of deck 143 is passed through. Release 1 processing may have resulted in some loss of data if multiple buoys exist in deck 143, e.g., there were apparently two buoys in the Pacific near Seattle starting late March 1976 (IDs: 0690000 and 0690100; locations: 48.3N, 236.4E and 48.1N, 236.6N).] Deck 714: Canadian Marine Environmental Data Service (MEDS) Buoys Rules: Delete any report from deck 714 with position flagged doubtful (i.e., lat/lon flag=3). Background: MEDS gathered from the GTS and quality controlled, drifting buoy data for 1978-79 around the period of the First GARP Global Experiment (FGGE). [NOTE: This rule assumes that drifting buoy data are not available or not retained from other sources. FGGE IIb drifting buoy data (deck 749, source ID 55) are subject to automatic deletion at the conclusion of dupelim. The FGGE IIb set, apparently utilizing data supplied directly by Service Argos versus gathered from GTS, comprises a much larger number of reports (787K versus 235K input MEDS reports). For this update we chose to use the MEDS data because of some evidence of internal duplicates and other questions about the level of quality control of the FGGE IIb data, and considerations of consistency in using a single drifting buoy product from MEDS for both the Release 1a and 1b periods. Because of the large difference in numbers of reports, however, this is an important question for reconsideration for a future update. To resolve this question, we feel it would be most helpful if an organization such as MEDS were someday able to integrate available Argos sources with GTS data and create a uniformly quality controlled archive covering the FGGE period.] Deck 749: First GARP Global Experiment (FGGE) Level IIb Rules: Delete any report from deck 749 with preexisting platform type set to indicate moored buoy (PT=6). Background: A few known location errors for NDBC moored buoys exist in the FGGE IIb archive. Better quality and more complete NDBC moored buoy data should be provided by decks 876-882. We rely on deck 888 to provide foreign moored buoy data during the FGGE period. [NOTE: Drifting buoy and oceanographic data are also removed from deck 749, after dupelim comparisons have been made (sec. 4). We rely on decks 714 (MEDS) and 780 (Levitus WOA) to supply these data.] Deck 780: Levitus World Ocean Atlas (WOA) Rules: Delete any report from deck 780 with sea surface temperature missing. Background: Quality control flags provided with the WOA were employed during conversion to LMR6 to screen out clearly erroneous sea surface temperature approximations. Since SST is the only regular data field available in the WOA, entire WOA reports are deleted if SST is missing. [NOTE: Conversion also excluded SST out of the range allowed for LMR6, thus SST should not appear in the error attachment.] Deck 888: US Air Force Global Weather Central (GWC) Rules: Delete all deck 888 reports containing "EB" as the first two positions of ID, and two digits as the third and fourth positions. Background: Some location errors exist in the GWC data for NDBC moored buoys. Better quality and more complete NDBC moored buoy data should be provided by decks 876-882. This rule retains any foreign moored buoy data, which we believe were assigned "BUOY" rather than an EBxx number. [NOTE: This rule may not delete all the NDBC buoys in the event GWC assigned "BUOY" rather than an EBxx number. Since the GWC format did not incorporate a standard 5-digit WMO number for moored buoys, GWC maintained a list relating EBxx numbers to WMO numbers for NDBC buoys, including a predesignated location. If for some reason a buoy was not on the list or the listed location was out-of-date, problems could have arisen.] Deck 891: US National Oceanographic Data Center (NODC) Surface Data Rules: Delete all deck 891 reports, except retain station data (SD) reports as indicated by PT=10 (oceanographic station data; SD/Co22) provided they contain at least one of the following elements: wind speed, wind direction, present weather, past weather, sea level pressure, air temperature, wet bulb temperature, or total cloud amount (delete reports containing only sea surface temperature, or other elements such as dew point temperature that are not specified above). Background: The Levitus Atlas provides more complete oceanographic data, including data from bathythermographs, than are provided by deck 891 (deck 891 currently extends only through 1977). However, the Levitus Atlas does not at this time provide accompanying meteorological data. Deletion of reports without any of the specified weather elements is designed to avoid creating "empty" SD reports after deletion of SST under field modifications (see sec. 2.2), since some deck 891 SD reports contain only SST. [NOTE: Some dew point temperatures were found in sample data, but NCDC (no-date 3) states that only wet bulb temperature was available in the data provided by NODC, thus we assume any DPT was computed. This rule may also result in the deletion of reports containing only supplemental fields, including non-standard wave data that apparently were not transferred into the regular wave fields (ref. NCDC, no-date 3). "Ship type" for NODC data was set during conversion to LMR5 (see Release 1, p. I13), and transformed into PT during conversion from LMR5 to LMR6 as specified by . As indicated in sec. 2.2, NODC (SID=11-12) PT values are not subject to change during "lmrfix" processing.] Deck 927: International Marine (US- or foreign-keyed ship data) Rules: Delete all deck 927 reports during February-July 1975 except for those containing SID=26 (1980-84 Annual Receipts; corrections) or received after Release 1 (SID=22 or SID greater than 24). Background: According to Cram (1986) and Jenne (1986), an erroneous deck 927 test file was included in the February-July 1975 data provided by NCDC for Release 1. The data were located in the wrong position because they were missing the hundreds position of longitude (e.g., a report at 130@W would be located in the test file at 30@W), which may account for large amounts of data mislocated over land in 1975. The erroneous reports (36,867 total) were duplicates, such that the corresponding correct reports were in proper position. NCDC supplied volume D4ZZ01 (SID=26) containing corrected February-July deck 927 data (483,600 reports, of which 6 were rejected in conversion to LMR). The 37K erroneous reports formed a test file (not real data) generated to test a particular problem and were merged into the archive by mistake. Comparisons for selected 10-degree boxes (i.e., 201, 206, and 216; Marsden Squares 121, 116, and 142) of the data on D4ZZ01 versus deck 927 data in NCDC's TD-1129 copy of ICOADS, in which NCDC had previously implemented a correction for the problem, revealed exactly the same deck 927 reports for February-July 1975. The records were pulled from both sources (D4ZZ01 and TD-1129) and found to match exactly in number, and bit-for-bit (via Unix command "diff") after each file was sorted, for each of the three test squares. The test results lead us to believe that D4ZZ01 is a complete and corrected set of deck 927 reports for the period February-July 1975, except for any delayed deck 927 reports received subsequent to Release 1 (as determined by SID). Therefore, to correct ICOADS we can delete all 927 records globally for the period February-July 1975, except for the corrected set (D4ZZ01) and any reports received after Release 1. The number of deleted reports should be approximately 484K plus 37K reports, except that more deletions may be due to multiple receipts of the deck 927 data. {2.2 Field modifications} Field modifications take the form of deleting, modifying, or adding a field, including extraction in some cases of information from the supplemental attachment (Attm4) (see sec. 3 for the rules used for setting the LMR fields for platform type and ID indicator). Erroneous data values already stored in the error attachment (Attm5) of LMR may or may not be affected by preconditioning, as specified under the description of each set of preconditioning rules. In addition, deleted data fields, for example, are not written out to the error attachment. These actions are important to note for any user of the error attachment, in case of unexpected effects. All decks Rules: a) Apply "lmrfix" rules (and some related rules described here) [NOTE: For convenience, all lmrfix rules are described here under "all decks" even though some rules are actually applied only to specific decks]: i) Delete dew point temperature (DPT) from decks 900 (Australian) and 155-156 (HSST), applicable to data in the error attachment. [NOTE: decks 155-156 are pre-1970s data]. ii) Delete wet bulb temperature (WBT) from deck 899 (South African Whaling), applicable to data in the error attachment. [NOTE: deck 899 is pre-1970s data.] iii) Reset platform type (PT) according to source ID (SID) or deck: 1. Release 1 OSV data: If SID=8, 9, or 20, and if PT is not already 3 (ocean station vessel; on station): set PT=2 (ocean station vessel; off station or station proximity unknown). This rule is not applied if PT is in the error attachment. 2. Release 1 data containing original PT information: If SID= 1, 5, 6, 10, 11, or 12, leave PT unchanged (including PT in error attachment). 3. Data processed after Release 1 for which PT was set during conversion to LMR6: For decks 145 (PMEL Daily Equatorial Mooring), 749 (FGGE Level IIb), and 780 (Levitus Atlas), leave PT unchanged (including PT in error attachment). 4. For all remaining data from Release 1 (i.e., SID=2-4, 7, 13-17, 19, 21, 24; and SID=18 and SID=23 reports, including those from decks 143 or 876-882) and data processed after Release 1 (SID=22 and SID greater than 24): set PT to missing (PT in error attachment is left unchanged). b) Left-justify ID, with missing right fill. This rule is only applied if leading missing positions are not associated with data in the error attachment. Any ID characters in the error attachment are similarly shifted to ensure that, e.g., they will assume the proper position in a printout. c) For reports converted from TD-1100 (as determined by source ID; see Release 1, supp. F, Table F1-2), extract ID from Attm4 according to deck as specified in Table 1. If a report from a deck listed in Table 1 with a question mark in the "ID length" column is encountered, or a deck not listed in Table 1 that was converted from TD-1100 is encountered, issue an error diagnostic and print the report including the supplemental attachment. Table 1. Location of ID information for 1970-79 source TD-1100 decks. This table defines the field or concatenation of fields that is to be extracted from Attm4 to form ID (LMR6 fields 57-64) of the length listed, applicable only to reports converted from TD-1100 (i.e., "00" in Fmt column), as determined by SID. When the table contains a question mark for "ID length," no extraction is made but a diagnostic is printed. The "II" column indicates the resultant setting of II (LMR6 field 56), provided extant ID information results from the extraction operation, or refers to sec. 3 if setting is resolved there. Note that in some cases decks may previously have been converted from TD-1100 into TD-1127 or TD-1129, with possible loss of ID information. ------------------------------------------------------------------------------- Deck Doc* Description Fmt Supplemental or regular positions** ID length II =============================================================================== 128 T tt US/For.Exch 00 90-3 SON 4 9 555 U -- Monterey 00 90-3 U 4 (sec.3) 666 U -- Tuna 00 ? ? - 876- 882 W x5 NDBC 00 90-4(extended "ship no.")U 5 (sec.3) 888 X xb GWC 00 90-3 U 4 (sec.3) 889 Y xb Autodin 00 90-3 U 4 (sec.3) 891 Z x6 NODC 00 90-5(extended "ship no."; from NODC)U 6 7 ------------------------------------------------------------------------------- * Three non-blank sub-columns appear under this column: An upper-case letter in first sub-column indicates the source of NCDC documentation for the TD-1100 format, if any: U = undocumented: not included in NCDC (1968), and no other documentation is known to exist T = NCDC (1968) W = NCDC (1972b) [NOTE: Positions 90-4 appear in hand-written changes to NCDC (1972b); NCDC (1972a) indicates positions 90-93.] X = NCDC (1976a) Y = NCDC (1976b) Z = NCDC (1983) A lower-case letter in the second sub-column indicates whether "ship number" (SN; TD-1100 positions 90-93) was documented to be blank or non-blank: - = not applicable (undocumented TD-1100 deck) t = SN documented as full configuration of TD-1100 codes (non-blank) x = SN documented as non-blank A lower-case letter or digit in the third sub-column indicates whether "Ocean Station Vessel or ship indicator" (OSVSI; TD-1100 position 81) was documented to be blank or non-blank: - = not applicable (undocumented TD-1100 deck) b = OSVSI documented as blank t = OSVSI documented as full configuration of TD-1100 codes (blank or non-blank) 5 = OSVSI documented as 5 6 = OSVSI documented as 6 [NOTE: Deck 128 was the deck around which the TD-1100 format was designed, and the only deck that incorporated the full configuration of OSVSI codes. NCDC (1968) defines blank in OSVSI as "Navy and Deck Log Observations," whereas for deck 116 it indicates "all other ships." For all other decks, blank in OSVSI probably represented "unknown" (i.e., missing OSVSI information). OSVSI values for ocean station vessel (OSV) "off station" (2), or "on station" (2 with "-" overpunch) signified that TD-1100 positions 78-79 were to be interpreted as "Ocean Weather Station Number" (OWSN); otherwise the OWSN field was expected to contain the country code for ship data from deck 128.] ** Individual fields documented in NCDC (1968) to be constrained to certain characters, as indicated by the following codes: N = numeric O = sign ("+" or "-") overpunch of a numeric*** S = sign only; "-" is the only documented possibility U = undocumented in TD-1100 *** Early ICOADS decks were stored at NCDC on punched cards 80 columns wide by 12 punch positions vertically (12, 11, 0, 1, ..., 9). Documentation may alternatively refer to the uppermost (12 and 11) positions as the "+" and "-" (NCDC, 1968, p. iv) or "Y" and "X" positions. Typically a 12 or 11 overpunch of one of the numeric values (0, ..., 9) corresponds to an ASCII character as follows (see also: Release 1, supp. I, Table I2-1; and , Table 14): Sign overpunch of numeric Punch 12 or "+" or "Y" 11 or "-" or "X" ----- ---------------- ---------------- Zero overpunch 0 { } -------------- 1 A J / 2 B K S 3 C L T 4 D M U 5 E N V 6 F O W 7 G P X 8 H Q Y 9 I R Z ---------- d) Recover country code (C1) characters 00-40 from Attm5 by accepting an 11 overpunch over a 0-9 in either or both character positions (i.e., "}",J-R, plus "{" (12 overpunch over zero). e) Compute a missing dew point temperature if WBT and AT are extant; if SLP is missing 1015.0 is used as SLP. This rule is not applied if any of the data used for computation of DPT (i.e., SLP, AT, WBT, or T2) are in the error attachment, or if WBT is greater than AT. Constants ACON and BCON are set for computation of DPT relative to water: ACON=7.5 and BCON=237.3. The following Fortran code is then used to attempt computation of DPT [NOTE: this computation actually occurs as a final step in field modifications, since AT and WBT values may be modified by rules given below that are applied to individual decks]: ESW = 6.1078*10.**(WBT*ACON/(WBT+BCON)) E = ESW-(.00066*SLP)*(((.00115*WBT)+1)*(AT-WBT)) IF(E.LT.0.) RETURN CCON = ALOG10(E/6.1078) DPT = BCON*CCON/(ACON-CCON) where the RETURN if vapor pressure (E) is less than zero leads to an error diagnostic, and otherwise the resulting DPT is rounded to the nearest 0.1@C. To indicate that this calculation has taken place during preconditioning, T2 is set to 3, 4, 5, or 6, simply depending on whether the previous value of T2 was missing, 0, 1, or 2. f) Set the lat/lon indicator (LI) according to Release 1, Table K5-1, if the deck of the report is listed in that table. If, however, the report contains an extant LI setting that differs from the Table K5-1 setting, issue an error diagnostic but leave LI unchanged. g) Delete an extant temperature indicator (T1) or second temperature indicator (T2) from reports missing all (air, wet bulb, dew point, and sea surface) temperature data. This rule is not applied if any of the temperature data are in the error attachment, but otherwise it is applied even if T1 and T2 are in the error attachment. Background: a) The Release 1 data converted from LMR5 to LMR6 (as documented in ) are a mixture of data input to and output from Release 1 dupelim, due to the loss of some of the input data (whenever possible for Release 1b, we use data input to rather than output from Release 1 dupelim, in order to be able to apply improved dupelim methods to the Release 1 data). However, only reports output from Release 1 dupelim incorporate the results of subsequent "lmrfix" processing to implement modifications and correct errors. Therefore, the equivalent processing must be applied to reports that represent input to Release 1 dupelim, so as to implement the lmrfix changes (because of the nature of these changes, it is harmless to reapply them to dupelim output data) [NOTE: for completeness, some fixes are also described for pre-1970s data]: i) Negative dew point temperatures (DPT) calculated in deck 900 were subject to a round-off error (Release 1, p. K3 documents that DPT calculation occurred during conversion to TD-1129); decks 155-156 were subject to a similar error in conversion to LMR5 (see Release 1, p. J1). Deleting any extant DPT values under this rule, followed by recalculation of DPT under rule e), fixes these problems. [NOTE: During original lmrfix processing, side-effects on QC flags were minimized by not completing re-computation unless the new DPT was exactly 0.1C colder than the old one (Release 1, p. J1). Based on test results it appears that the original deck 900 DPT recomputation never successfully completed, due to formula differences (in that used to calculate DPT at NCDC, versus that used for Release 1) that resulted in DPT differences greater than 0.1C. Since side-effects on QC flags are no longer an issue, this rule differs from lmrfix in not testing for DPT 0.1C colder.] ii) Negative wet bulb temperatures (WBT) calculated in deck 899 were subject to a round-off error (Release 1, p. K7 documents that WBT calculation occurred during conversion to TD-1129M). This was not part of the original lmrfix processing, but is described here because of its similarity to item i). iii) Some changes in the setting of platform type (PT; formerly ship type, ST) were originally made as part of lmrfix to fully implement the setting of existing (Release 1, p. I13) and derived (p. I14) ship type. In two cases for Release 1 data the processing described here differs from original lmrfix processing: --"Eltanin" data (SID=13): ST=6 (research ship) was set during original lmrfix processing. However, due to changes in the meaning of code values for PT versus ST, ST=6 for Eltanin data was changed into PT=missing during conversion from LMR5 to LMR6 (see ). --Release 1 buoy data: For SID=18 or SID=23, and deck 143 or 876-882; and for SID=24: ST=5 (buoy) was set during original lmrfix processing. Instead, PT=6 (moored buoy) is set for these decks (regardless of SID) during later PT/II processing (see sec. 3). [NOTE: Release 1, p. I14 contains an incorrect range "876-886" of buoy decks used for COADS Release 1, which should read "876-882".] One additional feature of lmrfix was implemented instead during conversion from LMR5 to LMR6 (see ): iv) Extraction of the sea surface temperature method indicator (SI; formerly bucket indicator, BI) from deck 128 starting 1 January 1968. [NOTE: Another set of Release 1 fixes ("qcfix") involved corrections for problems in the QC logic (see Release 1, p. J1). Corrections for these problems, which are documented as follows, have now been implemented in a revised QC subroutine that is applied to all data (including both data input to and output from Release 1 dupelim): i) pressure tendency flag (M): HSST Exchange data (decks 155-156) were mistakenly processed as if the supplemental attachment was a TD-11 format, resulting in a spurious pressure tendency flag. The correction set the pressure tendency flag to missing. This applies only to pre-1970s data. ii) dry bulb flag (J) and present weather flag (B): A test for "dryblb < -2.2C" is given by Release 1, p. J15. Originally QC had the minus sign missing. iii) dry bulb flag, wet bulb flag, dew point flag (N): A 0.5C tolerance on initial dryblb < wetblb < dewpt tests is given by Release 1, p. J13. Originally QC had zero tolerance on these checks (see Release 1, p. E6 for a discussion of the problem). iv) seawvf (M): A temporary substitution of a wave direction of 36 or 38 (inferred from the wind direction) into a post-1968 missing wave direction is given by Release 1, p. J16 (see footnote on that page). Originally QC omitted the bottom two rows of flowchart actions on p. J16 and jumped directly to flow connection 12. It should be noted that the flowchart omits a test for whether wind direction is less than 1 or greater than 362 (this test, which is executed as part of the QC code, should appear in the center of the second row from the bottom on the flowchart).] b) Although some input formats such as TD-1129 are documented to have ID left-justified, errors may have occurred (note that blanks were translated into missing during conversion into TD-1129). After this step, leading positions set to missing in the ID array should correspond to erroneous (or out-of-range) characters appearing in the error attachment. [NOTE: Rule c) may introduce leading blanks into the ID array. Rules b) and c) are ordered (and executed) as given, so as to preserve any meaningful blanks from source TD-1100 data.] c) Background is given in Table 1. d) (See .) e) This step prepares for statistics by calculating DPT where it would otherwise be unnecessarily missing. The second temperature indicator (T2) shows which of the psychrometric elements, wet bulb temperature (WBT), dew point temperature (DPT), or ice bulb temperature (IBT) (value stored in WBT field), were reported or computed (or were claimed to be reported or computed; see following text): 0 = WBT reported, DPT computed (either may be missing) 1 = DPT reported, WBT computed (either may be missing) 2 = IBT reported, DPT computed (DPT may be missing) 3 = DPT computed during preconditioning (T2 was missing) 4 = DPT computed during preconditioning (T2 was 0) 5 = DPT computed during preconditioning (T2 was 1) 6 = DPT computed during preconditioning (T2 was 2) Only reports input to preconditioning that were converted directly from the IMMT format into TD-1129 (SID=46-47) have T2 already set, limited to a value in the range 0-2 if DPT and/or WBT is extant; for all other reports T2 is missing. Under the interpretation of the IMMT format used for this conversion, there exist possibilities for a WBT or DPT whose numeric value is missing, but with the indication that it was "reported," hence the need for T2=5. [NOTE: The method for calculation of missing DPT is believed to be the same as that used in Release 1 processing. Release 1a processing (see ) previously differed in that ACON and BCON were set for computation of DPT relative to ice in some circumstances, and in that it failed to check for WBT greater than AT (it is unknown if Release 1 processing included such a check). Differences in Release 1a processing were fully resolved as part of the 1980-97 update and extension.] f) Release 1, Table K5-1 lists the LI settings that were made in reports during Release 1 processing (the lat/lon indicator was then abbreviated as XYI). As discussed under a), the Release 1 data converted from LMR5 to LMR6 are a mixture of data input to and output from Release 1 dupelim, such that LI should be missing in the former and extant in the latter. More recently processed decks, however, which may or may not appear in Table K5-1, should have LI set during conversion to LMR. g) Some of the following deck-specific field modifications may result in deletion of temperatures (e.g., wet bulb or sea surface temperature). This rule, which is executed after all deck-specific modifications, is intended to eliminate any temperature indicators (T1 and T2) that may end up referring to missing data after deletion of a given temperature field. All decks except 876-882, 883 Rules: Delete wave direction (WD) from all reports from all decks except 876-882, 883 (NDBC). This rule is not applied to data in the error attachment, and the following comparison is made with wind direction (D) as a part of this check: WD values 1-36 must equal D/10, or WD=0 must correspond to D=361, or WD=38 must correspond to D=362. If WD does not match D exactly as stated, print a diagnostic message followed by a listing of the report, but still delete wave direction. Background: Effective 1 January 1968 wave direction was no longer ordinarily reported by ships. However, NCDC substitutes wind direction into missing wave direction. [NOTE: This rule should affect only data originally converted from TD-11 formats into LMR5 or LMR6.] Deck 555: US Navy Fleet Num. Met. and Oceano. Center (FNMOC; Monterey) Telecom. Rules: Delete wet bulb temperature from any deck 555 report, applicable to data in the error attachment. Background: We believe that wet bulb temperature was computed in deck 555 (the "Surface Ship" format that appears to be the source of the Monterey data contains only dew point temperature, and only DPT could be reported in early telecommunicated data). [NOTE: In addition to the goal of restoring these data to a more pure form, this rule is important for increasing the dupelim selection of logbook (e.g., decks 128 and 926-928) reports over matching GTS reports. Otherwise, GTS data will frequently be selected because of the presence, usually only in GTS data during this period, of barometric tendency (A) and amount of SLP change (PPP). Deletion of WBT should typically compensate in the quality code for the presence of these additional elements, thus leading to selection of logbook data according to priority code given equal quality codes.] Deck 732: Russian Marine Meteorological Data Set (MORMET) (received at NCAR) Rules: a) Delete an extant station/weather indicator (IX) from any deck 732 report. This rule is applied to data in the error attachment. b) Delete an ID consisting of "000000." c) Starting in 1975, delete extant country code (C1), applicable to data in the error attachment. d) Starting in 1975, delete extant observation source (OS) and observation platform (OP), applicable to data in the error attachment. e) Delete extant sea surface temperature method indicator (SI), applicable to data in the error attachment. Background: a) (See background under deck 926.) b) Apparently meaningless IDs of "000000" are frequently present (mainly before 1982). [NOTE: This field modification was not applied to Release 1a data (see ).] c)-d) Based on abrupt shifts in behavior through time (after 1975) and comparison with duplicates from other sources, these fields do not appear to provide reliable information (OS=1 and OP=1 frequently, indicating "national logbook" and "selected ship," respectively). [NOTE: The GTS SHIP code does not provide country code information (ship flag nationality, as opposed to recruiting country, may be derivable from ship call signs). Since deck 732 appears to consist of a mixture of GTS and logbook data in recent years, with unreliable OS information, it appears impossible to reliably separate GTS from logbook data and thus to identify logbook reports in which country code is more likely legitimate.] e) Similarly, this field does not appear to provide reliable information (SI=3 frequently, indicating "hull contact sensor"). Deck 749: First GARP Global Experiment (FGGE) Level IIb Rules: For the "surface marine data" (SID=53) only: a) Reset time indicator (TI) to 0 (nearest whole hour) from 2 (hour plus minutes). b) Reset lat/lon indicator (LI) to 0 (degrees and tenths) from 5 (high resolution data). c) Reset wind direction indicator (DI) to 0 (36-point compass) from 5 (360-point compass). Background: a)-c) The original settings of these indicators accurately reflect the precisions used to store data in the original FGGE IIb format, but not the way ship (surface marine) data were originally reported. Similar modifications might be warranted for other portions of the FGGE data. [NOTE: Originally, the surface marine portion of deck 749 contained moored buoy data, for which these resettings may be less appropriate. However, the moored buoy data (preexisting PT=6) are subject to report deletions (sec. 2.1).] Deck 882: US National Data Buoy Center (NDBC) Data Rules: Delete any sixth position of ID set to "0" in deck 882. Background: A trailing zero sometimes appears after the ordinary 5-digit WMO buoy number in deck 882, which interferes with proper setting of ID indicator. The source of these trailing zeros is unknown. Deck 888: US Air Force Global Weather Central (GWC) Rules: a) Delete wet bulb temperature from any deck 888 report, applicable to data in the error attachment. b) During 1973-April 1977, add 0.2C to extant air temperature, sea surface temperature, and dew point temperature. If, however, the tenths position of negative DPT is not ".2", or of positive DPT is not ".8", do not implement corrections and issue an error diagnostic. Alternatively, if DPT is missing but either AT or SST is extant, implement corrections but issue a different error diagnostic. c) During May 1977-1979, check that the tenths position of dew point temperature is ".0". If not, issue an error diagnostic. Background: Rules a)-c) all concern correction of a GWC temperature bias during 1973-1200 UTC 5 April 1977 (approximate end date). The problem directly affects sea surface temperature, air temperature, and dew point temperature, and indirectly affects computed wet bulb temperature [NOTE: background under deck 555 about the importance of deleting WBT in order to increase the selection during dupelim of logbook over GTS data, also is applicable to rule a) here]. A 12 December 1985 memo from Joe Elms to Dick Cram gives general background on the discovery in 1985 of a negative 0.2C temperature bias in deck 888; a 23 August 1988 memo from Scott Woodruff to Dick Cram showed evidence of continuing problems. Briefly, GWC changed the Kelvin base, at 1200 UTC on 5 April 1977, to 273.2 from 273, without notification to NCDC. When this was later discovered, NCDC changed the conversion program to incorporate the new base value and to correct data after April 1977. Unfortunately, the conversion program was later run on "some pre-April 1977 data," resulting in temperatures too low by 0.2C. According to the memo of 12 December 1985, NCDC made a partial correction for such temperatures prior to April 1977 by checking for DPT ending in ".2" or ".8," since dew point was only available in whole degrees throughout the 1970s, and correcting DPT, AT, and SST, and recomputing WBT. This correction would have left reports with missing DPT unchanged. We have now performed annual spot checks on a very limited amount of Release 1 data for 1973-79, and have found by examination of duplicates between deck 888 and other decks (e.g., logbook data) confirmation of continuing problems as described in the 23 August 1988 memo. Basically it appears that the aforementioned correction was never applied to the GWC data used for ICOADS, such that all GWC temperatures for 1973 through April or May 1977 that were examined were 0.2C lower than they should be. Furthermore, the exact cut-off for the problem does not always occur effective at 1200 UTC on 5 April 1977, since some reports after that time were found with the 0.2C bias (e.g., 23 May). Issues such as delayed receipts at GWC or errors in date could complicate the cut-off issue. It should be emphasized that these spot checks were confined to very limited amounts of data in a few 10-deg boxes near Alaska (122-124 for 1973-77, with only one box examined for each year) and one 10-deg box near the E. Coast of the U.S. (206 for 1978-79). Deck 891: US National Oceanographic Data Center (NODC) Surface Data Rules: Delete sea surface temperature from all deck 891 station data (SD) reports, applicable to data in the error attachment. Background: As discussed in sec. 2, the Levitus World Ocean Atlas (WOA) provides more complete oceanographic data, including sea surface temperature, than are provided by deck 891. However, WOA does not at this time provide meteorological data that may accompany station data reports from deck 891. Deck 926: International Maritime Meteorological (IMM) Data Rules: Delete an extant station/weather indicator (IX) from any deck 926 report with country code indicating that the country that recruited the ship was France (C1=4). For deck 926 data from other countries, delete IX if IRD is missing, or if IRD indicates that the data were received prior to March 1985 (i.e., tape I3ZG62, which was received in December 1984). This rule is applied to data in the error attachment. Background: France stated that they did not consider a position in the IMMT format to be defined for IX, and they filled it with something else that sometimes resembles IX in form. IX only became part of the IMM format effective March 1985. {3. Rules for assignment of platform type and ID indicator} This section describes the rules used to set LMR fields platform type (PT) and ID indicator (II). PT indicates whether the reporting platform is a ship or some other type of platform; II indicates whether a call sign or some other sort of ID is contained in the ID array. As background, PT and II have the following defined values: Platform type (PT) 0 = US Navy or "deck" log, or unknown 1 = merchant ship or foreign military 2 = ocean station vessel--off station or station proximity unknown 3 = ocean station vessel--on station 4 = lightship 5 = ship 6 = moored buoy 7 = drifting buoy 8 = ice buoy 9 = ice station (manned, including ships overwintering in ice) 10 = oceanographic station data (bottle and STD/CTD data) 11 = mechanical bathythermograph (MBT) 12 = expendable bathythermograph (XBT) 13 = Coastal-Marine Automated Network (C-MAN) (NDBC operated) 14 = other coastal/island station 15 = fixed ocean platform (plat, rig) ID indicator (II) 0 = ID present, but unknown type 1 = ship, OSV, or ice station call sign 2 = generic ID (e.g., SHIP, BUOY, RIGG, PLAT) 3 = WMO 5-digit buoy number (possibly followed by A-L for NCEP data) 4 = other buoy number (e.g., Argos or national buoy number) 5 = C-MAN ID 6 = station name or number 7 = NODC platform/cruise 8 = IATTC pseudo ID 9 = national ship number 10 = ship name or composite information from early ship data Part a) gives the rules for setting PT and II. Part b) describes a set of general checks on the form and contents of ID, referenced in part a). In all cases deck is the field that initially determines the group of rules to be used; for a given deck, specific rules may depend further on the contents of the ID/call sign array (ID; fields 57-64, i.e., 8 characters). At this time, neither source ID (SID) nor the IMMT flag for observation platform (OP) are utilized in setting PT and II. a) Rules for setting PT and II: by deck For brevity, the following rules are expressed using pseudo-code similar to Fortran. Generally each line of pseudo-code assigns PT and then II; when two statements appear on a single line separated by a semicolon (;) and the first such statement is a conditional statement that is not true, execution proceeds on the next line. Checks are made in some cases for "generic" IDs; and in other cases for a specific form of ID, i.e., when a construct appears that contains the underline character (e.g., ship_ID). See part b) for details of the generic and general ID checks. Preexisting PT values were either retained or deleted as part of "lmrfix" processing (sec. 2.2). If preexisting PT was retained, PT was generally not subject to change at this stage. Specifically, for decks indicated by symbols: * = Final assignment of PT and II was made in conversion from the original input format into LMR6 (deck 749 is an exception, in that PT was assigned but II left missing, during conversion; and that originally assigned PT may be reset as discussed below). # = Derived "ship type" (Release 1, supp. I, sec. 2.11) was transformed into PT during conversion from LMR5 to LMR6 (see ), possibly subject to "lmrfix" modifications described in sec. 2.2 of this document. & = Existing "ship type" (Release 1, supp. I, sec. 2.10) was transformed into PT during conversion from LMR5 to LMR6 (see ), possibly subject to "lmrfix" modifications described in sec. 2.2 of this document. Assignment of II according to Table 1 (sec. 2.2). (We assume all these data were converted from TD-1100 into LMR5 during Release 1 processing; if not, alternative rules for assignment of PT and II are given below.) Otherwise, assignment of PT and II is determined at this stage of QC/dupelim. Where assignment is determined here, it is important to note that initially PT is set to missing and if ID is missing, II is set to missing; otherwise II is set to 0. Also a final check for generic ID (II = 2) overrides a previously set II, including the initial setting of 0. Deck 128: International Marine (US- or foreign-keyed ship data)& PT = 0, 1, 2, 3, or 4; if ID is not missing II = 9 Background: This rule assumes all data were originally source TD-1100. If this is not the case and an ID is present, II is set according to the standard initialization rules (above) and PT is set to 5. Deck 143: Pacific Marine Environmental Laboratory (PMEL) Buoys# PT = 6 Background: The availability and form of any ID information is unknown. If an ID is present, II is set according to the standard initialization rules (above). Deck 145: PMEL (Daily) Equatorial Mooring* PT = 6 or 14, and II = missing Background: During conversion, the PMEL moored buoy data were obtained from separate files; this information was used to set PT (ID information should always be missing). Deck 186: USSR Ice Stations PT = 9 Background: The availability and form of any ID information is unknown. If an ID is present, II is set according to the standard initialization rules (above). Deck 555: US Navy Fleet Num. Met. and Oceano. Center (FNMOC; Monterey) Telecom. if EB_number or "BUOY", PT = 6; if EB_number, II = 4 else if "RIGG" or "PLAT", PT = 15 else if OSV_ID, PT = 2 and II = 1 else PT = 5; if ship_ID, II = 1 Deck 666: Tuna Boats PT = 5 Background: The availability and form of any ID information is unknown. If an ID is present, II is set according to the standard initialization rules (above). Deck 667: Inter-American Tropical Tuna Commission (IATTC) PT = 5; if ID is not missing II = 8 Deck 714: Canadian Marine Environmental Data Service (MEDS) Buoys if moored_buoy_ID, PT = 6 and II = 3 else if drifting_buoy_ID, PT = 7 and II = 3 Deck 732: Russian Marine Meteorological Data Set (MORMET) (received at NCAR) if not OSV_ID, PT = 5; if ship_ID, II = 1 else PT = 2 and II = 1 Background: The default setting of PT = 5 is overridden only if an OSV call sign is found. All OSV data are set to PT = 2 indicating "off station or station proximity unknown" since tests for location relative to the assigned OSV location at any given time were not performed. [NOTE: Few ID fields other than "000000," deleted as part of field modifications (sec. 2.2), exist in deck 732 prior to 1982.] Deck 733: Russian AARI North Pole Stations (from Polar Science Center) PT = 9; if ID is not missing II = 6 Deck 749: First GARP Global Experiment (FGGE) Level IIb* PT = 2 (set here), 5, 6, 7, 10, or 12 (previously set); II setting depends on the preexisting setting of PT: for preexisting PT = 5 (from SID=53, Surface Marine Data): if not OSV_ID, PT = 5; if ship_ID, II = 1 else PT = 2 and II = 1 for preexisting PT = 6 (from SID=53, Surface Marine Data): PT = 6; if moored_buoy_ID, II = 3 for preexisting PT = 10 or 12 (from SID=54, Oceanographic Data): PT = 10 or 12; if ship_ID, II = 1 for preexisting PT = 7 (from SID=55, Drifting Buoy Data): PT = 7; if drifting_buoy_ID, II = 3 Background: For the PT=5 (ship) portion of the surface marine data (SID=53), it should be noted that the original FGGE IIb format contained a "data source index" (retained in the supplemental attachment) of 33 or 34 to indicate "fixed" or "mobile" ship data, respectively. However, this information was not used to set PT or II. For the oceanographic data (SID=54), the FGGE IIb documentation indicates that a call sign should be present in the ID field, unless "DECK=0004," a condition which we believe never occurred. [NOTE: The rule given here for PT=6 should never be invoked because these data are subject to report deletions (sec. 2.1).] Deck 780: Levitus World Ocean Atlas (WOA)* PT = 10, 11, or 12; if ID is not missing II = 7 Deck 849: First GARP Global Experiment (FGGE) if EB_number or "BUOY", PT = 6; if EB_number, II = 4 else if "RIGG" or "PLAT", PT = 15 else if OSV_ID, PT = 2 and II = 1 else PT = 5; if ship_ID, II = 1 Background: It is believed that some of the FGGE deck 849 data were derived from GWC or other GTS sources (see background under deck 888). [NOTE: During a conversion in Boulder of the '70s Decade (SID=18) data from TD-1127 to TD-1129 format (prior to COADS Release 1 processing) buoy data were removed from deck 849 as defined by any report with the first five positions of ID all numeric. This was per instructions from NCDC, due to errors in deck 849 (ref. 31 March 1982 memo from Dick Cram).] Deck 850: German FGGE if OSV_ID, PT = 2 and II = 1 else PT = 5; if ship_ID, II = 1 (see background under deck 732) Background: This deck is believed to be limited to mobile ship reports (delayed mode data). Decks 876-882: US National Data Buoy Center (NDBC) Data# PT = 6; if moored_buoy_ID, II = 3 else if EB_number or early_moored_buoy_ID, II = 4 Background: All NDBC data during the 1970s period are believed to be moored buoy data. Deck 883: US National Data Buoy Center (NDBC) Data PT = 6; if moored_buoy_ID, II = 3 Background: All NDBC data during the 1970s period are believed to be moored buoy data. [NOTE: This rule is modified from Release 1a processing in omitting tests for C-MAN and drifting buoy IDs, and in setting all reports to PT = 6.] Deck 888: US Air Force Global Weather Central (GWC) if EB_number or "BUOY", PT = 6; if EB_number, II = 4 else if "RIGG" or "PLAT", PT = 15 else if OSV_ID, PT = 2 and II = 1 else PT = 5; if ship_ID, II = 1 Background: NDBC buoys were identified in GWC's DATSAV format by an EB number rather than a 5-digit moored buoy number. Due to processing problems, drifting buoy data apparently are not available in the deck 888 data used for ICOADS. [NOTE: Drifting buoy data probably would have been identified by "DRIB" (also a valid ship call sign), since any 5-digit drifting buoy numbers were not extracted out of the DATSAV format for inclusion in TD-1129. EB_number should never be encountered here as these data are subject to report deletions (sec. 2.1).] Deck 889: Autodin (US Dept. of Defense Automated Digital Network) PT = 5; if USN_number, II = 9 else if ship_ID, II = 1 Background: Early Autodin IDs are a mixture of ordinary ship call signs (typically starting with "N" for US Navy ships) and special ship numbers. Deck 891: US National Oceanographic Data Center (NODC) Surface Data& PT = 10; if ID is not missing II = 7 Background: This rule assumes all data were originally source TD-1100. If this is not the case and an ID is present, II is set according to the standard initialization rules (above). [NOTE: Report deletions (sec. 2.1) should eliminate all reports with PT=11-12.] Deck 898: Japanese if not OSV_ID, PT = 5; if ship_ID, II = 1 else PT = 2 and II = 1 (see background under deck 732) Deck 900: Australian PT = 5; if AUS_number, II = 9 Background: The Australians provided a 3-digit ship number, which is related to ship names in the original Australian documentation. Deck 926: International Maritime Meteorological (IMM) Data if not OSV_ID, PT = 5; if ship_ID, II = 1 else PT = 2 and II = 1 (see background under deck 732) Deck 927: International Marine (US- or foreign-keyed ship data) if not OSV_ID, PT = 5; if ship_ID, II = 1 else PT = 2 and II = 1 (see background under deck 732) Deck 928: Same as 927 including Ocean Station Vessels (OSV) if not OSV_ID, PT = 5; if ship_ID, II = 1 else PT = 2 and II = 1 (see background under deck 732) b) Checks on ID During earlier field preconditioning (sec. 2.2), the contents of the ID field were left justified with missing fill (note that the blank character was translated into missing during conversion into LMR). Various "generic" IDs may have been introduced by operational (GTS) or archival centers when the actual ID was unavailable. For decks not marked by "*" above, a check was made for the following generic IDs (in addition to other checks for these generic IDs specifically referenced above), such that ID was required to consist of precisely the given characters followed only by missing characters (note that ID characters are not considered to be missing if associated with data in the error attachment): BUOY SHIP RIGG PLAT NNXX In addition, the following general tests on ID are referred to in part a). For easy recognition, the names of these checks utilize one or more underline characters to separate the parts of the name, e.g., ship_ID. The result of each test was a "true" or "false" value depending on whether the contents of the ID field adhered to the stated rules. Similar to the checks for generic ID, each of these general ID forms was required to consist of the indicated combinations or ranges of characters, followed only by missing characters: ship_ID 4-7 alphanumeric characters, not all numeric. buoy_ID "1" followed by a numeric character in the range "1" through "7" or "2" followed by a numeric character in the range "1" through "6" or "3" followed by a numeric character in the range "1" through "4" or "4" followed by a numeric character in the range "1" through "8" or "5" followed by a numeric character in the range "1" through "6" or "6" followed by a numeric character in the range "1" through "6" or "7" followed by a numeric character in the range "1" through "4" and 3 numeric characters. A trailing alphabetic character in the range "A" through "I" is permissible if the deck is 893 or 894. Characters 3 through 5 of "000" or "500" are disallowed. Background: WMO rules apparently disallow 000 or 500 in the rightmost three digits. [NOTE: NMC data are not currently used prior to 1980, but the rule was left unchanged from Release 1a processing.] moored_buoy_ID A buoy_ID such that the middle digit is strictly less than 5 (thus the right-hand three digits are permitted to fall in the range 001-499). early_moored_buoy_ID Zero followed by four digits. Background: Early buoy numbering system apparently devised by NCDC. May have been used as a 5-digit replacement for EB numbers in some data so as to distinguish between buoy reports obtained before the introduction of the WMO 5-digit buoy numbering system. drifting_buoy_ID A buoy_ID such that the middle digit is greater than or equal to 5 (thus the right-hand three digits are permitted to fall in the range 501-999). OSV_ID Beginning in July 1975: "C7" followed by an alphabetic character in the range "A" through "Z." Prior to July 1975: "4Y" followed by an alphabetic character in the range "A" through "Z." Background: [NOTE: This rule is modified from that used for Release 1a (see ) in incorporating the alternative rule for data prior to July 1975.] EB_number "EB" followed by a 2-digit number. Background: Early numbering for NDBC buoys. AUS_number A three digit number. USN_number An initial letter, followed by three digits. {4. Duplicate elimination} The 1970-79 version of the COADS Release 1 duplicate elimination (dupelim) program considered reports within the same 1-degree box and within plus or minus one hour ("hour cross") as possible dups, and performed a check for seven weather elements (wind speed, visibility, present weather, past weather, sea level pressure, air temperature, and sea surface temperature) to determine whether reports were actually dups. These checks for weather elements included "allowances" on air and sea surface temperatures and on wind speed, which considered values to match under some circumstances even though they were not equal. Furthermore, dup status (DS) was set to indicate dup certainty depending on how many elements matched, whether at least one report was from the Global Telecommunication System (GTS), and whether there was an hour cross (see Release 1, Table K5-3). Similarly, dup check (DC) was set to indicate the presence of matches between GTS and logbook reports. Quality code, as computed by the NCDC-defined quality control (QC) procedure (Release 1, supp. J), was the basis for the selection of one duplicate report over another, or, if quality codes were identical, a priority list by deck was used, or, if priorities were also identical, the second report in sort order was selected. See for background on why certain changes were made for Release 1a. Changes in dupelim similar to those used for Release 1a are used for Release 1b, as follows: a) QC processing In addition to some corrections made to the NCDC-defined QC procedure (see discussion of "qcfix" in sec. 2.2), a minor change was made in the way the procedure was applied: Some fields had a narrower range defined in LMR than was supposed to be checked according to the QC flowchart (e.g., sea level pressure has an LMR range of 870:1074.6 hPa, but Release 1, p. J6 stipulates a check for SLP in the range 0:9999.9 hPa). During Release 1 QC processing of such fields, any data from the error attachment (Attm5) were recovered and QC'd. However, this practice lead to confusing QC flags since the flagged data were available only in Attm5, and thus has been discontinued. b) Bathythermographs During Release 1 processing, bathythermographs were tested for duplicates only among themselves, but without distinction between XBT or MBT (ref. Release 1, p. K25). Instead, we rely on the special deck rules and revised priority structure (discussed below) to ensure that the oceanographic data are "passed through" (the WOA data are believed not to contain internal duplicates). c) Hour cross Retain the check for duplicates across hours, but ensure that hourly data subject to "chain reaction" problems are not lost in the main LMR file by retention of all hour-crosses as uncertain duplicates (discussed below). For automated platform or other (e.g., OSV) data the hour cross has the potential for a "chain reaction" such that a long string of reports each separated from the next by one hour would be eliminated. Hourly data were lost from Release 1 due to this problem (memo from Woodruff to Cram, 23 August 1988). d) Allowances Only allowances #1 and #3 were applicable in the Release 1 1970-79 version of dupelim. These allowances continue to apply in the revised dupelim program, as follows: #1 Temperatures off by less than 1 degree (any match with decks 116, 119, 555, 888, 899; ref. Release 1, p. K30). [NOTE: Deck 116 is pre- 1970s data.] #3 Wind ranges used to test for equality (all decks; listed in Release 1, Table K5-4). Since the 1970-79 data fall after July 1963, the ranges are applied unless the two winds being compared both have a wind speed indicator (WI) showing wind was measured; in this case the two winds are given a tolerance of 0.6 m/s (approximately 1 knot) for equality. [NOTE: This conforms with actual Release 1 processing; Release 1, p. K28 states imprecisely that "from July 1963 on this allowance was applied only if one of the two winds being compared had an indicator showing it was estimated." During actual Release 1 processing, ranges were applied if either or both LMR5 WI values were missing or estimated (WI=0 or 3); the 0.6 m/s tolerance was applied only if both WI values were measured (WI=1 or 4). Because of enhancements in the LMR6 format, additional WI values for estimated (WI=0, 3, 5, 6) and measured (1, 4, 7, 8) are now defined (see ).] e) Exact time/space/ID matches For each pair of reports being considered as possible dups, in addition to the weather element check (with allowances), check if the two reports match exactly in time (year, month, day, hour) and space (latitude and longitude). Also, ID is checked for an exact match (both IDs must be extant and non-generic), which has a bearing on the setting of dup status (DS). For the purpose of this ID check, erroneous characters in the error attachment appear as "e" regardless of the actual erroneous value, and are compared between two reports in that way (i.e., "e" matches "e" but nothing else). The weather element (WE) check and the exact time/space/ID check are conducted separately, in that a pair of reports may qualify as dups under either check or both. [NOTE: Future improvements in duplicate elimination should consider the possibility of also comparing the ID indicator (II) and platform type (PT) of each pair of possible duplicate reports (II is used to determine that both IDs are extant and non- generic). However, the settings of II and PT are not considered sufficiently reliable at this time to permit general usage of such checks without probably degrading the performance of dupelim (one check employing PT information appears under the special deck rules below).] f) Special deck rules and revised priority structure Table 2 lists special rules that are applied to some decks, and the proposed new priority numbers. The special rules, acting independently from the priority codes and other selection criteria, may force a given deck to be selected or not selected, and allow some decks to "pass through" dupelim unchanged. Table 2. Duplicate elimination special deck rules and priority codes. Priority codes are used in the event of a match of two reports with equal quality codes, in which case the deck with the lowest priority code is considered preferable and is selected in that match. (If two reports with equal priority codes match, the second report in sort order is selected.) The special rules may override other duplicate selection criteria, including the quality and priority codes; notes following the table explain each of the special deck rules in detail. ------------------------------------------------------------------------------- Rule Priority Deck Description =============================================================================== 2 128* International Marine (US- or foreign-keyed ship data) [b] 2 143 Pacific Marine Environmental Laboratory (PMEL) Buoys [a] 1 145 PMEL (Daily) Equatorial Mooring 3 186 USSR Ice Stations 5 555** US Navy FNMOC (Monterey) Telecom. 2 666** Tuna Boats [a] 1 667 Inter-American Tropical Tuna Commission (IATTC) [b] 2 714** Canadian Marine Environmental Data Service (MEDS) Buoys [z] 6 732 Russian Marine Meteorological Data Set (MORMET) 2 733 Russian AARI NP Stations (from Polar Science Center) [z] 5 749 First GARP Global Experiment (FGGE) Level IIb/SID=53 [k] 5*** 749 First GARP Global Experiment (FGGE) Level IIb/SID=54 [k] 5*** 749 First GARP Global Experiment (FGGE) Level IIb/SID=55 [a] 1 780 Levitus World Ocean Atlas (WOA) 4 849** First GARP Global Experiment (FGGE) 4 850** German FGGE 1 876-882 US National Data Buoy Center (NDBC) Data [k] 6*** 883 US National Data Buoy Center (NDBC) Data 5 888** US Air Force Global Weather Central (GWC) 4 889** Autodin (US Dept. of Defense Automated Digital Network) [a] 1 891 US National Ocean. Data Center (NODC) Surface Data 2 898 Japanese 2 900 Australian 3 926* International Maritime Meteorological (IMM) Data [k] 3*** 926**** International Maritime Meteorological (IMM) Data/SID=58 3 927* International Marine (US- or foreign-keyed ship data) 3 928* Same as 927 including Ocean Station Vessels (OSV) ------------------------------------------------------------------------------- * A deck that has been classified as ship logbook data for the purpose of setting DS and DC. Deck 927 is entirely logbook data, as is probably deck 926 (depending on the setting of LMR field OS). All the other non-GTS decks are classified as "delayed mode" data for setting DC, and thus treated equivalently to logbook data. These additional decks include manned ice floe station data (decks 186 and 733), oceanographic data (decks 780 and 891) or other automated platform data retrieved and quality controlled by a data producer (deck 145 and NDBC decks), and logbook data gathered outside the routine international exchange since 1963 under WMO Resolution 35 (deck 667). Deck 732 is a mixture of GTS and logbook data but is classified as delayed mode data for setting DS and DC. ** A deck that has been classified as from the Global Telecommunication System (GTS) for the purpose of setting DS and DC. Decks 849-850 are considered GTS although they may have been mixed. *** Priority code is needed for [k] decks only in order to resolve "best" match selection for the purpose of constructing "common/different" summary tables output from dupelim. Priority codes are chosen so that the same priority code is applied to both [k] and non-[k] reports within the same deck (not applicable to deck 883). **** Data from the French correction tape whose longitudes have been mislocated due to decoding via the "add 1000" method (source ID=58); see . ---------- The special deck rules listed in Table 2 are described below in order of precedence, i.e., the [a] rules take precedence over the [b] rules in the event of an [a]/[b] match: [a] Absolute pass through. These data should be duplicate free, and not available from any other source. Matches within this deck, or with a report from any other deck (including a different [a] deck), are ignored (all data are passed through). [NOTE: A small amount of matching may be expected, e.g., for ships servicing buoys. Matching is expected between decks 780 (WOA) and 891 (NODC), and also with oceanographic reports from deck 749, but that portion of deck 749 is later subject to automatic data rejection. Match results appear in dupelim summary Tables R5-R6, such that % BEST indicates selection under other rules supposing pass through rules were not in effect.] [b] Limited pass through. These data should be duplicate free, but may occasionally be available from other sources (e.g., where precondition- ing failed due to earlier deck misassignment, etc.). Matches within this deck are ignored (all data are passed through). When a report from this deck matches a report from a different [b] deck, dup selection is resolved according to the default rules (i.e., quality code, priority code, and sort order). Otherwise, the report from the [b] deck is automatically selected over the report from a non-[b] deck unless the following two conditions are met: i) the non-[b] deck has platform type indicating ship (PT=5), and ii) dup status for the match is less than 9. If both conditions are met, the match is ignored (both reports are passed through). [NOTE: See for background on the development of this rule. Match results appear in dupelim summary Tables R5-R6, such that % BEST has a different meaning depending on whether a given match is resolved by selection of one report (i.e., either according to the default rules, or by automatic selection of a [b] over non-[b] report), or if the matching reports are both passed through (i.e., for matches within a [b] deck, or under the conditions for pass through of matches between [b]/non-[b] reports). In the first case it indicates actual selection, or, in the second case, selection under other rules supposing the pass through rules were not in effect (as for [a]).] [k] Automatic data rejection. Some data known to be available from other sources, and some incorrectly located reports, were deliberately introduced into the datastream (e.g., to test for the presence of the French longitude problem). In the event of a match of a [k] report with any other report (including another [k] report), the match is ignored. Following such testing for all possible duplicates with [k] reports, all [k] reports are automatically deleted from the dupelim output. [NOTE: Match results appear in dupelim summary Tables R5-R6, such that % BEST indicates selection under other rules supposing the automatic deletion rules were not in effect (similarly to [a]).] [z] Non-selection except for unique reports. Matches within this deck, or between reports from different [z] decks, are resolved by selection according to the default rules. Matches of a report from this deck with a report from any other deck automatically result in the non-[z] report being considered the best duplicate; thus under these circumstances [z] reports are deleted unless they are unique (or uncertain duplicates). Preliminary work has demonstrated that much of the 1980-92 data from deck 732 is available at higher quality from other sources, but this deck is still believed to contain some unique data. [NOTE: Match results appear in dupelim summary Tables R5-R6, such that % BEST indicates actual selection unless pass through rules also apply.] g) Revisions for dup status (DS) (See .) h) Revisions for dup check (DC) (See .) i) Substitutions between duplicates (ID, II, and PT) Rules: For ship, OSV, or North Pole ice island data only, the ID array, ID indicator (II), and platform type (PT) may be substituted from one report into another matching report under specified conditions. Firstly, the report providing the ID/II/PT information must contain a recognizable non-generic ID (II=1 or II greater than than 2), and the matching report must be missing ID (II missing) or contain a generic ID (II=2). Secondly, the two reports must match as "certain" duplicates without an hour cross (DS=8 or greater), such that the report to receive ID/II/PT qualifies for DS=1, which is changed to DS=2 to indicate that substitution has taken place (thus reports with DS already set to a higher value as the result of an earlier match do not receive a substitution). Three cases for substitution are specified in Table 3. Table 3. Cases for substitution of the ID array and ID indicator (II). ------------------------------------------------------------------------------- Case Source decks w/PT Destination decks w/PT Resultant PT =============================================================================== (a) 555,888,749,849 5 128,926,927,928 0,1, 5 0,1,5* (b) 555,888,749,849 2 128,926,927,928 0,1,2,3,5 2,3** (c) 733 (PT=9; II=6) 9 555,888 5 9 ------------------------------------------------------------------------------- * PT in the destination deck is unchanged. ** Resultant PT=2, unless the destination deck already contains PT=3 which is then left unchanged. ---------- Implementation details: If a report that had a previous substitution as indicated by DS=2 is subject to a change to a higher dup status as the result of a subsequent match, the previously substituted ID/II is deleted after possible substitution into the newly chosen best report (generic IDs previously subject to replacement by a substituted ID/II are lost, but the substituted PT value is retained). This means that only reports with DS=2 will contain a substituted ID/II upon completion of dupelim. [NOTE: Due to frequent disparities between GTS and logbook data, some more generalized form of report compositing may be highly desirable in the longer term, involving fields such as the following: Wind indicator (WI) Ship course and speed (SC, SS) Barometric tendency, amount of pressure change (A, PPP) In planning any such future substitutions, an issue that needs to be carefully considered is whether the report should be re-QC'd each time a substitution takes place, and the quality code recomputed (ID/II are not considered during QC, thus substitutions have no effect on the quality code). Note that if the quality of a report were to improve during one match, this could influence the results of subsequent matches with other reports.] {References} Cram, R.S., 1986: Marine Data Processing Procedures at the National Climatic Data Center. Proceedings of a COADS Workshop, Boulder, Colorado, January 22-24, 1986. S.D. Woodruff, Ed., NOAA Environmental Research Laboratories, Boulder, Colo. 25-35. Jenne, R.L., 1986: Technical working group report. Proceedings of a COADS Workshop, Boulder, Colorado, January 22-24, 1986. S.D. Woodruff, Ed., NOAA Environmental Research Laboratories, Boulder, Colo. 183-189. NCDC (National Climatic Data Center), 1968: TDF-11 Reference Manual. NCDC, Asheville, NC. NCDC (National Climatic Data Center), 1972a: Environmental Data Buoy (TDF-11) Edit/Archive (from R. Quayle; documentation date approximate), 2 pp. NCDC (National Climatic Data Center), 1972b: Environmental Data Buoy (TDF-11) Archival Format (from R. Quayle; documentation date approximate), 3 pp. NCDC (National Climatic Data Center), 1976a: TDF-1144 reference pages (18 February 1976) (from R. Quayle), 2 pp. NCDC (National Climatic Data Center), 1976b: TDF-1145 reference pages (18 February 1976) (from R. Quayle), 2 pp. NCDC (National Climatic Data Center), 1983: Tape format documentation, tape deck 1148: "Conversion of XBT, MBT, and SD tapes to TDF-11 format" (original documentation date unknown, with February 1983 hand-written additions by P. Steurer), 2 pp. Uwai, T. and K. Komura, 1992: The Collection of Historical Ships' Data in Kobe Marine Observatory. Proceedings of the International COADS Workshop, Boulder, Colorado, 13-15 January 1992. H.F. Diaz, K. Wolter, and S.D. Woodruff, Eds., NOAA Environmental Research Laboratories, Boulder, Colo., 47-59.