=============================================================================== International Comprehensive Ocean-Atmosphere Data Set (ICOADS): Release 2.4 Technical Information about Statistics & Related Processing 26 September 2007 ==================================================================== Document Revision Information (previous version: 27 February 2004): Updates, most extensively in sec. 1, Table 4a, and Table 5, for Release 2.4 and for the ICOADS URL. ------------------------------------------------------------------------------- {1. Introduction} This document describes the quality control and data selection criteria used: a) For 1784-1997, to select observational data for inclusion in the Long Marine Reports Fixed-length (LMRF) format and in the International Maritime Meteorological Archive (IMMA) format, from the superset of data available in the Long Marine Reports (LMR) format (which included some landlocked reports and uncertain duplicates). [NOTE: For 1998-forward, in contrast, input data were translated instead directly into IMMA, during which landlocked and other suspicious data may have been removed on a source-by-source basis; LMRF reports were then generated directly from IMMA on a one-for-one basis (throughout 1784- May 2007, IMMA reports are one-for-one with LMRF).] b) To create year-month summary statistics in Monthly Summary Groups (MSG) format (1800-May 2007), from LMRF. For reference, this document also includes some notes about past production from LMR of Monthly Summaries Trimmed (MST) and Monthly Summary Trimmed Groups (MSTG); these older statistics formats have been superseded by the MSG format. Release 2.4 input data were taken from two separate archives: 1784-2004 data were from the delayed-mode (ICOADS.DM) archive, and 2005-May 2007 data were from the real-time (ICOADS.RT) archive. [NOTE: Because of processing differences, the three original COADS updates that compose ICOADS.DM archive through 1997, and accompanying documentation, are referred to as follows: Release 1a: 1980-97 Release 1b: 1970-79 1946-69 Release 1c: 1784-1949 These four documents describe the "preconditioning" and duplicate elimination processing used to create LMR for the indicated periods. 1946-49 Release 1b data were replaced by Release 1c data. In addition, describes similar processing used to extend the ICOADS.DM archive through 2004.] The MSG statistics are available in "standard" (ship only; 3.5 sigma trimming) and "enhanced" (mixed platforms; 4.5 sigma trimming) versions. MSG products are available back to 1800 using 2-degree latitude x 2-degree longitude boxes, and back to 1960 using 1x1-degree boxes. Please refer to for a description of currently available statistics products and time periods of available products. Subsetting of LMRF from LMR (for 1784-1997) and creation of MSG was governed by a condensed selection of quality control (QC) information contained in the trimming "attachment" to LMR (or in the equivalent trimming "section," fields 74-96, of LMRF). Thus the contents of the trimming attachment are documented first (sec. 2). [NOTE: LMR is used only for production ICOADS processing, and the Fortran access software available for LMRF makes the technical differences between an attachment and section transparent to the user. A non-condensed set of QC information is available in the ICOADS attachment of IMMA.] Sec. 3 documents the specific quality control and data selection criteria used to create LMRF (for 1784-1997) and MSG. Sec. 4 documents additional technical details, plus revisions in the rules used for some past updates. {2. LMR trimming attachment and LMRF trimming section} The LMR trimming attachment (Attm2) contains the 2-degree box number and flags that show whether the variables they refer to were trimmed (i.e., excluded from the summaries but retained in LMR and LMRF) as apparent statistical outliers or for other reasons. Also included in Attm2 are "composite" QC flags, a night/day report flag, a "landlocked" flag, and a set of "source exclusion" flags. Attm2 is equivalent to the trimming section in LMRF (different field numbering, but identical in content). A goal in designing Attm2 (and equivalently the LMRF trimming section) was to have it contain information necessary for a user to reproduce either enhanced or standard statistics, without accessing the LMR QC, supplemental, and error attachments (Attm1, Attm4, and Attm5). Although this approach is more complex than having a single flag that simply indicates whether or not a report or data element was used, it provides additional flexibility for users who might choose to make use of different combinations of flags. However, for users who do not wish to do so, easily used Fortran software {trimqc0.f,trimqc1.f} and a Webpage interface are available that will implement pre-defined data deletions and selections. It should be noted that four separate groups of flags within Attm2, plus the dup status (DS) and a few other regular LMR fields, must be checked in order to determine whether LMR qualify for use in the statistics. However, the situation is simpler in the more widely used LMRF format because reports are deleted based on DS and other regular LMR fields, in creation of LMRF from LMR. Table 1 gives the structure of Attm2; each flag group is described in detail following the table. Table 1. LMR trimming attachment (Attm2). Except for field numbering, this is also the structure of the LMRF trimming section (i.e., fields 74-96 in LMRF). In LMR the following fields are preceded by the attachment length and ID, as specified in Table 4 of . Notation is as follows: m:n denotes m through n inclusive. AT, WBT, DPT, and SST are abbreviations, respectively, for air (i.e., dry bulb), wet bulb, dew point, and sea surface temperatures; and SLP is the abbreviation for sea level pressure. ------------------------------------------------------------------------------- No. Field Description True value Units Base Coded Bits =============================================================================== Miscellaneous fields ------------------------------ 1 B2 2-degree box 1:16202 1 0 same 14 2 ND night/day report flag 1:2 1 0 same 2 Trimming flags ------------------------------ 3 SF SST flag 1:15 1 0 same 4 4 AF AT flag 1:15 1 0 same 4 5 UF U-wind flag 1:15 1 0 same 4 6 VF V-wind flag 1:15 1 0 same 4 7 PF SLP flag 1:15 1 0 same 4 8 RF rel. humidity flag 1:15 1 0 same 4 Composite QC flags (NCDC/other) ------------------------------- 9 ZQ report-status flag 1:3 1 0 same 2 10 SQ SST flag 1:3 1 0 same 2 11 AQ AT flag 1:3 1 0 same 2 12 WQ wind flag 1:3 1 0 same 2 13 PQ SLP flag 1:3 1 0 same 2 14 RQ WBT/DPT flag 1:3 1 0 same 2 Composite QC flags (NCDC-only) ------------------------------- 15 XQ present weather flag 1:3 1 0 same 2 16 CQ cloud flag 1:3 1 0 same 2 17 EQ wind wave flag 1:3 1 0 same 2 Landlocked flag -------------------------- 18 LZ landlocked flag 1:1 1 0 same 1 Source exclusion flags -------------------------- 19 SZ SST flag 1:1 1 0 same 1 20 AZ AT flag 1:1 1 0 same 1 21 WZ wind flag 1:1 1 0 same 1 22 PZ SLP flag 1:1 1 0 same 1 23 RZ rel. humidity flag 1:1 1 0 same 1 Attachment total 64 ------------------------------------------------------------------------------- a) Miscellaneous fields The night/day report flag was set to indicate whether the report fell in local nighttime or daytime (as determined in Release 1, supp. A): 1 = report time is local nighttime 2 = report time is local daytime b) Trimming flags The configuration of the trimming flags differs in several ways from what was used for Release 1, including an expansion in the number of bits and a change in the base value. Flag values were added to indicate whether data fall above or below a given sigma threshold, with reference to 4.5 sigma in addition to the 2.8 and 3.5 sigma limits that were previously used. Note that extreme values of the 4.5 sigma limits were adjusted to fall within the lower and upper bounds given by Table C2-3 in Release 1, as was done when the 3.5 sigma limits were originally calculated. Flag values were also added to indicate when data were missing or unusable (Table 2). Where a1 is the individual observation under scrutiny, g is the smoothed median, and s1 and s5 are the smoothed lower and upper median deviation, the trimming flags have the following defined values (values 8-10 are currently unused): 1 = within 2.8 sigma limits (g - 2.8*s1 <= a1 <= g + 2.8*s5) 2 = less than 2.8 sigma lower limit (g - 3.5*s1 <= a1 < g - 2.8*s1) 3 = greater than 2.8 sigma upper limit (g + 2.8*s5 < a1 <= g + 3.5*s5) 4 = less than 3.5 sigma lower limit (g - 4.5*s1 <= a1 < g - 3.5*s1) 5 = greater than 3.5 sigma upper limit (g + 3.5*s5 < a1 <= g + 4.5*s5) 6 = less than 4.5 sigma lower limit (a1 < g - 4.5*s1) 7 = greater than 4.5 sigma upper limit (a1 > g + 4.5*s5) 11 = limits missing (ocean/coastal box); MEDS data correct (SF/PF only) 12 = limits missing (ocean/coastal box) 13 = landlocked 2-degree box 14 = data unusable (SF, AF, and PF, only; see Table 2) 15 = data missing or not computable (see Table 2) The flags for data missing or unusable (14-15), if applicable, were set instead of the flags for limits missing or landlocked (11-13), if also applicable (i.e., values 11-13 refer only to usable data). The numeric ordering of the flags allows computation of "untrimmed" summaries, for example, by testing for a value <= 12 (excluding only landlocked, unusable, or missing data). Trimming flag value 11 was set only for MEDS (deck 714) SST or SLP data falling within an ocean/coastal box lacking trimming limits, for which the MEDS QC flag was set to "1" indicating that the data were checked and appear correct. Note that any MEDS air temperature or wind data appearing under these circumstances were flagged 12, because MEDS applied a lower level of quality control to these elements. Table 2. SST, AT, and SLP data were classified as unusable (flag value 14) under two conditions: a) data were outside the global physical limits given in this table,* and/or b) data were in the LMR error attachment. These variables were instead classified as missing (flag value 15) if the regular field was missing and no corresponding data existed in the error attachment. Flag value 14 is not defined for UF, VF, and RF because these flags refer to computed data not available as fields in LMR; flag value 15 is used to refer to data that were missing or not computable for wind** and relative humidity.*** ------------------------------------------------------------------------------- Variable Physical limits =============================================================================== sea surface temperature (SST) -5.0:40.0 degrees C air temperature (AT) -88.0:58.0 degrees C sea level pressure (SLP) 870.0:1074.6 hPa ------------------------------------------------------------------------------- * The SST and AT limits are based on ranges established for Compressed Marine Reports (CMR4/CMR5) for Release 1 (see supps. D-E); temperature values outside these ranges could not be stored in CMR, and thereby were indirectly omitted from statistics. For later Releases, in contrast, we calculate statistics directly from LMR/LMRF. Thus the global limits are checked when setting SF and AF, since SST or AT values outside these ranges may appear in LMR/LMRF. SLP values are limited to this range in CMR and LMR/LMRF formats, thus any SLP value outside this range was omitted from CMR, or will appear in the LMR error attachment. Therefore, when setting PF, we do not actually check for SLP outside these limits, but only for data in the error attachment. ** Wind was not computable if some wind data existed but U and V did not result after application of Release 1, Table E2-1, including all cases where only wind speed resulted. *** Relative humidity (RH) was not computable if AT was missing or unusable, if DPT was missing or unusable, or if the calculation of dew point depression (DP) yielded a value outside the range 0 <= DP <= 70 deg C, such that the calculation of DP was changed to yield zero when -0.5 <= AT - DPT < 0 deg C. In addition, for Release 1, RH was set to missing if AT fell outside the 3.5 sigma limits; for later Releases, the variable hierarchy for statistics (Release 1, Figure A4-1) serves to implement this dependency at different trimming levels. ---------- c) Composite QC flags The "NCDC/other" composite QC flags summarize selected results from one or two independent quality control procedures: the NCDC procedure (applied to all data), and a second procedure depending on deck. Presently, the second procedure is limited to MEDS (deck 714) moored and drifting buoy data. Decks other than 714 have the composite QC flags set based only on the NCDC procedure. Complete results from the NCDC procedure are available in LMR Attm1, and those from the MEDS procedure in Attm4 of LMR (neither attachment is available in LMRF). As listed in Table 3, the composite QC flags are divided into two groups: a) "NCDC/other" flags, whose values reflect the outcome of two procedures (as applicable; see Table 3). All of these flags were used for the MSG statistics, and all but one (report-status) for the MST/MSTG statistics. b) "NCDC-only" flags, whose values depend only on the outcome of the NCDC procedure (not currently used for statistics). Both groups of flags were set according to Table 3, but the meaning of the flag values differs between the two groups. The extant values of the NCDC/other flags are defined as follows: 1 = erroneous (based on NCDC quality control) 2 = erroneous (based on other quality control) 3 = erroneous (based on NCDC and other quality control) The extant values of the NCDC-only flags are defined as follows: 1 = correctable (based on NCDC quality control) 2 = suspect (based on NCDC quality control) 3 = erroneous (based on NCDC quality control) Table 3. NCDC quality control flags and MEDS flags used for setting composite QC flag values, where "-" indicates an undefined flag value, and "x" indicates a defined flag value that was not utilized in setting the composite flags. The NCDC flags are divided up into those that refer to correctable ("Mod."), suspect, or erroneous data (see Release 1, supp. J). Except in the case of wind data, the composite flags were set simply by utilizing the indicated MEDS and/or NCDC quality control flags; no check was made for the presence or usability of data (see Table 2). ------------------------------------------------------------------------------- Composite QC flags NCDC* MEDS* A B J K L M N Q Erroneous =============================================================================== NCDC/other Mod. Suspect Erroneous ------------------------ ---- ------- --------- ZQ report-status flag** - - - - - M - - (3) SQ SST flag - - - - x x - Q 3 AQ AT flag - - x - x x N Q 3 WQ wind flag x - x - - *** - x 3 (spd and/or dir)# PQ SLP flag - - - - x x - Q 3 RQ WBT/DPT flag## - x - - x x N Q - NCDC-only Mod. Suspect Erroneous ------------------------ ---- ------- --------- XQ present weather flag - B J - L M - - - CQ cloud flag - B J - - - N - - EQ wind wave flag A B J - - M N Q - ------------------------------------------------------------------------------- * The NCDC flags for visibility, past weather, swell, and pressure tendency, and the MEDS flag for pressure tendency, were not included. ** ZQ was used only for MSG statistics (not for MST/MSTG statistics). The NCDC report-status flag M indicates, among other possibilities, a landlocked report based on a check using a 1-degree land/sea grid. This is more stringent than the 2-degree grid used to set the separate landlocked flag (LF). Thus LF may conflict with the setting of ZQ. Preconditioning currently removes all MEDS reports prior to 1993 flagged with report-status 3 (see ); data with report-status 3 appear in the output LMR starting in 1993 (the setting of this flag is noted in parentheses because it appears for a limited period). [NOTE: Starting in 1993, due apparently to changes in the processing of data at ARGOS, MEDS started flagging large numbers of reports because they were from a different satellite pass, mixed with smaller amounts of data flagged for other reasons (e.g., unrealistic movements in the buoy trajectory). To avoid excessive data losses, all MEDS data flagged with report-status 3 were retained by preconditioning starting in 1993 (i.e., reports with ZQ = 2 or 3). This likely introduced additional outliers into the data mixture, and it also complicated rules for production of statistics (discussed in sec. 4).] *** WQ was set only when an M flag appeared under the conditions specified in the upper-left corner of Release 1, Table E2-1. This is the only case where an NCDC flag is necessary to implement Table E2-1. This flag information was retained only for informative purposes, since trimmed or untrimmed summaries can be calculated using only the trimming flags UF and VF (for Release 1, the untrimmed summaries included wind speed even if U and V were not computable). # WQ was set if either or both of the two separate MEDS flags for speed and direction were set. ## RQ was set if either or both of the corresponding NCDC flags for WBT and DPT were set. RQ was included in the NCDC/other group even though there are no such MEDS flags currently defined, to allow uniform processing of the flag values by the statistics program. Note that for Release 1 processing only the DPT flag was checked, not both flags as described here. ---------- d) Landlocked flag The single extant value of the landlocked flag (LF) is defined as follows: 1 = report over land If LF is missing, this indicates that the report falls over an ocean or coastal region as defined by a "landlocked" file at 2-degree resolution (see Release 1, supp. G). As noted in Table 3, the NCDC report-status flag may conflict with the setting of LF due to the use of a higher-resolution (1-degree) check in the NCDC procedure. [NOTE: As discussed in sec. 3, this yielded differences in MST/MSTG 2-degree statistics (screened according to LF), when compared to MSG 2-degree and 1-degree statistics (both screened using the 1-degree check).] e) Source exclusion flags The source exclusion flags are provided to indicate when certain sources or selections of data are automatically omitted from the statistics for subjective reasons. These flags are set independently from the trimming and composite flags, based only on data source, platform type (PT), or report time, and without testing for the presence or usability of data (see Table 2). The single extant value of the special flags is defined as follows: 1 = data automatically disqualified from statistics Even if a given source exclusion flag is missing, the data still may be excluded from the standard or enhanced statistics for other reasons, as detailed in sec. 3: according to the trimming or composite flags, if the report is an uncertain dup as indicated by dup status (DS), or if the report falls outside time periods specified for different source IDs. For 1980-May 2007 data, the source exclusion flags were set as follows: a) SZ through RZ: All C-MAN data (PT=13). b) WZ: All wind measurements from drifting buoys (PT=7). c) SZ through RZ: All off--3-hourly NDBC (deck 883) moored buoy data (PT=6), i.e., any such moored buoy report at an hour other than 00, 03, 06, 09, 12, 15, 18, or 21 UTC. For the purpose of this test, hour to hundredths is first rounded to the nearest hour, except that reports at exactly half past the hour are automatically rejected (reports with a missing or erroneous hour are also automatically rejected). [NOTE: For 2005-May 2007 data, rule c) had no effect since the ICOADS.RT archive was limited to GTS decks; it did not include deck 883.] For 1950-79 data, rule c) was modified and rule d) added: a) (Rule active, but not applicable to 1950-79 data.) b) (Rule active, but not applicable to 1950-79 data.) c) As for rule c) above, except applied instead to decks 876-882. d) WZ: Any wind measurements from IATTC data (SID=70 or SID=71). [NOTE: Rules a) and b) were carried over from 1980-97 processing without effect because: a) NDBC's Coastal-Maine Automated Network (C-MAN) program only became operational in March 1983; and b) we believe that drifting buoys only started initiating wind instrumentation well after 1979. Similarly, rule c) specifies a method for handling hour to hundredths in spite of the fact that all NDBC data during the 1950-79 period should have times limited to whole hours. For 1980-97 statistics, wind measurements from IATTC data were omitted by a special element rejection rule in Table 4a below, rather than rule d). This was because in construction of initially available Release 1a products, the decision not to use IATTC wind data was made after Attm2 was created. Future work should address this inconsistency between the handling of IATTC data in Releases 1a and 1b.] For 1784-1949 data, none of the rules were applicable (no source exclusion flags were set). {3. Rules for production of enhanced/standard statistics, and LMRF from LMR} The summaries are available in two versions: "enhanced" and "standard." These two versions were created by applying different quality control and data selection criteria, as documented in this section. Following is an overview of the rules used for creation of the enhanced and standard statistics: a) Enhanced statistics: The 1950-79 trimming limits originally defined for Release 1 were used (for data before and after 1979), but expanded to 4.5 sigma for all trimmed variables. In addition, SST and SLP data from MEDS drifting buoys were accepted in regions without trimming limits, provided the applicable MEDS flag indicated that QC had been performed on that element. IATTC (fishing fleet) wind data were excluded from enhanced statistics (due to tendencies to seek out calm wind conditions for fishing activities), but SST and cloudiness data were included. b) Standard statistics: The 1950-79 trimming limits were also used (for data before and after 1979), but at the original 3.5 sigma level used for Release 1 trimming. These statistics were limited as nearly as practical to ordinary ship data. Fishing fleet data (IATTC; SID = 70 or 71) were excluded from this set, as well as non-ship data that could be identified. Table 4a provides the detailed tests applied to reject 1980-May 2007 data from the standard and enhanced statistics, and creation of LMRF from LMR (applicable only to 1784-1997 data). Tables 4b and 4c provide similar rules governing 1950-79 and 1800-1949 processing (and production of LMRF extending back to 1784). Table 5 gives additional details on the platform and variable composition of the resulting standard and enhanced statistics. [NOTE: For Releases 1a and 1b, Table 4a/4b rules were implemented within the program to calculate MST/MSTG directly from LMR (only MSG was calculated for Release 1c). In contrast, in preparation for calculation of MSG from LMRF, selected rules from the enhanced set of report rejection rules (i.e., to delete uncertain duplicates, landlocked, and other suspect reports) are applied as indicated in Table 4a as part of creation of LMRF. This simplifies the rules that are needed to calculate statistics from LMRF.] Table 4a. Quality control flags and other LMR/LMRF fields used to prepare currently available enhanced and standard statistics based on 1980-May 2007 data (data meeting the indicated criteria were not used). In addition, enhanced report rejection rules a-c were applied to 1980-97 data to create LMRF from LMR (as discussed in sec. 1, 1998-forward LMRF were processed differently). Some of these rules implement a stage of processing termed "pre-trimming," which was designed to achieve better consistency with Release 1 data (details are covered in sec. 4). Tests were applied in the order listed, e.g., if an entire report was rejected, neither the report nor individual elements within it were further checked. Some field tests have been abbreviated, e.g., "SZ-RZ" indicates "SZ through RZ," and "SF/PF" indicates "SF or PF." The fields are all in Attm2 except for regular LMR/LMRF fields dup status (DS), year, platform type (PT), and source ID (SID). [NOTE: For 1998-forward data, some of the tests had no effect due to simplified processing (e.g., DS was always missing).] ------------------------------------------------------------------------------- Statistics product Description of test Field test (true values*) =============================================================================== Enhanced statistics Report rejection:** a) Uncertain dups DS > 2 b) Landlocked# LZ = 1; plus ZQ = 1 or 3 (MSG only) c) Time period i) SID 25: year > 1984 ii) SID 30: year > 1984 iii) SID 33: year < 1986 Element rejection: a) Source exclusion SZ-RZ = 1 b) Composite QC flags SQ-RQ > missing c) Trimming flags 1 > SF-RF > 5, except SF/PF = 11 not rejected d) Special SID 70/71 wind data Standard statistics Report rejection: a) Uncertain dups (same as enhanced) b) Landlocked (same as enhanced) c) Time period (same as enhanced) d) Non-ship data not PT=2/5, except PT=missing accepted for deck 888 only## e) Special SID = 70 or 71 Element rejection: a) Source exclusion (same as enhanced)### b) Composite QC flags (same as enhanced)### c) Trimming flags 1 > SF-RF > 3 ------------------------------------------------------------------------------- * Fields from Attm2 have coded values identical to true values, thus a missing value of zero is assumed in this table. ** Report rejection rules a)-c) also were applied to eliminate reports from LMRF (except for ZQ = 1, applied only in subsequent calculation of MSG from LMRF). # Reports with LZ = 1 (landlocked according to a 2-degree check) were deleted from MST/MSTG, and also from LMRF and thereby MSG. Reports with ZQ = 1 or 3, which were landlocked according to a 1-degree NCDC check, or otherwise suspect according to NCDC (and additionally in the case of ZQ = 3 according to MEDS) quality control, were retained in LMRF, but deleted from MSG. Thus a more stringent landlocked check was applied to MSG (at both 2-degree and 1-degree resolution) than to MST/MSTG. See Table 3 and sec. 4 for background on why ZQ = 2 was accepted as part of this check and other technical details. ## See sec. 4 for notes about this rule and the different Table 4b rule. ### Some of the source exclusion and composite QC flags may be unnecessary depending on platform type (e.g., if referring to non-ship data removed by preceding checks). ---------- Table 4b. Quality control flags and other LMR/LMRF fields used to prepare currently available enhanced and standard statistics, and to create LMRF (enhanced report rejection rules a-b), based on 1950-79 data (otherwise as for Table 4a, with the rule differences discussed in footnotes). ------------------------------------------------------------------------------- Statistics product Description of test Field test (true values) =============================================================================== Enhanced statistics Report rejection: a) Uncertain dups (same as Table 4a) b) Landlocked (same as Table 4a) Element rejection:* a) Source exclusion (same as Table 4a) b) Composite QC flags (same as Table 4a) c) Trimming flags (same as Table 4a) Standard statistics Report rejection: a) Uncertain dups (same as enhanced) b) Landlocked (same as enhanced) c) Non-ship data** PT > 5 (accept any PT=missing) d) Special (same as Table 4a) Element rejection: a) Source exclusion (same as enhanced) b) Composite QC flags (same as enhanced) c) Trimming flags (same as Table 4a) ------------------------------------------------------------------------------- * Table 4a includes additional rule d) for rejection of SID 70/71 wind data from Release 1a. For Release 1b, IATTC wind measurements instead were deleted via source exclusion flag WZ (see sec. 2). ** See sec. 4 for notes about this rule and the different Table 4a rule. ---------- Table 4c. Quality control flags and other LMR/LMRF fields used to prepare currently available enhanced and standard statistics, and to create LMRF (enhanced report rejection rules a-b), based on 1784-1949 data (two differences in comparison to Table 4b are discussed in the footnotes). In addition, the flags and fields used to prepare untrimmed statistics are listed (no element rejections are applied). ------------------------------------------------------------------------------- Statistics product Description of test Field test (true values) =============================================================================== Enhanced statistics Report rejection: a) Uncertain dups DS > 2, except accept DS=6* b) Landlocked LZ = 1** Element rejection: a) Source exclusion (same as Table 4a) b) Composite QC flags (same as Table 4a) c) Trimming flags (same as Table 4a) Standard statistics Report rejection: a) Uncertain dups (same as enhanced) b) Landlocked (same as enhanced) c) Non-ship data PT > 5 (accept any PT=missing) d) Special (same as Table 4a) Element rejection: a) Source exclusion (same as enhanced) b) Composite QC flags (same as enhanced) c) Trimming flags (same as Table 4a) Untrimmed statistics Report rejection:*** a) Uncertain dups (same as enhanced) b) Landlocked (same as enhanced) ------------------------------------------------------------------------------- * The rule identifying reports with DS=6 (uncertain: time/space match with ID mismatch) required only an exact time/space match between reports. We believe this rule worked well for more modern data, but not for early historical data in the Release 1c period (e.g., early ship data with locations recorded only to whole degrees). Thus DS=6 reports were retained both in LMRF and statistics. ** The rule used in Tables 4a-4b eliminating ZQ = 1 or 3 reports from MSG was found to be inappropriate for Release 1c data, because many (e.g., Maury Collection) reports had missing hour and thus were assigned ZQ = 1 (ZQ = 3 was not applicable to Release 1c data). See sec. 4 for further details. *** The above definition of untrimmed statistics was used for results here: http://icoads.noaa.gov/r1c.html [NOTE: {trimqc0.f} software now implements a revised definition of untrimmed statistics for the entire period-of-record defined as follows: In addition to the above rules, data with trimming flag values > 12 are not used, source exclusion flags are used (only applicable during Release 1a/1b periods), and special element rejection rule d) (SID 70/71 wind data) is applicable (only during the Release 1a period).] ---------- Table 5. Composition by platform type and variable, of the enhanced (4.5 sigma trimming) and standard (3.5 sigma) products based on Release 2.4 data (Release 1 trimmed statistics for 1854-1979 were derived using 3.5 sigma trimming limits for three different time periods, and combining all available platform types). Only observed variables are listed: S = SST, A = AT, W = wind data, P = SLP, R = humidity data, and C = total cloudiness (cloudiness is not presently subject to trimming). The approximate temporal coverage or starting year of data available in ICOADS is indicated for non-ship platform types. For a given platform type, "x" indicates that no data were used, and "-" indicates data generally not observed. [NOTE: There are exceptions to "data not observed" depending on platform type, and as some platform types have been modernized. For example, some moored buoys and NDBC C-MAN stations may report relative humidity; however, any such RH data corresponding to "-" were not decoded (and converted into dew point temperature) until 2005.] ------------------------------------------------------------------------------- Enhanced Standard Platform type S A W P R C S A W P R C =============================================================================== Ships (including Ocean Weather Stations) S A W P R C S A W P R C Drifting buoys (1978-)* S A x P - - x x x x - - Moored buoys (1970-)** S A W P - - x x x x - - EPOCS (daily) buoys/island stations (1979-91) S A W P - - x x x x - - Fishing fleet data (1971-) S - x - - C x - x - - x Rigs and platforms (~1973-) S A W P R C x x x x x x Coastal-Marine Automated Network (C-MAN) (1983-) x x x x - - x x x x - - Surface-level oceanographic temperatures (~1900-)# S - - - - - x - - - - - North Pole (NP) stations (1937-)## - A W P R C - A W P R C ------------------------------------------------------------------------------- * The enhanced statistics include some drifting buoy data that passed the MEDS quality control as acceptable, but that occurred in 2-degree boxes for which 1950-79 trimming limits could not be computed. ** Includes NDBC buoys, which were reduced from hourly to 3-hourly in the enhanced statistics (a similar reduction applied for 1985-91 in the interim statistics). This category also includes PMEL TOGA/TAO buoys since 1985, and foreign moored buoys. # For some time periods, the available oceanographic data sources included additional surface meteorological elements, which were included only in the enhanced statistics. ## Data from drifting manned stations ("ice islands") were available to some extent from Global Telecommunication System (GTS) receipts, plus from two delayed sources (decks 186 and 733), with deck 733 containing fewer data elements than indicated above. Due to the difficulty of identifying ice island reports in the GTS data, many of these, particularly in Release 1a data, were classified as ship data and thus included in standard statistics. ---------- {4. Notes on computational details and update processing changes} a) Pre-trimming and related issues For both the standard and enhanced statistics, the currently available data were subjected (by rules in Tables 4a-4c) to "pre-trimming," which was intended to replicate the Release 1 untrimmed data processing that took place during translation from LMR5 into CMR4 (see Release 1, supp. E). During that Release 1 translation, uncertain duplicate reports were rejected by testing for dup status (DS), and individual data elements were rejected by application of global physical limits or NCDC quality control flags. For Release 2.4, uncertain duplicates were also removed by testing DS, global physical limits were applied as part of the checks for unusable data (see Table 2), and the same NCDC flags were applied in the form of composite QC flags. Pre-trimming differs from Release 1 processing in that the NCDC flags for DPT and WBT were both used (see Table 3), and because MEDS flags were also used to delete data flagged erroneous by MEDS. In addition, we have now fully discontinued another aspect of pre-trimming, which had undesirable side-effects. During original Release 1a processing for 1980-92, and its 1992-93 extension, computed U and V wind components were rounded to the nearest tenth of a meter per second prior to usage in MST/MSTG. This was considered part of pre-trimming because the CMR format used for input to Release 1 statistics stored rounded U and V (and scalar wind had to be reconstructed from U and V). For processing of all MSG data, of the MST/MSTG statistics based on Release 1b data, and of updates to Release 1a starting with the 1990-95 update, we have instead used computed U and V to full numeric precision. For MST/MSTG statistics, full numeric precision was used to calculate the mean and standard deviation for variables involving U and V; but for the sextiles, all variables were rounded because of the computation method (see Release 1, p. A15). For the MSG statistics, a different method was used to calculate sextiles which allows full numeric precision for U, V, and other variables. In agreement with the now-discontinued pre-trimming procedure that was formerly used for rounding U and V for all statistics, U and V also were previously rounded in the dupelim program for comparison with the trimming limits in order to set the trimming flags. That rounding has also been discontinued in parallel with the changes made in the pre-trimming of the statistics. b) Table 4a/4b/4c rules and previous updates Proprietary WOCE buoy reports (SID=62) were deleted from LMRF for Release 1a (1980-92) and its 1992-93 extension. As a result, these proprietary data could only be included in the 2-degree enhanced MST/MSTG--not in the MSG statistics based on LMRF. However, for the Release 1a 1990-95 update and extension, and subsequent updates, enough time has elapsed that we were able to make the WOCE data openly available in LMRF. An additional time-period deletion rule was previously defined in Table 4a for original Release 1a 1980-92 processing to remove one erroneously dated (1993) report from SID=71 (IATTC Fishing Logs); inadvertently this rule was also used during the 1992-93 extension of Release 1a: iv) SID 71: year > 1992 Additional background: New SID=71 data were received as input for the 1992-93 extension, but this rule was carried over from 1980-92 processing, resulting in accidental omission of all SID=71 data from the MST/MSTG statistics for 1993. We became aware of this problem prior to creation of LMRF and of the MSG statistics, thus both LMRF and MSG included the SID=71 data missing in 1993 from the MST/MSTG statistics. Effective with the Release 1a 1990-95 update and extension, and subsequent updates, this rule has been discontinued, with the result that the one erroneously dated 1993 report is also included. The rules for rejection of non-ship data from the standard statistics are different in Tables 4a ("not PT=2/5, except PT=missing accepted for deck 888 only") and 4b ("PT > 5, accept any PT=missing"). The Table 4b rule differs from the Table 4a rule because, prior to 1980, reports with missing PT were more likely ship data, and additional PT values 0-4 for ship data may be available. Several complications exist in the way deck 888 (US Air Force Global Weather Central) was handled. In Release 1b, and in Release 1a products prior to the 1980-97 update, an older version of deck 888 was included (covering 1973-81). Also, at the time these Release 1a products were constructed for 1980-81, we were unable to reliably determine PT in deck 888. However, after additional investigation into deck 888, rules were devised for Release 1b to determine PT. The existing Table 4a rule was designed to handle the earlier 1980-81 GWC data. For the 1980-97 update, deck 888 was completely re-processed and included for the entire 1980-97 period. However, as part of preconditioning (see ), efforts were made to remove all non-ship data from deck 888. Therefore, the existing Table 4a rule should have little impact or no adverse impact, although changes should be considered for a future update. Prior to the 1980-97 Release 1a update and extension, ZQ > 1 never appeared. In the Table 4a documentation used then, however, ZQ was checked for the more general case of ZQ > missing, rather than ZQ = 1 or 3, with the aim to cover possible future expansion of ZQ. Unfortunately, problems in drifting buoy data starting in 1993 (discussed in Table 3) lead to the need to accept ZQ = 2 (i.e., values flagged only by MEDS QC). Note that such values occur only starting in 1993, thus the difference in rules has no effect prior to 1993. As indicated in a Table 4c footnote, the rule used in Tables 4a-4b eliminating ZQ = 1 or 3 reports from MSG was found to be inappropriate for Release 1c data, because many (e.g., Maury Collection) reports had missing hour. However, this was only discovered after an initial set of statistics (1800-1949) was made available in March 2001, in which ZQ = 1 or 3 reports were rejected, thus significantly reducing the numbers of observations and spatial coverage in early decades (especially prior to 1860). This also impacted some users of LMRF data. During August 2001 the problem was corrected. Additional details are provided here: http://icoads.noaa.gov/zq.html