Release 2.5 Data Characteristics

1. Introduction

We have completed a major ICOADS update, Release 2.5, covering 1662-2007, and including a wide variety of newly available or improved data inputs (Figure 1). The remaining sections of this webpage provide temporal and spatial comparisons between final Release 2.5 and Release 2.4 data for 1662-2007, information about a preliminary set of 1662-1969 data (some systematic data continuity problems, mostly during 1941-69, were found in those preliminary data that required a complete rerun), and plots illustrating duplicate elimination (dupelim) performance and quality control results.

The quality control ("trimming" flag) comparisons between decks provided in Figures 7a-7f illustrate some systematic QC problems. For example, the trimming limits (see also Figs. 1-4 on this webpage) are extensively missing, spanning different variables, for near-polar sources (e.g., whaling or expeditionary decks including 186-188, 197, 246, 733-734, 736, 761, 897, 899).

In preparation for the rerun, NODC/OCL World Ocean Database 2005 (WOD05) oceanographic data were re-translated into IMMA format using a new scheme to estimate sea surface temperature (SST) from subsurface ocean profile temperatures. The old (used for Release 2.4 and preliminary Release 2.5 data) and new (used for final Release 2.5 processing) schemes for deriving SST are discussed in this document, with detailed comparison tabulations and charts presented in this spreadsheet.

Figure 1. Major historical data sources added to Release 2.5. Horizontal green lines illustrate the time range of the original data sources. The annual numbers of reports are plotted as curves, blue for the previous Release 2.4, and red for Release 2.5. For clarity the vertical scale is truncated at 9M; years 2005-07 have 13M, 15M, and 16M total reports (not visible) in R2.5, respectively. Data coverage prior to 1800 is very sparse.

2. Release 2.5 4-panel plots

Figures 2a1 through 2i6 provide 4-panel comparison plots between Release 2.5 data and Release 2.4 data (2° enhanced monthly summaries in both cases) for 20-year periods (for details on the 4-panel plot layout see icoads.noaa.gov/panels.html; note that a minimum of 1000 nobs are required per month, or all curves are omitted to ensure meaningful data patterns). The humidity plots are omitted for the earliest three periods due to general data sparsity.

2° ENH 1850-1869 1870-1889 1890-1909 1910-1929 1930-1949 1950-1969 1970-1989 1990-2007
Sea Surface Temperature Figure 2a1 Figure 2a2 Figure 2a3 Figure 2a4 Figure 2a5 Figure 2a6 Figure 2a7 Figure 2a8
Air Temperature Figure 2b1 Figure 2b2 Figure 2b3 Figure 2b4 Figure 2b5 Figure 2b6 Figure 2b7 Figure 2b8
Scalar Wind Figure 2c1 Figure 2c2 Figure 2c3 Figure 2c4 Figure 2c5 Figure 2c6 Figure 2c7 Figure 2c8
U Wind Component Figure 2d1 Figure 2d2 Figure 2d3 Figure 2d4 Figure 2d5 Figure 2d6 Figure 2d7 Figure 2d8
V Wind Component Figure 2e1 Figure 2e2 Figure 2e3 Figure 2e4 Figure 2e5 Figure 2e6 Figure 2e7 Figure 2e8
Sea Level Pressure Figure 2f1 Figure 2f2 Figure 2f3 Figure 2f4 Figure 2f5 Figure 2f6 Figure 2f7 Figure 2f8
Total Cloudiness Figure 2g1 Figure 2g2 Figure 2g3 Figure 2g4 Figure 2g5 Figure 2g6 Figure 2g7 Figure 2g8
Specific Humidity       Figure 2h4 Figure 2h5 Figure 2h6 Figure 2h7 Figure 2h8
Relative Humidity       Figure 2i4 Figure 2i5 Figure 2i6 Figure 2i7 Figure 2i8

3. Release 2.5 map comparisons

Figures 3a-3d are decadal plots (1800-2007) illustrating data additions between Releases 1, 2.4, and 2.5. The colors show the total number of observations in a 2° box per decade separately for four primary variables (and using the enhanced monthly summaries for both Releases 2.4 and 2.5). The amounts of data added prior to 1800 were very small, so plots for those decades have been omitted (note: Release 2.4 data began in 1784 and ended in May 2007, and January-May 2007 data are not plotted; and Release 1 data began in 1854).

Sea Surface Temperature (SST)Scalar Wind (WSPD)Sea Level Pressure (SLP)Relative Humidity (RHUM)
Figure 3a Figure 3b Figure 3c Figure 3d

4. Preliminary Release 2.5 and outstanding data continuity questions

As noted above, some systematic data continuity problems, concentrated during 1941-69, were found in the preliminary 1662-1969 data that required a complete re-run. This webpage describes the overall characteristics of the preliminary data, with links to subsidiary webpages exploring data homogeneity questions for three major variables. This additional webpage describes some outstanding data continuity questions in the final data.

5. Release 2.5 dupelim results

Figure 4a. Numbers of reports added to, or deleted from, the output Release 2.5 data during dupelim, separately by deck, covering 1792-1877. The deck 156 reports deleted around 1860, without corresponding additions from other decks, were previously undetected duplicates (i.e., they matched other reports already in ICOADS).

Figure 4b. As for Figure 4a, but covering 1878-1949.

Figure 4c. As for Figure 4a, but covering 1950-69.

Figure 4d. As for Figure 4a, but covering 1970-79.

Figure 4e. As for Figure 4a, but covering 1980-2004.

Figure 4f. As for Figure 4a, but covering 2005-2007 (through coads_20090304_seas_jan.gz, coads_20090219_keyed_oct.gz, coads_20090206_bufr_jan.gz, and coadsgcc200804.IMMA.r2.5.Z).

Figure 5a. The larger plot shows total numbers of reports output in the Release 2.5 data, stratified by deck, for 1830-1969 (Tables D6a-D6c). The smaller plot shows patterns for the period 1750-1834, which is almost exclusively composed of data from decks 730 (CLIWOC) and 701 (US Maury).

Figure 5b. Total numbers of reports output in the Release 2.5 data, stratified by deck, for 1970-2007.

6. Release 2.5 quality control results

In Figures 7a through 7f, bars show the percentage frequency (left axis) of observation (a1) trimming flag value. Specifically, where a1 is the individual observation under scrutiny, g is the smoothed median, and s1 and s5 are the smoothed lower and upper median deviation:
E = data unusable (SF, AF, and PF, only)
C = limits missing (ocean/coastal box)
B = limits missing (ocean/coastal box); MEDS data correct (SF/PF only)
7 = greater than 4.5 sigma upper limit (a1 > g + 4.5*s5)
6 = less than 4.5 sigma lower limit (a1 < g - 4.5*s1)
5 = greater than 3.5 sigma upper limit (g + 3.5*s5 < a1 <= g + 4.5*s5)
4 = less than 3.5 sigma lower limit (g - 4.5*s1 <= a1 < g - 3.5*s1)
3 = greater than 2.8 sigma upper limit (g + 2.8*s5 < a1 <= g + 3.5*s5)
2 = less than 2.8 sigma lower limit (g - 3.5*s1 <= a1 < g - 2.8*s1)
1 = within 2.8 sigma limits (g - 2.8*s1 <= a1 <= g + 2.8*s5)
The line (right axis) indicates the total number of extant observations per deck during the period (note: log scale). Landlocked 2° boxes (LZ=1) were removed in a final Release 2.5 IMMA processing step (Figure 6), or during processing of earlier ICOADS Releases (for previously available data sources). Decks containing no extant data for a given variable are missing (e.g., deck 736 "Byrd Antarctic Expedition" includes only SLP). As an important modification to the trimming procedure for Release 2.5, the 1910-49 trimming limits for RH were used for flagging all RH data within the 1662-1909 period. This approach was used because the original 1854-1909 trimming limits for RH generally were missing (see Figs. 1-4 on this webpage, which also reveal smaller gaps in the earlier trimming limits for other variables including SLP and SST).

sea surf. temp. (SF)air temp. (AF)u comp. wind (UF)v comp. wind (VF)sea level press. (PF)rel. hum. (RF)
Figure 7a Figure 7b Figure 7c Figure 7d Figure 7e Figure 7f

