A. Valente et al.: A compilation of global bio-optical in situ data
237
www.earth-syst-sci-data.net/8/235/2016/
Earth Syst. Sci. Data, 8, 235-252, 2016
tually have been modified by the processing routines used by
the repositories or archives. Nevertheless, to minimise these
potential drawbacks, we have, for the most part, incorporated
only datasets that have emerged from the long-term efforts of
the ocean-colour and biological oceanographic communities
to provide scientists with high-quality in situ data and im
plemented additional quality checks on the data to enhance
confidence in the quality of the merged product.
In Sect. 2 the methodologies used to harmonise and inte
grate all data, as well as a description of individual datasets
acquired, are provided. In Sect. 3 the geographic distribu
tion and other characteristics of the final merged dataset are
shown. Section 4 provides an overview of the data.
2 Data and methods
2.1 Preprocessing and merging
The compiled global set of bio-optical in situ data described
in this work has an emphasis, though not exclusively, on
open-ocean data from all geographic regions. It is com
prised of the following variables: remote-sensing reflectance
(rrs), chlorophyll a concentration (chla), algal pigment ab
sorption coefficient (aph), detrital and coloured dissolved or
ganic matter absorption coefficient (adg), particle backscat-
tering coefficient (bbp) and diffuse attenuation coefficient for
downward irradiance (kd). A similar effort of compiling bio-
optical in situ data from different sources has been recently
published by Nechad et al. (2015). Given their focus on se
lected coastal regions, most of the data presented here are not
part of their compilation. The variables rrs, aph, adg, bbp and
kd are spectrally dependent, and this dependence is hereafter
implied. The data were compiled from 10 sources of in situ
data (MOBY, BOUSSOLE, AERONET-OC, SeaBASS, NO
MAD, MERMAID, AMT, ICES, HOT, GeP&CO), each de
scribed in Sect. 2.2. The compiled in situ observations have
a global distribution and cover the recent period of satellite
ocean-colour data between 1997 and 2012. The listed vari
ables were chosen as they are the operational satellite ocean-
colour products of the ESA OC-CCI project, which currently
focuses on the use of three ocean-colour satellite platforms
to create a time series of satellite data: the Medium Resolu
tion Imaging Spectrometer (MERIS) of ESA, the Moderate
Resolution Imaging Spectroradiometer (MODIS) of NASA,
and the Sea-viewing Wide Field-of-view Sensor (SeaWiFS)
of NASA,.
Rrs is a primary ocean-colour product routinely produced
by several space agencies. It is defined as rrs = Lw / Es,
where Lw is the upward water-leaving radiance and Es is
the total downward irradiance at sea level. Remote-sensing
reflectance is related to irradiance reflectance (Rw) approx
imately through rrs = Rw / Q, where Q ranges from 3 to 5
in natural waters and is equal to ?r for an isotropic (Lam
bertian) light held. Another quantity that is often required
is the “normalised” water-leaving radiance (nLw) (Gordon
and Clark, 1981), which is related to remote-sensing re
flectance via rrs = nLw / Fo, where Fo is the top-of-the-
atmosphere solar irradiance. If not directly available, remote
sensing reflectance was calculated through the equations de
scribed above, depending on the format of the original data.
The original data were acquired in an advanced form (e.g.
time-averaged, extrapolated to surface) from six data sources
particularly designed for ocean-colour validation (MOBY,
BOUSSOLE, AERONET-OC, SeaBASS, NOMAD, MER
MAID), therefore only requiring the conversion to a com
mon format. In the processing made by the space agencies,
the quantity rrs is normalised to a single Sun-viewing geom
etry (Sun at zenith and nadir viewing) taking in account the
bidirectional effects as described in Morel and Gentili (1996)
and Morel et al. (2002). Thus, for consistency with the satel
lite rrs product, only in situ rrs that included the latter nor
malisation was included in the compilation.
Chlorophyll a concentration is the traditional measure for
phytoplankton biomass and one of the most widely used
satellite ocean-colour products (IOCCG, 2008). To validate
satellite-derived chlorophyll a concentration, two different
variables were compiled: one of these represents chloro
phyll a measurements made through fluorometric or spec-
trophotometric methods, referred to hereafter as chla_fluor
and the other is the chlorophyll concentration derived from
HPLC (high-performance liquid chromatography) measure
ments, referred to hereafter as chla_hplc. The chlorophyll
data were compiled from eight data sources: BOUSSOLE,
SeaBASS, NOMAD, MERMAID, AMT, ICES, HOT and
GeP&CO. One requirement for chla_fluor measurements
was that they were made using in vitro methods (i.e. based
on extractions of chlorophyll a). Although this severely de
creased the number of observations, since in situ fluorome-
try (e.g. fluorometers mounted on CTDs) is widely available
in oceanographic databases, it was decided to exclude such
data because of potential problems with the calibration of
in situ fluorometers. The variable chla_hplc was calculated
by summing all reported chlorophyll a derivatives, includ
ing divinyl chlorophyll a, epimers, alio mers and chlorophyl-
lide a. The two chlorophyll variables are retained separately
in the database to facilitate their use. HPLC measurements
are considered of higher quality, but fluorometric measure
ments are more abundant. Thus one option for users is to use
chla_fluor only when there are no chla_hplc measurements
available. To be consistent with satellite-derived chlorophyll
values, which are derived from the light emerging from the
upper layer of the ocean, all chlorophyll observations found
in the top 10 m (replicates at the same depth or measurements
at multiple depths) were averaged if the coefficient of varia
tion among observations was less than 50 %; otherwise they
were discarded. The averages were then assigned to the sur
face. The depth of 10 m was chosen as a compromise be
tween clear oligotrophic and turbid eutrophic waters. Other
methods, such as chlorophyll depth averages using local at
tenuation conditions (Morel and Maritorena, 2001), require