A. Valente et al.: A compilation of global bio-optical in situ data
HZ
5
30
30
m
90° =
480°
PA
PA
gg
nn
3-10 ® >40 [chla (mg m‘)
"Oz ——— ————l
0° 4905
—s
WS
Figure 8. Global distribution of chlorophyll-a concentration per interval of the observed value. All chlorophyll data were considered, but for
a given station, HPLC data were selected if available.
ic
AR
le
EM
seabass
10mad
amt
ces
nermaid
gepco
almer
dochem
zalcofi
‘celter
ämnt
be
ats
pss
ara
naredat
arcsspp
awi
‚arentssea
zeadatanet
ode
mos
oussole
ot
BStOC
Figure 9. Global distribution of chlorophyll-a@ concentration per data set in the final table. All chlorophyll data were considered, but for a
given station, HPLC data were selected if available. Crosses show sites from where data of chlorophyll are available in a specific geographic
location.
and Gentili, 1996; Morel et al., 2002). The compiled vari-
ables were “chla_hple” and “rrs”
32 PBesults
In this work, several sets of bio-optical in situ data were ac-
quired, homogenized, and merged into a single unified data
set. The data set comprises in situ observations between 1997
and 2021, with a global distribution, and includes the follow-
ing varlables: “rrs”, “chla”, “aph”, “adg”, “bbp”, “kd”, and
“tsm”. All observations were processed in such a way that
chey can be compared directly with satellite-derived ocean
colour data. The compiled data set corresponds to a table
with a total of 151673 rows and 3458 columns. Each row
represents a unique station in space and time, separated from
the rest by at least 5 min and 200m. For each variable at a
given station, three metadata strings are provided: ‘“dataset”,
“subdataset”. and “contributor”. The columns of the table
attos://doi.org/10.5194/essd-14-573 /-2UZ:
take the form described in Table 1. The data contributors
are indicated in Table 2. Regarding spectral variables, all
original wavelengths were preserved, which required many
unique wavelengths to be maintained in the database. No
band shifting was performed (though some archived data in
some data sources may have been merged with nearby wave-
lengths) and no minimum number of wavelengths per obser-
vation was imposed. This allowed further manipulation of the
data set for different purposes. In the following paragraphs,
the final group of observations is described in terms of each
variable and the corresponding contributing data sets; how-
ever, it is important to note that the numbers reported here
do not reflect the original numbers in each contributing data
set, since observations close in time and space were aver-
aged and quality controls were applied. Furthermore, dupli-
cates across contributing data sets were removed (e.g. NO-
MAD and others, such as MOBY, were removed from MER-
MAID: also, data of individual projects, such as PALMER
Earth Syst. Sei. Data, 14, 5737-5770. 2022