15 May 2014
DANGENDORF ET AL.
3591
Fig. 6. Reconstruction of surges based on 20CRv2 winds and
SLP. The 10-yr moving averages of the 99.9th percentiles of ob
served surges ± standard deviations (black line with gray shaded
area; the standard deviation has been computed as a measure of
variability over each 10-yr window) and their reconstructions based
on 20CRv2 (light red: individual ensemble members; dark red:
ensemble mean; both normalized to a common period from 1950 to
2011: i.e., the mean has been removed) are presented. Differences
between both are shown in blue. Annual efficiency criteria between
observed and reconstructed daily surges are presented in black.
The red shaded area marks the period for which significant dif
ferences between observations and 20CRv2 are detected. The
different shades demonstrate the gradual increase of inconsistencies
before the 1910s.
for the year 1883), exceeding the mean of the calibra
tion period by approximately 150%. As shown before
with the stationary correlation of observed surge levels
with the NSCI since 1850 (Fig. 5b), the deviations of
observed surge levels with those predicted through
20CRv2 are unlikely to be caused by the observational
record (the NSCI and the storm surge record are mea
sures independently).
A similar picture is retrieved by comparing the 10-yr
moving averages of the 99.9th percentile time series of
observed and statistically reconstructed surges (Fig. 6).
Over the past 100 yr, the reconstruction fits well to the
observations. The model predicts the known decline in
storminess in the mid-twentieth century, the rapid in
crease until the mid-1990s, and the downturn afterward.
Nevertheless, in the early 1910s, the prediction starts to
decrease in a manner not visible in the observations
(for both the ensemble mean as well as the ensemble
spread). This decrease finally results in significant posi
tive long-term trends over the entire reanalysis period
from 1871 to 2010 if 20CRv2 is taken as predictor (note
that a similar behavior was also observed for the 95th
percentile; Fig. SI of the supplementary material).
Related to this, Bronnimann et al. (2012) demon
strated that the ensemble mean appears to be biased
toward lower wind speeds during earlier decades. They
recommended the use of single ensemble members
rather than the ensemble mean when investigating long
term changes. To examine whether the results from
Fig. 6 are influenced by such biases, we additionally
evaluated long-term changes in each ensemble member
separately. First, we calculated the differences between
the percentile time series from each 20CRv2 ensemble
member prediction and the observed time series. Then,
in a second step, we computed linear trends for each of
the residual time series. The results show that for each
percentile all ensemble members point to significant
positive long-term changes, which are further signifi
cantly different from the observations (Fig. 7a). Addi
tionally, we found that the residual trends are generally
increasing with the order of the percentiles: that is, highest
deviations are found within the highest percentiles.
To determine the exact timing from which the 20CRv2
generated surges start to deviate significantly from the
observations, we further computed linear trends for
the residual time series over 30-yr moving windows. The
results are shown in Figs. 7b,c for the 99th and 99.9th
percentiles, respectively. While the trend estimates scat
ter around zero back to approximately 1910, before that
time statistically significant differences are found for the
ensemble mean as well as each individual ensemble
member. While we can confirm the bias of the ensemble
mean reported by Bronnimann et al. (2012) (Fig. 7b), our
results also illustrate that using individual members can
not improve the results significantly (when assessing the
long-term behavior of storm surges in the German Bight).
The reanalysis is significantly biased toward a lower oc
currence of extreme values in the period prior to 1910 in
both the ensemble mean as well as all members (in this
region).
The decreasing coherence between reanalysis forcing
and observed surges is generally in line with increasing
uncertainties in the reanalysis because of fewer assimi
lated observational data in the earlier periods (Compo
et al. 2011; Krueger et al. 2013b). The results therefore
partly confirm the inconsistencies between storm