## 1. Introduction

Numerous approaches have been developed and used for operational atmospheric data assimilation to estimate and model the background-error covariances. Background-error covariances are important because they play an important role in determining the influence of assimilated observations on the three- or four-dimensional multivariate analysis of the atmospheric state. Within variational data assimilation approaches, the background-error covariances must be specified. Several approaches are commonly used to generate an ensemble of background-error realizations for this purpose, including the so-called National Meteorological Center (NMC; now known as National Centers for Environmental Prediction) method (Parrish and Derber 1992) and Monte Carlo approaches applied to an existing data assimilation system (e.g., Pereira and Berre 2006). Using such an ensemble, the covariances can then be modeled either by computing the ensemble variances in spectral space (Courtier et al. 1998; Gauthier et al. 1999) or wavelet space (Fisher and Andersson 2001; Deckmyn and Berre 2005) or by applying spatial covariance localization to the raw sample covariances (Lorenc 2003; Buehner 2005). Similarly, recursive filters have been used to model correlations efficiently, while partially relaxing the assumptions of homogeneity and isotropy (Wu et al. 2002; Purser et al. 2003). A review of some of these approaches is given by Buehner (2010).

In the case of the ensemble Kalman filter (EnKF), ensembles of model forecasts representative of the uncertainty in the background state are produced as part of the EnKF approach. When assimilating observations, the background-error covariances are typically estimated by applying spatial covariance localization to the raw sample covariances (Houtekamer and Mitchell 2001; Hamill et al. 2001). While most applications use simple prescribed functions for spatial localization, more sophisticated approaches of adaptive localization have been proposed (Anderson 2007; Bishop and Hodyss 2009).

Since 2005, the global deterministic analysis of the Meteorological Service of Canada (MSC) is produced using a four-dimensional variational data assimilation (4D-Var) system (Gauthier et al. 2007). An EnKF approach (Houtekamer and Mitchell 2005) is used operationally, also since 2005, to supply the initial conditions for the MSC global ensemble prediction system (EPS). The operational 4D-Var and EnKF data assimilation systems both use the Global Environmental Multiscale (GEM) model (Côté et al. 1998), but with different configurations. A study was recently conducted to compare the two data assimilation systems in the context of initializing global deterministic forecasts (Buehner et al. 2010a,b). In addition to comparing the 4D-Var and EnKF using configurations similar to the operational systems, several additional configurations of the variational data assimilation approach were considered. These used background-error covariances in the variational system that were estimated from the ensemble of background states produced by the EnKF. To facilitate the comparison with the EnKF results, the approach of estimating the background-error covariances from these ensembles closely followed the approach used in the EnKF itself. That is, spatial covariance localization was applied to the raw sample covariances using the same horizontal and vertical localization functions as in the EnKF.

The goal of the present study is to examine alternative approaches for estimating the flow-dependent background-error covariances from ensembles of EnKF background states. The spatial covariance localization approach as used in Buehner et al. (2010a,b) is compared with the wavelet-diagonal approach (Fisher and Andersson 2001) used operationally at the European Centre for Medium-Range Weather Forecasts (ECMWF). Both approaches significantly reduce the error present in the raw sample covariances obtained when using relatively small ensembles of *O*(100) members. In addition, a new approach that shares aspects of both the spatial localization and wavelet-diagonal approaches is evaluated. The new approach involves the localization of spatial correlations both in the spatial and spectral domains. Spectral localization has been shown to be equivalent to performing a local spatial averaging of the correlation functions (Buehner and Charron 2007).

The next section presents a description of the approaches considered for estimating the background-error covariances from an ensemble of error realizations. In section 3, a detailed examination of the different approaches is presented in the context of estimating the horizontal correlations of the 500-hPa temperature field. In section 4, results from using the spatial localization and new spatial/spectral localization approaches to conduct a series of 1-month three-dimensional variational data assimilation (3D-Var) experiments assimilating real observations to evaluate their relative impact on the quality of medium-range forecasts are shown. Finally, the conclusions are given in section 5.

## 2. Description of approaches for estimating background-error covariances

**e**

*. The sample estimate of the background-error covariance matrix iswhere*

_{s}*N*

_{ens}is the ensemble size and the overbar denotes the ensemble mean. To allow the use of such a covariance matrix in a variational data assimilation system, one common formulation requires the square root of the covariance matrix and its transpose or adjoint (e.g., Gauthier et al. 1999). For the sample estimate, a convenient rectangular square root can be written asAlternatively, by defining the normalized ensemble perturbations asthe square root matrix can be written more simply asIt is also convenient to separate the estimation of the variances from the correlations, the latter being the main focus of this study. Consequently, the ensemble perturbations are further normalized by the ensemble standard deviation as denoted bywhere

*N*

_{ens}, the number of columns in the square root matrix. Analogous to the control variable transformations used in other applications of the variational data assimilation approach (e.g., Gauthier et al. 1999), the correction to the background state is computed at each iteration aswhere

**is the control vector of dimension**

*ξ**N*

_{ens}.

### a. Spatial covariance localization

### b. Wavelet diagonal

*j*), given bywhere

*N*

_{ens}members and

**represents a normalization, described below, which is required to obtain the correct spatially averaged gridpoint variance.**

*β**J*horizontal scale ranges as the functions

*ψ*for

_{j}*j*= 1, … ,

*J*. These functions depend only on the total wavenumber

*n*, and are defined as the square root of a set of overlapping piecewise linear functions. These functions are chosen such thatfor all values of

*n*(e.g., see Fisher and Andersson 2001; Pannekoucke 2009). This ensures that by simply applying the bandpass filters a second time and then summing over

*j*, the original gridpoint state is recovered. To reduce the dimension of the state vector in wavelet space, a set of reduced-resolution grids are used with horizontal resolution only high enough to represent the scales included in each range of horizontal scale. Therefore the wavelet transform of the gridpoint state

**x**can be written aswhere

**Ψ**

*is a diagonal matrix with the values of*

_{j}*ψ*along the diagonal. And the left inverse of

_{j}**x**

^{wl}in wavelet space is defined asThe spectral transform

*ψ*.

_{j}**r**

_{1},

**r**

_{2},

*j*

_{1}, and

*j*

_{2}, which is two horizontal locations (

**r**

_{1}and

**r**

_{2}) and two bands of horizontal scale (

*j*

_{1}and

*j*

_{2}). It is the process of making the correlation matrix diagonal in wavelet space (with respect to both location and scale) that is responsible for modifying the original sample correlations. As shown by Pannekoucke et al. (2007) and in the results below, eliminating the off-diagonal correlations in wavelet space has the effect of filtering the correlations in gridpoint space. The process of diagonalizing the correlation matrix in wavelet space also necessitates a normalization of the wavelet variances. It was found that the required normalization factor

**is inversely proportional to the sum over all total wavenumbers**

*β**n*of [(2

*n*+ 1)

*ψ*(

_{j}*n*)

^{2}]. Consequently, the normalization increases the variances for narrow wave bands relative to those for broad wave bands.

### c. Spatial/spectral covariance localization

*ψ*. The rank of this matrix is equal to

_{j}*J*. This is consistent with the fact that the number of columns in the square root correlation matrix is increased by a factor of

*J*as seen by comparing (6) with (20) in the context of spectral localization only, or (9) with (18)–(19) in the context of spectral and spatial localization.

It was shown by Buehner and Charron (2007) that the localization of correlations in spectral space is equivalent to spatial averaging of the correlation functions in gridpoint space and that this can result in improved estimates of spatial correlations. Therefore this aspect of the spatial/spectral localization approach represents a potential improvement over the basic spatial localization approach used in many applications of the EnKF. The expected improvement results from a reduction in sampling error, while still capturing to a large extent the heterogeneous and anisotropic aspects of the original sample correlations. This is unlike the wavelet-diagonal approach in which the assumption of diagonal correlations in wavelet space (i.e., the elimination of both spatial and between-scale correlations) modifies the original sample correlations to a much greater extent and results in nearly isotropic correlations as will be demonstrated later. The elimination of all correlations in wavelet space also appears to necessitate choosing very narrow bandpass-filtered functions for small wavenumbers to adequately reproduce the large-scale component of the correlations. As demonstrated by Fisher and Andersson (2001), the wavelet-diagonal approach results in correlations in spectral space that are the result of interpolating between correlations at only a few total wavenumbers as determined by the choice of wavelet functions. Consequently, the use of broad filter functions can result in a very unrealistic correlation spectrum when the real spectrum does not vary linearly. In contrast, with the new spatial/spectral localization approach, the use of broader bandpass-filtered functions does not result in a poor approximation of the spatial correlations. The limiting case of using only a single bandpass filter equal to one for all wavenumbers will cause the spatial/spectral localization approach to be equivalent with the standard spatial localization approach.

Figure 1 shows the effective spectral localization matrix when using four spectral bands nearly equally separated between total wavenumbers 0 and 99. Even though the spectral filters are chosen with nearly equal separation, the effective localization with respect to total wavenumber is not homogeneous. This means that for some wavenumbers more localization will be applied than for other wavenumbers. This variation in the extent of spectral localization for different wavenumbers can cause the diagonal elements of the sample correlation matrix in gridpoint space to be different from one. This is because, for nonhomogeneous ensemble correlations, the distribution of variance with respect to horizontal scale varies as a function of location. Since correlation localization in spectral space is equivalent with spatial averaging of the correlation functions in gridpoint space, an unequal amount of spectral localization corresponds with an unequal amount of spatial averaging for different horizontal scales. Consequently, spatial variation in the spectral distribution of variance may lead to local increases or decreases in the total variance at different locations due to spectral localization. The wavelet-diagonal approach also produces spatial correlations for zero separation distance that are different from one. Typically, applications of the wavelet-diagonal approach employ more narrow wavenumber bands for low wavenumbers than for high wavenumbers. Figure 2 shows the effective spectral localization matrix when using such wavenumber bands.

*j*, such thatIt will be shown later that the optimal amount of spatial localization does appear to vary as a function of horizontal scale. Consequently, this added flexibility is another source of potential improvement over the basic spatial localization approach. The ability to apply more severe spatial localization for small scales than for large scales may result in improved analysis quality over the large range of spatial scales present in state-of-the-art global analysis systems with ever-increasing horizontal resolution.

Unlike the wavelet-diagonal approach, the new spatial/spectral localization approach can be applied both in variational and most typical EnKF data assimilation schemes. For application in a variational scheme that uses preconditioning according to the background-error covariance matrix, it has already been shown how the square root of the correlation matrix can be formulated and from which its adjoint can easily be obtained. For application in a typical EnKF data assimilation scheme that already uses spatial localization, it may be straightforward to use ensemble perturbations that have been filtered by each of the specified bandpass filters in place of the original ensemble perturbations. Therefore, the EnKF algorithm would be almost unchanged except that an expanded ensemble of *N*_{ens} × *J* members is used. To date, tests with spatial/spectral localization have used values of *J* less than 10.

## 3. Detailed examination of horizontal correlations

*s*= 1, … ,

*N*

_{ens}and where each vector

*η**contains independent random numbers drawn from a Gaussian distribution with a variance of one and mean of zero. The resulting ensemble members were then normalized by the ensemble standard deviation to ensure the sample standard deviation was exactly 1. For most of the results shown in this section, a 48-member ensemble computed in this way is used. A selection of columns from the raw sample correlation matrix and the estimates using the spatial localization, spatial/spectral localization, wavelet-diagonal, and spectral-diagonal approaches were obtained.*

_{s}### a. Scale dependence of optimal spatial localization

The first aspect examined is the possible relationship between the optimal amount of spatial localization and horizontal scale. A similar technique as that used by Houtekamer and Mitchell (1998) was applied to evaluate the distance over which the horizontal correlations can be accurately estimated from an ensemble. Beyond this distance covariance localization can be used to significantly reduce the amplitude of the correlations. The original 96-member EnKF ensemble was divided into two 48-member subensembles and the sample estimate of the horizontal correlations obtained from each for a large number of locations distributed over the globe. All of the resulting correlations are combined to obtain the average correlation 〈*ρ*〉 (where 〈〉 represents a spatial average for all correlations with a given separation distance), root-mean-square correlation

The result is shown in Fig. 3 when using four equally spaced bandpass filters that peak at total wavenumbers 0, 33, 66, and 99 in Figs. 3a–d, respectively. The result of computing these quantities using the unfiltered ensemble members is shown in Fig. 3e. The distance over which the mean correlations (solid curves) become nearly zero has a clear dependence on horizontal scale. The correlations computed from the lowest set of total wavenumbers are close to zero beyond 3000 km (Fig. 3a) and for the highest set of wavenumbers, the correlations become nearly zero at 500 km (Fig. 3d). For the unfiltered correlations, the mean correlations nearly reach zero at 2000 km. At a particular location, however, the estimated correlations may be larger or smaller than the spatially averaged correlations either due to real nonhomogeneous correlations that differ from the spatial average or due to sampling error. Both types of variations from the average will contribute to the root-mean-square correlations (dotted curves), but if the sampling errors in the two subensembles are independent of each other, the root-mean-product of the correlations (dashed curves) will only include the variations in the real correlations. Therefore, at distances where the root-mean product is much smaller than the root-mean-squared correlations, this indicates that the variations in the correlations are dominated by sampling error and therefore spatial localization should be used to significantly reduce the estimated correlations.

Based on the results in Fig. 3, it appears that more spatial localization should be applied for the small horizontal scales than for the large scales in the spatial/spectral localization approach. Consequently, different distances were chosen for each horizontal scale over which the fifth-order function of Gaspari and Cohn (1999) becomes zero.

### b. Heterogeneous and anisotropic structure of correlations

The horizontal spatial correlations were computed using each approach over a series of regularly spaced locations surrounding North America. The spatial localization approach is applied with a localization function that forces the correlations to zero at a distance of 3000 km. This is similar to the value of 2800 km used in the 96-member operational Canadian EnKF (Houtekamer et al. 2009). For the spatial/spectral localization approach, the four wavenumber bands with peaks at wavenumbers 0, 33, 66, and 99 are used with spatial localization that forces the correlations within each band to zero at a distance of 3500, 3000, 2500, and 2000 km, respectively. For the wavelet-diagonal approach, a larger number of irregularly spaced bandpass-filtered functions were chosen that have peaks at wavenumbers 0, 1, 2, 4, 8, 16, 32, 64, and 99.

Figure 4 shows contours of the true correlations, the raw sample correlations computed from a 48-member ensemble, and the correlations obtained from applying each of the approaches being examined to the same ensemble for a series of 20 locations surrounding North America. For reasons of clarity, the raw sample correlations are only shown up to a separation distance of 1200 km. By visually examining the true correlation functions (Fig. 4a), it is clear that the correlations are both anisotropic and heterogeneous within this region. Much of this is also captured in the raw sample correlations (Fig. 4b), but with the addition of sampling error (which would be even more obvious if the correlation functions were shown in their entirety and not only to a separation distance of 1200 km). The application of spatial localization (Fig. 4c) has the expected effect of reducing correlations at large separation distances that are the most affected by sampling error. The spatial/spectral localization approach (Fig. 4d) produces correlation functions similar to those from the spatial localization approach, except that the addition of spectral localization has led to somewhat smoother correlation functions. This smoothing is due to the local spatial averaging of the correlation functions that is caused by spectral localization. This smoothing can make the correlations appear more similar to the truth, but not in all cases. The wavelet-diagonal approach (Fig. 4e) produces correlation functions that are nearly isotropic at each location, but with a shape that becomes broader (sharper) in regions where the sample correlations are generally broader (sharper). The spectral diagonal approach (Fig. 4f) can be considered as applying the most possible amount of spectral localization in addition to removing any variation in the spectral variances for different zonal wavenumbers for each value of total wavenumber. The resulting correlation functions are homogeneous and isotropic. Any apparent variation in the shape of the correlation functions in Fig. 4f is solely due to the projection of the global field onto a Cartesian map.

Even though the original ensemble members have been normalized so that the sample variances are equal to one at each grid point, the variances obtained when applying the spatial/spectral localization and wavelet-diagonal approaches may be different from one, as already mentioned. Figure 5 shows the variances obtained from these two approaches for the region surrounding North America (many more locations were used than for the previous figure). The variances differ from 1 by at most around 25% in both cases with the average value close to 1. The variances from the wavelet-diagonal approach vary more smoothly in space than those from the spatial/spectral localization approach.

*x*is defined by Daley (1991) aswhere

*c*(

*x*) is the correlation function and the second derivative is computed at zero separation distance. This was computed using a finite-difference approximation that only requires the values of the correlation function at its origin and at one grid point to each side of the origin in either the zonal or meridional direction. Consequently, the subsampling of the original ensemble members to a resolution of 1.8° latitude and longitude may affect this result. The zonal and meridional length scales are shown in Figs. 6 and 7, respectively. The spatial variations in the length scales for the true correlations and the differences between the length scales in the zonal and meridional directions (Figs. 6a and 7a) are consistent with heterogeneous and anisotropic aspects of the correlations seen in Fig. 4. The length scales computed from the sample correlations (Figs. 6b and 7b) are very similar to those from the true correlations, indicating that the correlations very close to the origin are well estimated in the raw sample correlations. Similarly, correlations with spatial localization applied (Figs. 6c and 7c) have very similar length scales as those in the raw sample estimate, but with slightly reduced values. The spatial/spectral localization approach produces length scales (Figs. 6d and 7d) similar to those from the spatial localization approach, but with slightly smoother spatial variations due to the local spatial averaging of the correlation functions caused by spectral localization. The wavelet-diagonal approach produces correlations with length scales (Figs. 6e and 7e) that maintain some of the spatial variations seen in the true and sample correlations, but to a much lesser extent than the other approaches already discussed. Also, the length scales in the zonal and meridional directions now appear to have very similar spatial variations. This is consistent with the nearly isotropic character of the correlation functions from the wavelet-diagonal approach shown in Fig. 4e. The length scales from the wavelet-diagonal approach are also consistently smaller than for all of the other approaches. As expected, the length scales computed from the spectrally diagonal correlations (Figs. 6f and 7f) are spatially constant.

### c. Evaluation of the error in ensemble-based correlation estimates

The procedure of producing the ensemble members from a known true correlation matrix was chosen to allow the error in ensemble-based estimates to be measured quantitatively. Figure 8 shows the zonally averaged rms error for each correlation estimate as a function of latitude. These errors were computed from correlation functions obtained at many locations in the region surrounding North America using all correlations within 3000 km of the origin of each correlation function. The results show that all approaches significantly reduce the sampling error in the raw sample correlations. The spectral-diagonal approach provides the least reduction in error at almost all latitudes. The wavelet-diagonal approach results in a consistent, yet small improvement over the spectral-diagonal approach. The spatial localization approach produces a larger improvement over the spectral-diagonal approach at most latitudes. Finally, the spatial/spectral localization approach gives the best results at almost all latitudes, representing a small, but consistent improvement over the spatial localization approach (statistically significant at the 95% level for almost all latitudes). Because of the two-dimensional nature of the correlations, it may be expected that these rms errors are dominated by the error at large separation distances.

For a more detailed examination of the errors, the average correlations and error standard deviation in the correlations estimated from each approach were computed as a function of separation distance. Unlike the results in the previous figure, this allows each approach to be evaluated for both short separation distances, where sampling error in the raw sample correlations is relatively low, and for longer separation distances, where the sampling error may dominate. The panels on the left of Fig. 9 show the spatially averaged correlations as a function of separation distance for the raw sample correlations (dashed curves) and true correlations (dotted curves) together with the correlations obtained with the following approaches (solid curves): spatial localization (Fig. 9a), spatial/spectral localization with scale-dependent spatial localization (Fig. 9c), wavelet diagonal (Fig. 9e), and spectral diagonal (Fig. 9g). Because of the spatial averaging of the correlation functions over a large number of locations, the raw sample correlations very nearly match the average true correlations. In contrast to this, the spatial localization approach and to a lesser extent the spatial/spectral localization approach both underestimate the mean correlations for intermediate distances. This negative bias is unavoidable for all approaches that employ spatial localization. The wavelet-diagonal approach also slightly underestimates the average correlations for distances up to about 1500 km. With the spectral-diagonal approach, the estimated correlations are very similar to the spatially averaged true and sample correlations. Since the spectral diagonal correlations are equivalent to a global average of the sample correlations, the small difference between the spectral diagonal correlations and the spatially averaged sample correlations is because the averages here are only computed over the region surrounding North America and not globally.

The panels on the right of Fig. 9 show the error standard deviation for the sample correlations (dashed curves) and for the same four experiments (solid curves) as the corresponding panels on the left. Since the spatial localization approach has little effect on the sample correlations for very closely separated locations, it is then not surprising that the resulting correlations have a similar error standard deviation as the sample correlations for distances up to about 400 km (Fig. 9b). The largest error occurs at a distance of approximately 600 km and beyond this distance the error monotonically decreases. The error is very small at 3000 km, where the localization forces the correlations to zero. The correlations obtained with the spatial/spectral localization approach (Fig. 9d) have significantly higher error than the sample estimate at the shortest distances mostly due to the fact that variances different from one are obtained with this approach (as shown in Fig. 5). This modification of the ensemble variances appears to increase the error in the correlation estimate only because the variances have been assumed to be exactly known, which would not be the case in practice. The maximum value of error standard deviation occurs around 600 km, but at a slightly lower value than for the spatial localization approach. The wavelet-diagonal approach (Fig. 9f) produces correlations with higher error standard deviation than the raw sample estimate up to a distance of about 600 km and reaches its maximum near 400 km at a higher value than for the previous two approaches. At the shortest distance, the higher error standard deviation is due to the nonzero variances obtained with this approach, like for the spatial/spectral localization approach. For distances beyond about 2000 km, the error standard deviation is more similar to the other approaches. The spectral-diagonal approach produces correlations (Fig. 9h) with an error standard deviation that is higher than for the sample correlations up to a distance of about 800 km. The maximum value is higher than for any of the other methods. However, beyond about 2000 km, the error is again very similar to all of the other approaches.

Next, the impact of using a scale-dependent spatial localization in the new spatial/spectral localization approach is evaluated. The use of scale-dependent spatial localization with correlations forced to zero at 3500, 3000, 2500, and 2000 km for the four overlapping spectral bands is compared with using spatial localization that forces correlations to zero at 3000 km for all horizontal scales. Figure 10a shows the spatially averaged horizontal correlation function when using scale-dependent (solid curve) and scale-independent (dashed curve) spatial localization and also from the true correlations (dotted curve). Like in the previous figure, both results from the spatial/spectral localization approach produce smaller average correlations than the true correlations over almost the entire range of separation distance. When using scale-dependent spatial localization, the magnitude of the bias is slightly smaller for separation distances between about 500 and 1500 km. This is likely due to the use of less spatial localization for the largest horizontal scales (forced to zero at 3500 km) as compared to the case of using a constant localization for all scales (forced to zero at 3000 km). The standard deviation of the error in estimates produced by the two spatial/spectral localization approaches are shown in Fig. 10b. The use of a scale-dependent spatial localization results in a very similar standard deviation as when using the same spatial localization for all spatial scales. Therefore it appears that the use of scale-dependent spatial localization reduces the bias without affecting the random component of the error. To confirm this interpretation, an additional experiment was performed using the same amount of spatial localization for the scale-independent approach as for the largest scales in the scale-dependent approach (i.e., forced to zero at 3500 km). In this additional experiment (not shown) the bias is very similar to using scale-dependent localization. However, the use of weaker localization when applying scale-independent spatial localization results in an error standard deviation that is systematically increased as compared with scale-dependent spatial localization.

Until now, only correlations estimated from 48-member ensembles have been considered. However, because of the large computational cost of obtaining a large number of ensemble members in an operational context, the error in the different correlation estimates is now also considered in the context of using only 12 members. Figure 11 shows the spatially averaged correlations and error standard deviation as a function of separation distance in the same format as used for Fig. 9. By comparing the panels on the left of Figs. 9 and 11 it can be seen that the reduction in the ensemble size has almost no impact on the spatially averaged correlation function. In contrast, comparing the panels on the right shows that the use of a smaller ensemble can result in significant increases in the error standard deviation. The largest increase is seen for the spatial localization approach for which the maximum error standard deviation is nearly double the value obtained when using a 48-member ensemble. The increase in error standard deviation is somewhat less for the spatial/spectral localization. Consequently, the reduction in error standard deviation due to spectral localization of this approach (seen by comparing the spatial/spectral localization and spatial localization approaches) is more evident when using a smaller ensemble. For the wavelet-diagonal approach, the error standard deviation is only slightly increased, mostly at the shortest separation distances. Finally, the error standard deviation resulting from using the spectral-diagonal approach is nearly unchanged by the reduction in ensemble size. Because of these differences in the impact on the error standard deviation from reducing the ensemble size, it is clear that the approach giving the best overall correlation estimate is highly dependent on ensemble size. For the 12-member ensemble it appears that the wavelet-diagonal approach provides the best estimate. For the 48-member ensemble, the spatial/spectral localization approach provides the best estimate.

## 4. One-month data assimilation experiments

To evaluate the impact of the new spatial/spectral localization approach relative to applying only spatial localization, a series of 3D-Var experiments were performed using real observations over all of February 2007. These experiments use the same configuration as the 3D-Var experiments described by Buehner et al. (2010b). The Global Environmental Multiscale (GEM) model (Côté et al. 1998; Bélair et al. 2009) is then used to produce 56 global medium-range forecasts from the analyses valid at 0000 and 1200 UTC for each day of the experiments.

A set of four 3D-Var experiments were conducted. Some details of these experiments are summarized in Table 1. An ensemble of 96 background states produced by the EnKF is available every 6 h during February 2007. These ensembles were used to estimate the background-error covariance matrix in 3D-Var and 4D-Var experiments by Buehner et al. (2010b) with the spatial localization approach. For the present study, a similar experiment with 3D-Var and using the full 96-member ensembles was also conducted (Ens96). The only difference between this experiment and the experiment from the previous study is that a spectral filter eliminating the energy in the ensemble members above total wavenumber 180 is applied. This was done to make the spatial localization and spatial/spectral localization approaches more consistent since the highest bandpass filter used for the spatial/spectral localization approach also eliminates energy above total wavenumber 180. A series of 3D-Var experiments was then conducted using only 48 members to estimate the background-error covariances. The first experiment (Ens48_1) again uses the spatial localization approach with the same horizontal localization function as used for the experiment with 96 members (that forces the covariances to zero at a distance of 2800 km). The second experiment with 48 members (Ens48_2) uses the spatial/spectral localization approach with 6 spectral bands with nearly equal spectral width (see Table 1). For all of the spectral bands the same spatial localization function is used as in the previously described experiments. The final experiment (Ens48_3) also uses the spatial/spectral localization approach with 6 spectral bands, but a different horizontal localization function is used for each spectral band as specified in Table 1.

Summary of the differences between the 3D-Var experiments.

Figure 12 shows the verification scores of the 48-h forecasts from the Ens48_1 (black curves) and Ens96 (gray curves) experiments relative to radiosonde observations for geopotential height in the extratropics and for temperature in the tropics. Both the bias (dashed curves) and standard deviation (solid curves) are shown. The level of statistical significance of the difference between the two experiments is indicated by numbers (as a percentage) within shaded boxes on the left for the bias and on the right for the standard deviation. Values are only shown when the level of significance is above 90%. The appearance of black shaded boxes indicates differences in favor of the Ens48_1 experiment and gray shaded boxes indicates differences in favor of the Ens96 experiment. In general the use of a smaller ensemble size results in degraded forecast scores for the Ens48_1 experiment relative to Ens96. The difference is significant for the bias and standard deviation of geopotential height in the northern extratropics and temperature in the tropics (though for only a small number of levels) and for the standard deviation of geopotential height in the southern extratropics.

Similarly, Fig. 13 shows the verification scores of the 48-h forecasts from the Ens48_2 (black curves) and Ens96 (gray curves) experiments. The application of the spatial/spectral localization approach has resulted in an improved forecast quality. The degraded geopotential height bias seen for Ens48_1 relative to Ens96 in the northern extratropics has been eliminated, whereas the standard deviation has not changed. The increased temperature bias in the tropics has also been removed and is now slightly improved for Ens48_2 relative to Ens96. In the southern extratropics the increased standard deviation of geopotential height seen in Ens48_1 has been slightly reduced by the spatial/spectral localization.

Finally, the results of experiment Ens48_3 are shown in Fig. 14 together with those of Ens96. This enables the impact of using scale-dependent spatial localization with the spatial/spectral localization approach to be evaluated. In the northern extratropics, the geopotential height bias is nearly unchanged, whereas the difference in the standard deviation has decreased relative to Ens96 such that it is below 90% significance for nearly all pressure levels. The temperature bias in the tropics is further improved in the layer between 200 and 400 hPa by using scale-dependent spatial localization. Unlike in the northern extratropics and tropics where Ens48_3 is equivalent to or better than Ens96, in the southern extratropics the standard deviation of geopotential height is still slightly degraded in the upper portion of the troposphere, and only slightly improved relative to Ens48_1. Examination of the time series of these verification statistics (not shown) shows that the improvement in the tropics results from a ~1-week period near the end of the month, whereas the degradation in the southern extratropics results from many cases distributed throughout the entire month.

## 5. Conclusions

The goal of this study was to evaluate several existing approaches for estimating background-error covariances from an ensemble of error realizations and a new spatial/spectral covariance localization approach. The new approach shares aspects of both the spatial localization and wavelet-diagonal approaches. An interesting feature of this approach is that it enables the use of different spatial localization functions for covariances associated with each of a set of overlapping horizontal wavenumber bands. It was shown that the horizontal correlations associated with large horizontal scales require less severe localization than those associated with smaller scales. The use of such scale-dependent spatial localization was shown to reduce the error in spatial correlation estimates as compared to using the same localization function for all horizontal scales.

When comparing spatial localization, spatial/spectral localization, wavelet-diagonal and also spectral-diagonal approaches, it was found that the relative difference in estimation error between the approaches depended on the ensemble size. For a relatively large ensemble (48 members), the spatial/spectral localization approach produced the lowest error. When using a much smaller ensemble (12 members), the wavelet-diagonal approach resulted in the lowest error. This reduced error suggests that the positive effect on the correlations from applying spectral localization according to the wavelet functions is greater than the negative impact from the unintended changes to the original ensemble variances. Qualitatively, the horizontal correlation functions resulting from spatial/spectral localization appear smoother and less noisy than those from spatial localization, but preserve more of the heterogeneous and anisotropic nature of the raw sample correlations than the wavelet-diagonal approach.

In a set of 1-month 3D-Var experiments using a full set of real atmospheric observations, the new spatial/spectral localization approach was compared with spatial localization. The results from these experiments show that the spatial/spectral localization approach provides a nearly similar forecast quality, and in some cases improved forecast quality, as with spatial localization while using an ensemble of half the size (48 vs 96 members). Consequently, use of spatial/spectral localization in place of spatial localization can be seen as an alternative way to improve the quality of analyses and forecasts instead of increasing the ensemble size. While the computational cost of the analysis step is increased by localizing with respect to multiple spectral bands, this avoids the increase in cost associated with producing analyses and short-term forecasts for a larger ensemble size. Additional experiments would be necessary to determine the optimal bandpass-filtered functions (both the number and width) and localization functions for each for a given ensemble size. In addition, in the present study only the horizontal localization functions were varied as a function of horizontal scale, but it may be beneficial to also vary the vertical localization function.

## Acknowledgments

This work was mostly completed during a visit to Méteo-France. The author thanks Gérald Desroziers and Loïk Berre for organizing the visit and for many fruitful discussions. Thanks also to Luc Fillion for helpful comments on an earlier version of the manuscript.

## REFERENCES

Anderson, J. L., 2007: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter.

,*Physica D***230**, 99–111.Bélair, S., , M. Roch, , A.-M. Leduc, , P. A. Vaillancourt, , S. Laroche, , and J. Mailhot, 2009: Medium-range quantitative precipitation forecasts from Canada’s new 33-km deterministic global operational system.

,*Wea. Forecasting***24**, 690–708.Bishop, C. H., , and D. Hodyss, 2009: Ensemble covariances adaptively localized with ECO-RAP. Part 2: A strategy for the atmosphere.

,*Tellus***61A**, 97–111.Buehner, M., 2005: Ensemble-derived stationary and flow-dependent background-error covariances: Evaluation in a quasi-operational NWP setting.

,*Quart. J. Roy. Meteor. Soc.***131**, 1013–1043.Buehner, M., 2010: Error statistics in data assimilation: Estimation and modelling.

*Data Assimilation: Making Sense of Observations,*W. Lahoz, B. Khattatov, and R. Menard, Eds., Springer, 93–112.Buehner, M., , and M. Charron, 2007: Spectral and spatial localization of background-error correlations for data assimilation.

,*Quart. J. Roy. Meteor. Soc.***133**, 615–630.Buehner, M., , P. L. Houtekamer, , C. Charette, , H. L. Mitchell, , and B. He, 2010a: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part I: Description and single-observation experiments.

,*Mon. Wea. Rev.***138**, 1550–1566.Buehner, M., , P. L. Houtekamer, , C. Charette, , H. L. Mitchell, , and B. He, 2010b: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part II: One-month experiments with real observations.

,*Mon. Wea. Rev.***138**, 1567–1586.Côté, J., , S. Gravel, , A. Méthot, , A. Patoine, , M. Roch, , and A. Staniforth, 1998: The operational CMC-MRB Global Environmental Multiscale (GEM) model. Part I: Design considerations and formulation.

,*Mon. Wea. Rev.***126**, 1373–1395.Courtier, P., and Coauthors, 1998: The ECMWF implementation of three-dimensional variational assimilation (3D-Var). I: Formulation.

,*Quart. J. Roy. Meteor. Soc.***124**, 1783–1807.Daley, R., 1991:

*Atmospheric Data Analysis*. Cambridge University Press, 457 pp.Deckmyn, A., , and L. Berre, 2005: A wavelet approach to representing background error covariances in a limited-area model.

,*Mon. Wea. Rev.***133**, 1279–1294.Derber, J., , and F. Bouttier, 1999: A reformulation of the background error covariance in the ECMWF global data assimilation system.

,*Tellus***51A**, 195–221.Fisher, M., , and E. Andersson, 2001: Developments in 4D-Var and Kalman filtering. ECMWF Tech. Memo. 347, 36 pp. [Available from European Centre for Medium-Range Weather Forecasts, Shinfield Park, Reading, Berkshire RG2 9AX, United Kingdom.]

Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**, 723–757.Gauthier, P., , M. Buehner, , and L. Fillion, 1999: Background-error statistics modelling in a 3D variational data assimilation scheme: Estimation and impact on the analyses.

*Proc. ECMWF Workshop on Diagnosis of Data Assimilation Systems,*Reading, United Kingdom, ECMWF, 131–145.Gauthier, P., , M. Tanguay, , S. Laroche, , S. Pellerin, , and J. Morneau, 2007: Extension of 3DVAR to 4DVAR: Implementation of 4DVAR at the Meteorological Service of Canada.

,*Mon. Wea. Rev.***135**, 2339–2354.Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129**, 2776–2790.Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126**, 796–811.Houtekamer, P. L., , and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation.

,*Mon. Wea. Rev.***129**, 123–137.Houtekamer, P. L., , and H. L. Mitchell, 2005: Ensemble Kalman filtering.

,*Quart. J. Roy. Meteor. Soc.***131**, 3269–3289.Houtekamer, P. L., , H. L. Mitchell, , and X. Deng, 2009: Model error representation in an operational ensemble Kalman filter.

,*Mon. Wea. Rev.***137**, 2126–2143.Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4D-Var.

,*Quart. J. Roy. Meteor. Soc.***129**, 3183–3203.Pannekoucke, O., 2009: Heterogeneous correlation modeling based on the wavelet diagonal assumption and on the diffusion operator.

,*Mon. Wea. Rev.***137**, 2995–3012.Pannekoucke, O., , L. Berre, , and G. Desroziers, 2007: Filtering properties of wavelets for local background-error correlations.

,*Quart. J. Roy. Meteor. Soc.***133**, 363–379.Parrish, D. F., , and J. C. Derber, 1992: The National Meteorological Center’s spectral statistical interpolation analysis system.

,*Mon. Wea. Rev.***120**, 1747–1763.Pereira, M. B., , and L. Berre, 2006: The use of an ensemble approach to study the background error covariances in a global NWP model.

,*Mon. Wea. Rev.***134**, 2466–2489.Purser, R. J., , W.-S. Wu, , D. F. Parrish, , and N. M. Roberts, 2003: Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances.

,*Mon. Wea. Rev.***131**, 1536–1548.Whitaker, J. S., , T. M. Hamill, , X. Wei, , Y. Song, , and Z. Toth, 2008: Ensemble data assimilation with the NCEP global forecast system.

,*Mon. Wea. Rev.***136**, 463–482.Wu, W.-S., , R. J. Purser, , and D. F. Parrish, 2002: Three-dimensional variational analysis with spatially inhomogeneous covariances.

,*Mon. Wea. Rev.***130**, 2905–2916.