2 Post-processing of raw eddy-covariance data

To derive turbulent fluxes and other turbulence characteristics from eddy-covariance measurements, they have to be post-processed first. In the following, we will have a look a the required steps – starting with the theoretical foundation of the method and the necessary physical and technical considerations for the raw data processing. The upcoming analyses (e.g., 02_basic-turbulence-diagnostics.ipynb) build upon this and allow for a more detailed investigation (e.g., 03_quadrant-analysis.ipynb).

2.1 Eddy-covariance method

How do we derive a flux from the sonic measurements? The eddy-covariance method is based on the Reynolds decomposition, which is applied to every measured quantity \(x\) through \[ x = \overline{x} + x' \quad \textrm{with} \quad \overline{x}:= \frac{1}{t_s} \int_0^{t_s} x \: dt. \] Therein, \(\overline{x}\) represents the time average and \(x'\) the deviation from it. In principle, one would have to consider the ensemble average (i.e., all possible paths in the phase space), but this is replaced by the time average assuming ergodicity. Now, we take a look at the flux, which is defined as the average product of two quantities \(\overline{xw}\), and insert the Reynolds decomposition: \[ \overline{xw} = \overline{x}\:\overline{w} + \overline{x}\overline{w'} + \overline{x'}\overline{w} + \overline{x'w'} = \overline{x}\:\overline{w} + \overline{x'w'} \] The second and third term vanish because of the Reynolds postulates (i.e. \(\overline{x'} = 0\)). If additionally \(\overline{w}=0\), the flux can be represented by the correlation, i.e., \(\overline{xw} = \overline{x'w'}\). As you see, this procedure involves a lot of assumptions: For \(\overline{x'} = 0\) to be true, we assume that the flow is steady and for \(\overline{w}=0\) you either have a homogeneous and flat surface or you need to rotate the sonic coordiante system (see the correction methods below).
Usually, \(w\) represents the vertical wind speed and \(x\) can be an arbitrary scalar quantity. For the momentum flux, we use \(x=u\) (streamwise velocity), for the sensible heat flux \(x=T\) (temperature or potential temperature), for the latent heat flux \(x=q\) (specific humidity) and for trace gas fluxes \(x=c\) (the concentration or mixing ratio of the respective trace gas).

2.2 Raw data handling and corrections

For applying the described eddy-covariance method to the measurements, we have to decide on an averaging time \(t_s\), have to perform quality control of the raw data and have to check the fulfillment of the made assumptions.

2.2.1 Choosing a suitable averaging time

As we have seen above, for calculating the fluxes with the eddy-covariance method, a suitable averaging time (\(t_s\)) should be chosen. For this, several things need to be considered: Generally, one can say that the longer the averaging time, the less likely it is that the conditions are steady (which is an assumption in the EC method), but short averaging times lose low frequency contributions. So usually, an averaging time of 30 minutes is chosen. However, this averaging time can still be too long if the turbulence is very intermittent, which is the case for very stable conditions. Consequently, some studies use the 30 minutes average just for unstable stratification and an averaging time of 1 to 10 minutes for stable conditions. It is recommendable to first study the characteristics of the sampled raw data (e.g., based on spectra) and then chose an averaging time.

2.2.2 Overview of correction procedures applied to the raw data

Despiking (despiking): Removing of spikes in the raw data. For this usually three methods are applied: (1) based on pre-defined thresholds (i.e., an expected temperature or wind speed interval), (2) median deviation (MAD) test, i.e., all measurements lie within a pre-defined range around the median, and (3) based on skewness (3. moment) and kurtosis (4. moment), i.e., the skewness and kurtosis of the considered interval do not exceed pre-defined values. Details see e.g. Mauder et al. (2013).
Lag-time correction (shift2maxccf): Lag-time can occur if several loggers are used or the inlet tube of the gas analyzer is very long, such that there is a time offset between the different measured variables. This can be corrected by calculating the maximum cross-correlation and shift the time series according to the lag difference, which is done by the function shift2maxccf.
Linear detrending (pracma::detrending): Linear detrending by substracting the mean value (however, linear detrending is not consistent with Reynolds decomposition, so it should be used/interpreted with caution, see e.g. Baldocchi (2003)).
Rotation (rotate_double or rotate_planar): Rotation of the sonic coordinate system. Since the orientation of the sonic is arbitrary, the measurements have to be interpreted in a suitable coordinate system. For this, the natural coordinate systems is used, which means that it follows the streamlines and the coordinates are then given by the streamwise velocity component \(u\), the crosswise velocity component \(v\) and the vertical velocity component \(w\). To determine the orientation, two main approaches exist: Double rotation (rotate_double) alignes the sonic coordinate system with the streamlines for every averaging interval. Hence, this method can be applied near-real time. Planar fit rotation (rotate_planar, Wilczak, Oncley, and Stage (2001)) aligns the sonic coordinate system with the mean streamlines under the conditions that the mean vertical velocity \(\overline{w}\) vanishes. For this, a longer time series (usually the whole measurement campaign) is required and thus near-real time processing is not possible. Generally, it is recommended to use double rotation in relatively simple topography, but planar fit in complex (micro-)topogaphy. The angles around which the sonic coordinate system was rotated also allow to estimate the quality of the measurements (the smaller the angle of rotation, the better and important is that they do not flip signs, in particular for close-to-zero fluxes under very stable stratification).
Quality flagging (flag_stationarity,flag_w,flag_distortion,flag_most, e.g. Foken and Wichura (1996)): Several quality flags (ranging from 0: the best to 2: the worst) can be applied. Common flags are: a stationarity flag (flag_stationarity), that test the alignment with the stationarity assumption of the eddy-covariance method, a vertical velocity flag (flag_w), which checks that the remaining vertical velocity after the rotation is small, a flow distortion flag (flag_distortion), which removes the pre-defined wind directions that are possibly affected/blocked by the mast, an integral turbulence characteristics flag (flag_itc), which test for the agreement with Monin-Obukhov similarity theory. However, these flags have to be applied purpose-oriented and interpreted with caution. Sometimes particularly “poorly flagged” measurements are interesting for investigating turbulence under challenging (and therefore interesting) conditions, e.g. under very stable stratification.
SND (and cross-wind) correction (SNDcorrection, Schotanus, Nieuwstadt, and DeBruin (1983)): Converts the buoyancy flux to sensible heat flux. Since most sonics measure the sound propagation speed, the measured temperature (the so called sonic temperature \(T_s\)) is similar to the virtual temperature and thus the resulting flux \(\overline{w'T_s'}\) represents the buoyancy flux, which is about 10-20 % larger than the sensible heat flux \(\overline{w'T'}\). Note, the used constants in the method depend on the measurement device (default for CSAT3).
WPL correction (WPLcorrection, Webb, Pearman, and Leuning (1980)): Converts volume- to mass-related quantities for trace gas fluxes. This correction only applies to trace gas concentrations (water vapor, carbon dioxide, methane, …) measured with a gas analyzer.
Unit conversion (density) (ppt2rho): Converts ppt to density. Closed-path gas analyzers measure trace gas concentrations in parts-per-… (ppt: … thousand, ppm: … million), which has to be converted to a density based on the respective molar mass.
Unit conversion (fluxes) (cov2sh, cov2lh,cov2cf): Converts covariances to the respective flux in W/m\(^2\), i.e., cov(w,T) to sensible heat flux, cov(w,q) to latent heat flux and cov(w,c) to CO\(_2\) flux.
Gap-filling (gapfilling): Reproduction of timeseries by filling gaps. Gap-filling is necessary if budgets are calculated or a continuous timeseries is desired. Depending on the type of gaps (random gaps, systematic gaps, duration of gaps and available auxillary measurements) different physical and statistical methods for gap-filling exist (see review Moffat and others (2007)). In meteorology, particulary for studying turbulent fluxes under stable conditions statistically, gap-filling is usually not used to avoid introducing artificial “measurements” created under the assumptions of the used gap-filling method.

There are more (and partly controversial) correction methods, details are discussed in Foken (2017). Further correction methods are required when spectra are considered (spectral corrections) or budgets are calculated (e.g. more sophisticated gap-filling of missing values, to avoid a bias due to systematically discarded data).

2.3 An example post-processing routine

Now, we create an example post-processing routine. For this, we use 3 days of raw eddy-covariance measurements (CSAT3/Campbell Scientific and Li-7200/LI-COR closed-path IR gas analyzer) with a 10 Hz sampling frequency from our station in Finse, Hardangervidda, Norway. The data set is in the folder data/ec-data_10Hz_raw and can be downloaded directly or the whole github repository can be cloned with git clone git@github.com:noctiluc3nt/ec_analyze.git.

#loading Reddy package
install.packages("../src/Reddy_0.0.0.9000.tar.gz",repos=NULL,source=TRUE,quiet=TRUE)
library(Reddy)

#ec data files
dir_in="../data/ec-data_10Hz_raw" 
files=list.files(dir_in,full.names=TRUE)
nf=length(files)

Each given file contains 30 minutes of measurements, such that we just need to average over one file to get the 30 minutes averages and fluxes. In the notebook 04_multiresolution-decomposition.ipynb we will have a look at a method that allows to determine suitable averaging times more accurately. In our example routine, the data is first despiked, than the wind is rotated and the sonic measurements are averaged. For the gas analyzer measurements, the unit is converted to a density. Then the turbulence intensities (standard deviation) and the fluxes (covariances) are calculated. For the sensible heat flux, we need to apply the SND correction additionally, the WPL correction can be applied to the latent heat flux.
Here, the temperature is given directly in \(^\circ\)C, however, in the direct output of the LI-COR systems (.ghg-files) usually the speed of sound (sos in m/s) is stored, which can be easily converted to temperature with the function sos2Ts.

#allocate output
var_out=c("time","u_mean","v_mean","w_mean","ws_mean","wd_mean","T_mean","h2o_mean","co2_mean",
          "u_sd","v_sd","w_sd","T_sd","h2o_sd","co2_sd",
          "cov_uv","cov_uw","cov_vw","cov_wT","cov_h2ow","cov_co2w","cov_wT_snd","cov_rhoH2Ow_wpl",
          "rot_angle1","rot_angle2","flag_stationarity","flag_w","flag_distortion")
nv=length(var_out)
dat=data.frame(array(NA,dim=c(nf,nv)))
colnames(dat)=var_out

#postprocessing per file (30 mins)
for (i in 1:nf) {
    tmp=read.table(files[i],sep=",",header=T)
    dat$time[i]=tmp$X[1]
    #despiking
    tmp$T_degC=despiking(tmp$T_degC,-40,30)
    tmp$u_m.s=despiking(tmp$u_m.s,-25,25)
    tmp$v_m.s=despiking(tmp$v_m.s,-25,25)
    tmp$w_m.s=despiking(tmp$w_m.s,-5,5)
    #wind before rotation
    dat$ws_mean[i]=sqrt(mean(tmp$u_m.s,na.rm=T)^2+mean(tmp$v_m.s,na.rm=T)^2)
    dat$wd_mean[i]=atan2(mean(tmp$v_m.s,na.rm=T),mean(tmp$u_m.s,na.rm=T))
    #rotation
    rot_wind=rotate_double(tmp$u_m.s,tmp$v_m.s,tmp$w_m.s)
    tmp$u_m.s=rot_wind$u
    tmp$v_m.s=rot_wind$v
    tmp$w_m.s=rot_wind$w
    dat$rot_angle1[i]=rot_wind$theta
    dat$rot_angle2[i]=rot_wind$phi
    #averaging
    dat$u_mean[i]=mean(tmp$u_m.s,na.rm=T)
    dat$v_mean[i]=mean(tmp$v_m.s,na.rm=T)
    dat$w_mean[i]=mean(tmp$w_m.s,na.rm=T)
    dat$T_mean[i]=mean(tmp$T_degC,na.rm=T)
    #unit conversion for gases
    tmp$rhoH2O=ppt2rho(tmp$H2O_ppt,dat$T_mean[i]+273.15,87000)
    tmp$rhoCO2=ppt2rho(tmp$CO2_ppm/1000,dat$T_mean[i]+273.15,87000,gas="CO2")
    dat$h2o_mean[i]=mean(tmp$rhoH2O,na.rm=T)
    dat$co2_mean[i]=mean(tmp$rhoCO2,na.rm=T)
    #sds
    dat$u_sd[i]=sd(tmp$u_m.s,na.rm=T)
    dat$v_sd[i]=sd(tmp$v_m.s,na.rm=T)
    dat$w_sd[i]=sd(tmp$w_m.s,na.rm=T)
    dat$T_sd[i]=sd(tmp$T_degC,na.rm=T)
    dat$h2o_sd[i]=sd(tmp$rhoH2O,na.rm=T)
    dat$co2_sd[i]=sd(tmp$rhoCO2,na.rm=T)
    #covs
    dat$cov_uw[i]=cov(tmp$u_m.s,tmp$w_m.s,use="pairwise.complete.obs")
    dat$cov_uv[i]=cov(tmp$u_m.s,tmp$v_m.s,use="pairwise.complete.obs")
    dat$cov_vw[i]=cov(tmp$v_m.s,tmp$w_m.s,use="pairwise.complete.obs")
    dat$cov_wT[i]=cov(tmp$T_degC,tmp$w_m.s,use="pairwise.complete.obs")
    dat$cov_h2ow[i]=cov(tmp$rhoH2O,tmp$w_m.s,use="pairwise.complete.obs")
    dat$cov_co2w[i]=cov(tmp$rhoCO2,tmp$w_m.s,use="pairwise.complete.obs")
    #SND correction
    dat$cov_wT_snd[i]=SNDcorrection(tmp$u_m.s,tmp$v_m.s,tmp$w_m.s,tmp$T_degC+273.15)
    #WPL correction
    #dat$cov_rhoH2Ow_wpl[i]=WPLcorrection(tmp$rhoH2O,w=tmp$w_m.s,Ts=tmp$T_degC+273.15,q=tmp$q)
    #flagging
    dat$flag_stationarity[i]=flag_stationarity(tmp$w_m.s,tmp$T_degC,nsub=3000)
    dat$flag_w[i]=flag_w(dat$w_mean[i])
    dat$flag_distortion[i]=flag_distortion(tmp$u_m.s,tmp$v_m.s,dir_blocked=c(340,360))
}

The calculated covariances have to be converted to the fluxes (unit W/m\(^2\)), for example with the functions cov2shand cov2lh:

#calculate fluxes
dat$sh=cov2sh(dat$cov_wT_snd)
dat$lh=cov2lh(dat$cov_h2ow)

The output now contains the 30 minutes averages, standard deviations and covariances of the measured quantities, which will be used in the following notebooks for some more detailed analysis.

#look at output
head(dat)

A data.frame: 6 × 30
	time	u_mean	v_mean	w_mean	ws_mean	wd_mean	T_mean	h2o_mean	co2_mean	u_sd	⋯	cov_co2w	cov_wT_snd	cov_rhoH2Ow_wpl	rot_angle1	rot_angle2	flag_stationarity	flag_w	flag_distortion	sh	lh
	<chr>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	⋯	<dbl>	<dbl>	<lgl>	<dbl>	<dbl>	<dbl>	<dbl>	<lgl>	<dbl>	<dbl>
1	2018-07-20 08:30:00	2.872084	-3.939352e-16	6.931732e-06	2.873170	-2.289284	15.86938	0.007254132	0.0006009614	1.062143	⋯	-1.567917e-07	0.1052858	NA	228.8337	0.3484284	0	0	NA	123.2512	103.1085
2	2018-07-20 09:00:00	2.864793	7.504684e-17	9.052527e-08	2.864538	-2.326045	16.55190	0.007584267	0.0005978138	1.105122	⋯	-2.294700e-07	0.1600703	NA	226.7274	0.7745417	0	0	NA	187.3839	162.6196
3	2018-07-20 09:30:00	3.996526	9.365292e-16	3.049021e-05	4.002522	-2.035397	17.05704	0.007472065	0.0005954894	1.409495	⋯	-1.862787e-07	0.1533514	NA	243.3803	0.2888957	0	0	NA	179.5185	123.5711
4	2018-07-20 10:00:00	4.998016	-2.812846e-16	-2.391239e-06	4.997530	-1.977737	17.60447	0.006762097	0.0005953881	1.289326	⋯	-1.504328e-07	0.1556649	NA	246.6840	0.3483120	0	0	NA	182.2267	123.4474
5	2018-07-20 10:30:00	4.879095	4.062636e-16	2.187117e-05	4.880696	-2.014489	18.08994	0.005410862	0.0005948927	1.469949	⋯	-1.989823e-07	0.1914488	NA	244.5783	0.6531922	0	0	NA	224.1166	191.1642
6	2018-07-20 11:00:00	5.225037	-8.977916e-17	-6.972624e-06	5.223984	-2.261205	18.24207	0.004422858	0.0005954766	1.319227	⋯	-1.330638e-07	0.1490271	NA	230.4425	0.4471074	0	0	NA	174.4563	150.8430

saveRDS(dat,file="../data/ec-data_30min_processed/processed_data_example.rds")

Additionally, the data can be gap-filled with gapfilling or averaged to other customized averaging intervals with averaging. The function averaging is based on the RcppRoll package, which provides fast and efficient “rolling” functions utilizing a C++ interface.
Generally, it should be noted that different sensors may require different correction methods, see e.g. Foken (2017) for a discussion, so a detailed knowledge of the instrumentation is required.

References

Baldocchi, D. D. 2003. “Assessing the eddy covariance technique for evaluating carbon dioxide exchange rates of ecosystems: past, present and future.” Global Change Biology 9: 479–92.

Foken, T. 2017. “Micrometeorology.” Springer Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25440-6.

Foken, T., and B. Wichura. 1996. “Tools for quality assessment of surface-based flux measurements.” Agric Forest Meteorol 78: 83–105.

Mauder, M., M. Cuntz, C. Drüe, A. Graf, C. Rebmann, H. P. Schmid, M. Schmidt, and R. Steinbrecher. 2013. “A strategy for quality and uncertainty assessment of long-term eddy-covariance measurements.” Agric Forest Meteorol 169: 122–35. https://doi.org/10.1016/j.agrformet.2012.09.006.

Moffat, A. M., and others. 2007. “Comprehensive comparison of gap-filling techniques for eddy covariance net carbon fluxes.” Agricultural and Forest Meteorology 147: 209–32.

Schotanus, P., F. T. M. Nieuwstadt, and H. A. R. DeBruin. 1983. “Temperature measurement with a sonic anemometer and its application to heat and moisture fluctuations.” Boundary-Layer Meteorol 26: 81–93.

Webb, E. K., G. I. Pearman, and R. Leuning. 1980. “Correction of the flux measurements for density effects due to heat and water vapour transfer.” Quart J Roy Meteorol Soc 106: 85–100.

Wilczak, J. M., S. P. Oncley, and S. A. Stage. 2001. “Sonic anemometer tilt correction algorithms.” Boundary-Layer Meteorol 99: 127–50.