## 5.8 Temporal Analysis

Temporal statistical analysis enables you to examine and model the behavior of a variable in a data set over time (e.g., to determine whether and how concentrations are changing over time. The behavior of a variable in a data set over time can be modeled as a function of previous data points of the same series, with or without extraneous, random influences (such as an earthquake or a new release). Common temporal analyses discussed below include time series plots, one-way ANOVA, sample autocorrelationCorrelation of values of a single variable data set over successive time intervals (Unified Guidance). The degree of statistical correlation either (1) between observations when considered as a series collected over time from a fixed sampling point (temporal autocorrelation) or (2) within a collection of sampling points when considered as a function of distance between distinct locations (spatial autocorrelation)., the rank von Neumann test, seasonality correlations, or the seasonal Mann-Kendall test. Table F-4 includes information about checking assumptions for multi-sample tests.

### 5.8.1 Time series plots

The time series plotA graphic of data collected at regular time intervals, where measured values are indicated on one axis and time indicated on the other. This method is a typical exploratory data analysis technique to evaluate temporal, directional, or stationarity aspects of data (Unified Guidance). provides a graphical view of the raw data. Time is plotted on the x-axis, and the data series observation or observations (for multiple series) are plotted on the y-axis. See Section 5.1.1: Time Series Methods of this document for a complete overview of time series plots.

Figures 9-1, 14-1, and 14-2, Unified Guidance.

### 5.8.2 One-way ANOVA

ANOVAanalysis of variance is a general purpose statistical approach used to compare data from three or more populations (with the data divided into one group/subset per population). Because of its flexibility and generality, ANOVA has utility for spatial analyses (for example, measuring contaminant level differences across multiple wells/sampling points), temporal analyses (for example, evaluating seasonality or temporal correlations across sampling events), as well as diagnostic testing (for example, testing for equal variances or identifying significant spatial variation).

For temporal analysis, the statistical populations to be compared by ANOVAanalysis of variance represent distinct time periods, rather than distinct sampling points as in a spatial analysis. For instance, in cases of apparent seasonality at an individual well, each season (for example, spring or fall) is treated as a distinct population. In order to test for seasonality, each data subset must include representative observations from each distinct season — with a minimum of one sampling event per season collected over a period of at least three years.

When evaluating data sets for temporal patterns due to factors other than seasonality (but which impact a set of wells in common), each sampling event is treated as a separate population. The data are pooledGroundwater samples from more than one sampling point. across sampling points and then grouped/divided by sampling event. The ANOVAone-way analysis of variance then compares the average levels per sampling event to look for differences between events that signify temporal patterns common to the set of wells.

In all parametricA statistical test that depends upon or assumes observations from a particular probability distribution or distributions (Unified Guidance). ANOVAone-way analysis of variance analyses — regardless of how the data are grouped into subsets — the test (parametric F-test) returns an F-ratio statistic and an associated p-valueIn hypothesis testing, the p-value gives an indication of the strength of the evidence against the null hypothesis, with smaller p-values indicating stronger evidence. If the p-value falls below the significance level of the test, the null hypothesis is rejected.. A large F-ratio (and small p-value) indicates that the observed differences between the subsets of data are more than expected based on chance alone, whereas an F-ratio close to one (large p-value) suggests that the differences may be due to random variation.

The Kruskal-Wallis test is a nonparametricStatistical test that does not depend on knowledge of the distribution of the sampled population (Unified Guidance). counterpart to ANOVAone-way analysis of variance that does not require normality of the ANOVA residuals. In this version, ranks of the data are used instead of the observed measurements, and an H-statistic is produced instead of an F-ratio, but the basic thrust of the test is the same. Average ranks are computed for each group being compared. If the differences in rank averages are larger than expected by random variation, the H-statistic will be large (with correspondingly small p-value), indicating a probable difference in the populations.

For diagnostic testing, one-way ANOVAone-way analysis of variance can aid decisions about whether to conduct interwellComparisons between two monitoring wells separated spatially (Unified Guidance). or intrawellComparison of measurements over time at one monitoring well (Unified Guidance). tests by identifying the presence of significant spatial variabilitySpatial variability exists when the distribution or pattern of concentration measurements changes from well location to well location (most typically in the form of differing mean concentrations). Such variation may be natural or synthetic, depending on whether it is caused by natural or artificial factors (Unified Guidance). among a group of sampling points. If the spatial variation is a natural phenomenon, the ANOVA results can help justify use of intrawell groundwater tests. Conversely, the lack of significant spatial variation can point to the use of interwell upgradient-downgradient testing.

Another variation of ANOVAone-way analysis of variance, Levene's test, can also diagnose whether or not multiple populations have similar variances (see Chapter 11.2, Unified Guidance). In Levene’s test, the absolute values of the residuals from a set of wells are treated as the ‘data’ in a standard one-way ANOVA. This tests whether the typical deviations from the meanThe arithmetic average of a sample set that estimates the middle of a statistical distribution (Unified Guidance). of each well differ significantly among the wells, thus signifying differing levels of varianceThe square of the standard deviation (EPA 1989); a measure of how far numbers are separated in a data set. A small variance indicates that numbers in the dataset are clustered close to the mean..

- This method can be used to evaluate stationarityStationarity exists when the population being sampled has a constant mean and variance across time and space (Unified Guidance). (lack of a shift of the means over time).
- Use this method to check for the absence of spatial variability when evaluating temporal variations.
- Study Question 5: Is there a trend in contaminant concentrations?
- Study Question 6: Is there seasonality in the concentrations?

- Residuals of the data must follow a normal distributionSymmetric distribution of data (bell-shaped curve), the most common distribution assumption in statistical analysis (Unified Guidance)..
- Observations are statistically independent over time.
- Data must have constant variance.

- Measurements collected at each well are performed on dates common to all wells.
- Data may need to be transformed (for example, using Box-Cox powerSee "statistical power." transformation) if the assumptions of normality and equal variances are violated, and subsequently tested to evaluate the validity of the assumptions on the transformed scale.
- A level of confidence, such as 95% must be selected; this level of confidence may be determined by federal or state regulatory requirements or guidance, or project specific needs.
- Small sample sizes make it difficult to test the assumptions and may not allow for sufficient power. In order to test for seasonality, a minimum of three sampling events per distinct season, with events spanning at least three years is recommended.
- A minimum of 8 to 10 measurements is recommended when evaluating temporal variation due to factors other than seasonality.
- This test may be sensitive to outliersValues unusually discrepant from the rest of a series of observations (Unified Guidance).. Data should be checked for outliers before applying this test; see Section 5.10.
- If the purpose of the one-way ANOVAone-way analysis of variance is to design an interwell prediction limit which accounts for temporal dependence, spatial variability must not be present.
- See Section 5.7 for information regarding handling of nondetectsLaboratory analytical result known only to be below the method detection limit (MDL), or reporting limit (RL); see "censored data" (Unified Guidance)..
- If the data cannot be normalized, a similar test for a temporal or seasonal effect can be performed using the nonparametric Kruskal-Wallis test.

This method can be applied without specialized statistical software.

Use of one-way ANOVAone-way analysis of variance for spatial variability and an example problem are discussed in Chapter 13.2.2, Unified Guidance. Use of ANOVA to improve parametric intrawell tests is described in Chapter 13.3, Unified Guidance. Chapter 14.2.2, Unified Guidance discusses application of ANOVA for temporal effects and also provides a sample problem. A more generalized discussion of ANOVA is provided in Chapter 17.1, Unified Guidance.

### 5.8.3 Sample Autocorrelation Function

Autocorrelation is a correlationAn estimate of the degree to which two sets of variables vary together, with no distinction between dependent and independent variables (USEPA 2013b). of a variable, such as a contaminant concentration, with itself over a series of time steps. Autocorrelation may be used to evaluate the frequency of sampling (for example, if subsequent sampling events are correlated, a reduction in sampling frequency may be supported). By computing the first few sample autocorrelation coefficients (ACFs), a plot of ACFs versus the time lags can be prepared; this graph is known as a correlogram (Figure 5-2). The shape of the ACF plot provides information regarding the variability of a given value over time.

A stationaryA distribution whose population characteristics do not change over time or space (Unified Guidance). but nonrandom series will often exhibit a large first-order autocorrelation coefficient, followed by one or two other significant coefficients, with the remaining coefficients tending towards zero. A seasonal series will exhibit a sinusoidal ACFautocorrelation coefficient or function. If the first order autocorrelation coefficient is significant and negative, the series tends to alternate between high and low values. If the series contains a trend, the ACF coefficients will not drop to zero with increasing lag.

- A comparison of the ACFautocorrelation coefficient or function coefficients that shows correlated consecutive time steps may support a reduction in sampling frequency.
- Study Question 6: Is there seasonality in the concentrations?
- Study Question 9: Is the sampling frequency appropriate (temporal optimization)?

Data must follow a normal distribution or be reasonably symmetric.

- Check the data for normality.
- Use of at least 8 to 10 measurements is recommended, although a greater number of measurements may be necessary to obtain the desired confidence levelDegree of confidence associated with a statistical estimate or test, denoted as (1 – alpha) (Unified Guidance). or power.
- If outliers are suspected, examine the data with a probability plot, Dixon's test, or Rosner's test. Remove outliers from the data set.
- Select a level of confidence, such as 95%; this level of confidence may be determined by federal or state regulatory requirements or guidance.
- If you suspect autocorrelation of a series, change the sampling frequency. The smallest lag between sampling events with no serial discernible correlation indicates the minimum sampling frequency needed for statistical independence. If you suspect seasonal autocorrelation (at the appropriate lag), see Section 5.8.5: Seasonality Correlations. An ACFautocorrelation coefficient or function plot exhibiting a sinusoidal shape indicates seasonality.

Requires a higher level of analysis than other methods to interpret results.

Further information on the sample autocorrelation function and an example problem are provided in Chapter 14.2.3, Unified Guidance. Partial autocorrelation coefficients can also be computed (see Chatfield 1994). Significant autocorrelation and partial autocorrelation coefficients can be combined for the construction of a Box-Jenkins autoregressive integrated moving average (ARIMAAutoregressive integrated moving average (ARIMA) is a time series model consisting of autoregressive parameters (explaining the time series observation with past values) and moving average parameters (random shocks with an error structure that is usually Gaussian). The integrated portion of the model refers to the order of differencing (subtracting one observation from the previous one) in order to simulate stationarity in nonstationary data.) time series model (Box and Jenkins 1976). Such a model can be used for prediction purposes. For two series, a cross-correlation function (CCF) can be constructed (Box and Jenkins 1976).

See Example 14-3, Unified Guidance (which includes Figure 14-5, a sample autocorrelation function), and case example A.3.

### 5.8.4 Rank von Neumann Ratio Test

The rank von Neumann ratio is used to evaluate seasonality in a data set and is constructed from the sum of differences between the ranks of lag-1 data pairs (for example, data pairs generated by comparison of data collected in a monitoring event to data generated in the previous monitoring event). When these differences are small, the pattern of observations of the data series will be somewhat predictable, and the data series is likely to be autocorrelated. Large differences indicate no autocorrelation. The test is formally conducted by comparing the Rank von Neumann ratio to the tabulated critical points (at a given sample size and desired significance level; see Table 14-1 of Appendix D, Unified Guidance). The Rank von Neumann Ratio test is a nonparametric method.

Study Question 6: Is there seasonality in the concentrations?

No distributional assumptions are required; however, frequent nondetects in the data may lead to a poor estimation of the Rank von Neumann ratio and critical points.

- Check the data for nondetects; replace each tied value by its mid-rank.
- Use of a minimum of 10-12 observations from a single well is recommended.
- Check that the data are not autocorrelated.
- Select a level of confidence, such as 95% (or 99%); this level of confidence may be determined by federal or state regulatory requirements, or guidance, or by project specific needs.

- This method is easily applied to nonparametric tests.
- This method identifies simple temporal correlations.
- You must apply this method to a single series of data at a single data point, not to multiple series of data.
- Use this method only on data sets with few nondetects.
- Compared to other tests of statistical independence, the Rank von Neumann ratio is more powerful than certain other nonparametric methods. The Rank von Neumann ratio correctly detects dependent data for a variety of underlying data distributions.

The Rank von Neumann ratio test is discussed further in Chapter 14.2.4, Unified Guidance, which also gives an example problem. The literature on time series analysis is extensive for other potential tests as well (such as Runs test, Durbin-Watson test, and Kendall’s tau).

See Example 14-4, Unified Guidance.

### 5.8.5 Seasonality Correlations

If the seasonal pattern in a data series is highly regular, then you can model the data with a sinusoidal function. Moving averages and lag-based differencing (for example, lag-4 for quarterly data, or lag-12 for monthly data) can be used to evaluate the data; see Chapter 14.3.3.1, Unified Guidance. When a significant temporal dependence is identified across a group of wells (for instance, by one-way ANOVAone-way analysis of variance), the adjustment process (moving averages) can be conducted simultaneously for several sets of wells as described in Chapter 14.3.3.2, Unified Guidance.

- Use this method to de-seasonalize data.
- Study Question 6: Is there seasonality in the concentrations?

Seasonal correction is only appropriate for wells where a cyclical pattern is clearly present.

- Use at least a two-year period of data.
- A minimum of three measurements per season is recommended for the application of seasonal corrections.
- If you suspect outliers, examine the data using a probability plot, Dixon’s test, Rosner’s test, or another appropriate method.
- See Section 5.7 for information regarding the handling of nondetects.
- For interwell comparisons, the same seasonality effect must be present in all wells.
- For interwell comparisons (such as simultaneously collected backgroundNatural or baseline groundwater quality at a site that can be characterized by upgradient, historical, or sometimes cross-gradient water quality (Unified Guidance). and downgradient data), a seasonal correction may not be necessary if the background and downgradient values are on the same cycle. To dispense with seasonal correction in this case, average groundwater velocities must be high enough for groundwater to migrate through both background and downgradient wells in the same season.

A description of how to de-seasonalize a data set (or multiple data sets) is given in Chapter 14.3, Unified Guidance. De-seasonalizing data can also be conducted by differencing (see Box and Jenkins 1976). A two-way ANOVAone-way analysis of variance may be conducted to test for both spatial variation and temporal autocorrelationThe correlation between observations on a single variable over successive intervals of time. This relationship is also called "serial correlation". Autocorrelation in temporal data is significant for time-series analysis (Unified Guidance; Burt et al. 2009). (Davis 1994). Example problems are provided in Chapter 14.3.3, Unified Guidance.

### 5.8.6 Seasonal Mann-Kendall Test

The seasonal Mann-Kendall test is a simple modification to the Mann-Kendall test for trend that accounts for seasonal fluctuations. The data series is divided into subsets, with each subset representing the measurements collected during a common sampling event. The standard Mann-Kendall test is performed separately on each subset, with a test statistic performed for each individual subset. The separate, seasonal statistics are subsequently summed to arrive at the overall Mann-Kendall statistic, which is then compared to the critical points of the standard normal distribution.

- Study Question 5: Is there a trend in contaminant concentrations?
- Study Question 6: Is there seasonality in the concentrations?

The long-term mean of the data series should be stationary.

- The sample series should span at least three seasons, with an observable seasonal pattern.
- Each season should include at least three measurements in order to compute the Mann-Kendall statistic.
- If you suspect outliers, examine the data using a probability plot, Dixon’s test, Rosner’s test, or another appropriate method.
- See Section 5.7 for information regarding the handling of nondetect data.
- A normal approximation to the overall Mann-Kendall test statistic plot must hold.
- Select a level of confidence, such as 95%; this level of confidence may be determined by federal or state regulatory requirements, guidance, or project specific needs.
- You can also choose to remove the seasonal autocorrelation first (see Section 5.8.5: Seasonality Correlations) and subsequently conduct a formal trend test on the entire series.

This is a nonparametric test.

See Chapter 14.3.4, Unified Guidance for additional information and a sample problem. See also Gilbert (1987) for a description of the seasonal Mann-Kendall trend test and slope estimator.

### 5.8.7 Temporal Optimization (Cost-Effective Sampling and Iterative Thinning)

Temporal optimization is best represented by the cost-effective sampling method (CES; Ridley et al. 1995; Ridley and McQueen 2005) and later modifications to this approach. In CES, a linear trend is estimated for each chemical-well pair and then classified according to the slope of the apparent trend as well as how much variation exists around the trend. Trends with relatively ‘flat’ slopes (small rates of change) and low variation are recommended for less frequent sampling, while trends with higher slopes or higher degrees of variation are targeted for more frequent sampling. The overriding principle is to (1) sample more frequently at locations where the apparent changes are more dynamic and associated with the greatest statistical uncertainty, and (2) sample less frequently when the trend is changing little and is statistically more certain (that is, less variable).

The second approach is the iterative thinning method (Cameron 2004). Iterative thinning examines whether sampling frequencies can be reduced due to temporal redundancy in the sampling events. This approach identifies redundancy by first estimating a baseline trend using the full data set, after which the trend is repeatedly re-estimated using subsets of the full data to identify the average number of data points needed to accurately reconstruct the baseline. The computations in iterative thinning create a series of ‘what if’ scenarios estimating the nature of the trend that would have been identified if only some of the existing data had been sampled. The overriding principle in iterative thinning is that if a trend can be accurately reconstructed using fewer sampling events, the optimal sampling frequency should be based on this smaller number.

Study Question 9: Is the sampling frequency appropriate (temporal optimization)?

The basic assumptions underlying temporal optimization methods are similar to those for most trend tests. CEScost-effective sampling and its modifications assume the trend is linear. Also, if linear regression is used to measure the trend, the trend residuals must be normal and homoscedastic. Iterative thinning can be performed on linear or non-linear trends, but typically requires at least 8 observations from which to form the (non-linear) baseline trend.

- CEScost-effective sampling and its modifications generally require at least 4 observations per well; Iterative thinning usually requires at least 8 observations per well in order to estimate non-linear trends.
- CES is found in specialized optimization software packages such as MAROS and 3TMO3-Tiered Monitoring and Optimization Tool. Iterative thinning is deployed in the geostatistical temporal-spatial (GTS) software and VSP.

- CEScost-effective sampling and its modifications can employ either parametric (linear regression) or nonparametric (Mann-Kendall) trend methods.
- Iterative thinning, as deployed in GTS software, can be applied to either linear or non-linear trends.

Publication Date: December 2013