C.1 Study Question 1: What are the background concentrations?
Background is defined as groundwater which is not influenced by the releases from a site (Section 4.2.1). Specifically, a backgroundNatural or baseline groundwater quality at a site that can be characterized by upgradient, historical, or sometimes cross-gradient water quality (Unified Guidance). groundwater data set may represent either a location or a time period that has not been influenced by a release from the site.
While it may be convenient to think of the background concentration as a single value, the concentration of a chemical will naturally vary both spatially and temporally. The analytical process introduces additional variability. Because of both natural and introduced variability, background can best be understood as a distribution, prompting the question: “Is the distribution of concentrations consistent with the distribution of concentrations in background?”
Background data may be collected by either of two methods. Data may be collected from a number of different wells (interwellComparisons between two monitoring wells separated spatially (Unified Guidance). data collection). Data may also be collected from the same well over time (intrawellComparison of measurements over time at one monitoring well (Unified Guidance). data collection; see Section 3.6.4). If interwell comparisons are desired, a hydrogeologic assessment must be performed to evaluate whether the upgradient and downgradient wells are appropriately grouped. For example, are the wells screened in the same geologic formation (Section 4.2.2)? If representative upgradient wells are not available for use as background, or if spatial variabilitySpatial variability exists when the distribution or pattern of concentration measurements changes from well location to well location (most typically in the form of differing mean concentrations). Such variation may be natural or synthetic, depending on whether it is caused by natural or artificial factors (Unified Guidance). exists among background wells, intrawell comparisons may be better for evaluating background conditions. Intrawell evaluations assume that the background time period is uninfluenced by chemicals from the site.
Chemicals present at background concentrations in the groundwater may be either naturally occurring or anthropogenic. Naturally occurring substances present in the environment are those that are not a result of human activity. "Anthropogenic substances are natural and human-made substances present in the environment as a result of human activities not specifically related to the site in question" (USEPA 2002c).
Tests used to determine if a particular data set is consistent with background assume appropriate data collection and focus on data characterization.
This question is usually relevant in the release detection, site characterization, and closure stages of the project life cycle.
Selecting and Characterizing the Data Set
Verify that the collected data set represents background. Using graphical methods (Section 3.3 and Section 5.1) and distributions, establish that a collected data set is consistent with background (natural or anthropogenic).
- Identify outliersValues unusually discrepant from the rest of a series of observations (Unified Guidance). using box plots, probability plots, Dixon’s test, or Rosner’s test.
- Plot data sets on maps and in three dimensions (vertical, horizontal, and time) and examine data sets for sources of contamination, important areas that have not been sampled, spatial correlations or trends in the data, and the location of suspected outliers. See Section 3.3.3: Exploratory Data Analysis.
- Check that meanThe arithmetic average of a sample set that estimates the middle of a statistical distribution (Unified Guidance). and varianceThe square of the standard deviation (EPA 1989); a measure of how far numbers are separated in a data set. A small variance indicates that numbers in the dataset are clustered close to the mean. are stable over the data set time frame (stationarityStationarity exists when the population being sampled has a constant mean and variance across time and space (Unified Guidance).) and seasonality in the data is accounted for and considered in the analysis.
- Analyze the data for significant trends.
- Note and appropriately address nondetectsLaboratory analytical result known only to be below the method detection limit (MDL), or reporting limit (RL); see "censored data" (Unified Guidance). in the data set (see Section 5.7: Managing Nondetects in Statistical Analyses).
- See also Section 4.1: Considerations for Statistical Analysis.
Statistical Methods and Tools
- These plots show the entire distribution of measured concentrations, ranging from the lowest value to the highest value, against percentile of the distribution of measured concentrations.
- Probability plots are useful for identifying data distribution, rangeThe difference between the largest value and smallest value in a dataset (NIST/SEMATECH 2012)., bunching and outliers.
The goal of this test is to verify that the data set appears consistent with background. Examine the data set distribution for its range, its visual skew (or, inversely, its bunching), and for outliers. Do these elements support the conclusion that the data set appears to be drawn from a single population? You can use a scatter plot to examine a data set for the same parameters as outlined for the probability plot (Section 5.1.3).
Begin data analysis with a probability plot to identify potential weaknesses in the collected data set. Potential weaknesses include problems such as the data set not being distributed as expected. The data also may be bunched, or may have extreme outliers.
If the data set exhibits the above characteristics, you may need to investigate and address outliers, collect additional data, or select new sample points from which to collect potential background data.
- These plots show measured concentrations over time.
- Time series plots are useful for assessing trends, patterns, and inconsistencies in the data set.
The goal of this test is to verify that the concentration of a chemical is in steady-state equilibrium over time, and that broad variations in chemical concentration are the result of identifiable events or seasonality.
Conduct this examination to identify potential trends over time. While background groundwater data may have seasonal variation in the concentration of many chemicals, seasonal variation should occur within a range and should be repetitive within the range over time. Trends in background data may occur due to changing hydrogeologic conditions or influences from upgradient sources. Some statistical tests require that background data remain stable, in which case background data should not demonstrate either an increasing or decreasing trend over time.
If concentrations exhibit either increasing or decreasing trends over time that are not attributable natural, cyclical events, then new sample points must be selected or monitoring may need to be extended until a stable trend is observed. Historical data that are no longer representative may be removed from the data set. Alternatively, trends in upgradient wells may indicate that intrawell tests are preferable. Additional information is presented in Chapter 5.2.5 and Chapter 5.3.4 of the Unified Guidance.
Note that the time series plotA graphic of data collected at regular time intervals, where measured values are indicated on one axis and time indicated on the other. This method is a typical exploratory data analysis technique to evaluate temporal, directional, or stationarity aspects of data (Unified Guidance). is a specific kind of scatter plot. If the goal is to examine the relationship between two variables, for example, the correlationAn estimate of the degree to which two sets of variables vary together, with no distinction between dependent and independent variables (USEPA 2013b). between the concentration of chromium and the concentration of iron at a site, then refer to Section 5.1.3: Scatter Plots.
- This test examines the data set for extreme concentrations (outliers).
- This test is useful for ensuring that the data set is representative of background and does not include nonbackground samples.
The goal of this test is to determine if any of the samples in the data set appear unrepresentative of the background data set. A statistical outlier in the background data set may indicate that one of the background samples was collected in a location that is not truly background.
If the concentration of a sample indicates that the sample is outside the background data set, then that data point may distort the statistical analysis. Statistical outliers, however, may represent real variations in the background data and should not be automatically removed unless there is a reason to suspect an error or data quality issue (see Chapter 5.2.3, Unified Guidance). Use professional judgment in evaluating whether or not a statistical outlier should be retained in a background data set (see Section 5.10).
Interpretation of Results and Associated Uncertainty
The natural variation, the anthropogenic variation, or both variations in concentrations must be understood before developing background values. The expected distribution, character of the probability plot, the potential concentration variation across seasons as well as over time, and the occurrence of apparent outliers are all a function of the chemical and its environmental setting. Examine published studies regarding the occurrence of the chemical to determine which analyses should be emphasized or more heavily weighted in decision making.
Based on the qualitative examination of the background data set, you may choose to analyze the data set and present its basic statistical characteristics. See discussions on characterizing the data set presented in Section 3.3.3, Section 5.1, and Section 5.6.
The background value is not determined only once. The individual wells or groups of wells which were used to support background determination or comparisons may develop trends. These trends could result from new contaminant sources influencing previously unimpacted wells. These trends could also result from changes in groundwater flow or chemistry. Trends that are not sustained could also result by chance.
Regardless of the reason for changes, background data must be updated. How often the data must be reconsidered for update depends on site-specific parameters such as groundwater flow velocity, nearness of other potential sources of contamination, and geochemistry. Frequency of updating the background data is also dependent on having sufficient new data to statistically identify a change; the Unified Guidance suggests four to eight new data points. The new data may be compared to the historical data by either the parametric t-test or the nonparametricStatistical test that does not depend on knowledge of the distribution of the sampled population (Unified Guidance). Wilcoxon rank sum test depending on the distribution of the pooledGroundwater samples from more than one sampling point. intrawell data. If there is no significant difference between the new data and the historical data, then the new data can be considered background. Additionally, the absence of a trend in the data when historical and new data are combined, is indicative of background; see Section 5.5, Section 5.5.1, and Section 5.5.2.
Related Study Questions
Study Question 2: Are concentrations greater than background concentrations?
Key Words: Background, Compliance Monitoring, Concentration Comparisons, Release Detection, Site Characterization, Closure
Publication Date: December 2013