2. Regulatory Framework and Challenges for Groundwater Statistics
This section identifies various issues and concerns to consider as you plan use of statistics at a specific site. Presented here is an overview of state and federal regulatory requirements and example applications of statistical methods as they are used in the private sector. The section also introduces some general challenges in the use and application of statistics.
2.1 Regulatory Issues and Barriers
Inconsistent federal, state, and local regulations often complicate the evaluation of groundwater data using statistical methods. Existing regulations differ widely and may not reflect the “state of the science” or recognize the advantages and limitations of current statistical practices. The following summary of regulatory challenges is not an authoritative guide to the regulations, but rather an overview of the widely varying requirements.
2.1.1 State Regulatory Perspective
Based on the ITRC groundwater statistics survey (Appendix E), state guidance on the use of statistics ranged from “not well defined” to “overly prescriptive.” Differences in using statistics exist among various programs within a state. The survey also revealed that some states are in the process of developing statistical guidance. In a few cases, specific programs reject statistical analysis outright.
While some states have adopted and been authorized to implement statistical practices recommended in the Unified Guidance for sites regulated under the Resource Conservation and Recovery Act (RCRA), these practices are not consistently carried through to other cleanup programs. State environmental cleanup programs may have requirements or guidance for statistical methods that predate the Unified Guidance and do not reflect contemporary practices. Because tools and approaches presented in this guidance document may differ from state and local guidance, regulators and the regulated community should closely coordinate planning and implementation of statistical evaluations.
Program Guidance
The following topic areas of environmental project management were identified where existing program guidance is likely:

comparison to criteria

trend analyses
The following are examples of questions which may be discussed between regulators and the regulated community:
 What are the statespecific requirements for the test or procedure that is planned to be used?
 Are there multiple programs within the state with differing requirements?
 Are there requirements that have been implemented on a local (county or municipal) basis?
 Are there any common procedures for which the state has no guidance or preference?
In instances where statistical evaluations are discouraged or prohibited, or guidance is not consistent with current best practices, provide results from statistical methods presented in this document to introduce alternatives that may be considered.
A snapshot of the variability in the statistical requirements of state programs is provided in the groundwater statistics survey (Appendix E). Contact the appropriate state regulatory agency to discuss the requirements for your particular project.
2.1.2 Federal Regulatory Perspective
USEPA’s Unified Guidance provides recommendations for statistical methods and strategies that can be used to demonstrate compliance with federal RCRAResource Conservation and Recovery Act regulations for regulated units such as landfills and surface impoundments. Statistical evaluations under CERCLAComprehensive Environmental Response, Compensation, and Liability Act can also follow USEPA’s Unified Guidance. The data quality objectivesThe qualitative and quantitative statements derived for the DQO process that clarifies the study’s technical and quality objectives, defines the appropriate type of data, and specifies tolerable levels of potential decision errors that will be used as the basis for establishing the quality and quantity (USEPA 2002b). using USEPA’s sevenstep data quality objective (DQO) process (2006a), and characteristics of the data sets (sample sizes and frequency of nondetects), usually drive the test selection. However, for the RCRA cleanup process and other remediation programs, statistical approaches from the earlier USEPA documents may be inconsistent with either the USEPA Unified Guidance or generally accepted current best practices for statistical evaluations. In addition, USEPA has a publication for monitored natural attenuation (MNA) that includes information about using statistics in groundwater evaluations for MNA sites (USEPA 2011).
USEPA’s optimization strategy (USEPA 2007 and 2012) institutes changes to remedial programs to take advantage of newer tools that promote more effective and efficient cleanup, and to achieve verified protective cleanup faster, cleaner, greener, and cheaper. The strategy encourages use of techniques throughout the life cycle of site cleanup, acknowledging that optimization techniques and their use throughout the cleanup life cycle have become numerous and are growing rapidly. In the early investigation and design stages of the cleanup process, statistical and geostatistical methods can be used to optimize sampling design (as in incremental sampling), at the design stage (using value engineering techniques), and during the remedial action stage.
The Remedial System Evaluation can be used to examine a site holistically to determine whether the remedy is on track for cleanup, offering alternate approaches if the remedy is stalled, and to develop a completion strategy for the final disposition of the site. Specifically, quantitative analyses using statistical and geostatistical approaches are often used for optimization of groundwater monitoring programs. Recent green remediation advances have also been incorporated into each USEPA optimization technical support event. The benefits of optimization strategies include "more cost effective expenditures, lower energy use, reduced carbon footprint, improved remedy protectiveness, improved project and site decision making, and acceleration of project and site completion" (USEPA 2013f).
2.2 Private Sector Perspective
Groundwater statistics are typically used in the private sector to support environmental project management, risk assessment, and decision making. Some of the uses typically include the selection of groundwater sampling frequency, comparison of results of different methods of sample chemical analysis, demonstrating results of differing sample techniques are comparable, and identification of background concentrations. Once statistical analyses are performed, private sector entities must secure regulatory and stakeholder concurrence on acceptability of results.
2.2.1 Examples of Statistics Usage in the Private Sector
Some examples of how statistics have been used for various types or stages of projects in the private sector and potential outcomes and concerns are presented below. Please note that these are for illustrative purposes only to demonstrate concerns that Private Sector users have addressed or need to address, and are by no means exhaustive with respect to either site type or method used. The methods described in these examples are further described in Section 5.
MannKendall trend analysis is routinely used to determine sampling frequency. For example, to select groundwater sampling frequency for a Superfund site, MannKendall trend analysis was applied to an existing data set, which included seven years of semiannual sampling results. The sampling frequency was backtested by reducing the data frequency to onehalf and then onefourth of the original frequency. The results showed that the trend results were similar in all three data sets, leading to a significantly reduced sampling frequency for the longterm monitoring program. Less than 20% of the wells are sampled annually or biennially; most are sampled only once every five years.
Regression analysis is routinely used to compare the results of different methods of sample collection or chemical analysis. For example, nonlinear regression analysis has been used to demonstrate that the analytical results of different sampling techniques, such as nopurge and lowflow groundwater sampling, are comparable. Results are generally comparable, and the two methods yield the same conclusion regarding the attainment of a certain performance criterionGeneral term used in this document to identify a groundwater concentration that is relevant to a project; used instead of designations such as Groundwater Protection Standard, cleanup standard, or cleanup level. for compliance. A related application of regression analysisA statistical tool for evaluating the relationship of one of more independent variables to a single continuous dependent variable (Kleinbaum et al. 2007). is the calibration of one method to another (such as field and laboratory analytical methods) so that the results obtained with one technique may relate to the other using their empirical regression relationship.
Crude and refined petroleum products are complex mixtures that frequently contain hundreds of individual compounds. Multivariate statistical analyses may be used at some sites to either identify the general types of source petroleum products present at a site, or to differentiate and quantify the relative contribution of various sources in a comingled plume. It can be a difficult analysis and not all data sets will reveal clear contributions of the various sources. For light end mixtures, such as gasoline, it is possible to apply a combination of trend analysis and ratio analysis to determine if a new release has occurred where a historical spill was previously documented. The ratio analysis is a comparison of the relative concentration ratios among a group of chemicals measured in groundwater at the site and how those ratios might vary depending on location (for example, the relative concentration ratios may differ between a new release and a historical spill area).
Nonparametric trend analyses (such as the MannKendall trend test, the seasonal MannKendall test, and the TheilSen trend line) and parametricA statistical test that depends upon or assumes observations from a particular probability distribution or distributions (Unified Guidance). trend analyses (such as linear regression) are routinely used to evaluate trends in groundwater concentrations of contaminants over time for natural attenuation remedies. These trend analysis techniques are used to assign direction of trends (increasing, decreasing, or no trend) and the statistical significanceStatistical difference exceeding a test limit large enough to account for data variability and chance (Unified Guidance). A fixed number equal to alpha (α), the false positive rate, indicating the probability of mistakenly rejecting the stated null hypothesis (H₀) in favor of the alternative hypothesis (Hᴀ). Or, the pvalue sufficiently low such that the analyst will reject the null hypothesis (H₀). of the trends. For decreasing contaminant concentration trends, the natural logarithms of the concentrations are plotted versus time and the linear regression analysisA parametric statistical method to measure the linear trend of a data set using data point regression residuals that are based on assumptions of normality, homoscedasticity, and independence (Unified Guidance). is conducted. The result is used to predict the time required for groundwater contaminant concentrations to meet remedial objectives (See also Appendix A, Example A.6). This information can be used to design appropriate MNA remedies, or to request closure based on demonstration of limited risk and an expected short time to meet remedial objectives.
Nine closed RCRAResource Conservation and Recovery Act and active National Pollutant Discharge Elimination (NPDES) solid Waste permitted units are being monitored on a semiannual basis at an oil refinery site. Various siterelated chemicals are being evaluated for statisticallysignificant exceedances using a combination of traditional parametric and nonparametricStatistical test that does not depend on knowledge of the distribution of the sampled population (Unified Guidance). tests (based on data distribution analyses). Two recently updated Solid Waste permits include a requirement for statistical planning and evaluations. The most recent permit includes intrawellComparison of measurements over time at one monitoring well (Unified Guidance)., rather than interwellComparisons between two monitoring wells separated spatially (Unified Guidance)., comparison to address different background and compliance wells.
2.3 Challenges for Project Managers
Statistical analysis of groundwater data, and other forms of environmental monitoring data, can present challenges during different project activities (for example, planning, implementation, data interpretation, decision making, and communication). The following key challenges are addressed in this document and the Unified Guidance.
2.3.1 Planning Challenges
These challenges are related to the conceptual understanding of statistical methods, selecting and applying methods to answer study questions, and satisfying the project’s objectives:
 When is it advantageous to use statistical tests?
 Why do I need statistics for my “small” site?
 Do I have enough data to perform a statistical analysis?
 How will the results of statistical tests help users make decisions?
 What are the purposes and limitations of statistical tests?
 How can I estimate the value of using statistical tests during the various stages of a project’s life cycle?
 How many samples do I need to collect?
 How do I balance the statistical certainty I need with the cost of data acquisition?
 What are the best ways to explain the statistical approach to all involved – regulators, consultants, owner representatives and community members?
 How should historical data be optimally processed?
 How to plan sampling to achieve data quality objectives?
 What is the statistical parameter (for instance, meanThe arithmetic average of a sample set that estimates the middle of a statistical distribution (Unified Guidance). or medianThe 50th percentile of an ordered set of samples (Unified Guidance).) of interest?
 What is the variability of the statistical parameter or trait that will be measured?
 What are the acceptable false positiveIn hypothesis testing, if the null hypothesis (H₀) is true but is rejected in favor of the alternate hypothesis (Hᴀ) which is not true, then a false positive (Type I) error has occurred (Unified Guidance). and false negativeIn hypothesis testing, if the alternative hypothesis (Hᴀ) is true but is rejected in favor of the null hypothesis (H₀) which is not true, then a false negative (Type II, β) error has occurred (Unified Guidance). rates?
 What statistical tests are appropriate?
 What hypotheses need to be tested?
 What is my alternate hypothesis?
 Is it important to identify a directional or nondirection change (such as plume movement)?
2.3.2 Interpretation and Communication Challenges
These challenges are related to interpreting and communicating the results of statistical tests and evaluations to multiple audiences:
 How do I judge data adequacy and convey confidence in the data?
 How do I use graphics effectively and fairly? How do I spot misleading graphics?
 How can I explain the statistical tests and results to stakeholders?
2.3.3 Common Misapplications
Just as the application of statistics to groundwater and environmental monitoring presents many challenges, it also affords many opportunities for misuse, misapplication, and misinterpretation. The following list summarizes common misapplications. A more thorough discussion is provided in Appendix B, along with suggestions on how to best address each issue.

Misapplication: The site maximum sample is less than the riskbased decision criterion, therefore I can conservatively assume the site is low risk.
Response: If the number of samples is small, this assumption may be wrong and the population mean may actually be above the decision criterion.

Misapplication: The site maximum sample is greater than a “background” maximum or mean, therefore the site is contaminated.
Response: The data sets may be very different in size and in what they represent, so that comparisons may not be meaningful or appropriate. Statistical approaches such as the use of twosample tests or background upper prediction limit (UPL) would be better alternatives.

Misapplication: The concentration this round is less than the last round and the round before that, therefore there is a decreasing trend.
Response: Assess trends over multiple rounds and consider the variability of the data. A few monitoring events are typically not adequate to characterize the variability and assess trends.

Misapplication: Three rounds of data below the decision criterion means a site is in compliance.
Response: An arbitrary rule or number of rounds may not adequately address the variability of the data being assessed.

Misapplication: If I have good correlation between my field screening data and laboratory analyzed splits, then I should be able to use the field data to make comparisons to the decision criteria.
Response: You must assess the variability of the responses (field technique to valid laboratory data) and quantify the potential error and confidence in the field results.

Misapplication: If the statistics indicate a result is an outlier, then it is acceptable to disregard that point.
Response: You must always have a welldocumented, weightofevidence reason to eliminate data. Also evaluate the effect that eliminating data points has on the conclusions drawn from the remaining data. Often, outliers may be representative of unforeseen, yet important, site characteristics.

Misapplication: There is a statistically significant difference in the data, therefore it must be important.
Response: When data sets are large (such as for wells that have been monitored for long periods of time), statistical tests can be used to identify small changes in data trends. However, you must also note the magnitude of the change relative to the decision criteria.

Misapplication: The statistics “prove” the research hypothesis was correct.
Response: Statistics is the science of probabilities and the imprecise, therefore it is essential to state the degree of confidence the statistics provide for the conclusion.
Please refer to Appendix B for more common misapplications as well as suggestions for addressing these important issues.
2.3.4 Implementation Challenges
These challenges are related to selecting and using software, enhancing confidence in calculations, and gaining consensus on results:
 How do I select an appropriate statistical package to independently verify results and conclusions?
 Is the selected software package accessible and user friendly?
 What is its cost?
 Are there upload/download security issues?
 Who maintains the software and provides support?
 How do I overcome data entry difficulties resulting from software program structure?
 Does the software produce output with adequate explanations of intermediate results?
 Does the software produce presentation quality graphics with minimal effort?
 How does the user check the results and what should users check?
 What types of data need to be processed?
 Transformed data (such as logarithms of the original results)
 Censored data (such as nondetects)
 How, when, and why do I modify a data set before analysis?
Publication Date: December 2013