D.14 ProUCL 5.0.00
Approximate Cost: Free
Source: USEPA (http://www.epa.gov/land-research/proucl-software)
Current Version: v5.0.00
Operating System Needs: Windows 32 or 64 bit, XP, Vista, Windows 7
Software Needs: Microsoft.NET version 4.0 Framework
Input Structure: Cross-tabular Excel file (XLS or XLSX format) or comma-separated values (CSV) format
Overview
ProUCL has been developed by Lockheed Martin under a contract with USEPA. ProUCL has been developed to address statistical issues arising in the various Superfund and RCRAResource Conservation and Recovery Act site projects, and is available from USEPA at no cost. The software is designed specifically for various environmental monitoring applications including backgroundNatural or baseline groundwater quality at a site that can be characterized by upgradient, historical, or sometimes cross-gradient water quality (Unified Guidance). contaminant evaluations, risk analysis for understanding concentrations, trend identification, and provides both interwellComparisons between two monitoring wells separated spatially (Unified Guidance). and intrawellComparison of measurements over time at one monitoring well (Unified Guidance). procedures for evaluating groundwater contamination, using several parametricA statistical test that depends upon or assumes observations from a particular probability distribution or distributions (Unified Guidance). and nonparametricStatistical test that does not depend on knowledge of the distribution of the sampled population (Unified Guidance). (including bootstrapA computerized method for assigning measures of accuracy to sample estimates. This technique allows estimation of the sample distribution of almost any statistic using only very simple methods. Bootstrap methods are generally superior to ANOVA for small data sets or where sample distributions are nonnormal (USEPA 2010). methods) approaches as well. Graphical analyses offered include normal, lognormalA dataset that is not normally distributed (symmetric bell-shaped curve) but that can be transformed using a natural logarithm so that the data set can be evaluated using a normal-theory test (Unified Guidance)., and gammaA gamma distribution or data set. A parametric unimodal distribution model commonly applied to groundwater data where the data set is left skewed and tied to zero. Very similar to Weibull and lognormal distributions; differences are in their tail behavior, and the gamma density has the second longest tail where its coefficient of variation is less than 1 (Unified Guidance; Gilbert 1987; Silva and Lisboa 2007). quantile-quantile (Q-Q) plots, probability plotsGraphical presentation of quantiles or z-scores plotted on the y-axis and, for example, concentration measurement in increasing magnitude plotted on the x-axis. A typical exploratory data analysis tool to identify departures from normality, outliers and skewness (Unified Guidance)., histogramsGraphical representation of frequency with data values grouped into specified numerical ranges (Unified Guidance)., box plots, and line/trend plots. In addition to parametric and nonparametric upper limits, ProUCL 5.0 provides Gehan and Tarone-Ware tests to compare two data sets with multiple detection limits. Results for statistical intervals are offered with several options and relevant cautions. ProUCL 5.0 has rigorous methods to compute statistical upper limits including confidence limits, prediction limitsIntervals constructed to contain the next few sample values or statistics within a known probability (Unified Guidance)., tolerance limitsThe upper or lower limit of a tolerance interval (Unified Guidance)., and simultaneous limits for data sets with and without nondetect observations covering a wide-rangeThe difference between the largest value and smallest value in a dataset (NIST/SEMATECH 2012). of skewnessA measure of asymmetry of a dataset (Unified Guidance). and sample sizes. A partial list of references used in the decision making process included in ProUCL is provided.
Statistical Method |
Capability As Is |
Capability with Scripts/Add-Ins |
---|---|---|
Handling of NDs |
|
|
● |
N/A |
|
● |
N/A |
|
● |
N/A |
|
|
N/A |
|
Exploratory/Diagnostic Tools |
|
|
Summary Statistics |
● |
N/A |
● |
N/A |
|
● |
N/A |
|
Data transformations |
|
|
Statistical Design |
||
Statistical Power |
◒ |
N/A |
|
N/A |
|
Contaminant ranking |
|
N/A |
|
N/A |
|
Statistical Limits |
||
◒ |
N/A |
|
◒ |
N/A |
|
◒ |
N/A |
|
Testing Compliance Limits |
◒ |
N/A |
Graphics |
||
Plots/Charts |
◒ |
N/A |
Batch plots |
|
N/A |
Tweaking of graphics |
|
N/A |
Statistical Comparisons |
||
● |
N/A |
|
◒ |
N/A |
|
Spatial Analysis |
|
|
Geostatistics/Mapping |
|
N/A |
|
N/A |
|
|
N/A |
|
Regression/Time Series |
||
|
N/A |
|
● |
N/A |
|
● |
N/A |
|
|
N/A |
|
● |
N/A |
|
|
N/A |
|
Multivariate Analysis |
|
|
Multiple regression |
◒ |
N/A |
Factor/Discriminant analysis |
|
N/A |
◒ |
N/A |
Capability Ratings:
N/A = Not applicable or not available
● = Full capability
◒ = Some capability
(blank cell) = No capability
Add-Ins Available
None
Ease of Use and Data Import
Use of ProUCL is straightforward. Although ProUCL requires no formal background in statistics, you should understand the assumptions and input requirements for any statistical tests used in making decisions. Input data sets are uncomplicated, requiring columns of detected values for contaminants and corresponding columns indicating whether each value is a detect or a nondetect at the quantitation limit. You can also add variables to provide grouping data, regression variables, or sample dates. ProUCL modules can handle missing data.
You can select desired statistical tests from drop-down menus, and relevant options from subsequent menus. You can also view results within the program and export them to an Excel spreadsheet.
Data can be evaluated for fit to normal, lognormal, or gamma distributions; statistical interval test results are available for all of these distributions in addition to several nonparametric options.
Types of Distribution
ProUCL provides goodness-of-fit tests for normal, lognormal, and gamma distributions for uncensored data sets (without nondetectsLaboratory analytical result known only to be below the method detection limit (MDL), or reporting limit (RL); see "censored data" (Unified Guidance).) as well as left-censored dataValues that are reported as nondetect. Values known only to be below a threshold value such as the method detection limit or analytical reporting limit (Helsel 2005). sets (with nondetect observations.)
ProUCL 5.0 computes parametric and nonparametric statistical upper limits including confidence limits, prediction limits for k future observations and meanThe arithmetic average of a sample set that estimates the middle of a statistical distribution (Unified Guidance). of k observations, tolerance limits, and upper simultaneous limits for censored and uncensored data sets (Singh and Nocerino 1997). Statistical limits computation methods available in ProUCL 5.0 cover a wide range of skewness and sample size. All of these limits can be computed using GROS, LROS, and nonparametric Kaplan-Meier methods. ProUCL also provides bootstrap methods to compute confidence and tolerance limits.
In addition to two-sample t-tests and the Wilcoxon rank sum test, ProUCL 5.0 provides two-sample hypothesis tests (Gehan generalized Wilcoxon test and Tarone-Ware) for left-censored data with nondetect observations.
Visualization
Plots available in ProUCL include box plots, histograms, quantile-quantile (Q-Q) plots for normal, lognormal, and gamma distributions, and normal probability plots. Multiple normal Q-Q plots by groups provide a point-by-point comparison of data from multiple groups (monitoring wells). These graphs can also be used on data sets with nondetect observations. Although the program offers some options for editing the output plots, the process is limited.
Primary Uses for Groundwater Data Analyses
As the name implies, ProUCL was initially developed to provide a package for computing statistical intervals. Iterations of the software have provided additional statistical tools as well as improvements to the original. Version 5.0 provides for upper confidence limits, upper prediction limits, upper tolerance limits, and upper simultaneous limits for data sets with and without nondetect observations covering a wide range of data skewness and sample sizes. Depending upon the sample size, data distribution, and number of detects present, ProUCL 5.0 provides suggestions and cautions on the output results. Other available tests that may apply to groundwater monitoring include analysis of variance (ANOVA)A statistical method for identifying differences among several population means or medians., trend evaluation, outlier, and goodness-of-fit tests. The sample size module of ProUCL computes DQOdata quality objectives-based sample sizes needed to address statistical requirements of environmental projects. The sample size module can also be used to perform powerSee "statistical power." evaluations in retrospect.
Although ProUCL software does not perform more in-depth statistical evaluations or address issues of definition and investigation, it is a good fit for many sites.
Benefits
- free
- relative simplicity of use
- developed specifically for environmental applications
- results output with recommendations, cautions, and cited references
- documentation well written and generally easy to understand
Limitations
- primary use: calculation of upper statistical limits, background versus site comparisons, interwell and intrawell comparisons, outlier identification, sample size determination and power evaluations, and trend evaluations in ground water data, and hypothesis testing
- limited opportunity for user intervention or modification of procedures
References
Singh, Anita and Nocerino, John. 1997. "Robust Intervals for Some Environmental Applications." The Journal of Chemometrics and Intelligent Laboratory Systems, Vol. 37: 55-69.
Singh, A.K., Singh, A., and Engelhardt, M. 1997. The Lognormal Distribution in Environmental Applications. Technology Support Center Issue Paper, 182CMB97. EPA/600/R-97/006, December.
Singh, A., Maichle, R., and Lee, S. 2006. On the Computation of a 95% Upper Confidence Limit of the Unknown Population Mean Based Upon Data Sets with Below Detection Limit Observations. EPA/600/R-06/022, March. http://www.epa.gov/land-research/proucl-software.
Singh, A., Singh, A.K., and Iaci, R.J. 2002. Estimation of the Exposure Point Concentration Term Using a Gamma Distribution. EPA/600/R-02/084, October. http://www.epa.gov/osp/hstl/tsc/software.htm.
Publication Date: December 2013