Water Quality Portal – Tools for Automated Data Analysis (TADA)
What are the capabilities of TADA?
The U.S. Environmental Protection Agency (EPA) TADA (Tools for Automated Data Analysis) encompasses an R package and series of R Shiny applications currently under development – new features are added every month. These tools are designed to help Tribes, Tribal Nations, Pueblos, States and other stakeholders more efficiently compile and evaluate Water Quality Portal (WQP) data collected from surface water monitoring sites.
As of Spring 2023, TADAShiny (Module 1: Data Discovery and Cleaning) retrieves data from the WQP and runs it through a series of quality control screens and data wrangling steps. Features include flagging invalid results and metadata using validation reference tables, harmonization of synonyms, result and depth unit conversions, censored (detection limit) data substitutions, dataset filtering, and data visualizations. TADA leverages the EPA Water Quality eXchange (WQX) QAQCCharacterisiticValidation domain value service (available here) to flag invalid results and metadata. Users will be able to review and download summary information about their dataset, along with a data file and that is ready for additional manual review and use in subsequent analyses. Within the application, users decide to flag data for removal or keep data depending on its quality and relevance for their analysis. Data in the WQP are not altered by TADA – if underlying data quality issues are found using TADA, users can contact the WQX helpdesk ([email protected]) for assistance fixing their organizations data in the WQP. Only data submitting organizations are allowed to make changes to their data. If WQP data users find data quality issues for which they are not the data owner, they may also reach out to the WQX helpdesk who can let the data owner know about the issue.
Once finished, TADA aims to meet the following user requirements: 1) data discovery and cleaning, 2) assessment unit and use integration, 3) criteria and methodologies integration, and 4) assessment unit-use-parameter level analyses in a format compatible with the EPA Assessment, Total Maximum Daily Load (TMDL) Tracking and Implementation System (ATTAINS). The TADA Team is using an agile development approach. User requirements are still being adjusted as needed during development using frequent feedback solicited from the TADA user community.
Tip: In 2022, new WQP data retrieval and visualization capabilities were added to the EPA How's My Waterway (HMW) application, which is designed for the public. Stakeholders who are interested in discovering WQP data can start by using HMW mapping, Monitoring Tab and Monitoring Location page features to find and review WQP data availability in areas of interest. Users that wish to further analyze the WQP data can move to TADA for assistance.
What data are compatible with TADA?
In 2012, the U.S. Geological Survey (USGS), EPA and the National Water Quality Monitoring Council deployed the WQP to combine and serve water-quality data in a standardized format from numerous sources - including EPA WQX, USGS National Water Information System (NWIS) and United States Department of Agriculture (USDA) Agricultural Research Service (ARS) Sustaining the Earth's Watershed Agricultural Research Data System (STEWARDS). The WQP holds over 420 million water quality sample results from over 1000 federal, state, tribal and other partners, and is the nation's largest source for single point of access for water-quality data. Participating organizations submit their data to the WQP using the EPA's WQX, a framework designed to map their data holdings to a common data structure.
TADA is designed for use with WQP data, or any data formatted in the WQX/WQP schema. TADA leverages the WQP web services and USGS dataRetrieval R Package, which allows users to access any publicly available WQP data directly from the TADA R Package and/or R Shiny application(s).
Background and Impetus for TADA
In 2015, the EPA began developing two tools to help organizations more easily discover, clean, visualize, and analyze data from the WQP. Both were developed using R and have an R Shiny interface intended to be user friendly. These tools, the Data Discovery Tool and Data Analysis Tool, are TADA’s predecessors. The Data Discovery Tool, published online in 2017, performs data retrieval, cleaning, and visualization. The Data Analysis Tool analyzes water quality data against associated numerical criteria for various uses with common assessment and listing methodologies but was never released. The TADA project began with the intention to utilize and build on the existing functionality of these tools, and to review open-source tools developed by stakeholders for potential incorporation. In addition, the TADA team began conducting outreach to better understand tool requirements and potential barriers to tool use and adoption by stakeholders. For this purpose, the team initiated the TADA Working Group in 2020, which meets regularly and has continuously grown in membership. During the initial requirements gathering process in 2020-2021, the TADA Working Group developed and reviewed issue papers, answered assessment questions, found commonalities in assessment processes, and created a space for open-source development. The working group produced a Master List of Requirements for TADA and is maintaining an inventory of open-source water quality data tools. The team learned a tremendous amount through that process and looks forward to continued discourse throughout development of all TADA tools.
How can I access TADA and get started using it?
Code repositories for the TADA and TADAShiny R packages are available on GitHub. The TADA Shiny application provides an easy to use R Shiny user interface on top of the TADA R Package.
EPA also provides a vignette, including examples of how to use TADA in R. Here you will find all TADA R Package functions and their documentation. Stakeholders are encouraged to test the functionality and provide feedback on TADA. Moreover, open-source software provides an avenue for water quality data originators and users to develop and share code, and the TADA Team welcomes your contributions! We encourage you to find more information on how to contribute and reach out as needed. A collaborative community dedicated to this effort is integral to the success of TADA and its users, where contributors can discover, share, and build the package functionality over time.
Join the TADA Working Group
Working Group Mission: To share and develop R code for evaluating and visualizing WQP data more efficiently though collaboration and open-source programming. This includes working together to find commonalities in assessment processes across the nation, creating flexible tools that can be easily customized to work within existing workflows, supporting each other in learning R, and ensuring products will be accessible to organizations most in need. New members are welcome! Contact the TADA Team ([email protected]) for more information.