Project goal
This final assignment called to create a package in R and publish it via GitHub (link here).
There is also a peer review component to this assignment:
- Overall impression
- How many unique functions this package have?
- Does the package has DESCRIPTION file?
- Does the package has dataset example?
Previous assignments
- BA student Ryan Scharf (printtools3d)
- PhD candidate Karena Nguyen (Schistosomiasis)
- MA student Ashley Ashabranner (LibStats)
- PhD candidate Erin Feichtinger (plantgrowth)
Objectives
Create a package that provides a toolset for environmental scientists to clean data and perform exploratory analysis on data from the US Environmental Protection Agency (EPA), US Geological Survey (USGS), and the National Water Quality Monitoring Council (NWQMC) Water Quality Monitoring Portal (WQP) that integrates publicly available water quality data from the USGS National Water Information System (NWIS) the EPA STOrage and RETrieval (STORET) Data Warehouse, and the USDA ARS Sustaining The Earth’s Watersheds - Agricultural Research Database System (STEWARDS) found here.
The package contains 4 unique functions:
- load_wqp() - loads WQP data with "start_date" and "end_date" parameters via URL
- preview_na() - shows columns with NA values and the total percentage of NA values
- preview_uniques() - shows all unique values in each column
- preview_locations() - shows geographic distribution of each sample location
This package is easy to install and only has two dependencies (ggmap and ggplot2). Once loaded, it allows the user to quickly and easily get a dataset for the Hillsborough County watershed sampling records. While the package can be expanded in the future to allow for other sampling locations, the package currently serves as a way to save time when looking to load, clean, and summarize a particular set of water quality data in Hillsborough County, FL.
Documentation
Function previews:
load_wqp()
preview_locations()
preview_locations() plot
preview_na()
preview_uniques()