LIS 4370 Blog

Kevin Hitt
April 24, 2020
Entry 013

Project goal

This final assignment called to create a package in R and publish it via GitHub (link here).

There is also a peer review component to this assignment:

Overall impression
How many unique functions this package have?
Does the package has DESCRIPTION file?
Does the package has dataset example?

Previous assignments

Objectives

Create a package that provides a toolset for environmental scientists to clean data and perform exploratory analysis on data from the US Environmental Protection Agency (EPA), US Geological Survey (USGS), and the National Water Quality Monitoring Council (NWQMC) Water Quality Monitoring Portal (WQP) that integrates publicly available water quality data from the USGS National Water Information System (NWIS) the EPA STOrage and RETrieval (STORET) Data Warehouse, and the USDA ARS Sustaining The Earth’s Watersheds - Agricultural Research Database System (STEWARDS) found here.

The package contains 4 unique functions:

load_wqp() - loads WQP data with "start_date" and "end_date" parameters via URL
preview_na() - shows columns with NA values and the total percentage of NA values
preview_uniques() - shows all unique values in each column
preview_locations() - shows geographic distribution of each sample location

This package is easy to install and only has two dependencies (ggmap and ggplot2). Once loaded, it allows the user to quickly and easily get a dataset for the Hillsborough County watershed sampling records. While the package can be expanded in the future to allow for other sampling locations, the package currently serves as a way to save time when looking to load, clean, and summarize a particular set of water quality data in Hillsborough County, FL.