A new machine learning tool has been developed to classify water stations with similar water quality trends. The tool is based on the statistical method, Weighted Regressions in Time, Discharge, and Season (WRTDS), developed by the United States Geological Survey (USGS) to estimate daily concentrations of water constituents in rivers and streams based on continuous daily discharge data and discrete water quality samples collected at the same or nearby locations. WRTDS is based on parametric survival regressions using a jack-knife cross validation procedure that generates unbiased estimates of the prediction errors. One of the disadvantages of WRTDS is that it needs a large number of samples (n > 200) collected during at least two decades. In this article, the tool is used to evaluate the use of Boosted Regression Trees (BRT) as an alternative to the parametric survival regressions for water quality stations with a small number of samples. We describe the development of the machine learning tool as well as an evaluation comparison of the two methods, WRTDS and BRT. The purpose of the tool is to evaluate the reduction in variability of the estimates by clustering data from nearby stations with similar concentration and discharge characteristics. The results indicate that, using clustering, the predicted concentrations using BRT are in general higher than the observed concentrations. In addition, it appears that BRT generates higher sum of square residuals than the parametric survival regressions.
Files and links (1)
url
A Machine Learning Tool for Weighted Regressions in Time, Discharge, and SeasonView
Published (Version of record)link to articleCC BY V4.0, Open
Related links
Details
Title
A Machine Learning Tool for Weighted Regressions in Time, Discharge, and Season
Publication Details
International journal of advanced computer science & applications, Vol.5(3), pp.99-106
Resource Type
Journal article
Publisher
Science & Information Sai Organization Ltd
Format
link
Grant note
EPS-1010607 / NSF EPSCoR; National Science Foundation (NSF); NSF - Office of the Director (OD)
2013AL156B / USGS 104(G) Competitive Grants Program/Alabama Water Resources Research Institute
Copyright
This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.
Identifiers
WOS:000219160600014; 99380460967706600
Academic Unit
Center for Teaching, Learning, and Technology; College of Arts, Social Sciences, and Humanities; Center for Cybersecurity
Language
English
Access the Argo Scholar Commons Lib Guide
Return to the libraries' main page
Access answers to the questions we get the most
A Machine Learning Tool for Weighted Regressions in Time, Discharge, and Season