Master’s Thesis Presentation • Data Systems — Machine Learning for Streamflow Prediction | Cheriton School of Computer Science

Tuesday, April 7, 2020 10:30 am - 10:30 am EDT (GMT -04:00)

Please note: This master’s thesis presentation will be given online.

Martin Gauch, Master’s candidate
David R. Cheriton School of Computer Science

Accurate prediction of streamflow — the amount of water flowing past a stream section at a given time — is a long-standing challenge in hydrology. Not only do researchers strive to understand the natural processes at play, the predictions are also vital for management of floods, irrigation control, or hydro-electric power generation. Traditional, physically-based models explicitly simulate the processes that drive streamflow, but their predictions are often inaccurate, especially when predicting multiple watersheds with one model.

In this thesis, we study applications of machine learning to streamflow prediction: We present two case studies where data-driven models outperform physically-based models. Although more accurate, these data-driven techniques lack interpretability compared to physically-based models. Hence, we further explore first steps towards combining physically-based and data-driven approaches into a single model that preserves each component’s advantages. Lastly, we quantify the effects of limited training data on the quality of data-driven predictions. We show that models benefit from additional data not only in terms of longer time periods, but also in terms of additional basins. This is a promising result towards transferring trained models to regions with limited or no training data.

As all of the above research directions hinge on the access to geospatial datasets, we precede their examination with the development of the Cuizinart, a cloud-based platform to disseminate and subset large environmental datasets.