Cheriton School of Computer Science
University of Waterloo
The City of Abbotsford, BC fitted all of the approximately 24,000 homes and businesses with smart water meters that record usage at the hourly level. Given this unique data resource, there are many interesting research problems to work on—including leak or loss detection, predicting or forecasting usage, customer modeling to improve hydrographic modeling, and disaggregating outdoor usage from total water consumption—and many of these appear to be amenable to artificial intelligence tools and techniques.
The results reported in the thesis also appear in revised form in:
(PDF) Valerie Platsko and Peter van Beek. Identification, Prediction, and Explanation of Outdoor Residential Water Consumption Using Smart Meter Data. Proceedings of the 1st International Water Distribution Systems Analysis / Computing and Control for the Water Industry (WDSA/CCWI) Joint Conference, Kingston, Ontario, July, 2018.
Valerie Platsko and Peter van Beek. Forecasting Outdoor Residential Water Consumption using Ensembles of Regression Trees. Poster at the 51st Canadian Meteorological and Oceanographic Society Congress, Toronto, June, 2017.
Smart meter technology allows frequent measurements of water consumption at at household level. This greater availability of data allows improved analysis of patterns of residential water consumption, which is important for demand management and targeting conservation efforts. The dataset in this thesis includes 8,000 households in Abbotsford, British Columbia from 2012-2013, and contains hourly measurements of water consumption recorded by smart meters installed in 2010. This work focuses on identifying outdoor consumption due to its contribution to peak demand during the summer, which is important because of concerns about strain on infrastructure in Abbotsford. This research shows that outdoor water consumption can be robustly identified from hourly measurement of total water consumption by determining an upper threshold on plausible indoor usage, and that this estimated outdoor water consumption is consistent with seasonal patterns of water consumption identified in previous work, with the timing of restrictions on outdoor watering, and with household size. The research also includes regression tree-based models for predicting next-hour water consumption, however the predictability of this consumption is limited. In contrast to previous work, there is little correlation between outdoor consumption and demographic factors such as income. Outdoor consumption shows a large amount of individual variability, with 8.6% of households accounting for 50% of the total outdoor usage. This limits the predictability of outdoor consumption, but also highlights the importance of identifying this consumption for each household to allow for targeted conservation efforts.
Figure 1. Summer water consumption by dissemination area for the City of Abbotsford, British Columbia.
Figure 2. Summer outdoor water consumption by households for the City of Abbotsford, British Columbia.
Smart water meters have been installed across Abbotsford, British Columbia, Canada, to measure the water consumption of households in the area. Using this water consumption data, we develop machine learning and deep learning models to predict daily water consumption for existing multi-family residences. We also present a new methodology for predicting the water consumption of new housing developments. This thesis contains three main contributions: First, we build machine learning models which include a feature engineering and feature selection step to predict daily water consumption for existing multi-family residences in the city of Abbotsford. This is motivated by the recent development direction towards denser living spaces in urban areas. We present the steps of the model building process and obtain models which achieve accurate performance. Second, we present a new methodology for building machine learning models to predict daily water consumption for new multi-family housing developments at the dissemination area level. Currently, the models used in the industry are simple baseline models which can lead to an overestimation of predicted water consumption for new developments, leading to costly and unnecessary investments in infrastructure. Using this new methodology, we obtain a machine learning model which achieves a 32.35% improvement over our best baseline model, which we consider a significant improvement. Third, we investigate the use of deep learning models, such as recurrent neural networks and convolutional neural networks, to predict daily water consumption for multi-family residences. In our case, the main advantage of deep learning models over traditional machine learning techniques is the capability of deep learning models to learn data representations, allowing us to omit the feature engineering and feature selection steps and thereby allowing water utilities to save valuable time and resources. The deep learning models we build achieve comparable performance to traditional machine learning techniques.
Figure 3. Spatial distribution of average daily water usage across Abbotsford.
Figure 4. Comparing test performance of deep learning models and LinearSVR.
Smart water meter devices are now widely installed in single family residences, allowing water consumption data to be collected at a high resolution from both the temporal and spatial perspectives. Such data allows improved prediction of future water consumption---an important task for water utilities as they manage the water supply. The dataset in this thesis consists of hourly water consumption data from the 9,045 single-family residences in Abbotsford, British Columbia from September 2012 to August 2013. This research focuses on predicting hourly water consumption by using improved artificial neural network (ANN) models and makes five main contributions. The first contribution is accurately predicting hourly water consumption at a finer spatial and temporal scale than previous work. The second contribution is gathering and studying a wide variety of datasets and related features for predicting future water consumption. In addition to water consumption data, daily weather information, demographic information, property information and date information during the same period of time are collected in the raw dataset. The third contribution is to systematically perform feature selection, an important step in building machine learning models but one that is absent from previous work on predicting water consumption. For different experiment criteria, customized feature sets assist the corresponding models to accurately predict the hourly usages. The fourth contribution is to improve prediction accuracy by building separate models for weekday and weekend prediction. Residents consume water in different patterns between weekdays and weekends. By tackling the predictions separately, better performance can be achieved with less complicated models. Lastly, this research investigates the performance of multi-hidden-layer ANN models versus single-hidden-layer models. Although, single-hidden-layer models are sufficient in theory, we show that multi-hidden-layer ANNs can lead to improved performance.