Please note: This PhD seminar will take place online.
Zhenyu (Alister) Liao, PhD candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Peter van Beek
A Bayesian network (BN) is a probabilistic graphical model with applications in knowledge discovery and prediction. A widely used data analysis methodology using Bayesian networks is to: (i) learn a set of plausible networks that fit the data using optimization and a scoring function, (ii) perform model averaging to obtain confidence measure for each edge, and (iii) select a threshold and report all edges with confidence higher than the threshold. In this manner, a representative network can be constructed from the edges that are deemed significant that can then be examined for probabilistic dependencies and possible cause-effect relations.
This seminar focuses on several improvements that benefit the data analysis methodology. We propose a novel approach to model averaging inspired by performance guarantees in approximation algorithms. Our approach only considers credible models in that they are optimal or near-optimal in score, is more efficient and scales to significantly larger Bayesian networks than existing exact approaches, and is more accurate than existing heuristic approaches. We then present several improvements to our new approach as well as to a popular existing heuristic approach including: selecting the best scoring function, selecting an appropriate threshold, and using meta-ensembles to boost performance.