Zhucheng
Tu,
Master’s
candidate
David
R.
Cheriton
School
of
Computer
Science
Modelling the similarity of two sentences is an important problem in natural language processing and information retrieval, with applications in tasks such as paraphrase identification and answer selection in question answering. The Multi-Perspective Convolutional Neural Network (MP-CNN) is a model that improved previous state-of-the-art models in 2015 and has remained a popular model for sentence similarity tasks. However, until now, there has not been a rigorous study of how the model actually achieves competitive accuracy.
In this thesis, we report on a series of detailed experiments that break down the contribution of each component of MP-CNN towards its statistical accuracy and how they affect model robustness. We find that two key components of MP-CNN are actually not needed to achieve competitive accuracy and they actually make the model less robust to changes in hyperparameters. Furthermore, we suggest simple changes to the architecture and experimentally show that when we remove two major components of MP-CNN and incorporate these small changes, we improve the accuracy of MP-CNN, pushing its scores closer to more recent works on competitive semantic textual similarity and answer selection datasets, while using seven times fewer parameters.