Master’s Thesis Presentation • Data Systems • Dowsing For Math Answers: Exploring MathCQA with a Math-aware Search Engine

Friday, October 15, 2021 11:30 am - 11:30 am EDT (GMT -04:00)

Please note: This master’s thesis presentation will be given online.

Yin Ki Ng, Master’s candidate
David R. Cheriton School of Computer Science

Supervisor: Distinguished Professor Emeritus Frank Tompa

Solving math problems can be challenging, so challenging that one might wish to seek insights from the Internet and search for answers from Community Question Answering sites such as Math StackExchange. However, searching for relevant answers for a math problem is itself not trivial.

Given a math question — expressed in mathematical natural language — how can a math-aware search engine be designed to retrieve and rank answers from a Community Question Answering site effectively?

This thesis details how a math-aware search engine Tangent-L — which adopts a traditional text retrieval model (Bag-of-Words scored by BM25+) using formulas’ symbol pairs and other features as “words” — tackles this challenge. Various adaptions for Tangent-L to this challenge are explored, including query conversion, weighting scheme of math features, and result re-ranking.

In a recent workshop series named Answer Retrieval for Questions on Math (ARQMath), and with math problems from Math StackExchange, the submitted runs of finding answers based on Tangent-L achieved the best participant run for two consecutive years, performing better than many systems designed with machine learning and deep learning models. A data exploration tool is built for interested audiences to view the math questions in this ARQMath challenge and check the answer rankings produced by Tangent-L.


To join this master’s thesis presentation on Zoom, please go to https://us02web.zoom.us/j/7721365315?pwd=eGhiQzB3R1V0NVRWQTlRVlZjaVVJZz09.