Please note: This master’s thesis presentation will take place in DC 2585 and online.
Gustavo
Sutter
Pessurno
de
Carvalho,
Master’s
candidate
David
R.
Cheriton
School
of
Computer
Science
Supervisor: Professor Pascal Poupart
Grammatical Error Correction (GEC) and Grammatical Error Detection (GED) are two important tasks in the study of writing assistant technologies. Given an input sentence, the former aims to output a corrected version of the sentence, while the latter’s goal is to indicate in which words of the sentence errors occur. Both tasks are relevant for real world applications that help native speakers and language learners to write better. Naturally, these two areas have attracted the attention of the research community and have been studied in the context of modern neural networks. This work focuses on the study of multilingual GED models and how they can be used to improve GEC performed by large language models (LLMs).
We study the difference in performance between GED models trained in a single language and models that undergo multilingual training. We expand the list of datasets used for multilingual GED to further experiment with cross-dataset and cross-lingual generalization of detection models. Our results go against previous findings and indicate that multilingual GED models are as good as monolingual ones when evaluated in the in-domain languages. Furthermore, multilingual models show better generalization to novel languages seen only at test time.
Making use of the GED models we study, we propose two methods to improve corrections of prompt-based GEC using LLMs. The first method aims to mitigate overcorrection by using a detection model to determine if a sentence has any mistakes before feeding it to the LLM. The second method uses the sequence of GED tags to select the in-context examples provided in the prompt. We perform experiments in English, Czech, German and Russian, using Llama2 and GPT3.5. The results show that both methods increase the performance of prompt-based GEC and point to a promising direction of using GED models as part of the correction pipeline performed by LLMs.
To attend this master’s thesis presentation in person, please go to DC 2585. You can also attend virtually using Zoom at https://vectorinstitute.zoom.us/j/86538188181.