Please note: This seminar will take place in DC 1304.
Akira
Yoshiyama,
Undergraduate
computer
science
student
David
R.
Cheriton
School
of
Computer
Science
Mentor: Lena Podina
Brain-computer interfaces (BCIs) have the potential to restore communication to people with paralysis by decoding neural activity evoked by attempted speech into text in real time. Electrocorticographic (ECoG) signals may produce more accurate results than electroencephalography due to their higher spatial and temporal resolution and their relative imperviousness to muscle movements. Recent ECoG-based approaches to neural signal-to-text decoding have used deep neural networks, often including a language model. Most recently, Willett et al. (2023) demonstrated the first successful large-vocabulary decoding, yielding a 23.8% word error rate on a 125,000-word vocabulary. However, lower word error rates are needed for real-life usage.
We propose the use of transformers for end-to-end decoding of ECoG data to text. We aim to fine-tune a Large Language Model (LLM) on encodings of ECoG signals passed through a transformer encoder. The LLM would then output predicted word tokens. To our knowledge, transformers have not yet been applied to decode ECoG data for speech prediction in the literature.
Another computer science DRP seminar, by presented by Jaimee Yeung and Sanika Poojary, follows this one.