CFI support expands access to Canadiana, nation’s digital historical archive

Thursday, March 26, 2026

Dan Brown, co-investigator on project, will explore language models trained on Canadian-source historical data

A new national initiative to transform access to Canada’s historical records is bringing together researchers across disciplines and institutions, including Professor Dan Brown of Waterloo’s Cheriton School of Computer Science.

Known as Open Science Infrastructure for Canad(ian)a: Digital Collections of the Future, the project is led by Professor Constance Crompton, Canada Research Chair in Digital Humanities at the University of Ottawa. Recently awarded $4.02 million from the Canada Foundation for Innovation (CFI), the initiative will develop a comprehensive, ethical and accessible digital archive of Canada’s historical and cultural materials.

Professor Dan Brown on stairs in Waterloo's Davis Centre

Dan Brown is a professor at the Cheriton School of Computer Science whose research spans computational creativity, music information retrieval and bioinformatics. He holds a PhD and MSc from Cornell University and a BSc from MIT.

Professor Brown is among the project’s co-investigators, which include experts from the humanities, information science and computer science across Canada. He contributes technical expertise as well as represents computational researchers interested in using the collection for language modelling.

The initiative will explore Canadiana, a vast collection maintained by the Canadian Research Knowledge Network. Spanning materials from the 16th century to the present, the collection includes some 69 million pages of digitized heritage content, offering diverse perspectives on Canada’s social, economic, political and cultural development.

Despite its importance, much of the Canadiana collection is difficult to access and even harder to analyze. Many documents are unannotated, are in a range of formats, or have not been processed using optical character recognition.

The project will address these challenges by developing AI tools and infrastructure to organize, annotate and connect the data at scale. This includes enabling analysis of handwritten documents, standardizing formats for computational use, and making the collection more accessible to AI systems.

Professor Brown’s research will focus on how computational researchers can use the vast archive to better understand and model Canadian historical perspectives.

“I’m interested in how a language model trained on Canadian historical materials differs in its outputs, how might it generate a hypothetical speech by a historical Indigenous leader, for example, compared with a general-purpose system like Google Gemini or Claude,” Professor Brown says. “This project opens new ways of incorporating historical context into computational tools, and vice versa. It’s especially interesting to see how stereotype and bias found in the source materials will wind up in the outputs of such language models.”

The project is coordinated through the Canadian Research Knowledge Network, a network of 88 Canadian universities, libraries and research institutions dedicated to making trustworthy knowledge accessible. A significant portion of CFI’s funding will support new infrastructure and expanded programming capacity to improve the querying speed and performance of Canadiana.

The initiative also involves re-architecting the platform’s underlying technology and information management systems, adding new primary-source content, and integrating generative AI tools to support research, says Beth Sandore Namachchivaya, University Librarian at Waterloo.

“This work has the potential to advance the research of scholars across the disciplines who focus on Canadian history, literature, languages, cultures, politics and geography, in addition to numerous interdisciplinary applications.”