Research interests

photo of Professor Dan BrownProfessor Brown's primary research area is the understanding of sequential data, joining ideas from evolutionary theory with probabilistic modeling and discrete mathematical ideas.

His established expertise is in biological sequence analysis, where he has worked on a host of areas, ranging from homology search to algorithms for hidden Markov models to haplotype inference and kinship discovery. 

A recent notable effort has been in design of algorithms for large-scale evolutionary tree reconstruction, with PhD student Jakub Truszkowski; their work resulted in the algorithms QTree and LSHTree, which are among the fastest known in this domain that still have good accuracy guarantees. QTree uses quartet queries and a clever data structure to predict phylogenies in O (n log n) time for n species, while LSHTree is the first phylogeny algorithm with subquadratic runtime with accuracy guarantees for sequences from a Markov model.

Brown is also interested in why algorithms in bioinformatics tend to work faster in practice than might be predicted in theory, and has solved this question for problems in motif finding, homology search, haplotype inference and kinship detection.

Brown's other major research area is music information retrieval. He has adapted algorithms from bioinformatics to the study of lyric and audio analysis, with notable successes in rhyme detection in hip hop and musicological applications of this technique, misheard lyric disambiguation, and cover song detection.  He is also working with a current student on ways to add lyric-related features to a variety of problems in music information retrieval.

Degrees and major awards

SB (Massachusetts Institute of Technology), MS, PhD (Cornell)

Early Researcher Award (2007-2012)

Industrial and sabbatical experience

Professor Brown's primary external research work since coming to Waterloo has been the year he spent at the Whitehead Institute / MIT Centre for Genome Research (now the Broad Institute), when he analyzed data from the human and mouse genome sequence projects.

In 2006, he spent six months on sabbatical at the University of California, Davis, working on computational modeling of evolution; in 2009-2010, Brown worked on computational population genetics and hidden Markov model decoding.

Representative publications

A. Singhi and D.G. Brown. On cultural, textual and experiential aspects of music mood. To appear in Proceedings of the 2014 International Society for Music Information Retrieval Conference.

D.G. Brown, J. Truszkowski. Fast phylogenetic tree reconstruction using locality-sensitive hashing. Proceedings of the 2012 Workshop on Algorithms in Bioinformatics, pp 14-29.

H. Hirjee, D.G. Brown. Solving misheard lyric search queries using a probabilistic model of speech sounds. Proceedings of the 2010 International Society for Music Information Retrieval Conference, pp 147-152. Best Student Paper Award.

B. Brejova, D.G. Brown, and T. Vinar. Vector seeds: an extension to spaced seeds. Journal of Computing and System Sciences 70:364-380, 2005.

International Human Genome Sequencing Consortium (including D.G. Brown), Initial sequencing and analysis of the human genome. Nature409:860-921, 2001.

University of Waterloo
Contact information: 

Profiles by type