Ph.D. (Computer Science), Cornell, 2000
S.B. (Mathematics with Computer Science), MIT, 1995
(Me, with chickenpox spots, February 2007. Not pleasant.)
(Many more cute photos of Rover)
(A formal photo of me from early 2005)
(An old photo of me from the late '90s, when I lived in Ithaca. I don't look like this anymore...)
I am a member of our Bioinformatics group. I also do research on music information retrieval.
You may find my CV in .pdf or HTML versions here. My Google Scholar page also lists my publications.
In 2010 and 2011, Jakub Truszkowski and I developed what we believe is the first error-tolerant O (n log n)-runtime algorithm for phylogenetic reconstruction. QTree uses a complex search structure and a random walk approach to incrementally add new taxa to a growing tree, with each addition happening in logarithmic runtime. Performance is comparable to FastTree and other existing fast phylogeny packages.
Conference papers about QTree are in CPM 2011 and WABI 2011; a journal version of the CPM paper will appear in TCS, and the journal version of the WABI paper (which includes presentation of the actual QTree program), was in Algorithms in Molecular Biology.
As part of proving one of the theorems in the first QTree paper, I needed a tail inequality for negative bionomial random variables. I couldn't find any good links for "negative binomial Chernoff", so I spent forever proving something that should've been obvious. Now you don't need to. (And yes, this is sort-of in the Dubhashi/Panconesi book, but not with the emphasis it might have had.)
Jakub and I have also developed an algorithm which runs in o (n2) runtime for sequences of length O (log n) that come from Markov models of evolution. The algorithm runs in time competitive with that of FastTree and QTree, gives higher accuracy (except for trees with long edges), and uses locality-sensitve hashing and ancestral sequence reconstruction to try to agglomerate a single tree from a group of n sequences. A paper that presents the algorithm was in WABI 2012, and LSHPlace, the adaptation to phylogenetic placement, was in PSB 2013. We will make more practical versions of this software available soon.
Margareta Ackerman and David Loker and I adapted some of their theoretical work on clustering to look at what kinds of properties various phylogenetic algorithms satisfy. We independently proved a result similar to one of Frederick Matsen about how much a bad taxon can damage the tree constructed by Neighbour joining. Our paper appeared at ICCABS 2012, and the journal paper is in BMC Genomics.
My former MMath student Hussein Hirjee and I developed an algorithm for detecting rhymes in rap music lyrics. This work was published in ISMIR 2009 and in Empirical Musicology Review; the software itself is available on SourceForge, and was a demo at ISMIR 2010.
We also developed an algorithm for correcting misheard lyric queries; this work was publshed at ISMIR 2010.
Jakub Truszkowski and I developed an algorithm for "robust" decoding hidden Markov models that allows for some inexactness in boundaries between features; this work was published in APBC 2010, and an adaptation of this method to HIV subtyping was published in BMC Bioinformatics in 2011.
Daniil Golod and I played around with looking at the top k paths in an HMM and summarizing them; this work was published in APBC 2010 as well.
I was interested in work by Tanya Berger-Wolf and her colleagues in identifying sibling groups in wild populations. A WABI 2010 paper with Tanya proposes a trivially simple heuristic for this problem (the previous method used integer programming) that works beautifully and has theoretical support.
More recently, my MMath student Dan Dexter and I are adapting some ideas from that work to half-sibling reconstruction; that work was in WABI 2012, and its journal paper will be in the special issue, in Algorithms in Molecular Biology.
Largely because of my concerns about casino gambling, I've recently affiliated with the Gambling Research Lab at Waterloo. I'm largely being their math consultant, and we're starting to write our first papers about the influx of casino-style gambling into all aspects of life.
A theme of my work is a mathematical understanding of why NP-hard bioinformatics problems often aren't hard to solve in practice. I have studied this problem for haplotype inference, for motif finding, for kinship discovery, and my work on HMM decoding seeks to extend the idea to that domain as well.
These are the courses I have taught since my arrival at UW:
Currently, I am Director of Undergraduate Studies for the School. From July, 2011 to June, 2013, I was Associate Director of the School, which means that I acted as Director when David Taylor was away, and also meant that I chaired a variety of School committees.
I also am Waterloo's Academic Colleague to the Council of Ontario Universities, and I serve on the University Senate and its Executive Committee. I am a member of the Faculty Association's Status of Women and Equity Committee.
And finally, I am not the bestselling author Dan Brown, nor are we related. I don't like his writing. We have corresponded in the past, but that was long ago. Please don't send me fan mail or threats.
David R. Cheriton School of Computer Science
last updated 6 jul 2013