Heewook Lee, Lane Fellow, Computational Biology Department
School of Computer Science, Carnegie Mellon University
Genetic diversity is necessary for survival and adaptability of all forms of life. The importance of genetic diversity is observed universally in humans to bacteria. Therefore, it is a central challenge to improve our ability to identify and characterize the extent of genetic variants in order to understand the mutational landscape of life.
In this talk, I will focus on two important instances of genetic diversity found in (1) human genomes (particularly the human leukocyte antigens—HLA) and (2) bacterial genomes (rearrangement of insertion sequence [IS] elements). I will first show that specific graph data structures can naturally encode high levels of genetic variation, and I will describe our novel, efficient graph-based computational approaches to identify genetic variants for both HLA and bacterial rearrangements. Each of these methods is specifically tailored to its own problem, making it possible to achieve the state-of-the-art performance. For example, our method is the first to be able to reconstruct full-length HLA sequences from short-read sequence data, making it possible to discover novel alleles in individuals. For IS element rearrangement, I used our new approach to provide the first estimate of genome-wide rate of IS-induced rearrangements including recombination. I will also show the spatial patterns and the biases that we find by analyzing E. coli mutation accumulation data spanning over 2.2 million generations. These graph-centric ideas in our computational approaches provide a foundation for analyzing genetically heterogeneous populations of genes and genomes, and provide directions for ways to investigate other instances of genetic diversity found in life.
Bio: Heewook Lee is currently a Lane Fellow at Computational Biology Department at the School of Computer Science in Carnegie Mellon University, where he works on developing novel assembly algorithms for reconstructing highly diverse immune related genes, including human leukocyte antigens.
He received a B.S. in computer science from Columbia University, and obtained M.S. and Ph.D in computer science from Indiana University. Prior to his graduate studies, he also worked as a bioinformatics scientist at a sequencing center/genomics company where he was in charge of the computational unit responsible for carrying out various microbial genome projects and Korean Human Genome project.
200 University Avenue West
Waterloo, ON N2L 3G1