Intelligent Autonomic Computing for Computational Biology

_Principal Investigator:_
Igor Jurisica, Associate Professor
Department of Computer Science, University of Toronto, and Ontario Cancer Institute

Background: Computational biology is concerned with developing and using techniques from computer science, informatics, mathematics, and statistics to solve biological problems. Analyzing biomedical data requires robust approaches that deal with (ultra) high dimensionality, multimodal and rapidly evolving representations, missing information, ambiguity and uncertainty, noise, and incompleteness of domain theories. Merely coping with the deluge of data is no longer an option; their systematic analysis is a necessity in the biomedical research. Despite the introduction of many powerful chemotherapeutic agents over the past two decades, most cancers retain devastating mortality rates. To significantly impact cancer research, novel therapeutic approaches for targeting metastatic disease and diagnostic markers reflective of changes associated with disease onset that can detect early stage disease must be discovered. Better drugs must be rationally designed, and current drugs made more efficacious either by re-engineering or by information-based combination therapy. To tackle these complex biological problems and their impact, high-throughput biology requires integrative computational biology, i.e., considering multiple data types, developing and applying diverse algorithms for heterogeneous data analysis and visualization.
Our various analyses have diverse requirements and demand several computational components that cannot effectively be met by a single generic HPC (?) platform. Some analyses are CPU-intensive while others also have large storage needs; some require long batch processes while others are computationally demanding but need to work interactively; and many require the highest levels of information security.

Objectives: The goal is to implement autonomic and on-demand computing for heterogeneous systems and complex applications: scalable performance, maximized system utilization, minimized human intervention for system maintenance and optimization, system self-monitoring with preventive maintenance, and reduced cost. In the project, we will focus on three main areas:

  1. to provide existing a new computational biology applications for testing our and other systems;
  2. to develop pattern discovery approach to application performance improvement on the consolidated application server, and
  3. to implement scheduling optimization in the heterogeneous grid.

The first application will focus on heterogeneous data integration, protein crystallography, protein-protein interactions, and integrative cancer profile analysis. The second aim will monitor and capture individual transactions, resources used, and system faults. It will then apply data mining algorithms to discover association rules among applications using historical application response time, resources, and faults, and apply case-based reasoning and the discovered rules combined with system monitoring information for performance improvement. Our goals are:

  1. to improve system response time by anticipating application needs for resources, and preparing them in advance; b. to optimize system resources for given set of applications over time, i.e., automated system tuning based on discovered relationships between applications, resources and system performance; and c. to improve root cause determination of a given fault by using indirect relationships discovered among applications and resources. The third project will apply case-based reasoning, optimized job run time estimation, and dynamic scheduling heuristic selection to minimize total run time for computational biology applications in the heterogeneous grid.

Potential benefit to Ontario: Improved system utilization, increased system uptime, reduced need for human intervention for system maintenance and optimization will result in reduced cost for system operation. Improved computational analysis will lead to relevant discoveries faster, and thus leading to reduced cancer burden, improving clinical care, improving quality of life for patients, and reducing healthcare cost.

_Other Projects_

  • Automated Management of Virtual Database Appliances
  • Fine-grained Resource Management and Problem Detection in Dynamic Content Servers
  • Semantically Configurable Modelling Notations and Tools
  • Model Management for Continuously Evolving Systems
  • Modeling, Evolution, and Automated Configuration of Software Services
  • Elaborating and Evaluating UMLís 3-Layer Semantics Architecture
  • Performance Management of IT Infrastructure
  • Performance-Model-Assisted Creation and Management of Service Systems
  • Topic revision: r2 - 2007-05-14 - CherylMorris
    This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
    Ideas, requests, problems regarding TWiki? Send feedback