# CS886 Winter09 - Syllabus: Bayesian Data Analysis

## Objectives

Information technologies have given rise to an abundance of data that needs to be analyzed.  While there are many ways to analyze data, Bayesian methods distinguish themselves by their explicit use of probabilities to quantify uncertainty.  Taking into account uncertainty is critical when making predictions, recognizing patterns and drawing conclusions.  The use of probability theory to represent uncertainty leads to conceptually simple methods that are principled, facilitate generalization, tend to be immune to overfitting, can be easily composed for information fusion and facilitate model selection.  However, these benefits come at a price: inference is often computationally complex.

This course focuses on the theory of Bayesian learning, models for Bayesian analysis and algorithms for Bayesian inference.  The theory, models and algorithms covered will be of general interest and therefore applicable to a wide range of domains beyond machine learning and computational statistics.  This course should be of interest to researchers in a variety of fields where there is a need to analyze data, including natural language processing, information retrieval, data mining, bioinformatics, computer vision, computational finance, health informatics and robotics.

## References

We will make use of two textbooks with additional readings from selected research papers.  The first textbook is on reserve at the library and the second textbook is available online.
• Gelman, Carlin, Stern and Rubin (2004), Bayesian Data Analysis, 2nd edition
• Rasmussen and Williams (2006), Gaussian Processes

## Outline

Topics:

1. Basics of probability theory, machine learning and statistics
• Inference
• Generalization
2. Models
• Single and multi-variate models
• Bayesian networks
• Second-order models (i.e., distributions over distribution parameters)
• Infinite models
• Gaussian process
• Dirichlet process
• Hierarchical models
• Non-parameteric models
3. Prior construction
• Conjugate priors
• Informative and non-informative priors
• Hierarchical priors
4. Inference (for learning and predictions)
• Bayes' theorem
• Exact inference with conjugate distributions
• Approximate inference with Markov chain simulation
5. Model checking
Applications domains:
1. Natural language processing and Information retrieval
• Latent semantic analysis
• Topic modeling
2. Machine Learning and Bioinformatics
• Model selection
• Kernel learning
3. Computer vision
• Scene analysis
4. Robotics
• Inverse kinematics
5. Control
• Bayesian reinforcement learning
6. Health informatics
• Experimental design
7. Computational finance
• Time-series analysis
8. Data mining