A Graphical Modeling Framework to Study Complex Dependence Patterns in High-Dimensional Biological Data

Project Type:

Seed

The next step of work in this project is to create a unique and inventive interface between biology and statistics, as it is clear that genomic and statistical research must go hand in hand, each informing the other based on sound principles with the common goal of obtaining answers closer to the objective truth than either could not have achieved alone.

Project Leader(s):

[url=mailto:[email protected]]Dr. Laurent Briollais[/url] , University of Toronto

Graphical models have been one of the most efficient statistical tools used in the last twenty years for the analysis of complex structured high-dimensional data. Graphical models provide a probabilistic framework for making inference and representing the knowledge that we have about these complex structured data. In biological research and more particularly in the emerging -omics disciplines such as genomics, proteomics, metabolomics, transcriptomics, data are often generated from complex high throughput experiments and from complex experimental designs. Graphical models can represent these complex biological problems, leading to relatively simple and tractable computational algorithms to obtain the quantities of interest. The graph underlying the graphical model consists of a set of vertices and a set of edges linking some of the vertices. In biological studies, the vertices represent the variables such as mRNA expression levels, protein levels, environmental conditions, genotypes, and phenotypes. Edges describe the relationships between these variables and can be interpreted with typical biological semantics (the so-called “annotation” in bio-informatics). Despite the explosion of complex statistical models appearing in genetics and bioinformatics, there still remains an important gap between theory and its applications. The exchanges between the applied and methodological communities remain surprisingly limited. This is a common problem in -omics research where highly structured biological systems are unfortunately very difficult to apprehend from one single angle. Our group of researchers has developed a unique expertise in statistical and biological sciences.

The next step of our work is to create a unique and inventive interface between biology and statistics, as it is clear that genomic and statistical research must go hand in hand, each informing the other based on sound principles with the common goal of obtaining answers closer to the objective truth than either could not have achieved alone. It is important that the methods developed be aimed at drawing inferences biologists are interested in and these methods be mathematically and statistically sound.

We will demonstrate the relevance of our approach through several applications that could have a major impact in biology and public health areas:

Gene Discovery in Complex Human Diseases using Genome-wide Association Studies (GWAS)

Inference about Complex Biological Systems using -omics data

Predictive Models for Complex Human Diseases

Finally, our ultimate goal is the development of predictive tools for a personalized medicine in a variety of clinical contexts using high-throughput -omics experiments.

Project Website:

http://research.lunenfeld.ca/mprime_Briollais/DEFAULT.ASP

Project team:

Dr. Gary Bader, University of Toronto

Dr. Adrian Dobra, University of Washington

Dr. Hélène Massam, York University

Dr. Hilmi Ozcelik, Samuel Lunenfeld Research Institute

Non-academic participants:

[url=http://www.genizon.com/]Genizon Biosciences Inc.[/url]

[url=http://www.wolfram.com/]Wolfram Research[/url]

[url=http://www.ibm.com/ca/en/]IBM[/url]

[url=http://www.tgen.org/]Translational Genomics Research Institute[/url]