Introduction to graphical models with applications to quantitative genetics and genomics

(The Final Programme is available as attachment)

Dal 03.06.2019 al 07.06.2019

Period

    • June 3 – 7, 2019

Instructors

 

Registration

Registrations are closed

REGISTRATION FEES

250 € for students, PhD students, post-doc etc. and 350 € for researchers and professors (without VAT). Details about terms of payment and the potential inclusion (or not) of VAT tax will be communicated once the registration will be done.

 

Organization / contacts

Please forward your requests of information to Alessio Cecchinato (alessio.cecchinato@unipd.it)

 

Course Description

Graphical models comprise a set of data analysis tools that allow the investigation and representation of interconnected components in complex systems. In genetics and genomics, for example, graphical models can be used to study recursive and simultaneous relationships among phenotypes, or to investigate gene-phenotype networks. Graphical models explore conditional independencies between variables to detect those that are directly linked to each other, and to infer the directional flow of information (e.g. causal effects) between them. As such, they can produce an interpretation of relationships among variables which differs from that obtained with traditional multivariate models, in which all relationships are represented by symmetric linear associations among random variables, such as covariances and correlations. This course will provide an introduction to graphical models, including techniques such as path analysis, Bayesian networks (BNs), and structural equation models. Some theoretical background will be presented, and key concepts will be introduced, such as the concept of d-separation, causal sufficiency, instrumental variable, and Markov blanket. All the material will be illustrated with applications in quantitative genetics and genomics, with examples including the prediction of phenotypes using earlier expressed traits, genome-enabled prediction, genome-wide association analysis (GWAS) and quantitative trait loci (QTL) mapping for multiple traits, and the analysis of multiple layers of omics information.

Target audience and prerequisites

The course is guided to graduate students and researchers interested on the analysis of genetics and genomics data, including complex traits, molecular markers and gene expression. Some basic knowledge of quantitative and molecular genetics, linear models, and elementary probability and statistics is expected. However, a brief overview of matrix algebra, probability distributions, and statistical inference will be provided at the beginning of the course. In addition, a working knowledge of R is desirable but an introduction will be offered prior to the use of specific R packages.

COURSE MATERIAL

Click >>>HERE<<< to access the Dropbox folder with alla the files. 

COURSE OUTLINE

Correlation and Causation (

    • Sewall Wright and path analysis
    • Observational and experimental data
    • Confounding and selection bias
    • Randomization

Basics of Matrix Algebra

    • Definitions and matrix operations
    • Systems of equations
    • Linear regression and least squares

Aspects of Multivariate Distributions

    • Density function or mass function
    • Marginal and conditional distributions
    • Expectation and variance
    • Covariance and independence
    • The multivariate normal distribution

Inference with Multivariate Models

    • Likelihood principle
    • Parameter estimation, Hypothesis test
    • Independence tests (Discrete, Continuous, and Mixed cases)

Introduction to Graphical Models

    • Basic concepts; network topology features
    • Correlation networks
    • Marginal and partial correlations
    • Conditional independence and the concept of d-separation

Structural Equation Models in Quantitative Genetics

    • Traditional multi-trait mixed effects model (MTM)
    • Genetic and phenotypic correlation
    • Basics of structural equation models (SEM)
    • SEM with latent variables
    • SEM embedded in MTM; direct and indirect genetic effects

Bayesian Networks

    • Introduction
    • Structure learning (constraint- and score-based algorithms)
    • Parameter learning
    • The concept of Markov blanket
    • Causal inference

Applications in Genetics and Genomics

    • Building parsimonious models
    • Genome-enabled prediction
    • Instrumental variable and Mendelian randomization
    • Multiple-trait QTL mapping
    • Combining multiple layers of omics information

R packages

    • Rgraphviz, pcalg, bnlearn, qtlnet, sem, lavaan, among others