Asset Publisher

Desenvolupament de models per a l'anàlisi estadística de l'associació entre polimorfismes genètics i malalties psiquiàtriques

Juan José Abellán Andrés, CSIC
Funded by
Generalitat Valenciana
Most commonly tools used statistical analysis to study the association between SNPs and disease phenotypes are primarily based on logistic regression and hypothesis testing (see eg [1]), but lately have also proposed new alternatives such as regression logic [2,3]. The hypothesis testing is applied individually to each SNP, as SNPs are often not independent, is subject to the problem of multiple comparisons. To control this serious statistical limitations, usually is a correction of the global p-value based on the Bonferroni, reducing the nominal value of the level of statistical significance of the tests well below the usual 5%. In logistic regression models the response variable is the phenotype, and the explanatory variables are the SNPs. When the number of variables, i.e. of SNPs is very large, the estimation of parameters is numerically unstable. The logistic regression is a more interesting alternative tool, trying to establish possible associations between phenotype and logical combinations of SNPs, including SNPs and interactions that somehow takes into account the multivariate nature of the problem. Despite the utility of these techniques exploratory purposes, all have the limitation of not being able to include possible structures underlying the generation mechanism that can link data such as SNPs or phenotype. The SNPs can be grouped by anything haplotius blocks, spatial correlation chromosome or its functionality also [4]. Also to study phenotypic disease can be interrelated through some sort of structure that may be able to be taken into consideration in the study [5].

Use of other alternative methodologies such as multilevel or hierarchical models has not been explored so far. These models have the advantage of having great flexibility as possible to incorporate different sources of variation associated with or without structure, and therefore take into account more realistic dependency relationships underlying the data. Our goal with this project is to explore the usefulness of these models in the context of genomics and health studies described above. Bayesian statistics provides a suitable framework for inference of such highly structured models, and has already been applied in a wide variety of different contexts [6]. The use of the Bayesian paradigm is not new in the field of genetics and genomics, where it has been used successfully (see eg [7-9] as an illustration of new applications). For the particular problem of the association between SNPs and phenotype, some authors have proposed within the Bayesian framework using variable selection techniques in probit regression models (similar to logistic regression, but with another function of liaison between the mean and linear predictor) for finding SNPs [10]. We in this project we intend to investigate the use of Bayesian hierarchical models in studies of association between SNPs and mental illness, although its application to other types of diseases will be immediate.

Our Team

Principal Investigator (PI)

  • Juan Ramón González Ruiz
    Juan Ramón González Ruiz