An integrative approach to predicting the functional effects of non-coding and coding sequence variation

Bioinformatics. 2015 May 15;31(10):1536-43. doi: 10.1093/bioinformatics/btv009. Epub 2015 Jan 11.

Abstract

Motivation: Technological advances have enabled the identification of an increasingly large spectrum of single nucleotide variants within the human genome, many of which may be associated with monogenic disease or complex traits. Here, we propose an integrative approach, named FATHMM-MKL, to predict the functional consequences of both coding and non-coding sequence variants. Our method utilizes various genomic annotations, which have recently become available, and learns to weight the significance of each component annotation source.

Results: We show that our method outperforms current state-of-the-art algorithms, CADD and GWAVA, when predicting the functional consequences of non-coding variants. In addition, FATHMM-MKL is comparable to the best of these algorithms when predicting the impact of coding variants. The method includes a confidence measure to rank order predictions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Genetic Variation / genetics*
  • Genome, Human*
  • Genome-Wide Association Study
  • Genomics / methods
  • Humans
  • Molecular Sequence Annotation*
  • Open Reading Frames / genetics*
  • Phenotype
  • Untranslated Regions / genetics*

Substances

  • Untranslated Regions