2015.12.14 Identification of disease-causing single nucleotide variants in exome sequencing studies

2019-07-07 00:28:34 0

北京大学定量生物学中心

学术报告

  题目: Identification of disease-causing single nucleotide variants in

exome sequencing studies

  报告人: 江瑞 副教授

                           清华大学自动化系

  时间:2015-12-14(周一),13:00-14:00

  地点:北京大学老化学楼东配楼101报告厅

  主持人:邓明华 教授

  摘 要:

Exome sequencing has been widely used in detecting pathogenic nonsynonymous single nucleotide variants (SNVs) for human inherited diseases. However, traditional statistical genetics methods are ineffective in analyzing exome sequencing data, due to such facts as the large number of sequenced variants, the presence of non-negligible fraction of pathogenic rare variants or de novo mutations, and the limited size of affected and normal populations. Here, we propose bioinformatics approaches, SPRING, snvForest and GLINTS, for identifying pathogenic nonsynonymous SNVs for a given query disease. SPRING integrates six functional effect scores calculated by existing methods and five association scores derived from a variety of genomic data sources to calculate the statistical significance that an SNV is causative for a query disease. snvForest adopts an ensemble learning method to assign prediction scores to candidate SNVs. These methods are designed to use with a set of seed genes known as associated with the disease of interest, and thus is suitable for studies on diseases with some prior knowledge. GLINTS further incorporates three disease phenotype similarity data to facilitate the detection of causative SNVs without any knowledge of seed genes for a query disease. This method is therefore suitable for research on diseases whose genetic bases are completely unknown. With a series of comprehensive validation experiments, we demonstrate the effectiveness of these methods, not only in simulation studies, but also in detecting causative de novo mutations for autism, epileptic encephalopathies and intellectual disability.

 

报告人简介:江瑞,男,博士,清华大学自动化系副教授。2002年毕业于清华大学自动化系,获博士学位。2004-2007年在美国南加州大学计算分子生物学专业从事博士后研究。20077月至今任清华大学自动化系副教授、博士生导师。20142015年任美国斯坦福大学统计系访问学者、访问教授。主要研究兴趣包括致病遗传因素识别的生物信息学方法、生物网络的建模分析应用、生物医学大数据的深度学习等。近年来在国际重要学术刊物PLoS GeneticsNucleic Acids Res等以通讯作者身份发表论文40余篇,获自动化学会自然科学二等奖。