TITLE: Discrete Multiple Testing In Detecting Differential Methylation Using Sequencing Data
INSTRUCTORS: Professor Nan Lin, Washington University in St. Louis
MODERATOR: Xiaoming Li
Abstract: DNA methylation, as one of the most important epigenetic mechanisms, is critical for deciding cell fate, and hence tightly relevant to understanding disease processes, such as cancer. It is expected that epigenetic tests will be widely used for selecting personalized treatments in cancer and other diseases. We will discuss the multiple testing issue arising in detecting differential methylation in next generation sequencing studies. The detection requires comparing DNA methylation levels at millions of genomic loci across different genomic samples and statistically can be viewed as a large-scale multiple testing problem. Due to low read counts at individual CpG sites, asymptotic tests are often inadequate as discreteness in the test statistics is nonignorable. This brings up many intriguing statistical issues on proper control of false discovery rates (FDRs). Popular FDR control procedures often assume the test statistics are continuously distributed. Consequently, direct applications of such methods are often underpowered in methylation sequencing data analysis due to the discreteness. As discrete multiple testing is a generic statistical problem, methods discussed in this tutorial are also widely applicable in scenarios beyond methylation sequencing data analyses. The first part of the tutorial will review background issues in multiple testing and next generation sequencing data. The second part will discuss various FDR control methods for discrete multiple testing developed recently, and provide R demonstrations on real methylation sequencing data.
Nan Lin is an Associate Professor in the Department of Mathematics at Washington University in St. Louis and has a joint appointment in the Division of Biostatistics, Washington University in St. Louis, School of Medicine. His methodological research is in the areas of big data, quantile regression, bioinformatics, Bayesian statistics, longitudinal and functional data analysis. His applied research involves statistical analysis of data from anesthesiology and genomics. He teaches a wide range of statistics courses, including mathematical statistics, Bayesian statistics, linear models, experimental design, statistical computation, and nonparametric statistics.
He earned a B.S. (1999) from University of Science and Technology of China, a M.S. (2000) and Ph.D. (2003) in Statistics, and a second M.S. (2003) in Finance from University of Illinois at Urbana-Champaign.
Before joining Washington University, he was a postdoctoral associate (2003-2004) at the Center for Statistical Genomics and Proteomics, Yale University.