11.26.14


Methods for time series analysis of RNA-seq data with application to human Th17 cell differentiation

  1. Tarmo Äijö1,*,
  2. Vincent Butty2,
  3. Zhi Chen3,
  4. Verna Salo3,
  5. Subhash Tripathi3,
  6. Christopher B. Burge2,
  7. Riitta Lahesmaa3 and
  8. Harri Lähdesmäki1,3,*

+Author Affiliations


  1. 1Department of Information and Computer Science, Aalto University, FI-00076 Aalto, Finland, 2Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA and 3Turku Centre for Biotechnology, University of Turku and Åbo Akademi University, FI-20520 Turku, Finland
  1. *To whom correspondence should be addressed

Abstract

Motivation: Gene expression profiling using RNA-seq is a powerful technique for screening RNA species’ landscapes and their dynamics in an unbiased way. While several advanced methods exist for differential expression analysis of RNA-seq data, proper tools to anal.yze RNA-seq time-course have not been proposed.

Results: In this study, we use RNA-seq to measure gene expression during the early human T helper 17 (Th17) cell differentiation and Tcell activation (Th0). To quantify Th17specific gene expression dynamics, we present a novel statistical methodology, DyNB, for analyzing time-course RNA-seq data. We use non-parametric Gaussian processes to model temporal correlation in gene expression and combine that with negative binomial likelihood for the count data. To account for experimentspecific biases in gene expression dynamics, such as differences in cell differentiation efficiencies, we propose a method to rescale the dynamics between replicated measurements. We develop an MCMC sampling method to make inference of differential expression dynamics between conditions. DyNB identifies several known and novel genes involved in Th17 differentiation. Analysis of differentiation efficiencies revealed consistent patterns in gene expression dynamics between different cultures. We use qRT-PCR to validate differential expression and differentiation efficiencies for selected genes. Comparison of the results with those obtained via traditional timepointwise analysis shows that time-course analysis together with time rescaling between cultures identifies differentially expressed genes which would not otherwise be detected.

Availability: An implementation of the proposed computational methods will be available at http://research.ics.aalto.fi/csb/software/

Contact: tarmo.aijo@aalto.fi or harri.lahdesmaki@aalto.fi

Supplementary information: Supplementary data are available atBioinformatics online.