Monthly Archives: August 2015


Deep Learning

In the last two months, a couple of groups have published papers applying deep learning to problems related to gene regulation:  protein-nucleic acid binding specificity [1] and chromatin state [2]. We will be talking about these soon.

Before discussing these papers, we think it will be useful to give people some time to get familiar with the fundamentals of artificial neural networks and deep learning. So, this coming *Monday* at our new time of 12 noon, we’ll have a meeting to talk about deep learning and work through each other’s questions. Beforehand, please check out some of the following resources and bring questions (or expertise you’d like to share!).

At the meeting, we’ll walk through the topics in this Nature review:

More resources:

Lecture slides from Mark’s machine learning class:, ANNs-2.pdf

Intro to neural networks from a programming perspective (just skimmed this one; looks like an interesting presentation):

[1] DeepBind (Alipanahi et al, Nature Biotech 2015)
[2] DeepSEA (Zhou & Troyanskaya, Nature Methods 2015)


Wanderlust with special guest Monacle

“Single-Cell Trajectory Detection Uncovers Progression and Regulatory Coordination in Human B Cell Development”

Bendall et al, Cell 2014


Tissue regeneration is an orchestrated progression of cells from an immature state to a mature one, conventionally represented as distinctive cell subsets. A continuum of transitional cell states exists between these discrete stages. We combine the depth of single-cell mass cytometry and an algorithm developed to leverage this continuum by aligning single cells of a given lineage onto a unified trajectory that accurately predicts the developmental path de novo. Applied to human B cell lymphopoiesis, the algorithm (termed Wanderlust) constructed trajectories spanning from hematopoietic stem cells through to naive B cells. This trajectory revealed nascent fractions of B cell progenitors and aligned them with developmentally cued regulatory signaling including IL-7/STAT5 and cellular events such as immunoglobulin rearrangement, highlighting checkpoints across which regulatory signals are rewired paralleling changes in cellular state. This study provides a comprehensive analysis of human B lymphopoiesis, laying a foundation to apply this approach to other tissues and “corrupted” developmental processes including cancer.

Copyright © 2014 Elsevier Inc. All rights reserved.

Monocle method

(Trapnell et al, Nature 2014)


Defining the transcriptional dynamics of a temporal process such as cell differentiation is challenging owing to the high variability in gene expression between individual cells. Time-series gene expression analyses of bulk cells have difficulty distinguishing early and late phases of a transcriptional cascade or identifying rare subpopulations of cells, and single-cell proteomic methods rely on a priori knowledge of key distinguishing markers. Here we describe Monocle, an unsupervised algorithm that increases the temporal resolution of transcriptome dynamics using single-cell RNA-Seq data collected at multiple time points. Applied to the differentiation of primary human myoblasts, Monocle revealed switch-like changes in expression of key regulatory factors, sequential waves of gene regulation, and expression of regulators that were not known to act in differentiation. We validated some of these predicted regulators in a loss-of function screen. Monocle can in principle be used to recover single-cell gene expression kinetics from a wide array of cellular processes, including differentiation, proliferation and oncogenic transformation.


Computational and analytical challenges in single-cell transcriptomics

Oliver Stegle1, Sarah A. Teichmann1,2 and John C. Marioni1,2

1European Molecular Biology Laboratory European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. 2Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. Correspondence to J.C.M.  e-mail: doi:10.1038/nrg3833 Published online 28 January 2015



The development of high-throughput RNA sequencing (RNA-seq) at the single-cell level has already led to profound new discoveries in biology, ranging from the identification of novel cell types to the study of global patterns of stochastic gene expression. Alongside the technological breakthroughs that have facilitated the large-scale generation of single-cell transcriptomic data, it is important to consider the specific computational and analytical challenges that still have to be overcome. Although some tools for analysing RNA-seq data from bulk cell populations can be readily applied to single-cell RNA-seq data, many new computational strategies are required to fully exploit this data type and to enable a comprehensive yet detailed study of gene expression at the single-cell level.