Virtual ChIP-seq: predicting transcription factor binding by learning from the transcriptome

Our next meeting will be at 2pm on Mar 12th, in room 4160 of the Discovery building. Our Selected paper is Virtual ChIP-seq: predicting transcription factor binding by learning from the transcriptome.
.
The abstract is as follows.

Motivation: Identifying transcription factor binding sites is the first step in pinpointing non-coding mutations that disrupt the regulatory function of transcription factors and promote disease. ChIP-seq is the most common method for identifying binding sites, but performing it on patient samples is hampered by the amount of available biological material and the cost of the experiment. Existing methods for computational prediction of regulatory elements primarily predict binding in genomic regions with sequence similarity to known transcription factor sequence preferences. This has limited efficacy since most binding sites do not resemble known transcription factor sequence motifs, and many transcription factors are not even sequence-specific.

Results: We developed Virtual ChIP-seq, which predicts binding of individual transcription factors in new cell types using an artificial neural network that integrates ChIP-seq results from other cell types and chromatin accessibility data in the new cell type. Virtual ChIP-seq also uses learned associations between gene expression and transcription factor binding at specific genomic regions. This approach outperforms methods that use transcription factor sequence preferences in the form of position weight matrices, predicting binding for transcription factors (accuracy > 0.99; Matthews correlation coefficient > 0.3). In at least one validation cell type, performance of Virtual ChIP-seq is higher than all participants of the DREAM Challenge for in vivo transcription factor binding site prediction in 4 of 9 transcription factors that we could compare to.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.