DNA methylation

Epigenomic Co-localization and Co-evolution Reveal a Key Role for 5hmC as a Communication Hub in the Chromatin Network of ESCs

Our selected paper for this week is titled Epigenomic Co-localization and Co-evolution Reveal a Key Role for 5hmC as a Communication Hub in the Chromatin Network of ESCs, from Cell.The abstract is as follows:

Epigenetic communication through histone and cytosine modifications is essential for gene regula- tion and cell identity. Here, we propose a framework that is based on a chromatin communication model to get insight on the function of epigenetic modifica- tions in ESCs. The epigenetic communication network was inferred from genome-wide location data plus extensive manual annotation. Notably, we found that 5-hydroxymethylcytosine (5hmC) is the most-influential hub of this network, connecting DNA demethylation to nucleosome remodeling complexes and to key transcription factors of plurip- otency. Moreover, an evolutionary analysis revealed a central role of 5hmC in the co-evolution of chro- matin-related proteins. Further analysis of regions where 5hmC co-localizes with specific interactors shows that each interaction points to chromatin remodeling, stemness, differentiation, or meta- bolism. Our results highlight the importance of cyto- sine modifications in the epigenetic communication of ESCs.

Feel free to begin our discussion in the comments section below. Our meeting will be at 12:30 PM in room 3160 of the Discovery building on July 18th.


Regression Analysis of Combined Gene Expression Regulation in Acute Myeloid Leukemia

Yue Li, Minggao Liang, Zhaolei Zhang


Gene expression is a combinatorial function of genetic/epigenetic factors such as copy number variation (CNV), DNA methylation (DM), transcription factors (TF) occupancy, and microRNA (miRNA) post-transcriptional regulation. At the maturity of microarray/sequencing technologies, large amounts of data measuring the genome-wide signals of those factors became available from Encyclopedia of DNA Elements (ENCODE) and The Cancer Genome Atlas (TCGA). However, there is a lack of an integrative model to take full advantage of these rich yet heterogeneous data. To this end, we developed RACER (Regression Analysis of Combined Expression Regulation), which fits the mRNA expression as response using as explanatory variables, the TF data from ENCODE, and CNV, DM, miRNA expression signals from TCGA. Briefly, RACER first infers the sample-specific regulatory activities by TFs and miRNAs, which are then used as inputs to infer specific TF/miRNA-gene interactions. Such a two-stage regression framework circumvents a common difficulty in integrating ENCODE data measured in generic cell-line with the sample-specific TCGA measurements. As a case study, we integrated Acute Myeloid Leukemia (AML) data from TCGA and the related TF binding data measured in K562 from ENCODE. As a proof-of-concept, we first verified our model formalism by 10-fold cross-validation on predicting gene expression. We next evaluated RACER on recovering known regulatory interactions, and demonstrated its superior statistical power over existing methods in detecting known miRNA/TF targets. Additionally, we developed a feature selection procedure, which identified 18 regulators, whose activities clustered consistently with cytogenetic risk groups. One of the selected regulators is miR-548p, whose inferred targets were significantly enriched for leukemia-related pathway, implicating its novel role in AML pathogenesis. Moreover, survival analysis using the inferred activities identified C-Fos as a potential AML prognostic marker. Together, we provided a novel framework that successfully integrated the TCGA and ENCODE data in revealing AML-specific regulatory program at global level.


Network-guided regression for detecting associations between DNA methylation and gene expression


Zi Wang1, Edward Curry2 and Giovanni Montana1,3,*

1Department of Mathematics, Imperial College London, London SW7 2AZ.

2 Division of Cancer, Imperial College London, Hammersmith Hospital, London, W12 0NN

3 Department of Biomedical Engineering, King’s College London, St Thomas’ Hospital, London SE1 7EH

*To whom correspondence should be addressed. Giovanni Montana, E-mail: giovanni.montana@kcl.ac.uk



Motivation: High-throughput profiling in biological research has resulted in the availability of a wealth of data cataloguing the genetic, epigenetic and transcriptional states of cells. This data could yield discoveries that lead to breakthroughs in the diagnosis and treatment of human disease, but requires statistical methods designed to find the most relevant patterns from millions of potential interactions. Aberrant DNA methylation is often a feature of cancer, and has been proposed as a therapeutic target. However, the relationship between DNA methylation and gene expression remains poorly understood.

Results: We propose Network-sparse Reduced-Rank Regression (NsRRR), a multivariate regression framework capable of using prior biological knowledge expressed as gene interaction networks to guide the search for associations between gene expression and DNA methylation signatures. We use simulations to show the advantage of our proposed model in terms of variable selection accuracy over alternative models that do not use prior network information. We discuss an application of NsRRR to TCGA datasets on primary ovarian tumours.

Availability: R code implementing the NsRRR model is available at http://www2.imperial.ac.uk/~gmontana/