szhang256


Learning causal networks with latent variables from multivariate information in genomic data

Our next meeting will be at 11:00 on Dec 5th, in room 4160 of the Discovery building. Our Selected paper is Learning causal networks with latent variables from multivariate information in genomic data.
The abstract is as follows.

Learning causal networks from large-scale genomic data remains challenging in absence of time series or controlled perturbation experiments. We report an information- theoretic method which learns a large class of causal or non-causal graphical models from purely observational data, while including the effects of unobserved latent variables, commonly found in many genomic datasets. Starting from a complete graph, the method iteratively removes dispensable edges, by uncovering significant information contributions from indirect paths, and assesses edge-specific confidences from randomization of available data. The remaining edges are then oriented based on the signature of causality in observational data. The approach and associated algorithm, miic, outperform earlier methods on a broad range of benchmark networks. Causal network reconstructions are presented at different biological size and time scales, from gene regulation in single cells to whole genome duplication in tumor development as well as long term evolution of vertebrates. Miic is publicly available at https://github.com/miicTeam/MIIC.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.


Reconstruction of developmental landscapes by optimal-transport analysis of single-cell gene expression sheds light on cellular reprogramming.

Our next meeting will be at 11:00 on Nov 7th, in room 4160 of the Discovery building. Our Selected paper is Reconstruction of developmental landscapes by optimal-transport analysis of single-cell gene expression sheds light on cellular reprogramming.
The abstract is as follows.

Understanding the molecular programs that guide cellular differentiation during development is a major goal of modern biology. Here, we introduce an approach, WADDINGTON-OT, based on the mathematics of optimal transport, for inferring developmental landscapes, probabilistic cellular fates and dynamic trajectories from large-scale single-cell RNA-seq (scRNA-seq) data collected along a time course. We demonstrate the power of WADDINGTON-OT by applying the approach to study 65,781 scRNA-seq profiles collected at 10 time points over 16 days during reprogramming of fibroblasts to iPSCs. We construct a high-resolution map of reprogramming that rediscovers known features; uncovers new alternative cell fates including neural- and placental-like cells; predicts the origin and fate of any cell class; highlights senescent-like cells that may support reprogramming through paracrine signaling; and implicates regulatory models in particular trajectories. Of these findings, we highlight Obox6, which we experimentally show enhances reprogramming efficiency. Our approach provides a general framework for investigating cellular differentiation.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.


Vicus: Exploiting local structures to improve network-based analysis of biological data

Our next meeting will be at 11:00 on Oct 24th, in room 4160 of the Discovery building. Our Selected paper is Vicus: Exploiting local structures to improve network-based analysis of biological data.
The abstract is as follows.

Biological networks entail important topological features and patterns critical to understanding interactions within complicated biological systems. Despite a great progress in understanding their structure, much more can be done to improve our inference and network analysis. Spectral methods play a key role in many network-based applications. Fundamental to spectral methods is the Laplacian, a matrix that captures the global structure of the network. Unfortunately, the Laplacian does not take into account intricacies of the network’s local structure and is sensitive to noise in the network. These two properties are fundamental to biological networks and cannot be ignored. We propose an alternative matrix Vicus. The Vicus matrix captures the local neighborhood structure of the network and thus is more effective at modeling biological interactions. We demonstrate the advantages of Vicus in the context of spectral methods by extensive empirical benchmarking on tasks such as single cell dimensionality reduction, protein module discovery and ranking genes for cancer subtyping. Our experiments show that using Vicus, spectral methods result in more accurate and robust performance in all of these tasks.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.


Knowledge-guided gene prioritization reveals new insights into the mechanisms of chemoresistance

Our next meeting will be at 11:00 on Oct 10th, in room 4160 of the Discovery building. Our Selected paper is Knowledge-guided gene prioritization reveals new insights into the mechanisms of chemoresistance.
The abstract is as follows.

Background: Identification of genes whose basal mRNA expression predicts the sensitivity of tumor cells to cytotoxic treatments can play an important role in individualized cancer medicine. It enables detailed characterization of the mechanism of action of drugs. Furthermore, screening the expression of these genes in the tumor tissue may suggest the best course of chemotherapy or a combination of drugs to overcome drug resistance.

Results: We developed a computational method called ProGENI to identify genes most associated with the variation of drug response across different individuals, based on gene expression data. In contrast to existing methods, ProGENI also utilizes prior knowledge of protein–protein and genetic interactions, using random walk techniques. Analysis of two relatively new and large datasets including gene expression data on hundreds of cell lines and their cytotoxic responses to a large compendium of drugs reveals a significant improvement in prediction of drug sensitivity using genes identified by ProGENI compared to other methods. Our siRNA knockdown experiments on ProGENI-identified genes confirmed the role of many new genes in sensitivity to three chemotherapy drugs: cisplatin, docetaxel, and doxorubicin. Based on such experiments and extensive literature survey, we demonstrate that about 73% of our top predicted genes modulate drug response in selected cancer cell lines. In addition, global analysis of genes associated with groups of drugs uncovered pathways of cytotoxic response shared by each group.

Conclusions: Our results suggest that knowledge-guided prioritization of genes using ProGENI gives new insight into mechanisms of drug resistance and identifies genes that may be targeted to overcome this phenomenon.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.


Reversed graph embedding resolves complex single-cell trajectories

Our next meeting will be at 11:00 on September 26th, in room 4160 of the Discovery building. Our Selected paper is Reversed graph embedding resolves complex single-cell trajectories.
The abstract is as follows.

Single-cell trajectories can unveil how gene regulation governs cell fate decisions. However, learning the structure of complex trajectories with multiple branches remains a challenging computational problem. We present Monocle 2, an algorithm that uses reversed graph embedding to describe multiple fate decisions in a fully unsupervised manner. We applied Monocle 2 to two studies of blood development and found that mutations in the genes encoding key lineage transcription factors divert cells to alternative fates.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.


Context Specificity in Causal Signaling Networks Revealed by Phosphoprotein Profiling

Our next meeting will be at 2:30 on August 4th, in room 4160 of the Discovery building. Our Selected paper is Context Specificity in Causal Signaling Networks Revealed by Phosphoprotein Profiling.
The abstract is as follows.

Signaling networks downstream of receptor tyrosine kinases are among the most extensively studied biological networks, but new approaches are needed to elucidate causal relationships between network components and understand how such relationships are influenced by biological context and disease. Here, we investigate the context specificity of signaling networks within a causal conceptual framework using reverse-phase protein array time-course assays and network analysis approaches. We focus on a well-defined set of signaling proteins profiled under inhibition with five kinase inhibitors in 32 contexts: four breast cancer cell lines (MCF7, UACC812, BT20, and BT549) under eight stimulus conditions. The data, spanning multiple pathways and comprising ~70,000 phosphoprotein and ~260,000 protein measurements, provide a wealth of testable, context-specific hypotheses, several of which we experimentally validate. Furthermore, the data provide a unique resource for computational methods development, permitting empirical assessment of causal network learning in a complex, mammalian setting.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.


Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning

Our next meeting will be at 3:00 on June 23th, in room 4160 of the Discovery building. Our Selected paper is Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning.
The abstract is as follows.

We present single-cell interpretation via multikernel learning (SIMLR), an analytic framework and software which learns a similarity measure from single-cell RNA-seq data in order to perform dimension reduction, clustering and visualization. On seven published data sets, we benchmark SIMLR against state-of-the-art methods. We show that SIMLR is scalable and greatly enhances clustering performance while improving the visualization and interpretability of single-cell sequencing data.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.


Predicting Causal Relationships from Biological Data: Applying Automated Casual Discovery on Mass Cytometry Data of Human Immune Cells

Our next meeting will be at 3:00 on June 09th, in room 4160 of the Discovery building. Our Selected paper is Predicting Causal Relationships from Biological Data: Applying Automated Casual Discovery on Mass Cytometry Data of Human Immune Cells.
The abstract is as follows.

Learning the causal relationships that define a molecular system allows us to predict how the system will respond to different interventions. Distinguishing causality from mere association typically requires randomized experiments. Methods for automated causal discovery from limited experiments exist, but have so far rarely been tested in systems biology applications. In this work, we apply state-of-the art causal discovery methods on a large collection of public mass cytometry data sets, measuring intra-cellular signaling proteins of the human immune system and their response to several perturbations. We show how different experimental conditions can be used to facilitate causal discovery, and apply two fundamental methods that produce context-specific causal predictions. Causal predictions were reproducible across independent data sets from two different studies, but often disagree with the KEGG pathway databases. Within this context, we discuss the caveats we need to overcome for automated causal discovery to become a part of the routine data analysis in systems biology.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.


Selecting the most appropriate time points to profile in high-throughpsut studies

Our next meeting will be at 3:00 on May 26th, in room 4160 of the Discovery building. Our Selected paper is Selecting the most appropriate time points to profile in high-throughpsut studies.
The abstract is as follows.

Biological systems are increasingly being studied by high throughput profiling of molecular data over time. Determining the set of time points to sample in studies that profile several different types of molecular data is still challenging. Here we present the Time Point Selection (TPS) method that solves this combinatorial problem in a principled and practical way. TPS utilizes expression data from a small set of genes sampled at a high rate. As we show by applying TPS to study mouse lung development, the points selected by TPS can be used to reconstruct an accurate representation for the expression values of the non selected points. Further, even though the selection is only based on gene expression, these points are also appropriate for representing a much larger set of protein, miRNA and DNA methylation changes over time. TPS can thus serve as a key design strategy for high throughput time series experiments.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.


Discovering sparse transcription factor codes for cell states and state transitions during development

Our next meeting will be at 3:00 on April 28th, in room 4160 of the Discovery building. Our Selected paper is Discovering sparse transcription factor codes for cell states and state transitions during development.
The abstract is as follows.

Computational analysis of gene expression to determine both the sequence of lineage choices made by multipotent cells and to identify the genes influencing these decisions is challenging. Here we discover a pattern in the expression levels of a sparse subset of genes among cell types in B- and T-cell developmental lineages that correlates with developmental topologies. We develop a statistical framework using this pattern to simultaneously infer lineage transitions and the genes that determine these relationships. We use this technique to reconstruct the early hematopoietic and intestinal developmental trees. We extend this framework to analyze single-cell RNA-seq data from early human cortical development, inferring a neocortical-hindbrain split in early progenitor cells and the key genes that could control this lineage decision. Our work allows us to simultaneously infer both the identity and lineage of cell types as well as a small set of key genes whose expression patterns reflect these relationships.

We welcome all who can join us for this discussion. Feel free to begin that discussion in the comments section below.