MELD - Graph signal processing identifies cell populations affected by an experimental treatment

View our preprint on BioRxiv

Single-cell RNA-sequencing (scRNA-seq) is a powerful tool to quantify transcriptional states in thousands to millions of cells. It is increasingly common for scRNA-seq data to be collected in multiple experimental conditions, yet quantifying differences between scRNA-seq datasets remains an analytical challenge. Previous efforts at quantifying such differences focus on discrete regions of the transcriptional state space such as clusters of cells. Here, we describe a continuous measure of the effect of an experiment across the transcriptomic space. First, we use the manifold assumption to model the cellular state space as a graph with cells as nodes and edges connecting cells with similar transcriptomic profiles. Next, we create an Enhanced Experimental Signal (EES) that estimates the likelihood of observing cells from each condition at every point in the manifold. We show that the EES has useful properties and information. First, it allows us to identify how gene expression is affected by a given perturbation, including identifying non-monotonic changes from only two conditions. Second, we show that we can use both the magnitude and frequency of the EES, using an algorithm we call vertex frequency clustering, to derive subsets of cells at appropriate levels of granularity that are enriched in the experimental or control conditions or that are unaffected between conditions. We demonstrate both algorithms using a combination of biological and synthetic datasets.