CellCharter¶
Verdict: The most complete composition-based niche method; multi-resolution, cross-platform, and principled clustering, though bounded by the quality of input cell-type annotations.
Citation: Varrone M, et al. "CellCharter reveals spatial cell niches associated with tissue remodeling and cell plasticity." Nature Genetics 56:1718-1727, 2024. DOI
Problem Setup¶
How can we identify spatial cell niches that are robust across different spatial resolutions, platforms (imaging vs sequencing), and tissues?
Niche Definition¶
Composition-based: Niche = cell-type proportions in a graph neighborhood, learned through GNN message passing at multiple spatial scales, then clustered via Gaussian mixture models.
Architecture¶
- Build a spatial graph from cell coordinates (k-NN or radius-based).
- Use graph neural network (GNN) message passing to aggregate cell-type composition at multiple neighborhood scales.
- Encode aggregated compositions into a latent space.
- Cluster the latent embeddings using a Gaussian mixture model with automatic model selection (BIC/AIC for choosing k).
Evaluation¶
- Tested on CODEX (protein imaging), MERFISH (RNA imaging), Visium (sequencing), and Slide-seq (sequencing) datasets.
- Benchmarked against Schurch-style k-NN + k-means and other niche methods.
- Demonstrated cross-platform consistency — similar niches detected from different technologies on comparable tissues.
Honest Assessment¶
Strengths:
- Multi-resolution: captures niches at different spatial scales rather than committing to a single neighborhood size.
- Cross-platform: validated on both imaging and sequencing data.
- Principled clustering: GMM with model selection avoids the arbitrary k-choice problem of Schurch et al.
- Well-engineered software with good documentation.
Limitations:
- Requires cell-type annotations as input — niche quality is bounded by annotation quality. If cell types are wrong or too coarse, niches will be wrong.
- The GNN architecture adds complexity over simpler approaches (k-NN + k-means) but the marginal improvement varies by dataset.
- Does not capture expression states or signaling — two neighborhoods with the same cell-type proportions but different activation states are treated identically.