NicheCompass¶
Verdict: The most complete niche method currently available — integrates cell identity, spatial context, and ligand-receptor signaling into a single framework with cross-sample atlas building.
Citation: Lim J, et al. "NicheCompass: a graph neural network approach to niche identification using cell-cell communication." Nature Genetics 57:697-708, 2025. DOI
Problem Setup¶
Can we build a unified niche atlas across multiple samples and conditions by incorporating intercellular communication into niche embeddings?
Niche Definition¶
Communication-based (hybrid): Niche = GNN-learned embedding that explicitly encodes ligand-receptor interaction features on spatial graph edges, combined with cell expression features on nodes.
This is a hybrid definition — it uses composition (who is nearby via graph structure), expression (node features), and communication (L-R edge features) simultaneously.
Architecture¶
- Build a spatial graph from cell coordinates.
- Annotate graph edges with ligand-receptor interaction scores from curated databases.
- Train a graph neural network that propagates both node (expression) and edge (L-R) features.
- Learn a latent niche embedding per cell that captures cell identity, spatial context, and active signaling.
- Enable cross-sample integration by aligning latent spaces across datasets.
Evaluation¶
- Tested on multiple tissues including liver, intestine, and tumor samples.
- Cross-sample niche atlas construction demonstrated on datasets from different labs and platforms.
- Compared against composition-only and expression-only niche methods.
Honest Assessment¶
Strengths:
- The most information-rich niche definition — incorporates who, what, and how cells interact.
- Cross-sample atlas building is a practical capability missing from most niche methods.
- L-R edge features add biological interpretability — you can inspect which signaling pathways define each niche.
- Well-engineered software from the Lotfollahi lab (same group as scArches, CellFlow).
Limitations:
- Niche definitions are constrained by the L-R database used. Novel or poorly annotated signaling pathways are invisible.
- L-R "activity" is inferred from co-expression, not measured directly. High expression of a ligand and receptor does not guarantee active signaling.
- The GNN + L-R combination adds computational complexity compared to simpler niche methods.
- Cross-sample integration assumes that niches are comparable across samples — a strong assumption that may not hold for highly variable tissues.
Design Decision: NicheCompass bets that intercellular communication is the most biologically meaningful niche feature. This is a defensible bet — signaling defines function, and function defines the niche. But it makes the method dependent on L-R databases, which are incomplete for many tissues and species.