Multi-Modal Integration¶
Pipeline question: How do we combine spatial transcriptomics with other data modalities — chromatin accessibility, protein expression, histology, or dissociated scRNA-seq — to build a more complete picture of tissue biology?
Overview¶
Spatial omics technologies increasingly generate or pair with multiple data types: RNA expression combined with protein (CITE-seq, CosMx), chromatin accessibility (spatial ATAC), histology images, or matched scRNA-seq. Multi-modal integration methods align these complementary views to reveal relationships invisible in any single modality — for example, connecting spatial gene expression patterns to chromatin states or linking protein markers to transcriptomic subtypes.
Key Methods¶
SpatialGLUE¶
- Paper: Nature Methods, 2024
- Code: github.com/JinmiaoChenLab/SpatialGLUE
- Key innovation: Graph dual-attention framework that integrates two spatial omics modalities (e.g., RNA + protein, RNA + ATAC) by learning modality-specific and shared spatial representations.
- Strengths:
- Designed specifically for multi-modal spatial data
- Dual-attention mechanism handles modality-specific noise
- Demonstrated on spatial RNA + protein and spatial RNA + ATAC
- Limitations:
- Limited to two modalities at a time
- Requires spatially-matched multi-modal data
- GPU required
- Technology compatibility: Spatial CITE-seq, SPOTS, spatial ATAC, any co-measured spatial modalities
HEST¶
- Paper: Nature Methods, 2024
- Code: github.com/mahmoodlab/HEST
- Key innovation: Large-scale collection and framework for integrating H&E histology images with spatial transcriptomics, enabling histology-guided spatial analysis and gene expression prediction from images.
- Strengths:
- Standardized dataset of 1,000+ paired H&E + spatial transcriptomics samples
- Enables training of histology-to-expression prediction models
- Connects pathology and molecular spatial data
- Limitations:
- Histology-to-expression prediction has inherent accuracy ceilings
- Dataset focus on Visium platform
- Technology compatibility: Visium, any platform with paired histology
moscot¶
- Paper: Nature, 2024
- Code: github.com/theislab/moscot
- Key innovation: Multi-omics single-cell optimal transport framework for aligning cells across modalities, time points, and spatial coordinates using unbalanced optimal transport.
- Strengths:
- Unified optimal transport framework for many alignment tasks
- Handles spatial + temporal + multi-modal integration
- Scalable to large datasets through entropic regularization
- Part of the scverse ecosystem
- Limitations:
- Optimal transport parameters require tuning
- Complex framework with a steep learning curve
- Technology compatibility: Visium, Slide-seq, MERFISH, any spatial platform
MISO¶
- Paper: Nature Biotechnology, 2024
- Code: github.com/KlugerLab/MISO
- Key innovation: Multi-resolution framework that integrates spot-level and single-cell spatial data, bridging the resolution gap between Visium and imaging-based platforms.
- Strengths:
- Addresses the resolution gap between technologies
- Can enhance Visium resolution using imaging-based data as reference
- Limitations:
- Requires data from multiple platforms on similar tissue
- Alignment across platforms introduces new assumptions
- Technology compatibility: Visium + MERFISH, Visium + Xenium (cross-platform)
SpaGE¶
- Paper: Nucleic Acids Research, 2020
- Code: github.com/tabdelaal/SpaGE
- Key innovation: Integrates scRNA-seq with spatial data to predict unmeasured genes in the spatial dataset using domain adaptation.
- Strengths:
- Imputes unmeasured genes in spatial data from scRNA-seq
- Simple and fast computation
- Limitations:
- Imputation accuracy decreases for genes with low spatial autocorrelation
- Assumes scRNA-seq and spatial data share the same biological space
- Technology compatibility: Visium, MERFISH, seqFISH
PRECAST¶
- Paper: Nature Communications, 2023
- Code: github.com/feiyoung/PRECAST
- Key innovation: Probabilistic framework for joint spatial domain detection and batch integration across multiple spatial transcriptomics samples.
- Strengths:
- Jointly integrates multiple samples with batch correction
- Probabilistic framework provides uncertainty
- Scalable to many samples
- Limitations:
- Focused on within-modality (RNA) batch integration rather than cross-modality
- R only
- Technology compatibility: Visium, Slide-seq
Starfysh¶
- Paper: Nature Biotechnology, 2024
- Code: github.com/azizilab/starfysh
- Key innovation: Semi-supervised deconvolution framework that integrates spatial transcriptomics with archetype analysis and histology for tissue-level multi-modal integration.
- Strengths:
- Combines expression, spatial coordinates, and histology
- Discovers tissue archetypes beyond predefined cell types
- Semi-supervised mode handles incomplete annotations
- Limitations:
- Archetype interpretation requires domain knowledge
- Complex model with many components
- Technology compatibility: Visium
MOFA-FLEX¶
- Paper: Genome Biology, 2024
- Code: github.com/bioFAM/MOFA2
- Key innovation: Extension of MOFA+ for flexible multi-modal factor analysis that handles partially overlapping features and samples across modalities.
- Strengths:
- Handles missing data and partial overlap between modalities
- Interpretable factor model
- Well-established framework
- Limitations:
- Linear model may miss nonlinear cross-modal relationships
- Not spatially-aware natively
- Technology compatibility: Any multi-modal data; spatial integration requires pairing with spatial methods
LLOKI¶
- Paper: bioRxiv, 2024
- Code: github.com/jfma-USTC/LLOKI
- Key innovation: Language-model-based framework for integrating spatial omics with knowledge graphs, connecting molecular observations to curated biological knowledge.
- Strengths:
- Connects spatial data to external knowledge bases
- Language model enables natural-language queries of spatial results
- Limitations:
- Emerging approach with limited validation
- Language model interpretations require expert verification
- Technology compatibility: Any spatial platform
pyWNN¶
- Paper: Cell, 2021
- Code: github.com/scverse/muon
- Key innovation: Weighted nearest neighbor (WNN) integration from Seurat v4, reimplemented in Python within muon — weights each modality's contribution to cell similarity adaptively per cell.
- Strengths:
- Per-cell modality weighting adapts to local data quality
- Simple conceptual framework
- Available in Python (muon) and R (Seurat)
- Limitations:
- Not spatially-aware — treats cells as unordered
- Works best for co-measured modalities (CITE-seq, multiome)
- Technology compatibility: Any multi-modal spatial data with co-measured modalities
Benchmark Summary¶
Multi-modal spatial integration is a rapidly evolving area with limited systematic benchmarks. For spatial RNA + protein integration, SpatialGLUE is the current leader. For Visium + scRNA-seq integration, MOFA+ and Tangram (see Deconvolution) are most widely used. The totalVI model from scvi-tools remains strong for protein + RNA co-measured data. True multi-modal spatial integration (>2 modalities) is still immature, and most methods handle only two modalities at a time.
Start with the simplest integration
Before applying complex multi-modal methods, ask whether simple concatenation or WNN integration provides adequate results. Complex methods shine when modalities have different noise structures, but add overhead for well-behaved data.
Alignment assumptions
Cross-modal integration assumes that the modalities share a common biological space. When integrating spatial data with scRNA-seq from different donors, conditions, or preparation protocols, batch effects between modalities can dominate over biological signal.
When to Use What¶
| Your data | Your goal | Recommended | Why |
|---|---|---|---|
| Co-measured RNA + protein (spatial) | Joint spatial domain detection | SpatialGLUE | Purpose-built for multi-modal spatial data |
| Visium + H&E | Histology-guided spatial analysis | HEST or Starfysh | Connects pathology and molecular data |
| Multi-sample spatial | Batch integration across samples | PRECAST or moscot | Handle batch effects in spatial context |
| Spatial + scRNA-seq | Map scRNA-seq onto spatial data | Tangram (see Deconvolution) | Project single-cell modalities to space |
| Cross-platform spatial | Bridge resolution gaps | MISO | Integrates Visium with imaging-based data |
| Co-measured multi-modal | Simple weighted integration | pyWNN (muon) | Per-cell adaptive modality weighting |
Technology Compatibility¶
| Method | Visium | Visium HD | Xenium | MERFISH | CosMx | CODEX | Stereo-seq |
|---|---|---|---|---|---|---|---|
| SpatialGLUE | Yes | - | - | - | - | Yes | - |
| HEST | Yes | - | - | - | - | - | - |
| moscot | Yes | - | - | Yes | - | - | - |
| MISO | Yes | - | Yes | Yes | - | - | - |
| SpaGE | Yes | - | - | Yes | - | - | - |
| PRECAST | Yes | - | - | - | - | - | - |
| Starfysh | Yes | - | - | - | - | - | - |
| MOFA-FLEX | Yes | - | - | - | - | - | - |
| pyWNN | Yes | - | Yes | Yes | Yes | Yes | - |