Spatially Variable Genes¶
Pipeline question: Which genes exhibit expression patterns that are spatially organized rather than randomly distributed across the tissue?
Overview¶
Spatially variable gene (SVG) detection identifies genes whose expression varies in a structured way across tissue space. These genes mark spatial domains, gradients, and local microenvironments that would be invisible in dissociated scRNA-seq data. SVG detection typically serves as an exploratory step — surfacing candidate genes for spatial domain identification, ligand-receptor analysis, or biological interpretation.
Key Methods¶
SpatialDE / SpatialDE2¶
- Paper: Nature Methods, 2018 / bioRxiv, 2021
- Code: github.com/PMBio/SpatialDE
- Key innovation: Gaussian process regression to model spatial gene expression, with automatic relevance determination for spatial versus non-spatial variance.
- Strengths:
- Principled statistical framework with well-calibrated p-values
- SpatialDE2 adds automatic expression histology for grouping genes by spatial pattern
- Limitations:
- Computationally expensive — O(n^3) scaling limits applicability to large datasets
- SpatialDE2 improves speed but remains slower than non-GP methods
- Technology compatibility: Visium, Slide-seq, ST, any spot-level data
SPARK / SPARK-X¶
- Paper: Nature Methods, 2020 / Genome Biology, 2021
- Code: github.com/xzhoulab/SPARK
- Key innovation: SPARK uses a spatial generalized linear mixed model; SPARK-X uses non-parametric covariance testing for scalability to millions of cells.
- Strengths:
- SPARK-X is among the fastest SVG methods available
- Non-parametric approach avoids distributional assumptions
- Scales to Stereo-seq and other high-resolution platforms
- Limitations:
- SPARK-X trades some statistical power for speed
- Less sensitive to subtle spatial patterns than GP-based methods
- Technology compatibility: Visium, Slide-seq, Stereo-seq, Xenium, MERFISH
nnSVG¶
- Paper: Nature Communications, 2023
- Code: github.com/lmweber/nnSVG
- Key innovation: Nearest-neighbor Gaussian processes that approximate full GP inference in O(n) time, enabling scalable and accurate SVG detection.
- Strengths:
- Best accuracy in independent benchmarks
- Scalable through nearest-neighbor approximation
- Provides effect size estimates (proportion of spatial variance)
- Limitations:
- Slower than SPARK-X despite approximations
- R/Bioconductor only
- Technology compatibility: Visium, Visium HD, Slide-seq, Xenium, MERFISH, Stereo-seq
SOMDE¶
- Paper: Bioinformatics, 2021
- Code: github.com/WhittakerLab/SOMDE
- Key innovation: Self-organizing map dimensionality reduction of spatial coordinates before Gaussian process testing, dramatically reducing computation time.
- Strengths:
- 5-10x faster than SpatialDE with comparable accuracy
- Simple to use with minimal parameters
- Limitations:
- SOM-based approximation may miss fine-grained spatial patterns
- Less well-maintained than alternatives
- Technology compatibility: Visium, Slide-seq, ST
Hotspot¶
- Paper: Cell Systems, 2021
- Code: github.com/yoseflab/hotspot
- Key innovation: Identifies informative genes and gene modules using local autocorrelation statistics on any cell-cell similarity graph, not just spatial coordinates.
- Strengths:
- Flexible — works with spatial, kNN, or other similarity graphs
- Identifies gene modules (co-varying gene sets), not just individual SVGs
- Fast computation
- Limitations:
- Graph-based approach conflates spatial and transcriptomic neighborhoods if not careful
- Less interpretable spatial pattern decomposition than SpatialDE
- Technology compatibility: Any spatial platform, also applicable to scRNA-seq
SpatialCorr¶
- Paper: bioRxiv, 2023
- Code: github.com/mbernste/SpatialCorr
- Key innovation: Tests for spatially varying gene-gene correlations, detecting genes whose co-expression structure changes across tissue space.
- Strengths:
- Unique focus on spatial co-expression, not just spatial expression
- Identifies genes involved in spatially-localized regulatory programs
- Limitations:
- Computationally intensive for large gene panels
- Conceptually more complex — harder to interpret than standard SVGs
- Technology compatibility: Visium, Slide-seq, any spot-level data
trendsceek¶
- Paper: Nature Methods, 2018
- Code: github.com/edsgard/trendsceek
- Key innovation: Uses marked point process statistics to detect spatial expression trends without assuming specific pattern types.
- Strengths:
- Non-parametric approach detects diverse spatial patterns
- Among the earliest SVG methods with solid statistical foundations
- Limitations:
- Very slow — impractical for genome-wide testing on large datasets
- Largely superseded by faster methods
- Technology compatibility: Visium, ST, Slide-seq
PROST¶
- Paper: Nature Communications, 2024
- Code: github.com/WenMi1/PROST
- Key innovation: Quantifies spatial expression patterns using a Positive-and-Negative-index (PI) score that captures both the strength and type of spatial autocorrelation.
- Strengths:
- Fast and scalable
- PI score provides interpretable pattern characterization
- Also performs spatial domain detection
- Limitations:
- Dual-purpose design (SVG + domains) may not optimize either fully
- Less independent validation than nnSVG or SPARK-X
- Technology compatibility: Visium, Slide-seq, Stereo-seq
Benchmark Summary¶
Independent benchmarks consistently rank nnSVG as the most accurate SVG detection method, with well-calibrated p-values and the best true-positive recovery rate. SPARK-X offers the best speed-accuracy tradeoff, making it the practical choice for large datasets (>50,000 spots/cells) where nnSVG becomes slow. Notably, the classical Moran's I statistic remains a competitive baseline — often matching or exceeding more complex methods in power, especially for strong spatial patterns.
Start simple
Moran's I (available in squidpy and scanpy) is a reasonable first pass for SVG detection. Move to nnSVG for rigorous analysis or SPARK-X when dataset size demands speed.
When to Use What¶
| Your data | Your goal | Recommended | Why |
|---|---|---|---|
| Visium (<10K spots) | Rigorous SVG detection | nnSVG | Best accuracy, manageable runtime at this scale |
| Large dataset (>50K cells) | Fast SVG screen | SPARK-X | Scalable non-parametric testing |
| Quick exploration | Initial SVG candidates | Moran's I (squidpy) | Fast, no additional installation, competitive accuracy |
| Any data | Gene module discovery | Hotspot | Identifies co-varying gene groups, not just individual SVGs |
| Any data | Spatially varying co-expression | SpatialCorr | Unique capability to detect changing gene-gene correlations |
Technology Compatibility¶
| Method | Visium | Visium HD | Xenium | MERFISH | CosMx | CODEX | Stereo-seq |
|---|---|---|---|---|---|---|---|
| SpatialDE/2 | Yes | - | - | - | - | - | - |
| SPARK-X | Yes | Yes | Yes | Yes | - | - | Yes |
| nnSVG | Yes | Yes | Yes | Yes | - | - | Yes |
| SOMDE | Yes | - | - | - | - | - | - |
| Hotspot | Yes | Yes | Yes | Yes | Yes | - | Yes |
| SpatialCorr | Yes | - | - | - | - | - | - |
| trendsceek | Yes | - | - | - | - | - | - |
| PROST | Yes | - | - | - | - | - | Yes |