Spatial Transcriptomics Models

This document covers models for analyzing spatially-resolved transcriptomics data in scvi-tools.

DestVI (Deconvolution of Spatial Transcriptomics using Variational Inference)

Purpose: Multi-resolution deconvolution of spatial transcriptomics using single-cell reference data.

Key Features: - Estimates cell type proportions at each spatial location - Uses single-cell RNA-seq reference for deconvolution - Multi-resolution approach (global and local patterns) - Accounts for spatial correlation - Provides uncertainty quantification

When to Use: - Deconvolving Visium or similar spatial transcriptomics - Have scRNA-seq reference data with cell type labels - Want to map cell types to spatial locations - Interested in spatial organization of cell types - Need probabilistic estimates of cell type abundance

Data Requirements: - Spatial data: Visium or similar spot-based measurements (target data) - Single-cell reference: scRNA-seq with cell type annotations - Both datasets should share genes

Basic Usage:

import scvi

# Step 1: Train scVI on single-cell reference
scvi.model.SCVI.setup_anndata(sc_adata, layer="counts")
sc_model = scvi.model.SCVI(sc_adata)
sc_model.train()

# Step 2: Setup spatial data
scvi.model.DESTVI.setup_anndata(
    spatial_adata,
    layer="counts"
)

# Step 3: Train DestVI using reference
model = scvi.model.DESTVI.from_rna_model(
    spatial_adata,
    sc_model,
    cell_type_key="cell_type"  # Cell type labels in reference
)
model.train(max_epochs=2500)

# Step 4: Get cell type proportions
proportions = model.get_proportions()
spatial_adata.obsm["proportions"] = proportions

# Step 5: Get cell type-specific expression
# Expression of genes specific to each cell type at each spot
ct_expression = model.get_scale_for_ct("T cells")

Key Parameters: - amortization: Amortization strategy ("both", "latent", "proportion") - n_latent: Latent dimensionality (inherited from scVI model)

Outputs: - get_proportions(): Cell type proportions at each spot - get_scale_for_ct(cell_type): Cell type-specific expression patterns - get_gamma(): Proportion-specific gene expression scaling

Visualization:

import scanpy as sc
import matplotlib.pyplot as plt

# Visualize specific cell type proportions spatially
sc.pl.spatial(
    spatial_adata,
    color="T cells",  # If proportions added to .obs
    spot_size=150
)

# Or use obsm directly
for ct in cell_types:
    plt.figure()
    sc.pl.spatial(
        spatial_adata,
        color=spatial_adata.obsm["proportions"][ct],
        title=f"{ct} proportions"
    )

Stereoscope

Purpose: Cell type deconvolution for spatial transcriptomics using probabilistic modeling.

Key Features: - Reference-based deconvolution - Probabilistic framework for cell type proportions - Works with various spatial technologies - Handles gene selection and normalization

When to Use: - Similar to DestVI but simpler approach - Deconvolving spatial data with reference - Faster alternative for basic deconvolution

Basic Usage:

scvi.model.STEREOSCOPE.setup_anndata(
    sc_adata,
    labels_key="cell_type",
    layer="counts"
)

# Train on reference
ref_model = scvi.model.STEREOSCOPE(sc_adata)
ref_model.train()

# Setup spatial data
scvi.model.STEREOSCOPE.setup_anndata(spatial_adata, layer="counts")

# Transfer to spatial
spatial_model = scvi.model.STEREOSCOPE.from_reference_model(
    spatial_adata,
    ref_model
)
spatial_model.train()

# Get proportions
proportions = spatial_model.get_proportions()

Tangram

Purpose: Spatial mapping and integration of single-cell data to spatial locations.

Key Features: - Maps single cells to spatial coordinates - Learns optimal transport between single-cell and spatial data - Gene imputation at spatial locations - Cell type mapping

When to Use: - Mapping cells from scRNA-seq to spatial locations - Imputing unmeasured genes in spatial data - Understanding spatial organization at single-cell resolution - Integrating scRNA-seq and spatial transcriptomics

Data Requirements: - Single-cell RNA-seq data with annotations - Spatial transcriptomics data - Shared genes between modalities

Basic Usage:

import tangram as tg

# Map cells to spatial locations
ad_map = tg.map_cells_to_space(
    adata_sc=sc_adata,
    adata_sp=spatial_adata,
    mode="cells",  # or "clusters" for cell type mapping
    density_prior="rna_count_based"
)

# Get mapping matrix (cells × spots)
mapping = ad_map.X

# Project cell annotations to space
tg.project_cell_annotations(
    ad_map,
    spatial_adata,
    annotation="cell_type"
)

# Impute genes in spatial data
genes_to_impute = ["CD3D", "CD8A", "CD4"]
tg.project_genes(ad_map, spatial_adata, genes=genes_to_impute)

Visualization:

# Visualize cell type mapping
sc.pl.spatial(
    spatial_adata,
    color="cell_type_projected",
    spot_size=100
)

gimVI (Gaussian Identity Multivi for Imputation)

Purpose: Cross-modality imputation between spatial and single-cell data.

Key Features: - Joint model of spatial and single-cell data - Imputes missing genes in spatial data - Enables cross-dataset queries - Learns shared representations

When to Use: - Imputing genes not measured in spatial data - Joint analysis of spatial and single-cell datasets - Mapping between modalities

Basic Usage:

# Combine datasets
combined_adata = sc.concat([sc_adata, spatial_adata])

scvi.model.GIMVI.setup_anndata(
    combined_adata,
    layer="counts"
)

model = scvi.model.GIMVI(combined_adata)
model.train()

# Impute genes in spatial data
imputed = model.get_imputed_values(spatial_indices)

scVIVA (Variation in Variational Autoencoders for Spatial)

Purpose: Analyzing cell-environment relationships in spatial data.

Key Features: - Models cellular neighborhoods and environments - Identifies environment-associated gene expression - Accounts for spatial correlation structure - Cell-cell interaction analysis

When to Use: - Understanding how spatial context affects cells - Identifying niche-specific gene programs - Cell-cell interaction studies - Microenvironment analysis

Data Requirements: - Spatial transcriptomics with coordinates - Cell type annotations (optional)

Basic Usage:

scvi.model.SCVIVA.setup_anndata(
    spatial_adata,
    layer="counts",
    spatial_key="spatial"  # Coordinates in .obsm
)

model = scvi.model.SCVIVA(spatial_adata)
model.train()

# Get environment representations
env_latent = model.get_environment_representation()

# Identify environment-associated genes
env_genes = model.get_environment_specific_genes()

ResolVI

Purpose: Addressing spatial transcriptomics noise through resolution-aware modeling.

Key Features: - Accounts for spatial resolution effects - Denoises spatial data - Multi-scale analysis - Improves downstream analysis quality

When to Use: - Noisy spatial data - Multiple spatial resolutions - Need denoising before analysis - Improving data quality

Basic Usage:

scvi.model.RESOLVI.setup_anndata(
    spatial_adata,
    layer="counts",
    spatial_key="spatial"
)

model = scvi.model.RESOLVI(spatial_adata)
model.train()

# Get denoised expression
denoised = model.get_denoised_expression()

Model Selection for Spatial Transcriptomics

DestVI

Choose when: - Need detailed deconvolution with reference - Have high-quality scRNA-seq reference - Want multi-resolution analysis - Need uncertainty quantification

Best for: Visium, spot-based technologies

Stereoscope

Choose when: - Need simpler, faster deconvolution - Basic cell type proportion estimates - Limited computational resources

Best for: Quick deconvolution tasks

Tangram

Choose when: - Want single-cell resolution mapping - Need to impute many genes - Interested in cell positioning - Optimal transport approach preferred

Best for: Detailed spatial mapping

gimVI

Choose when: - Need bidirectional imputation - Joint modeling of spatial and single-cell - Cross-dataset queries

Best for: Integration and imputation

scVIVA

Choose when: - Interested in cellular environments - Cell-cell interaction analysis - Neighborhood effects

Best for: Microenvironment studies

ResolVI

Choose when: - Data quality is a concern - Need denoising - Multi-scale analysis

Best for: Noisy data preprocessing

Complete Workflow: Spatial Deconvolution with DestVI

import scvi
import scanpy as sc
import squidpy as sq

# ===== Part 1: Prepare single-cell reference =====
# Load and process scRNA-seq reference
sc_adata = sc.read_h5ad("reference_scrna.h5ad")

# QC and filtering
sc.pp.filter_genes(sc_adata, min_cells=10)
sc.pp.highly_variable_genes(sc_adata, n_top_genes=4000)

# Train scVI on reference
scvi.model.SCVI.setup_anndata(
    sc_adata,
    layer="counts",
    batch_key="batch"
)

sc_model = scvi.model.SCVI(sc_adata)
sc_model.train(max_epochs=400)

# ===== Part 2: Load spatial data =====
spatial_adata = sc.read_visium("path/to/visium")
spatial_adata.var_names_make_unique()

# QC spatial data
sc.pp.filter_genes(spatial_adata, min_cells=10)

# ===== Part 3: Run DestVI =====
scvi.model.DESTVI.setup_anndata(
    spatial_adata,
    layer="counts"
)

destvi_model = scvi.model.DESTVI.from_rna_model(
    spatial_adata,
    sc_model,
    cell_type_key="cell_type"
)

destvi_model.train(max_epochs=2500)

# ===== Part 4: Extract results =====
# Get proportions
proportions = destvi_model.get_proportions()
spatial_adata.obsm["proportions"] = proportions

# Add proportions to .obs for easy plotting
for i, ct in enumerate(sc_model.adata.obs["cell_type"].cat.categories):
    spatial_adata.obs[f"prop_{ct}"] = proportions[:, i]

# ===== Part 5: Visualization =====
# Plot specific cell types
cell_types = ["T cells", "B cells", "Macrophages"]

for ct in cell_types:
    sc.pl.spatial(
        spatial_adata,
        color=f"prop_{ct}",
        title=f"{ct} proportions",
        spot_size=150,
        cmap="viridis"
    )

# ===== Part 6: Spatial analysis =====
# Compute spatial neighbors
sq.gr.spatial_neighbors(spatial_adata)

# Spatial autocorrelation of cell types
for ct in cell_types:
    sq.gr.spatial_autocorr(
        spatial_adata,
        attr="obs",
        mode="moran",
        genes=[f"prop_{ct}"]
    )

# ===== Part 7: Save results =====
destvi_model.save("destvi_model")
spatial_adata.write("spatial_deconvolved.h5ad")

Best Practices for Spatial Analysis

Reference quality: Use high-quality, well-annotated scRNA-seq reference
Gene overlap: Ensure sufficient shared genes between reference and spatial
Spatial coordinates: Properly register spatial coordinates in .obsm["spatial"]
Validation: Use known marker genes to validate deconvolution
Visualization: Always visualize results spatially to check biological plausibility
Cell type granularity: Consider appropriate cell type resolution
Computational resources: Spatial models can be memory-intensive
Quality control: Filter low-quality spots before analysis

references/models-spatial.md

Spatial Transcriptomics Models

DestVI (Deconvolution of Spatial Transcriptomics using Variational Inference)

Stereoscope

Tangram

gimVI (Gaussian Identity Multivi for Imputation)

scVIVA (Variation in Variational Autoencoders for Spatial)

ResolVI

Model Selection for Spatial Transcriptomics

DestVI

Stereoscope

Tangram

gimVI

scVIVA

ResolVI

Complete Workflow: Spatial Deconvolution with DestVI

Best Practices for Spatial Analysis