SpaTopic Basics

**More detailed documentation is available at the SpaTopic Home Page. Please check the github page for more detailed information for the package.

Introduction

Recent advancements in multiplexed tissue imaging allow for examination of tissue microenvironments in great detail. These cutting-edge technologies offer invaluable insights into cellular heterogeneity and spatial architectures, playing a crucial role in decoding mechanisms of treatment response and disease progression.

However, gaining a deep understanding of complex spatial patterns remains challenging. SpaTopic implements a novel spatial topic model to integrate both cell type and spatial information to identify the complex spatial tissue structures without human intervention. The Collapsed Gibbs sampling algorithm is used for model inference. Contrasting to computationally intensive K-nearest-neighbor-based cell neighborhood analysis approaches, SpaTopic is more scalable to large-scale image datasets without extracting neighborhood information for every single cell.

SpaTopic can be applied either on a single image or across multiple images.

Simple Usage

The required input of SpaTopic is a data frame containing cells within on a single image or a list of data frames for multiple images. Each data frame consists of four columns:

  • image: Image ID
  • X, Y: X, Y cell coordinate
  • type: cell type information
library(SpaTopic)
library(sf)
## The input can be a data frame or a list of data frames
data("lung5")
head(lung5)
#>      image        X        Y           type
#> 1_1 image1 4215.889 158847.7      Dendritic
#> 2_1 image1 6092.889 158834.7     Macrophage
#> 3_1 image1 7214.889 158843.7 Neuroendocrine
#> 4_1 image1 7418.889 158813.7     Macrophage
#> 5_1 image1 7446.889 158845.7     Macrophage
#> 6_1 image1 3254.889 158838.7          CD4 T

Run Gibbs Sampling

## Gibbs sampling
gibbs.res<-SpaTopic_inference(lung5, ntopics = 7, sigma = 50, region_radius = 400)

Check the output of SpaTopic

str(gibbs.res)
#> List of 8
#>  $ Perplexity   : num 11.3
#>  $ Deviance     : num 485960
#>  $ loglikelihood: num -242980
#>  $ Beta         :'data.frame':   38 obs. of  7 variables:
#>   ..$ topic1: num [1:38] 0.03587 0.02539 0.00755 0.01858 0.02585 ...
#>   ..$ topic2: num [1:38] 6.51e-03 3.55e-02 2.62e-06 5.80e-04 7.75e-01 ...
#>   ..$ topic3: num [1:38] 4.54e-06 4.54e-06 9.13e-04 3.45e-01 1.73e-03 ...
#>   ..$ topic4: num [1:38] 0.02664 0.01743 0.00186 0.0152 0.08919 ...
#>   ..$ topic5: num [1:38] 2.99e-06 2.99e-06 5.32e-03 1.91e-02 4.90e-03 ...
#>   ..$ topic6: num [1:38] 6.35e-06 6.35e-06 2.04e-02 3.43e-03 6.35e-06 ...
#>   ..$ topic7: num [1:38] 0.00534 0.00699 0.00604 0.01843 0.00655 ...
#>  $ Theta        : num [1:971, 1:7] 0.855601 0.000232 0.999269 0.99889 0.998725 ...
#>  $ Ndk          : int [1:971, 1:7] 107 0 82 54 47 72 100 0 0 0 ...
#>  $ Nwk          : int [1:38, 1:7] 390 276 82 202 281 505 697 522 29 58 ...
#>  $ Z.trace      :'data.frame':   100149 obs. of  7 variables:
#>   ..$ topic1: num [1:100149] 0.065 0.865 0.135 0.82 0.785 0.02 0.1 0.105 0.075 0.095 ...
#>   ..$ topic2: num [1:100149] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ topic3: num [1:100149] 0.275 0.005 0.21 0.005 0.005 0.77 0.02 0.015 0.085 0.075 ...
#>   ..$ topic4: num [1:100149] 0.415 0 0 0.01 0.005 0.1 0.665 0.62 0.015 0.025 ...
#>   ..$ topic5: num [1:100149] 0.005 0.01 0 0 0 0 0.005 0.005 0.005 0 ...
#>   ..$ topic6: num [1:100149] 0 0 0.655 0.165 0.205 0.005 0 0 0 0 ...
#>   ..$ topic7: num [1:100149] 0.24 0.12 0 0 0 0.105 0.21 0.255 0.82 0.805 ...

For more detailed usage of SpaTopic and how to interprete output from SpaTopic, please check the complete tutorial in SpaTopic Home Page. We also provide a function to prepare input from Seurat v5 object.