Perform clustering in BANKSY's neighborhood-augmented feature space.

clusterBanksy(
  se,
  use_agf = FALSE,
  lambda = 0.2,
  use_pcs = TRUE,
  npcs = 20L,
  dimred = NULL,
  ndims = NULL,
  assay_name = NULL,
  group = NULL,
  algo = c("leiden", "louvain", "kmeans", "mclust"),
  k_neighbors = 50,
  resolution = 1,
  leiden.iter = -1,
  kmeans.centers = 5,
  mclust.G = 5,
  M = NULL,
  seed = NULL,
  ...
)

Arguments

se: A SpatialExperiment, SingleCellExperiment or SummarizedExperiment object with computeBanksy ran.
use_agf: A logical vector specifying whether to use the AGF for clustering.
lambda: A numeric vector in \(\in [0,1]\) specifying a spatial weighting parameter. Larger values (e.g. 0.8) incorporate more spatial neighborhood and find spatial domains, while smaller values (e.g. 0.2) perform spatial cell-typing.
use_pcs: A logical scalar specifying whether to cluster on PCs. If FALSE, runs on the BANKSY matrix.
npcs: An integer scalar specifying the number of principal components to use if use_pcs is TRUE.
dimred: A string scalar specifying the name of an existing dimensionality reduction result to use. Will overwrite use_pcs if supplied.
ndims: An integer scalar specifying the number of dimensions to use if dimred is supplied.
assay_name: A string scalar specifying the name of the assay used in computeBanksy.
group: A string scalar specifying a grouping variable for samples in se. This is used to scale the samples in each group separately.
algo: A string scalar specifying the clustering algorithm to use; one of leiden, louvain, mclust, kmeans.
k_neighbors: An integer vector specifying number of neighbors for constructing sNN (for louvain / leiden).
resolution: A numeric vector specifying resolution used for clustering (louvain / leiden).
leiden.iter: An integer scalar specifying the number of leiden iterations. For running till convergence, set to -1 (leiden).
kmeans.centers: An integer vector specifying the number of kmeans clusters (kmeans).
mclust.G: An integer vector specifying the number of mixture components (Mclust).
M: Advanced usage. An integer vector specifying the highest azimuthal Fourier harmonic to cluster with. If specified, overwrites the use_agf argument.
seed: Random seed for clustering. If not specified, no seed is set.
...: to pass to methods

Value

A SpatialExperiment / SingleCellExperiment / SummarizedExperiment object with cluster labels in colData(se).

Details

This function performs clustering on the principal components computed on the BANKSY matrix, i.e., the BANKSY embedding. The PCA corresponding to the parameters use_agf and lambda must have been computed with runBanksyPCA. Clustering may also be performed directly on the BANKSY matrix with use_pcs set to FALSE (this is not recommended).

Four clustering algorithms are implemented.

leiden: Leiden graph-based clustering. The arguments k_neighbors and resolution should be specified.
louvain: Louvain graph-based clustering. The arguments k_neighbors and resolution should be specified.
kmeans: kmeans clustering. The argument kmeans.centers should be specified.
mclust: Gaussian mixture model-based clustering. The argument mclust.G should be specified.

By default, no seed is set for clustering. If a seed is specified, the same seed is used for clustering across the input parameters.

Examples

data(rings)
spe <- computeBanksy(rings, assay_name = "counts", M = 1, k_geom = c(15, 30))
#> Computing neighbors...
#> Spatial mode is kNN_median
#> Parameters: k_geom=15
#> Done
#> Computing neighbors...
#> Spatial mode is kNN_median
#> Parameters: k_geom=30
#> Done
#> Done
#> Centering
#> Done
spe <- runBanksyPCA(spe, M = 1, lambda = c(0, 0.2), npcs = 20)
spe <- clusterBanksy(spe, M = 1, lambda = c(0, 0.2), resolution = 1)