R/cluster.R
clusterBanksy.Rd
Perform clustering in BANKSY's neighborhood-augmented feature space.
clusterBanksy(
se,
use_agf = FALSE,
lambda = 0.2,
use_pcs = TRUE,
npcs = 20L,
dimred = NULL,
ndims = NULL,
assay_name = NULL,
group = NULL,
algo = c("leiden", "louvain", "kmeans", "mclust"),
k_neighbors = 50,
resolution = 1,
leiden.iter = -1,
kmeans.centers = 5,
mclust.G = 5,
M = NULL,
seed = NULL,
...
)
A SpatialExperiment
,
SingleCellExperiment
or SummarizedExperiment
object with computeBanksy
ran.
A logical vector specifying whether to use the AGF for clustering.
A numeric vector in \(\in [0,1]\) specifying a spatial weighting parameter. Larger values (e.g. 0.8) incorporate more spatial neighborhood and find spatial domains, while smaller values (e.g. 0.2) perform spatial cell-typing.
A logical scalar specifying whether to cluster on PCs. If FALSE, runs on the BANKSY matrix.
An integer scalar specifying the number of principal components
to use if use_pcs
is TRUE.
A string scalar specifying the name of an existing
dimensionality reduction result to use. Will overwrite use_pcs
if
supplied.
An integer scalar specifying the number of dimensions to use if
dimred
is supplied.
A string scalar specifying the name of the assay used in
computeBanksy
.
A string scalar specifying a grouping variable for samples in
se
. This is used to scale the samples in each group separately.
A string scalar specifying the clustering algorithm to use; one of leiden, louvain, mclust, kmeans.
An integer vector specifying number of neighbors for constructing sNN (for louvain / leiden).
A numeric vector specifying resolution used for clustering (louvain / leiden).
An integer scalar specifying the number of leiden iterations. For running till convergence, set to -1 (leiden).
An integer vector specifying the number of kmeans clusters (kmeans).
An integer vector specifying the number of mixture components (Mclust).
Advanced usage. An integer vector specifying the highest azimuthal
Fourier harmonic to cluster with. If specified, overwrites the
use_agf
argument.
Random seed for clustering. If not specified, no seed is set.
to pass to methods
A SpatialExperiment / SingleCellExperiment / SummarizedExperiment
object with cluster labels in colData(se)
.
This function performs clustering on the principal components computed on
the BANKSY matrix, i.e., the BANKSY embedding. The PCA corresponding to the
parameters use_agf
and lambda
must have been computed with
runBanksyPCA. Clustering may also be performed directly on the
BANKSY matrix with use_pcs
set to FALSE
(this is not
recommended).
Four clustering algorithms are implemented.
leiden: Leiden graph-based clustering. The arguments
k_neighbors
and resolution
should be specified.
louvain: Louvain graph-based clustering. The arguments
k_neighbors
and resolution
should be specified.
kmeans: kmeans clustering. The argument kmeans.centers
should
be specified.
mclust: Gaussian mixture model-based clustering. The argument
mclust.G
should be specified.
By default, no seed is set for clustering. If a seed is specified, the same seed is used for clustering across the input parameters.
data(rings)
spe <- computeBanksy(rings, assay_name = "counts", M = 1, k_geom = c(15, 30))
#> Computing neighbors...
#> Spatial mode is kNN_median
#> Parameters: k_geom=15
#> Done
#> Computing neighbors...
#> Spatial mode is kNN_median
#> Parameters: k_geom=30
#> Done
#> Computing harmonic m = 0
#> Using 15 neighbors
#> Done
#> Computing harmonic m = 1
#> Using 30 neighbors
#> Centering
#> Done
spe <- runBanksyPCA(spe, M = 1, lambda = c(0, 0.2), npcs = 20)
spe <- clusterBanksy(spe, M = 1, lambda = c(0, 0.2), resolution = 1)