The scAnnotatR.models packages contains a set of pre-trained models to classify various (immune) cell types in human data to be used by the scAnnotatR package.
scAnnotatR is an R package for cell type prediction on single cell RNA-sequencing data. Currently, this package supports data in the forms of a Seurat object or a SingleCellExperiment object.
If you are interested in directly applying these models to your data, please refer to the vignettes of the scAnnotatR package.
The scAnnotatR.models package is a AnnotationHub package. Normally, it is automatically loaded by the scAnnotatR package.
To load the package manually into your R session, please use the Bioconductor AnnotationHub package:
# use the AnnotationHub to load the scAnnotatR.models package
eh <- AnnotationHub::AnnotationHub()
# load the stored models
query <- AnnotationHub::query(eh, "scAnnotatR.models")
models <- query[["AH95906"]]
#> loading from cache
#> Loading required namespace: scAnnotatR
#> Warning: replacing previous import 'ape::where' by 'dplyr::where' when loading
#> 'scAnnotatR'
#> Warning: replacing previous import 'e1071::element' by 'ggplot2::element' when
#> loading 'scAnnotatR'
#> Registered S3 method overwritten by 'spatstat.explore':
#> method from
#> plot.roc pROCThe models object is a named list containing the cell type’s name as key and the respective classifier as value:
# print the available cell types
names(models)
#> [1] "B cells" "Plasma cells" "NK"
#> [4] "CD16 NK" "CD56 NK" "T cells"
#> [7] "CD4 T cells" "CD8 T cells" "Treg"
#> [10] "NKT" "ILC" "Monocytes"
#> [13] "CD14 Mono" "CD16 Mono" "DC"
#> [16] "pDC" "Endothelial cells" "LEC"
#> [19] "VEC" "Platelets" "RBC"
#> [22] "Melanocyte" "Schwann cells" "Pericytes"
#> [25] "Mast cells" "Keratinocytes" "alpha"
#> [28] "beta" "delta" "gamma"
#> [31] "acinar" "ductal" "Fibroblasts"Each classifier is an instance of the scAnnotatR S4 class. For example:
models[['B cells']]
#> An object of class scAnnotatR for B cells
#> * 31 marker genes applied: CD38, CD79B, CD74, CD84, RASGRP2, TCF3, SP140, MEF2C, DERL3, CD37, CD79A, POU2AF1, MVK, CD83, BACH2, LY86, CD86, SDC1, CR2, LRMP, VPREB3, IL2RA, BLK, IRF8, FLI1, MS4A1, CD14, MZB1, PTEN, CD19, MME
#> * Predicting probability threshold: 0.5
#> * No parent modelThe scAnnotatR package comes with several pre-trained models to classify cell types.
# Load the scAnnotatR package to view the models
library(scAnnotatR)
#> Loading required package: Seurat
#> Loading required package: SeuratObject
#> Loading required package: sp
#>
#> Attaching package: 'SeuratObject'
#> The following objects are masked from 'package:base':
#>
#> intersect, t
#> Loading required package: SingleCellExperiment
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#>
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#>
#> colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#> colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#> colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#> colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#> colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#> colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#> colWeightedMeans, colWeightedMedians, colWeightedSds,
#> colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#> rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#> rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#> rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#> rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#> rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#> rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#> rowWeightedSds, rowWeightedVars
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> Loading required package: generics
#>
#> Attaching package: 'generics'
#> The following objects are masked from 'package:base':
#>
#> as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#> setequal, union
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#> as.data.frame, basename, cbind, colnames, dirname, do.call,
#> duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
#> mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#> rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
#> unsplit, which.max, which.min
#> Loading required package: S4Vectors
#>
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:utils':
#>
#> findMatches
#> The following objects are masked from 'package:base':
#>
#> I, expand.grid, unname
#> Loading required package: IRanges
#>
#> Attaching package: 'IRanges'
#> The following object is masked from 'package:sp':
#>
#> %over%
#> Loading required package: Seqinfo
#> Loading required package: Biobase
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#>
#> Attaching package: 'Biobase'
#> The following object is masked from 'package:MatrixGenerics':
#>
#> rowMedians
#> The following objects are masked from 'package:matrixStats':
#>
#> anyMissing, rowMedians
#>
#> Attaching package: 'SummarizedExperiment'
#> The following object is masked from 'package:Seurat':
#>
#> Assays
#> The following object is masked from 'package:SeuratObject':
#>
#> AssaysThe models are stored in the default_models object:
default_models <- load_models("default")
#> loading from cache
names(default_models)
#> [1] "B cells" "Plasma cells" "NK"
#> [4] "CD16 NK" "CD56 NK" "T cells"
#> [7] "CD4 T cells" "CD8 T cells" "Treg"
#> [10] "NKT" "ILC" "Monocytes"
#> [13] "CD14 Mono" "CD16 Mono" "DC"
#> [16] "pDC" "Endothelial cells" "LEC"
#> [19] "VEC" "Platelets" "RBC"
#> [22] "Melanocyte" "Schwann cells" "Pericytes"
#> [25] "Mast cells" "Keratinocytes" "alpha"
#> [28] "beta" "delta" "gamma"
#> [31] "acinar" "ductal" "Fibroblasts"The default_models object is named a list of classifiers. Each classifier is an instance of the scAnnotatR S4 class. For example:
default_models[['B cells']]
#> An object of class scAnnotatR for B cells
#> * 31 marker genes applied: CD38, CD79B, CD74, CD84, RASGRP2, TCF3, SP140, MEF2C, DERL3, CD37, CD79A, POU2AF1, MVK, CD83, BACH2, LY86, CD86, SDC1, CR2, LRMP, VPREB3, IL2RA, BLK, IRF8, FLI1, MS4A1, CD14, MZB1, PTEN, CD19, MME
#> * Predicting probability threshold: 0.5
#> * No parent modelPlease refer to the scAnnotatR package documentation for detailed information about how to use these classifiers.
sessionInfo()
#> R Under development (unstable) (2025-10-20 r88955)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] scAnnotatR_1.17.0 SingleCellExperiment_1.33.0
#> [3] SummarizedExperiment_1.41.0 Biobase_2.71.0
#> [5] GenomicRanges_1.63.1 Seqinfo_1.1.0
#> [7] IRanges_2.45.0 S4Vectors_0.49.0
#> [9] BiocGenerics_0.57.0 generics_0.1.4
#> [11] MatrixGenerics_1.23.0 matrixStats_1.5.0
#> [13] Seurat_5.3.1 SeuratObject_5.2.0
#> [15] sp_2.2-0
#>
#> loaded via a namespace (and not attached):
#> [1] RcppAnnoy_0.0.22 splines_4.6.0 later_1.4.4
#> [4] filelock_1.0.3 tibble_3.3.0 polyclip_1.10-7
#> [7] hardhat_1.4.2 pROC_1.19.0.1 rpart_4.1.24
#> [10] fastDummies_1.7.5 lifecycle_1.0.4 httr2_1.2.2
#> [13] globals_0.18.0 lattice_0.22-7 MASS_7.3-65
#> [16] magrittr_2.0.4 plotly_4.11.0 sass_0.4.10
#> [19] rmarkdown_2.30 jquerylib_0.1.4 yaml_2.3.11
#> [22] httpuv_1.6.16 otel_0.2.0 sctransform_0.4.2
#> [25] spam_2.11-1 spatstat.sparse_3.1-0 reticulate_1.44.1
#> [28] cowplot_1.2.0 pbapply_1.7-4 DBI_1.2.3
#> [31] RColorBrewer_1.1-3 lubridate_1.9.4 abind_1.4-8
#> [34] Rtsne_0.17 purrr_1.2.0 nnet_7.3-20
#> [37] rappdirs_0.3.3 ipred_0.9-15 lava_1.8.2
#> [40] data.tree_1.2.0 ggrepel_0.9.6 irlba_2.3.5.1
#> [43] spatstat.utils_3.2-0 listenv_0.10.0 goftest_1.2-3
#> [46] RSpectra_0.16-2 spatstat.random_3.4-3 fitdistrplus_1.2-4
#> [49] parallelly_1.45.1 codetools_0.2-20 DelayedArray_0.37.0
#> [52] tidyselect_1.2.1 farver_2.1.2 spatstat.explore_3.6-0
#> [55] BiocFileCache_3.1.0 jsonlite_2.0.0 caret_7.0-1
#> [58] e1071_1.7-16 progressr_0.18.0 ggridges_0.5.7
#> [61] survival_3.8-3 iterators_1.0.14 foreach_1.5.2
#> [64] tools_4.6.0 ica_1.0-3 Rcpp_1.1.0.8.1
#> [67] glue_1.8.0 gridExtra_2.3 prodlim_2025.04.28
#> [70] SparseArray_1.11.8 xfun_0.54 dplyr_1.1.4
#> [73] withr_3.0.2 BiocManager_1.30.27 fastmap_1.2.0
#> [76] digest_0.6.39 timechange_0.3.0 R6_2.6.1
#> [79] mime_0.13 scattermore_1.2 tensor_1.5.1
#> [82] spatstat.data_3.1-9 dichromat_2.0-0.1 RSQLite_2.4.5
#> [85] tidyr_1.3.1 data.table_1.17.8 recipes_1.3.1
#> [88] class_7.3-23 httr_1.4.7 htmlwidgets_1.6.4
#> [91] S4Arrays_1.11.1 uwot_0.2.4 ModelMetrics_1.2.2.2
#> [94] pkgconfig_2.0.3 gtable_0.3.6 timeDate_4051.111
#> [97] blob_1.2.4 lmtest_0.9-40 S7_0.2.1
#> [100] XVector_0.51.0 htmltools_0.5.9 dotCall64_1.2
#> [103] scales_1.4.0 png_0.1-8 spatstat.univar_3.1-5
#> [106] gower_1.0.2 knitr_1.50 reshape2_1.4.5
#> [109] nlme_3.1-168 curl_7.0.0 proxy_0.4-27
#> [112] cachem_1.1.0 zoo_1.8-14 stringr_1.6.0
#> [115] BiocVersion_3.23.1 KernSmooth_2.23-26 parallel_4.6.0
#> [118] miniUI_0.1.2 AnnotationDbi_1.73.0 pillar_1.11.1
#> [121] grid_4.6.0 vctrs_0.6.5 RANN_2.6.2
#> [124] promises_1.5.0 dbplyr_2.5.1 xtable_1.8-4
#> [127] cluster_2.1.8.1 evaluate_1.0.5 cli_3.6.5
#> [130] compiler_4.6.0 rlang_1.1.6 crayon_1.5.3
#> [133] future.apply_1.20.1 plyr_1.8.9 stringi_1.8.7
#> [136] deldir_2.0-4 viridisLite_0.4.2 Biostrings_2.79.2
#> [139] lazyeval_0.2.2 spatstat.geom_3.6-1 Matrix_1.7-4
#> [142] RcppHNSW_0.6.0 patchwork_1.3.2 bit64_4.6.0-1
#> [145] future_1.68.0 ggplot2_4.0.1 KEGGREST_1.51.1
#> [148] shiny_1.12.1 AnnotationHub_4.1.0 kernlab_0.9-33
#> [151] ROCR_1.0-11 igraph_2.2.1 memoise_2.0.1
#> [154] bslib_0.9.0 bit_4.6.0 ape_5.8-1