In this article we overview each scPipeline module, and provide details on how to specify parameters for analysis. For each module, we provide a description, example reports and documentation for implementation (i.e., parameter specification).

Prior to running any scPipeline module, please visit our getting started with scPipeline article to setup up scPipeline.

scPipeline Parameter specification

In general, our convention is to store all analysis parameters in a list named parameter.list at the start of the script.

# example
parameter.list <- list(input.file = "../data/demo.rds", ..., print.inline = F, save.flag = T)

In all scPipeline modules, parameter.list is already specified, so users need only to configure the analysis parameters (e.g., specify input data, set cluster resolution, etc.) prior to running the module. In general, the default settings will yield reasonable results, however a little time spent optimizing parameters will often yield better outcomes.


Module 1: Quality Control (QC) & Preprocessing

Description

Initial appraisal of scRNA-seq count matrix data and overview of quality control metrics, data filtering, and normalization. Generates initial UMAP clusters, and PCA analyses. A preprocessed seurat object is saved as output.

Quality control: Cells are filtered based on genes/cell recovered and fraction of counts belonging to mitochondrial genes. Injured/broken cells are characterized by low gene recovery and high mitochondrial content whereas doublets are characterized by extremely high gene recovery. Based on these behaviors, high quality non-injured single cells can be retained for downstream analysis using the gene.upperlimit, gene.lowerlimit, and mt.upperlimit analysis parameters.

Count Normalization: Count matrices are normalized using SCTransform. In brief, the count data are modeled using regularized negative binomial regression to remove the influence of technical covariates, including sequencing depth and mitochondrial content. The resulting model residuals are treated as normalized count data for downstream analysis.

Dimensional Reduction: Following normalization, select features (i.e., genes) are used to reduce the dimensionality of the data set. This effectively eases downstream computational burden, increases the signal-to-noise and allows for data visualization. We perform principal component analysis and use the top N principal components that explain 90% (user-defined; pca.var.threshold) variance for Louvain clustering and UMAP embedding. The UMAP algorithm embeds reduced data (e.g., principal components) into a lower dimensional representation while preserving the global structure of the transcriptomic data to facilitate visualization.

Example Reports

For all the example reports shown here, datasets were subsampled to 10,000 cells and minimal cell filtering was performed due to original count matrices having been cleaned by authors.

Inputs and Outputs

show contents

Inputs

One of the following data formats must be provided as input:

  • Seurat object: must contain RNA assay with gene x cell count matrix
  • Cell Ranger output: folder must contain barcodes.tsv, genes.tsv, matrix.mtx
  • gene x cell count matrix

Outputs

  • Seurat object (.Rdata file)
  • Analysis report (.HTML file)

Analysis Parameters

show table
Parameter Description Argument
input.object path to seurat object character
input.cell_ranger path to folder containing cell ranger output character
input.matrix path gene x cell count matrix (.tsv, .csv format) character
save.flag save Seurat object from current analysis logical
output.object.directory path to folder in which to save Seurat object character
output.object.file name of file to which Seurat object is saved. Must have “.Rdata” suffix (e.g., ‘object.Rdata’) character
subsample.n value specifying how many cells to sub sample. Set to NA if none. numeric
cluster.resolution value of the resolution parameter passed to Seurat::FindClusters(...). Use a value above (below) 1.0 if you want to obtain a larger (smaller) number of communities. Default is 1. numeric
gene.upperlimit Upper limit of genes per cell. Value between [0, Inf], but more than gene.lowerlimit. Any cells exceeding this threshold are filtered out. Default is 9000. numeric
gene.lowerlimit Lower limit of genes per cell. Value between [0, Inf], but less than gene.upperlimit. Any cells below this threshold are filtered out. Default is 200. numeric
mt.upperlimit Upper limit of mitochondrial content per cell. Value between [0, 100]. Cells exceeding this threshold are filtered out. Default is 10. numeric
unmatched.rate.filter.flag Logical specifying whether to filtered cells using unmatched rates. Default is TRUE.
vars2regress Variables to regress out during scaling step. Default is ‘percent.mt’. vector of character(s)
correct.artifact identify artifact cells and omit from dimensional reduction. Default is TRUE. logical
feature.select.method Feature selection method for dimensional reduction. - hvg (default): highly-variable genes
- deviance
scale.method Expression scaling method. - sct (default): SCTransform residuals
- null_residuals
pca.method principal component analysis method. - pca (default)
- glmpca
pca.component.select.method method for selecting number of principal components for clustering and UMAP embedding. - cum_var (default)
- elbow
pca.var.threshold value between [0, 1] specifying cumulative variance explained by pca. Ignored if pca.component.select.method is not ‘cum_var’. numeric
conserve.memory If TRUE, the residual matrix for all genes computed using SCTransform is never created in full; recommended for large data sets, but will take longer to run. Default is FALSE logical
species.filter.flag Logical specifying whether to filter species. Default is FALSE. logical
species.include Specify which species to retain in analysis. Ignored if species.filter.flag is FALSE - Hs: human
- Mm: murine
nworkers Number of workers used for parallel implementation of SCTransform. Default is 1. numeric
save.pdf Logical specifying whether to save figures and tables in separate directory. logical
print.inline print figures after each code chunk. Recommended if running script chunk-by-chunk. Default is FALSE. logical
developer Developer mode. Default is FALSE. logical

Module 2: Data Integration

Description

Batch correction and integration of independent experiments. Generates a new Seurat integrated assay using one of four approaches:

  1. rPCA: reciprocal principal component analysis (Seurat)
  2. CCA: canonical correlation analysis + mutual nearest neighbors-based integration (Seurat)
  3. BBKNN: batch-balanced K-nearest neighbors-based integration
  4. Scanorama: SVD + Mutual nearest neighbors-based integration

See Stuart et al. (2019). Cell for a description of rPCA and CCA, Polański et al. (2020). Bioinformatics for BBKNN, and Hie et al. (2019). Nature Biotechnology for Scanorama.

Our selection of integration methods is based on the benchmarking efforts published by Luecken et al. (2021). Nature Methods

The primary aim of data integration is to identify common cell populations between datasets. However, because many integration methods involve non-linear warping, many statistical assumptions are broken in the process and the count matrix in the integrated assay is not appropriate for downstream differential gene expression. If downstream differential expression (DEG) analyses are intended, use counts from the data slot in the SCT assay; Data are processed using PrepSCTFindMarkers() thereby ensuring that counts have been corrected for varying sequencing-depth across data sets.

Analysis Parameters

In addition to specifying the parameter.list (analysis parameters), the Integration Module additionally requires a separate list, named input.data, to specify the data sets used for integration. See below for details and examples.

show parameter.list
Parameter Description Argument
save.file name of output file character
integration.feature.select.method method used to select integration features - hvg (default): highly-variable genes
- deviance
integration.k.filter number of neighbors (k) to use when filtering anchors. Default is 200. Passed to Seurat::FindIntegrationAnchors(...). Ignored if integration.method = “BBKNN” or “Scanorama”. numeric
integration.k.anchor number of neighbors (k) to use when picking anchors. Dictates strength of alignment. 5 is recommended, 20 suggested for stronger alignments. Passed to Seurat::FindIntegrationAnchors(...). Ignored if integration.method = “BBKNN” or “Scanorama”. numeric
integration.k.weight number of neighbors to consider when weighting anchors. Passed to Seurat::IntegrateData(...). Default is 100. Ignored if integration.method = “BBKNN” or “Scanorama”. numeric
integration.n.genes number of genes used for integration. 1000-3000 recommended. numeric
integration.method integration method. - rPCA
- CCA
- BBKNN
- Scanorama
integration.limit.memory Specify whether to limit max memory allocated during integration. Default is T. Ignored if integration.method = “BBKNN” or “Scanorama”. logical
integration.max.memory max memory allocated during integration, in terms of Gb. Ignored if integration.limit.memory = F. Default is 150. Ignored if integration.method = “BBKNN” or “Scanorama”. numeric
integration.cluster.resolution value of the resolution parameter passed to Seurat::FindClusters(...), which is run after data integration. Use a value above (below) 1.0 if you want to obtain a larger (smaller) number of communities. Default is 1. numeric
integration.compute.diversity Compute change in population diversity before and after integration. Default is T logical
pca.var.threshold proportion of variance used for clustering and UMAP embedding. 0.9 is recommended. numeric [0,1]
correct.artifact omit artifact genes. Default is F logical
vars2regress Variables to regress out during scaling step. Default is ‘percent.mt’. vector of character(s)
correct.artifact identify artifact genes (i.e., genes detected in only one sample) and omit from integration step. Recommended only if integrated datasets have similar cellular compositions. Default is F logical
n.workers number of workers to use for parallel implementation. Provided as list, with each named entry corresponding to different task (e.g., n.workers = list(import = 4, integration = 1)) list
print.inline print figures after each code chunk. Recommended if running script chunk-by-chunk. Default is FALSE. logical
save.pdf Logical specifying whether to save figures and tables in separate directory. logical
update.log update analysis log. Ignored if developer = F. logical
rprofile.dir look for input file in path specified in .Rprofile. Default is FALSE. logical
developer Developer mode. Default is FALSE. logical
show input.data

The input.data list is a named list, with each entry corresponding to a separate input dataset. For a basic use, given 2 or more seurat objects for integration, we can specify the path to the input data using the file argument as follows:

# basic input specification
input.data <- list(
  dataset1 = list(file = "dataset1.Rdata"),    # path to .Rdata file containing seurat object named 'so'
  dataset2 = list(file = "dataset2.Rdata"),
  dataset3 = list(file = "dataset3.Rdata")
)

Alternatively, if a given data set contains multiple batches of data, we can partition the seurat object into subsets which will then be handled as independent datasets in the downstream integration step. This can be accomplished using the integrate.by argument, which specifies which feature to use to split the seurat object. The feature specified by integrate.by must be present in the metadata of the corresponding seurat object. In the example below, we split dataset 1 using the Barcode feature prior to integration:

# spliting seurat object into subset prior to integration
input.data <- list(
  dataset1 = list(
    file = "dataset1.Rdata",        # path to .Rdata file containing seurat object named 'so'
    integrate.by = "Barcode"        # meta data features specifying sample labels
  ),       
  dataset2 = list(
    file = "dataset2.Rdata"
  ),
  dataset3 = list(
    file = "dataset3.Rdata"
  )
)

Users can also filter their inputs prior to integration using the meta.feature.name and meta.feature.filter arguments. For example, if dataset 1 consists of multiple cell types, and we are interested in only integrating the immune compartment, we can set meta.feature.filter to “lymphoid|myeloid” and meta.feature.name to “cell.type”, with the latter corresponding to the meta data column present in the seurat object:

# filtering seurat object prior to integration
input.data <- list(
  dataset1 = list(
    file = "dataset1.Rdata",                   # path to .Rdata file containing seurat object named 'so'
    meta.feature.name = "cell.type",           # meta data column used to filter data
    meta.feature.filter = "lymphoid|myeloid"   # features to include for downstream integration
  ),       
  dataset2 = list(
    file = "dataset2.Rdata"
  ),
  dataset3 = list(
    file = "dataset3.Rdata"
  )
)

Note that the filter uses non-exact matching, so please be aware that substrings may unintentionally filter out data of interest if incorrectly specified. For example, “cell_type_1” will filter “cell_type_1”, “cell_type_10”, “cell_type_11”, etc. A few additional examples for filtering:

  • meta.feature.filter = c(“Cell_A”, “Cell_B”) will match “Cell_A1”, “Cell_A2” and “Cell_B1”, but not “Cell_C1”.
  • meta.feature.filter = c(“Cell_A|Cell_B”) will NOT match “Cell_A1”, “Cell_A2” or “Cell_B1”
  • meta.feature.filter = c(“Cell”) will match “Cell_A1”, “Cell_A2”, “Cell_B1”, and “Cell_C1”.

Module 3: Cluster Optimization

Description

Clustering is performed at several candidate resolutions, and the optimal resolution is informed using:

  • Marker specificity criterion (CDI-based)
    • Measure of how well differentially-expressed genes can discriminate clusters at a given resolution.
    • Values range from 0 (no specific markers exist for any cluster) to 1 (specific markers exist for each cluster)
  • Nearest-neighbor purity criterion
    • Seeks to optimize the cluster consistency, or purity, within each cell cluster by maximizing the similarity of cells within clusters compared to other clusters (analogous to silhouette width)
    • For each cluster resolution, cell-level purity is defined as the proportion of cells within cell i’s K-nearest neighborhood that belong to the most represented cluster within that neighborhood. Then, cell-level purity scores are aggregated as means to yield the global purity score.
    • Values range from 0 (impure) to 1 (pure).
    • Neighborhoods in which cells belong to many different clusters are considered impure (low purity score) whereas neighborhoods in which cells belong to a single cluster are pure (high purity score). Clustering configurations with purity scores exceeding 0.8-0.9 are recommended.
  • Silhouette width criterion
    • Measure of how similar cells are within their cluster compared to other clusters.
    • The silhouette width is quantified in UMAP space, so clusters that are well-separated on a UMAP will yield higher silhouette widths than those that are poorly separated.
    • Values range from 0 (inconsistent) to 1 (consistent).
  • Differential expression (DE) analyses (Wilcoxon, CDI)
    • Differentially-expressed genes (DEGs) are computed for each cluster at each resolution using the Wilcoxon and Co-dependency Index (CDI) methods.
    • Wilcoxon: Presto, an ultrafast DE algorithm, is used to compute auROC and Wilcoxon p-value based on Gaussian approximation. Identifies graded and binary DEGs.
    • Co-dependency Index (CDI): Cluster-Gene co-dependency profiles are computed using the co-dependence index. Specifically identifies binary DEGs.

Clustering is performed using the Seurat implementation of FindNeighbors() and FindClusters(). In brief, a cell X cell shared nearest neighbor (SNN) graph is first constructed by determining the neighborhood overlap (Jaccard Index) between every cell and its kth nearest neighbor. Then, the SNN graph is clustered using the Louvain algorithm. The main parameter affecting the number of clusters generated is the resolution. Higher resolutions yield more clusters, whereas lower resolution yield fewer clusters.

Inputs and Outputs

show contents

Inputs

  • Seurat object (.Rds or .RData file)
    • If an .RData file is provided, the seurat object must be named so.

Outputs

  • Analysis report (.HTML file)

Analysis Parameters

show table
Parameter Description Argument
input.file path to seurat object character
cluster.resolution range of resolution values used for evaluating candidate cluster configurations. Recommended = c(0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2, 3) vector of numeric(s)
lfc.threshold log-fold change threshold used in differential-expression analysis. 0.5 is recommended. numeric
fdr.threshold fdr-value significance threshold. Value between [0,1]. Default is 0.05. numeric
group.singletons Group singletons into nearest cluster. If FALSE, assign all singletons to a “singleton” group. For certain difficult-to-cluster data sets, group.singletons = FALSE may yield faster results. Default is FALSE logical
print.inline print figures after each code chunk. Recommended if running script chunk-by-chunk. Default is FALSE. logical
subsample_factor Value between [0,1] specifying fraction of cells to subsample. Use 1 for no subsampling. numeric
subsample_n value specifying how many cells to sub sample. Set to NA if none. numeric
n.workers number of workers used for parallel implementation. parallel::detectCores() is recommended. numeric
only.pos only included genes with logFC > 0. Default is TRUE. logical
save.pdf save figures and tables logical
update.log update analysis log. Ignored if developer = F. logical
rprofile.dir look for input file in path specified in .Rprofile. Default is FALSE. logical
developer Developer mode. Default is FALSE. logical

Module 4: Cell-Type Annotation

Description

Annotation of cell populations identified by unsupervised clustering. Cell-type markers are scored using the Miko scoring pipeline.

Prior to running this module, we recommend running the cluster optimization module to identify the appropriate resolution for clustering.

Inputs and Outputs

show contents

Inputs

  • Seurat object (.Rds or .RData file)
    • If an .RData file is provided, the seurat object must be named so.

Outputs

  • Analysis report (.HTML file)

Analysis Parameters

show table
Parameter Description Argument
input.file path to seurat object character
external.markers cell-type markers used for annotation named list; names are cell-types, entries are genes.
E.g., list(cell_A = c(“gene1”, “gene2”, “gene3”), cell_B = c(“gene4”, “gene5”, “gene6”))
internal.markers name of scMiko internal gene set(s). Run names(geneSets) for list of available gene sets. character
marker.filter filter marker sets using grepl pattern matching. E.g., Manno2021 for all La Manno 2021 atlas-derived markers in Cell_Catalogy stored in scMiko geneSets. character
coherence.threshold value of minimum gene set coherence. Gene sets with coherence scores below this threshold are disqualified from consideration. Value must be between [0,1]. Default is 0.8. numeric
p.threshold p-value significance threshold. Value between [0,1]. Default is 0.05. numeric
do.center center scores by null model predictions. FALSE is recommended logical
frequent.filter.threshold path to seurat object character
filter.frequent.fliers path to seurat object character
min.representation minimum fraction of genes required for gene set consideration. If fraction of genes present in seurat object is below this threshold, gene set is disqualified. Value between [0,1]. Default is 0.5. numeric
cluster.resolution value of the resolution parameter passed to Seurat::FindClusters(...). Use a value above (below) 1.0 if you want to obtain a larger (smaller) number of communities. Default is 1. numeric
fdr.correction apply Benjamini & Hochberg correction logical
clean.clusters filter cells that exceed 3 median absolute deviations from cluster center in UMAP coordinate space. Default is FALSE. logical
max.cells max number of cells used in Miko scoring pipeline. Default is 20000. numeric
subsample_factor value specifying fraction of cells to sub sample. Value between [0,1] numeric
min.geneset.size maximal gene set size. Default is 200. numeric
max.geneset.size minimal gene set size. Default is 3. numeric
pathway.db database used for functional annotation of gene sets - GO
- Bader
n.workers number of workers used for parallel implementation. Default is parallel::detectCores(). numeric
save.pdf save figures and tables. logical
show.top.n.gene.panels max number of gene sets per cluster to show in report. Default is 5 numeric
max.gene.panels.allowed max number of overall gene sets to show in report. Default is 75 numeric
print.inline print figures after each code chunk. Recommended if running script chunk-by-chunk. Default is FALSE. logical
update.log update analysis log. Ignored if developer = F. logical
rprofile.dir look for input file in path specified in .Rprofile. Default is FALSE. logical
developer Developer mode. Default is FALSE. logical

Module 5: Gene Program Discovery

Description

Gene program/module identification. Three independent methods are implemented to identify gene programs based on gene co-expression profiles:

  • Scale-free shared nearest neighbor network analysis (SSN)
  • Independent component analysis (ICA)
  • Non-negative matrix factorization (NMF)

Gene set enrichment is performed on each gene program to facilitate functional annotation.

Inputs and Outputs

show contents

Inputs

  • Seurat object (.Rds or .RData file)
    • If an .RData file is provided, the seurat object must be named so.

Outputs

  • Analysis report (.HTML file)

Analysis Parameters

show table
Parameter Description Argument
input.file path to seurat object character
cluster.resolution Value of the resolution parameter, use a value above 1.0 if you want to obtain a larger number of communities. numeric
subsample_factor Value between [0,1] specifying fraction of cells to subsample. Use 1 for no subsampling. numeric
subsample_n value specifying how many cells to sub sample. Set to NA if none. numeric
subset.data Subset Seurat object according to specified meta data field. Only specified meta data entries are retained, while remaining of data is omitted. Set as “no.subset” for not subsetting. data frame specifying which field (df$field) to subset on, and which field entries to retain (df$subgroups)
batch.feature meta data field to use for batch correction (set to NA if unspecified) character
feature.selection.method method used to select features for module detection analysis. - expr: expression-based criterion
- hvg: highly-variable genes
- deviance: deviance-based criterion
max.features max number of features to include in module detection analysis (NA if unspecified)
min.pct minimal feature expression (applicable only if select.method = “expr”). Value between [0,1], use 0.1 for large network and 0.5 for small network. numeric
data.type Data representation used for module detection - pearson: based on SCTransform (recommended)
- deviance: based on multinomial null model
purity.optimization.step.size value of step size used for optimizing module purity. 0.5 recommended. numeric
filter.parameters specify which clusters to include or omit named list with 2 sets of entries; ‘include’ entries specify which clusters to retain; ‘omit’ entries specify which clusters to omit.
E.g., list(include = c(0,1,2), omit = c(3,4,5))
general.weight.by.var Weight feature embeddings by the variance of each PC. Default is FALSE. logical
general.pca.cum.sum cumulative variance explained by PCs used for SSN analysis. Value between (0,1], 0.9 is recommended. numeric
general.umap.knn number of neighboring points used in local approximations of manifold structure. Larger values will result in more global structure being preserved at the loss of detailed local structure. In general this parameter should often be in the range 5 to 50. numeric
general.cluster.purity target purity of network modules. Value between [0,1], 0.8 is recommended. logical
general.scale.free.topology apply scale-free transform to shared nearest neighbor graph. Default is TRUE. logical
general.ica perform independent component analysis (ICA). logical
general.nmf perform non-negative matrix factorization (NMF). logical
general.nmf.k number of latent factors to compute with NMF. Can be a range of values (e.g., c(5, 10, 15)) vector of numeric(s)
general.robust.pca use robust PCA for dimensional reduction (warning: computationally intensive) logical
n.workers number of workers used for parallel implementations. numeric
print.inline print figures after each code chunk. Recommended if running script chunk-by-chunk. Default is FALSE. logical
save.pdf save figures and tables logical
update.log update analysis log. Ignored if developer = F. logical
rprofile.dir look for input file in path specified in .Rprofile. Default is FALSE. logical
developer Developer mode. Default is FALSE. logical



Module 6: Gene Expression and Associations

Description

Given a query gene or gene set, scRNAseq expression is evaluated and gene similarity metrics are computed to identify associated genes.

Inputs and Outputs

show contents

Inputs

  • Seurat object (.Rds or .RData file)
    • If an .RData file is provided, the seurat object must be named so.

Outputs

  • Analysis report (.HTML file)

Analysis Parameters

show table
Parameter Description Argument
input.file path to seurat object character
features features used for analysis.
If feature.format = 1, features must be vector of features present in seurat object (e.g., c(“PRRX1”, “TCF4”)).
If feature.format = 2, features must be name of column spreadsheet containing features (e.g., “HALLMARK_genes”).
If feature.format = 3, features argument is ignored.
feature.format format of features provided by user. - 0: user-provided genes
- 1: features are stored in spreadsheet.
- 2: discovery mode.
feature.csv.path Path to spreadsheet containing features. Required if using feature.format = 1. For intended use, features in spreadsheet must be organized in columns that can be referenced by the column header using the features argument. character
discovery.method Differential expression algorithm used in discovery mode (feature.format = 2) - Wilcoxon: Identifies DEGs using Wilcoxon Rank Sum test genes
- CDI: Identifies DEGs using co-dependency test
- Gini: Identifies group-specific features using gini inequality index
discovery.pct.min Features that are expressed below this value are omitted from differential expression analysis. Range between [0,1], 0.1 is recommended. Required only if feature.format = 2. numeric
discovery.group.by Name of object meta data field to group cells by when performing differential expression. E.g., “seurat_clusters”. Required only if feature.format = 2. character
discovery.top.n path to seurat object character
discovery.max.n minimum fraction of genes required for gene set consideration. If fraction of genes present in seurat object is below this threshold, gene set is disqualified. Value between [0,1]. Default is 0.5. numeric
as.module aggregate features into single meta-program. Default is FALSE. logical
cor.method feature association method - pearson: Pearson correlation
- spearman: Spearman correlation
- rho_p: Proportionality constant
- cdi: co-dependency index
cor.top.n Number of top associated features to show. 70 is recommended. numeric
cor.which.data aslot to use for feature association analysis. “data” is recommended. - data
- scale
cor.min.expr Features that are expressed below this value are omitted from feature association analysis. Range between [0,1], 0.025-0.1 is recommended. numeric
general.cluster.resolution Value of the resolution parameter, use a value above 1.0 if you want to obtain a larger number of communities. numeric
general.subsample.factor Value between [0,1] specifying fraction of cells to subsample. Use 1 for no subsampling. numeric
general.subset Subset Seurat object according to specified meta data field. Only specified meta data entries are retained, while remaining of data is omitted. Set as “no.subset” for not subsetting. data frame specifying which field (df$field) to subset on, and which field entries to retain (df$subgroups)
general.clean.clusters omit cells exceeding 3 median absolute deviations from cluster center in UMAP coordinate space. logical
general.save.pdf save figures and tables logical
general.print.inline print figures after each code chunk. Recommended if running script chunk-by-chunk. Default is FALSE. logical
general.rprofile.dir look for input file in path specified in .Rprofile. Default is FALSE. logical
general.barcode.recode Named list where names are new labels, and entries are old labels. If is provided for old labels, treated as wild card. e.g. list(in.vitro = c(“invitro), in.vivo = NA). See scMiko::recordBarcode(...) for details. Named list where names are new labels, and entries are old labels.
expression.do.hex show UMAP expression using schex. Default is FALSE logical
n.workers number of workers used for parallel implementation. Set to 1 for non-parallel run. numeric
developer Developer mode. Default is FALSE. logical

Session Info
## R version 4.2.0 (2022-04-22 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19044)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_Canada.utf8  LC_CTYPE=English_Canada.utf8   
## [3] LC_MONETARY=English_Canada.utf8 LC_NUMERIC=C                   
## [5] LC_TIME=English_Canada.utf8    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] rstudioapi_0.13   knitr_1.39        magrittr_2.0.3    R6_2.5.1         
##  [5] ragg_1.2.2        rlang_1.0.2       fastmap_1.1.0     stringr_1.4.0    
##  [9] tools_4.2.0       xfun_0.31         cli_3.3.0         jquerylib_0.1.4  
## [13] htmltools_0.5.2   systemfonts_1.0.4 yaml_2.3.5        digest_0.6.29    
## [17] rprojroot_2.0.3   pkgdown_2.0.3     textshaping_0.3.6 purrr_0.3.4      
## [21] formatR_1.12      sass_0.4.1        fs_1.5.2          memoise_2.0.1    
## [25] cachem_1.0.6      evaluate_0.15     rmarkdown_2.14    stringi_1.7.6    
## [29] compiler_4.2.0    bslib_0.3.1       desc_1.4.1        jsonlite_1.8.0