Skip to contents

This function is used to create a pre-defined sparse expression set object from a data matrix of differnt classes: "dgCMatrix", "dgTMatrix", "dgeMatrix", "matrix", "data.frame". It allows the users to provide self-customized meta data for both cells (parameter cellData) and genes (parameter featureData). It can also generate the meta data for both automatically, if addMetaData = TRUE. The automatically generated meta data includes:

  • "nUMI": number of total UMIs in each cell, only valid when the values in data matrix are raw UMI counts;

  • "nFeature": number of expressed features/genes in each cell;

  • "pctMito": percentage of UMIs of mitochondrial genes (defined by "^mt-|^MT-") in each cell;

  • "pctSpikeIn": percentage of UMIs of spike-in RNAs (defined by "^ERCC-|^Ercc-") in each cell;

  • "nCell": number of cells that each feature/gene was identified in.

Usage

createSparseEset(
  input_matrix,
  do.sparseConversion = TRUE,
  cellData = NULL,
  featureData = NULL,
  annotation = "",
  projectID = NULL,
  addMetaData = TRUE
)

Arguments

input_matrix

A data matrix with Features/Genes as the rows and Cells as the columns. It should be one of: 'dgCMatrix', 'dgTMatrix', 'dgeMatrix', 'matrix', 'data.frame'.

do.sparseConversion

Logical, whether to convert the input_matrix to a sparse matrix if it's not. Default: TRUE.

cellData

A data frame containing meta data of cells or NULL. It's row.names should be consistent with the colnames of input_matrix. Default: NULL.

featureData

A data frame containing meata data of features or NULL. It's row.names should be consistent with the row.names of input_matrix. Default: NULL.

annotation

Character, a character describing the project properties. It's highly recommended to use the path to project space. Default: "".

projectID

Character or NULL, the project name of the sparse eset object. Default: NULL.

addMetaData

Logical, whether to calculate and add extra statistics (a.k.a. meta data) to cells and features. Default: TRUE.

Value

A sparse eset object with three slot: 1) gene by cell matrix; 2) data frame of cell information; 3) data frame of feature/gene information.

Examples

data("pbmc14k_rawCount")
## 1. Create SparseEset object solely from raw count matrix
pbmc14k_raw.eset <- createSparseEset(input_matrix = pbmc14k_rawCount,
                                     projectID = "PBMC14k",
                                     addMetaData = TRUE)
#> Creating sparse eset from the input_matrix ...
#> 	Adding meta data based on input_matrix ...
#> Done! The sparse eset has been generated: 17986 genes, 14000 cells.

## 2. Create SparseEset with customized meta data
true_label <- read.table(system.file("extdata/demo_pbmc14k/PBMC14k_trueLabel.txt.gz", package = "scMINER"),
                         header = TRUE, row.names = 1, sep = "\t", quote = "", stringsAsFactors = FALSE)
pbmc14k_raw.eset <- createSparseEset(input_matrix = pbmc14k_rawCount,
                                     cellData = true_label,
                                     featureData = NULL,
                                     projectID = "PBMC14k",
                                     addMetaData = TRUE)
#> Creating sparse eset from the input_matrix ...
#> 	Adding meta data based on input_matrix ...
#> Done! The sparse eset has been generated: 17986 genes, 14000 cells.