Skip to contents

This function is used to read the gene expression data from the HDF5 file generated by CellRanger pipeline of 10x Genomics. This function can automatically distinguish the data of different modalities (e.g. expression data, ATAC data) and retains the gene expression data only. The **hdf5r** package is needed to use this function.

Usage

readInput_10x.h5(
  h5_file,
  featureType = "gene_symbol",
  removeSuffix = TRUE,
  addPrefix = NULL
)

Arguments

h5_file

H5 file generated by CellRanger pipeline of 10x Genomics

featureType

Character, feature type to use as the gene name of expression matrix: "gene_symbol" (the default) or "gene_id".

removeSuffix

Logical, whether to remove the suffix "-1" when present in all cell barcodes. Default: TRUE.

addPrefix

Character or NULL, add a prefix to the cell barcodes, like Sample ID. It is highly recommended to use a prefix containing letters and/or numbers only, and not starting with numbers. Default: NULL.

Value

A sparse gene expression matrix of raw UMI counts, genes by cells

Examples

h5_file <- system.file("extdata/demo_inputs/hdf5_10x/demoData3.h5", package = "scMINER") # path to hdf5 file
sparseMatrix <- readInput_10x.h5(h5_file,
                                 featureType = "gene_symbol",
                                 removeSuffix = TRUE,
                                 addPrefix = "demoSample")
#> Reading 10x Genomics data from: /private/var/folders/v0/njhqcmrs32xgrjgx2wz8d50r0000gp/T/Rtmpf8JULY/temp_libpath11ae7194479/scMINER/extdata/demo_inputs/hdf5_10x/demoData3.h5 ...
#> 	Checking HDF5 file format ...
#> 	Format check passed!
#> Done! The sparse gene expression matrix has been generated: 36601 genes, 182 cells.