Skip to contents

This function is used to generate the standard input files for SJARACNe, a scalable software tool for gene network reverse engineering from big data.

Usage

generateSJARACNeInput(
  input_eset,
  group_name = "clusterID",
  group_name.refine = FALSE,
  sjaracne_dir,
  species_type = "hg",
  driver_type = "TF_SIG",
  customDriver_TF = NULL,
  customDriver_SIG = NULL,
  downSample_N = 1000,
  seed = 123,
  superCell_N = NULL,
  superCell_count = 100,
  superCell_gamma = 10,
  superCell_knn = 5,
  superCell_nHVG = 1000,
  superCell_nPC = 10,
  superCell_save = TRUE,
  print_command = FALSE,
  save_command = TRUE
)

Arguments

input_eset

The expression set object that filtered, normalized and log-transformed

group_name

Character, name of the column for grouping, usually the column of cell types or clusters. Default: "clusterID".

group_name.refine

Logical, whether to replace the non-word characters in group names with underscore symbol ("_"). The improper filename characters may cause troubles, since scMINER creates a folder for each group using the group names. Set this argument to TRUE can help avoid this issue. Default: FALSE.

sjaracne_dir

The path to the folder for SJARACNe runs. Both the inputs and outputs will be saved here.

species_type

Character, species of the pre-defined driver list to use: "hg" for human or "mm" for mouse. Default: hg.

driver_type

Character, type of the pre-defined driver list to use: "TF" for transcriptional factors only, "SIG" for signaling genes only, or "TF_SIG" for both. Default: "TF_SIG".

customDriver_TF

A character vector or NULL, genes used to replace the pre-defined transcriptional factor driver list. This allows the user to customize the TF driver list. Default: NULL.

customDriver_SIG

A character vector or NULL, genes used to replace the pre-defined signaling gene driver list. This allows the user to customize the SIG driver list. Default: NULL.

downSample_N

Integer or NULL, if an integer is given, the groups with more cells than this integer will be down-sampled to this integer. A number between 500 to 3000 gives a good balance between robustness and computational efficiency. If NULL, the downsampling would be skipped. Default: 1000.

seed

Non-negative integer, seed of random sampling. Default: 123.

superCell_N

Integer or NULL, if an integer is given, the metacell method would be performed by SuperCell package to the groups with more cells than this integer. If NULL, no metacell method would be done. Default: NULL.

superCell_count

Integer, number of metacells to generate by SuperCell. Default: 100. Ignored if superCell_N = NULL.

superCell_gamma

Integer, graining level of data by SuperCell (proportion of number of single cells in the initial dataset to the number of metacells in the final dataset). Default: 10. Ignored if superCell_N = NULL.

superCell_knn

Integer, the k value to compute single-cell kNN network by SuperCell. Default: 5. Ignored if superCell_N = NULL.

superCell_nHVG

Integer, number of genes with the largest variation to use by SuperCell. Default: 1000. Ignored if superCell_N = NULL.

superCell_nPC

Integer, number of principal components to use for construction of single-cell kNN network by SuperCell.Default: 10. Ignored if superCell_N = NULL.

superCell_save

Logical, whether to save the results generated by SuperCell, including membership and other components. Default: TRUE. Ignored if superCell_N = NULL.

print_command

Logical, whether to print the command to run SJARACNe to screen. Default: FALSE.

save_command

Logical, whether to save the command to run SJARACNe. Default: TRUE.

Value

This function will generate several folders and files in the directory specified by "sjaracne_dir":

  1. a folder for each group in the column specified by "group_name";

  2. In each folder:

    • a ".exp.txt" file: expression matrix, features by cells.

    • a "TF" folder containing a ".tf.txt" file: this file contains the TF driver list.

    • a "SIG" folder containing a ".sig.txt" file: this file contains the SIG driver list.

    • a bash script (runSJARACNe.sh) to run SJARACNe. Further modification is needed to run it.

    • a json file (config_cwlexec.json) containing parameters to run SJARACNe.

Examples

if (FALSE) { # \dontrun{
data(pbmc14k_expression.eset)
## 1. The most commonly used command: pre-defined driver lists, automatic down-sampling, no metacell method
generateSJARACNeInput(input_eset = pbmc14k_expression.eset,
                      group_name = "cell_type",
                      sjaracne_dir = "./SJARACNe",
                      species_type = "hg",
                      driver_type = "TF_SIG")

## 2. to disable the downsampling
generateSJARACNeInput(input_eset = pbmc14k_expression.eset,
                      group_name = "cell_type",
                      sjaracne_dir = "./SJARACNe",
                      species_type = "hg",
                      driver_type = "TF_SIG",
                      downSample_N = NULL)

## 3. Use the customized driver list: (add TUBB4A is the gene of interest but currently not in the pre-defined driver list)

# when the driver-to-add is known as a transcription factor
generateSJARACNeInput(input_eset = pbmc14k_expression.eset, group_name = "trueLabel", sjaracne_dir = "./SJARACNe", species_type = "hg", driver_type = "TF_SIG",
                      customDriver_TF = c(getDriverList(species_type = "hg", driver_type = "TF"), "TUBB4A"))

# when the driver-to-add is known as a non-transcription factor
generateSJARACNeInput(input_eset = pbmc14k_expression.eset, group_name = "trueLabel", sjaracne_dir = "./SJARACNe", species_type = "hg", driver_type = "TF_SIG",
                      customDriver_SIG = c(getDriverList(species_type = "hg", driver_type = "SIG"), "TUBB4A"))

# when it's ambiguous to tell if the driver-to-add is a transcriptional factor
generateSJARACNeInput(input_eset = pbmc14k_expression.eset, group_name = "trueLabel", sjaracne_dir = "./SJARACNe", species_type = "hg", driver_type = "TF_SIG",
                      customDriver_TF = c(getDriverList(species_type = "hg", driver_type = "TF"), "TUBB4A"),
                      customDriver_SIG = c(getDriverList(species_type = "hg", driver_type = "SIG"), "TUBB4A"))

## 4. Use the metacell method
generateSJARACNeInput(input_eset = pbmc14k_expression.eset, group_name = "trueLabel", sjaracne_dir = "./SJARACNe", species_type = "hg", driver_type = "TF_SIG",
                      superCell_N = 1000, superCell_count = 100, seed = 123)
} # }