Deconvolution and Decomposition

class deconvolution.Evaluation(proportion_truth, proportion_estimated_list, methods, out_dir='', cluster=None, type_list=None, colors=None, coordinates=None, min_spot_distance=112)[source]

Bases: object

static JSD(proportion_truth: ndarray, proportion_estimated: ndarray, metric_type='Spot')[source]: Jensen–Shannon divergence :param proportion_truth: Ground truth of the cell proportion. :param proportion_estimated: Estimated proportion. :param metric_type: How the metric is calculated.

static absolute_error(proportion_truth: ndarray, proportion_estimated: ndarray, metric_type='Spot')[source]

static correct_fraction(proportion_truth: ndarray, proportion_estimated: ndarray, metric_type='Spot')[source]

static correlation(proportion_truth: ndarray, proportion_estimated: ndarray, metric_type='Spot')[source]

static cosine(proportion_truth: ndarray, proportion_estimated: ndarray, metric_type='Spot')[source]

evaluate_metric(metric='Cosine similarity', metric_type='Spot', region=None)[source]

Evaluate the proportions based on the metric.

Parameters:

metric – Name of the metric.
metric_type – How the metric is calculated. ‘Spot’: metric is calculated for each spot; ‘Cell type’, metric is calculated for each cell type; ‘Individual’: metric is calculated for each individual proportion estimation.
region – The region that is being evaluated.

plot_metric(save=False, region=None, metric='Cosine similarity', metric_type='Spot', cell_types=None, suffix='', show=True)[source]

Plot the box plot of each method based on the metric.

Box number equals to the number of methods.

Parameters:

save – If true, save the figure.
region – Regions of the tissue.
metric – Name of the metric.
metric_type – How the metric is calculated. ‘Spot’: metric is calculated for each spot; ‘Cell type’, metric is calculated for each cell type; ‘Individual’: metric is calculated for each individual proportion estimation.
cell_types – If metric_type is ‘Cell type’ and cell_types is not None, then only plot the results corresponding to the cell_types.
suffix – suffix of the save file.
show – Whether to show the figure

plot_metric_all(save=False, metric='Absolute error', region=None)[source]

plot_metric_spot_type(save=False, metric='Absolute error')[source]: Similar to plot_metric_spot, but the figures are separated for each cell type.

static square_error(proportion_truth: ndarray, proportion_estimated: ndarray, metric_type='Spot')[source]

deconvolution.archive_assign_type_out_gp(nucleus_df, cell_proportion, spot_centers, type_list, max_distance=100, return_gp=False)[source]

Assign the cell type to the cells outside the spot.

Parameters:

nucleus_df – Dataframe of the nucleus. Part of spotiphy.segmentation.Segmentation.
spot_centers – Centers of the spots.
cell_proportion – Proportion of each cell type in each spot.
type_list – List of the cell types.
max_distance – If the distance between a nucleus and the closest spot is larger than max_distance, the cell type will not be assigned to this nucleus.
return_gp – If return the fitted GP models.

Returns:

nucleus_df with assigned spot

deconvolution.assign_type_out(nucleus_df, cell_proportion, spot_centers, type_list, max_distance=100, band_width=100)[source]

Assign the cell type to the cells outside the spot.

Parameters:

nucleus_df – Dataframe of the nucleus. Part of spotiphy.segmentation.Segmentation.
spot_centers – Centers of the spots.
cell_proportion – Proportion of each cell type in each spot.
type_list – List of the cell types.
max_distance – If the distance between a nucleus and the closest spot is larger than max_distance, the cell type will not be assigned to this nucleus.
band_width – Band width of the kernel.

Returns:

nucleus_df with assigned spot

deconvolution.assign_type_spot(nucleus_df, n_cell_df, cell_number, type_list)[source]

Assign the cell type to the cells inside the spot.

Parameters:

nucleus_df – Dataframe of the nucleus. Part of spotiphy.segmentation.Segmentation.
n_cell_df – Dataframe of the number of cells in each spot. Part of spotiphy.segmentation.Segmentation.
cell_number – Number of each cell type in each spot.
type_list – List of the cell types.

Returns:

nucleus_df with assigned spot

deconvolution.decomposition(adata_st: AnnData, adata_sc: AnnData, key_type: str, cell_proportion: ndarray, save=True, out_dir='', threshold=0.1, n_cell=None, spot_location: Optional[ndarray] = None, filtering_gene=False, filename='ST_decomposition.h5ad', verbose=0, use_original_proportion=False)[source]

Decompose ST.

Parameters:

adata_st – Original spatial transcriptomics data.
adata_sc – Original single-cell data.
key_type – The key that is used to extract cell type information from adata_sc.obs.
cell_proportion – Proportion of each cell type obtained by the deconvolution.
save – If True, save the generated adata_st as a file.
out_dir – Output directory.
threshold – If n_cell is none, discard cell types with proportion less than threshold.
n_cell – Number of cells in each spot.
spot_location – Coordinates of the spots.
filtering_gene – Whether filter the genes in sc_reference.initialization.
filename – Name of the saved file.
verbose – Whether print the time spend.
use_original_proportion – If the original proportion is used to estimate the iscRNA. Note that even when the original proportion is used, we still filter some cells in iscRNA.

Returns:

Anndata similar to scRNA, but obtained by decomposing ST.

Return type:

adata_st_decomposed

deconvolution.deconvolute(X, sc_ref, device='cuda', n_epoch=8000, adam_params=None, batch_prior=2, plot=False, fig_size=(4.8, 3.6), dpi=200)[source]

Deconvolution of the proportion of genes contributed by each cell type.

Parameters:

X – Spatial transcriptomics data. n_spot*n_gene.
sc_ref – Single cell reference. n_type*n_gene.
device – The device used for the deconvolution.
plot – Whether to plot the ELBO loss.
n_epoch – Number of training epochs.
adam_params – Parameters for the adam optimizer.
batch_prior – Parameter of the prior distribution of the batch effect: 2^(Uniform(0, batch_prior))
fig_size – Size of the figure.
dpi – Dots per inch (DPI) of the figure.

Returns:

Parameters in the generative model.

deconvolution.estimation_proportion(X, adata_sc, sc_ref, type_list, key_type, device='cuda', n_epoch=8000, adam_params=None, batch_prior=2, plot=False, fig_size=(4.8, 3.6), dpi=200)[source]

Estimate the proportion of each cell type in each spot.

Parameters:

X – Spatial transcriptomics data. n_spot*n_gene.
adata_sc (anndata.Anndata) – scRNA data.
sc_ref – Single cell reference. n_type*n_gene.
type_list – List of the cell types.
key_type – Column name of the cell types in adata_sc.
device – The device used for the deconvolution.
plot – Whether to plot the ELBO loss.
n_epoch – Number of training epochs.
adam_params – Parameters for the adam optimizer.
batch_prior – Parameter of the prior Dirichlet distribution of the batch effect: 2^(Uniform(0, batch_prior))
fig_size – Size of the figure.
dpi – Dots per inch (DPI) of the figure.

Returns:

Parameters in the generative model.

deconvolution.plot_proportion(img, proportion, spot_location, radius, cmap_name='viridis', alpha=0.4, save_path='proportion.png', vmax=0.98, spot_scale=1.3, show_figure=False, int_ticks=False, bar_location=(5800, 8100))[source]

Plot the proportion of a cell type.

Parameters:

img – 3 channel img with integer values in [0, 255]
proportion – Proportion of a cell type.
spot_location – Location of the spots.
radius – Radius of the spot
cmap_name – Name of the camp.
alpha – Level of transparency of the background img.
save_path – If not none, save the img to the path.
vmax – Quantile of the maximum value in the color bar.
spot_scale – Scale of the spot in the figure.
show_figure – Whether plot the figure.
int_ticks – Whether the ticks must be integers.

deconvolution.proportion_to_count(p, n, multiple_spots=False)[source]

Convert the cell proportion to the absolute cell number.

Parameters:

p – Cell proportion.
n – Number of cells.
multiple_spots – If the data is related to multiple spots

Returns:

Cell count of each cell type.

deconvolution.simulation(adata_st: AnnData, adata_sc: AnnData, key_type: str, cell_proportion: ndarray, n_cell=10, batch_effect_sigma=0.1, zero_proportion=0.3, additive_noise_sigma=0.05, save=True, out_dir='', filename='ST_Simulated.h5ad', verbose=0)[source]

Simulation of the spatial transcriptomics data based on a real spatial sample and deconvolution results of Spotiphy.

Parameters:

adata_st – Original spatial transcriptomics data.
adata_sc – Original single-cell data.
key_type – The key that is used to extract cell type information from adata_sc.obs.
cell_proportion – Proportion of each cell type obtained by the deconvolution.
n_cell – Number of cells in each spot, either a key of adata_st.obs or a positive integer.
batch_effect_sigma – Sigma of the log-normal distribution when generate batch effect.
zero_proportion – Proportion of gene expression set to 0. Note that since some gene expression in the original X is already 0, the final proportion of 0 gene read is larger than zero_proportion.
additive_noise_sigma – Sigma of the log-normal distribution when generate additive noise.
save – If True, save the generated adata_st as a file.
out_dir – Output directory.
filename – Name of the saved file.
verbose – Whether print the time spend.

Returns:

Simulated ST Anndata.