Preprocessing
- preprocessing.dendrogram(adata, cluster_header, *, tl_kwargs={}, pl_kwargs={}, save=False, figsize=None, output_folder='', outputfilename_suffix='')
Generating a dendrogram from the AnnData object.
Parameters
- adata: AnnData
Annotated data matrix.
- cluster_header: str
Column in adata.obs storing cell annotation. Passed into scanpy’s dendrogram as groupby.
- tl_kwargs: dict
Additional parameters to pass to sc.tl.dendrogram.
- pl_kwargs: dict
Additional parameters to pass to sc.pl.dendrogram.
- save: bool | str (default: False)
Whether to save plot in output_folder. If string, choose the type of file to save as (‘png’(default), ‘svg’, ‘pdf).
- figsize: tuple (default: (12, 2))
figure.figsize for plt.rc_context.
- output_folder: str (default: “”)
Output folder. Created if doesn’t exist.
- outputfilename_suffix: str (default: “”)
Suffix for all output files.
Returns
does not return anything. Adds adata.uns[“dendrogram_{cluster_header}”] to passed in adata.
- preprocessing.prep_medians(adata, cluster_header, use_mean=False, positive_genes_only=True, plot=False)
Calculating the median expression matrix. Subsetting adata if positive_genes_only = True.
Parameters
- adata: AnnData
Annotated data matrix.
- cluster_header: str
Column in adata.obs storing cell annotation.
- use_mean: bool (default: False)
Whether to use the mean (vs median) for minimum gene expression threshold.
- positive_genes_only: bool (default: True)
Whether to subset AnnData to only have genes with median/mean expression greater than 0.
Returns
- adata: AnnData
AnnData with median expression values stored in adata.varm[“medians_{cluster_header}”].
- preprocessing.get_medians(adata, cluster_header, use_mean=False)
Calculating the median (mean) expression per gene for each cluster_header.
Parameters
- adata: AnnData
Annotated data matrix.
- cluster_header: str
Column in adata.obs storing cell annotation.
- use_mean: bool (default: False)
Whether to use the mean (vs median) for minimum gene expression threshold.
Returns
- cluster_medians: pd.DataFrame
Gene-by-cluster median (mean) expression dataframe.
- preprocessing.prep_binary_scores(adata, cluster_header, medians_header='medians_')
Calculating the binary scores of each gene per cluster_header.
Parameters
- adata: AnnData
Annotated data matrix.
- cluster_header: str
Column in adata.obs storing cell annotation.
- medians_header: str (default: “medians_{cluster_header}”)
Key in adata.varm storing median expression matrix.
Returns
- adata: AnnData
AnnData with binary scores stored in adata.varm[“binary_scores_{cluster_header}”].
- preprocessing.plot_varm(adata, varm_key, nonzero=False, scale=None, figsize=(6, 4), show=True, save=False, output_folder='')
Plotting histogram of median expression per gene per cluster.
Parameters:
- adata: AnnData
Annotated data matrix.
- varm_key: str
Key in adata.varm storing calculated medians or binary scores.
- nonzero: bool
Whether to remove zeros from histogram.
- scale: str
How to scale the y-axis.
- figsize: tuple
Width and height of plot.
- show: bool
Whether to show the plot.
- save: bool | str (default: False)
Whether to save plot. If string, choose the type of file to save as (“png”, “svg”, “pdf”).
- output_folder: str (default: “”)
Output folder for output files.
Returns:
- fig: matplotlib.pyplot.figure
Histogram of adata.varm[varm_key]
- preprocessing.spaceTx_genefilter(adata, lower_percentile=0.1, upper_percentile=0.99, min_txLength=700, species='human', species_dict=None, gencode_folder='gencode_annotation')
Filtering genes for spatial gene probe panel design.
Parameters
- adata: AnnData
Annotated data matrix.
- lower_percentile: float (default: 0.1)
Lower quartile percentile to filter non-0 median gene expression.
- upper_percentile: float (default: 0.99)
Upper quartile percentile to filter non-0 median gene expression.
- min_txLength: int (default: 700)
Minimum transcript length.
- species: [“human”, “mouse”, “other”] (default: “human”)
Species relating to gencode_annotation.
Returns
- adata: AnnData
Subset AnnData based on lower_percentile, upper_percentile, min_txLength.