Evaluating
- evaluating.DecisionTree(adata, cluster_header, markers_dict, *, medians_header='medians_', beta=0.5, combinations=False, use_mean=False, save=False, save_supplementary=False, output_folder='', outputfilename_prefix='')
Calculating sklearn.metrics’s fbeta_score, precision_score, recall_score, and confusion_matrix for genes_eval.
Parameters
- adata: AnnData
Annotated data matrix.
- cluster_header: str
Column in adata.obs storing cell annotation.
- markers_dict: dict
Dictionary containing genes for each cluster_header (clusterName: list of markers)
- medians_header: str (default: “medians_{cluster_header}”)
Key in adata.varm storing median expression matrix.
- beta: float (default: 0.5)
beta parameter in sklearn.metrics’s fbeta_score.
- combinations: bool (default: False)
Whether to find the combination of genes_eval with the highest fbeta_score.
- use_mean: bool (default: False)
Whether to use the mean (vs median) for minimum gene expression threshold.
- save: bool (default: False)
Whether to save csv and pkl of df_results in output_folder.
- save_supplementary: bool (default: False)
Whether to save additional supplementary csvs.
- output_folder: str (default: “”)
Output folder. Created if doesn’t exist.
- outputfilename_prefix: str (default: “”)
Prefix for all output files.
Returns
- df_results: pd.DataFrame
NS-Forest results. Includes classification metrics (f_score, precision, recall, onTarget).
- evaluating.add_fraction(adata, df_results, cluster_header, medians_header='medians_', use_mean=False, save_supplementary=False, output_folder='', outputfilename_prefix='')
Calculating sklearn.metrics’s fbeta_score, sklearn.metrics’s prevision_score, sklearn.metrics’s confusion_matrix for each genes_eval combination. Returning set of genes and scores with highest score sum.
Parameters
- adata: AnnData
Annotated data matrix.
- df_results: pd.DataFrame
NS-Forest results. Contains classification metrics (f_score, precision, recall, onTarget).
- cluster_header
Column in adata’s .obs representing cell annotation.
- medians_header: str
Key in adata’s .varm storing median expression matrix.
- use_mean
Whether to use the mean or median for minimum gene expression threshold.
- output_folder
Output folder.
- outputfilename_prefix
Prefix for all output files.
Returns
df_results: pd.DataFrame of the NS-Forest results. Contains classification metrics (f_score, precision, recall, onTarget).