Plot

doubletdetection.plot.convergence(clf, show=False, save=None, p_thresh=1e-07, voter_thresh=0.9)[source]

Produce a plot showing number of cells called doublet per iter

Parameters:
  • clf (BoostClassifier object) – Fitted classifier
  • show (bool, optional) – If True, runs plt.show()
  • save (str, optional) – filename for saved figure, figure not saved by default
  • p_thresh (float, optional) – hypergeometric test p-value threshold that determines per iteration doublet calls
  • voter_thresh (float, optional) – fraction of iterations a cell must be called a doublet
Returns:

matplotlib figure

doubletdetection.plot.normalize_counts(raw_counts, pseudocount=0.1)[source]

Normalize count array. Default normalizer used by BoostClassifier.

Parameters:
  • raw_counts (ndarray) – count data
  • pseudocount (float, optional) – Count to add prior to log transform.
Returns:

Normalized data.

Return type:

ndarray

doubletdetection.plot.threshold(clf, show=False, save=None, log10=True, log_p_grid=None, voter_grid=None, v_step=2, p_step=5)[source]
Produce a plot showing number of cells called doublet across
various thresholds
Parameters:
  • clf (BoostClassifier object) – Fitted classifier
  • show (bool, optional) – If True, runs plt.show()
  • save (str, optional) – If provided, the figure is saved to this filepath.
  • log10 (bool, optional) – Use log 10 if true, natural log if false.
  • log_p_grid (ndarray, optional) – log p-value thresholds to use. Defaults to np.arange(-100, -1). log base decided by log10
  • voter_grid (ndarray, optional) – Voting thresholds to use. Defaults to np.arange(0.3, 1.0, 0.05).
  • p_step (int, optional) – number of xlabels to skip in plot
  • v_step (int, optional) – number of ylabels to skip in plot
Returns:

matplotlib figure

doubletdetection.plot.umap_plot(raw_counts, labels, n_components=30, show=False, save=None, normalizer=<function normalize_counts>, random_state=None)[source]

Produce a umap plot of the data with doublets in black.

Count matrix is normalized and dimension reduced before plotting.
Parameters:
  • raw_counts (array-like) – Count matrix, oriented cells by genes.
  • labels (ndarray) – predicted doublets from predict method
  • n_components (int, optional) – number of PCs to use prior to UMAP
  • show (bool, optional) – If True, runs plt.show()
  • save (str, optional) – filename for saved figure, figure not saved by default
  • normalizer ((ndarray) -> ndarray, optional) – Method to normalize raw_counts. Defaults to normalize_counts, included in this package. Note: To use normalize_counts with its pseudocount parameter changed from the default 0.1 value to some positive float new_var, use: normalizer=lambda counts: doubletdetection.normalize_counts(counts, pseudocount=new_var)
  • random_state (int, optional) – If provided, passed to PCA and UMAP
Returns:

matplotlib figure ndarray: umap reduction