SPADE#

class densitree.spade.SPADE(n_clusters: int = 50, downsample_target: float = 0.1, knn: int = 5, n_micro: int | None = None, n_consensus: int = 10, transform: str | None = 'arcsinh', cofactor: float = 150.0, backend: str = 'matplotlib', density_estimator: BaseStep | None = None, random_state: int | None = None)[source]#

Bases: object

SPADE clustering with scikit-learn-compatible API.

Improved SPADE that combines density-dependent downsampling (for rare population preservation and tree construction) with consensus overclustering (for accurate cell assignment).

The algorithm:

  1. Density estimation (k-NN) on all cells.

  2. Consensus clustering over multiple runs (overcluster into n_micro microclusters, merge into n_clusters metaclusters, align labels via Hungarian algorithm, filter low-agreement runs, take majority vote).

  3. Density-dependent downsampling for tree construction.

  4. MST construction on metacluster centroids.

Parameters:
  • n_clusters (int) – Number of clusters (default 50).

  • downsample_target (float) – Fraction of cells to retain for tree construction (default 0.1).

  • knn (int) – k for k-NN density estimation (default 5).

  • n_micro (int | None) – Number of microclusters. None uses min(10 * n_clusters, n_cells // 10).

  • n_consensus (int) – Number of MiniBatchKMeans runs per linkage type for consensus. Total runs = 2 * n_consensus (ward + average). Default 10.

  • transform (str | None) – 'arcsinh', 'log', or None.

  • cofactor (float) – Arcsinh cofactor (default 150.0).

  • backend (str) – Default plotting backend.

  • density_estimator (BaseStep | None) – Custom density estimator step.

  • random_state (int | None) – Seed for reproducibility.

fit(X: ndarray | DataFrame) SPADE[source]#

Fit SPADE to data.

fit_predict(X: ndarray | DataFrame) ndarray[source]#

Fit and return cluster labels for all cells.