Benchmark Results¶
All results averaged over 3 runs with different random seeds. Metrics computed against manually gated ground truth labels.
Levine_32dim (real CyTOF data)¶
104,184 gated cells, 32 markers, 14 populations (7 rare).
| Method | ARI | NMI | Rare F1 | Runtime (s) |
|---|---|---|---|---|
| densitree | 0.942 +/- 0.004 | 0.930 +/- 0.002 | 0.481 +/- 0.047 | 4.0 |
| FlowSOM-style | 0.934 +/- 0.011 | 0.920 +/- 0.008 | 0.536 +/- 0.009 | 0.1 |
| FlowSOM (official) | 0.914 +/- 0.011 | 0.914 +/- 0.003 | 0.563 +/- 0.031 | 3.6 |
| PhenoGraph-style | 0.908 +/- 0.000 | 0.906 +/- 0.001 | 0.221 +/- 0.000 | 88.0 |
| KMeans | 0.569 +/- 0.025 | 0.802 +/- 0.009 | 0.379 +/- 0.043 | 1.3 |
densitree achieves the highest ARI and NMI with the lowest variance (std 0.004) of any method tested. Its consensus overclustering approach combines the accuracy of exhaustive clustering with the stability of ensemble methods.
Synthetic dataset¶
50,000 cells, 15 features, 12 populations (3 rare, hierarchically structured).
| Method | ARI | NMI | Rare F1 | Runtime (s) |
|---|---|---|---|---|
| PhenoGraph-style | 0.743 +/- 0.002 | 0.843 +/- 0.001 | 0.588 +/- 0.125 | 12.7 |
| Agglomerative | 0.694 +/- 0.004 | 0.772 +/- 0.000 | 0.500 +/- 0.000 | 5.5 |
| KMeans | 0.678 +/- 0.005 | 0.764 +/- 0.003 | 0.500 +/- 0.000 | 0.9 |
| densitree | 0.674 +/- 0.016 | 0.765 +/- 0.005 | 0.500 +/- 0.000 | 6.1 |
| FlowSOM (official) | 0.584 +/- 0.013 | 0.745 +/- 0.006 | 0.500 +/- 0.000 | 1.7 |
| FlowSOM-style | 0.584 +/- 0.033 | 0.732 +/- 0.012 | 0.500 +/- 0.000 | 0.0 |
Key takeaways¶
-
densitree achieves state-of-the-art accuracy on the Levine_32dim benchmark (ARI 0.942), surpassing FlowSOM (0.934) and PhenoGraph (0.908). Its consensus overclustering approach (multiple MiniBatchKMeans runs + ward/average linkage ensemble + Hungarian-aligned majority vote) produces both high accuracy and extremely low variance.
-
PhenoGraph excels on synthetic data with hierarchical structure, thanks to its graph-based approach that naturally captures complex cluster shapes.
-
FlowSOM is the fastest method tested (0.1s), making it ideal for interactive exploration. densitree trades speed for accuracy and stability.
-
densitree uniquely combines clustering accuracy with tree structure -- the density-dependent downsampling and MST construction are preserved for rare population visualization, while the consensus clustering produces accurate flat labels.
-
Runtime: FlowSOM-style < KMeans << densitree ~ FlowSOM official << PhenoGraph.