Benchmark Results ================= All results averaged over 3 runs with different random seeds. Metrics computed against manually gated ground truth labels. Levine_32dim (real CyTOF data) -------------------------------- 104,184 gated cells, 32 markers, 14 populations (7 rare). .. list-table:: :header-rows: 1 :widths: auto * - Method - ARI - NMI - Rare F1 - Runtime (s) * - **densitree** - **0.942** ± 0.004 - **0.930** ± 0.002 - 0.481 ± 0.047 - 4.0 * - FlowSOM-style - 0.934 ± 0.011 - 0.920 ± 0.008 - 0.536 ± 0.009 - **0.1** * - FlowSOM (official) - 0.914 ± 0.011 - 0.914 ± 0.003 - **0.563** ± 0.031 - 3.6 * - PhenoGraph-style - 0.908 ± 0.000 - 0.906 ± 0.001 - 0.221 ± 0.000 - 88.0 * - KMeans - 0.569 ± 0.025 - 0.802 ± 0.009 - 0.379 ± 0.043 - 1.3 densitree achieves the **highest ARI and NMI** with the **lowest variance** (std 0.004) of any method tested. ---- Synthetic dataset ----------------- 50,000 cells, 15 features, 12 populations (3 rare, hierarchically structured). .. list-table:: :header-rows: 1 :widths: auto * - Method - ARI - NMI - Rare F1 - Runtime (s) * - **PhenoGraph-style** - **0.743** ± 0.002 - **0.843** ± 0.001 - **0.588** ± 0.125 - 12.7 * - Agglomerative - 0.694 ± 0.004 - 0.772 ± 0.000 - 0.500 ± 0.000 - 5.5 * - KMeans - 0.678 ± 0.005 - 0.764 ± 0.003 - 0.500 ± 0.000 - 0.9 * - **densitree** - 0.674 ± 0.016 - 0.765 ± 0.005 - 0.500 ± 0.000 - 6.1 * - FlowSOM (official) - 0.584 ± 0.013 - 0.745 ± 0.006 - 0.500 ± 0.000 - 1.7 * - FlowSOM-style - 0.584 ± 0.033 - 0.732 ± 0.012 - 0.500 ± 0.000 - 0.0 ---- Key takeaways ------------- 1. **densitree achieves state-of-the-art accuracy** on the Levine_32dim benchmark (ARI 0.942), surpassing FlowSOM (0.934) and PhenoGraph (0.908). 2. **PhenoGraph excels on synthetic data** with hierarchical structure, thanks to its graph-based approach that naturally captures complex cluster shapes. 3. **FlowSOM is the fastest** method tested (0.1s), making it ideal for interactive exploration. densitree trades speed for accuracy and stability. 4. **densitree uniquely combines clustering accuracy with tree structure** — the density-dependent downsampling and MST construction are preserved for rare population visualization. 5. **Runtime**: FlowSOM-style < KMeans << densitree ~ FlowSOM official << PhenoGraph.