This crate provides an efficient implementation of the newly invented DartUniFrac algorithm for large-scale UniFrac computation (weighted and unweighted). We named this new algorithm DartUniFrac because the key step is to use DartMinHash or Efficient Rejection Sampling (or ERS) on branches and the DartMinHash/ERS is about "Among the first r darts thrown, return those hitting
On Linux or MacOS
conda install -c bioconda -c conda-forge dartunifrac
Run on example data
wget https://github.com/jianshu93/DartUniFrac/releases/download/v0.2.3/GWMC_16S_otutab.biom
wget https://github.com/jianshu93/DartUniFrac/releases/download/v0.2.3/GWMC_rep_seqs_all.tre
dartunifrac -t ./GWMC_rep_seqs_all.tre -b ./GWMC_16S_otutab.biom -m dmh -s 3072 -o unifrac_unweighted.tsv
dartunifrac -t ./GWMC_rep_seqs_all.tre -b ./GWMC_16S_otutab.biom --weighted -m dmh -s 3072 -o unifrac_weighted.tsv
### obtain the truth via striped unifrac algorithm (SIMD supported), extremely slow at the million-sample scale
striped_unifrac -t ./GWMC_rep_seqs_all.tre -m ./GWMC_16S_otutab.biom --weighted -o unifrac_weighted_striped.tsv
UniFrac (unweighted) can be simply described as unique branches that differ two samples over shared branches (see original UniFrac paper here). Here, each sample has some taxa (or features) that are in the phylogenetic tree.
If we reformulate UniFrac in math, it can be descriped below:
Therefore:
since
here,
In summary, unweighted UniFrac distance can be considered as weighted Jaccard distance on branches. A fast PCoA on the resulting DartUniFrac distance can also be computed. We rely on fixed rank subspace iteration style randomized SVD, see below.
Weighted UniFrac measuers the mass accumulate (taxa weight) over branches(see here). Here, Weighted UniFrac is the normalized weighted UniFrac. A similar idea can be used to derive that Weighted UniFrac is a simple transformation from weighted Jaccard
where
We first created a few libraries for the best performance of DartUniFrac implementation.
1.Optimal representation of balanced parenthesis for phylogenetic trees via succparen
2.Implementation of DartMinHash and Efficient Rejection Sampling algorithms can be found here.
3.SIMD-aware Hamming similarity for computing hash collision probability of sketches, anndists
4.A fast, in-memory Principle Coordinate Analysis (PCoA) based on randomized SVD (subspace iteration type SVD), fpcoa
wget https://github.com/jianshu93/DartUniFrac/releases/download/v0.2.3/dartunifrac_Linux_x86-64_v0.2.3.zip
unzip dartunifrac_Linux_x86-64_v0.2.3.zip
chmod a+x ./dartunifrac
./dartunifrac -h
## install homebrew first: https://brew.sh
brew tap jianshu93/DartUniFrac
brew install DartUniFrac
dartunifrac -h
CMake needs to be installed first, see guidance here
git clone https://github.com/jianshu93/DartUniFrac.git
cd DartUniFrac
#### You must have CMake installed to compile HDF5 from source. This is for BIOM format input. You also need zstd installed and library files in system path for compression.
### Linux
cargo build --release --features intel-mkl-static,stdsimd
### macos
cargo build --release --features macos-accelerate,stdsimd
DartUniFrac will use all availble CPU cores/threads via Rayon by default.
$ ./target/release/dartunifrac -h
************** initializing logger *****************
DartUniFrac: Approximate unweighted UniFrac via Weighted MinHash 🎯🎯🎯
Usage: dartunifrac [OPTIONS] --tree <tree> <--input <input>|--biom <biom>>
Options:
-t, --tree <tree> Input tree in Newick format
-i, --input <input> OTU/Feature table in TSV format
-b, --biom <biom> OTU/Feature table in BIOM (HDF5) format
-o, --output <output> Output distance matrix in TSV format [default: unifrac.tsv]
--weighted Weighted UniFrac (normalized)
-s, --sketch <sketch-size> Sketch size for Weighted MinHash (DartMinHash or ERS) [default: 2048]
-m, --method <method> Sketching method: dmh (DartMinHash) or ers (Efficient Rejection Sampling) [default: dmh] [possible values: dmh, ers]
-l, --length <seq-length> Per-hash independent random sequence length for ERS, must be >= 1024 [default: 4096]
-T, --threads <threads> Number of threads, default all logical cores
--seed <seed> Random seed for reproducibility [default: 1337]
--compress Compress output with zstd, .zst suffix will be added to the output file name
--pcoa Fast Principle Coordinate Analysis based on Randomized SVD (subspace iteration), output saved to pcoa.txt/ordination.txt
--streaming Streaming the distance matrix while computing (zstd-compressed)
--block <block> Number of rows per chunk, streaming mode only
-h, --help Print help
-V, --version Print version
### DartMinHash, biom format input, unweighted
dartunifrac -t ./ASVs_aligned.tre -b ./ASVs_counts.biom -m dmh -s 2048 -o unifrac_dmh_dist.csv
### DartMinHash, biom format input, weighted
dartunifrac -t ./ASVs_aligned.tre -b ./ASVs_counts.biom --weighted -m dmh -s 2048 -o unifrac_dmh_dist.csv
### tsv/txt tabular input
dartunifrac -t ./data/ASVs_aligned.tre -i ./data/ASVs_counts.txt -m dmh -s 2048 -o unifrac_dmh_dist.csv
### Efficient Rejection Sampling
dartunifrac -t ./data/ASVs_aligned.tre -b ./data/ASVs_counts.biom -m ers -s 2048 -l 4096 -o unifrac_ers_dist.csv
### dartMinHash with pcoa and compressed output distance matrix
dartunifrac -t ./data/ASVs_aligned.tre -b ./data/ASVs_counts.biom -m dmh -s 2048 -o unifrac_dmh_dist.tsv --pcoa --compress
### Streaming mode (reduce memory requirement) for large number of samples. Block size is normally 1/50 to 1/5 of sample size
dartunifrac -t ./data/ASVs_aligned.tre -b ./data/ASVs_counts.biom -m dmh -s 2048 -o unifrac_dmh_dist.tsv --streaming --block 8192
We provide Nvidia GPU support via CUDA (CUDA v0.12.0 or later must be installed and in system library path) on Linux (x86-64 tested). It will fall back to CPU if no GPU device is detected. Only the Hamming distance computation step benefits from GPU. It is optimized for Nvidia A100 but also works for RTX series.
### get the binary
wegt https://github.com/jianshu93/DartUniFrac/releases/download/v0.2.3/dartunifrac-cuda_Linux_x86-64_v0.2.3.zip
unzip dartunifrac-cuda_Linux_x86-64_v0.2.3.zip
chmod a+x ./dartunifrac-cuda
RUST_LOG=info ./dartunifrac-cuda -t ./GWMC_rep_seqs_all.tre -b ./GWMC_16S_otutab.biom -m dmh -s 3072 --weighted -o unifrac_weighted.cuda.tsv
A better way is to compile from source.
git clone --branch DartUniFrac-GPU https://github.com/jianshu93/DartUniFrac.git
cd DartUniFrac
cargo build --release --features intel-mkl-static,stdsimd,cuda
./target/release/dartunifrac -h
Speed benchmark for 50k samples, 4 Nvidia RTX6000 was available
$ RUST_LOG=info dartunifrac-cuda -t ./ag_emp.tre -b ./ag_emp_even500.biom --weighted -m dmh -s 3072 -o ag_emp_dmh_RTX.tsv
************** initializing logger *****************
[2025-09-17T20:01:21Z INFO dartunifrac] Logger initialized from default environment
[2025-09-17T20:01:22Z INFO dartunifrac] 24 threads will be used
[2025-09-17T20:01:22Z INFO dartunifrac] method=dmh k=3072 seed=1337
[2025-09-17T20:01:22Z INFO dartunifrac] Weighted mode
[2025-09-17T20:01:27Z INFO dartunifrac] nodes = 614110 leaves = 307055
[2025-09-17T20:01:28Z INFO dartunifrac] transposing BIOM CSR→CSC …
[2025-09-17T20:01:28Z INFO dartunifrac] building per-sample weighted sets from BIOM (CSC) …
[2025-09-17T20:01:29Z INFO dartunifrac] built weighted sets in 919 ms
[2025-09-17T20:01:30Z INFO dartunifrac] active edges = 545049 (from 614110 total, 307055 leaves)
[2025-09-17T20:01:30Z INFO dartunifrac] sketching starting...
[2025-09-17T20:01:37Z INFO dartunifrac] sketching done.
[2025-09-17T20:01:38Z INFO dartunifrac] CUDA detected (4 devices). Computing pairwise distances on GPUs …
[2025-09-17T20:02:01Z INFO dartunifrac] pairwise distances (GPU) in 23305 ms
[2025-09-17T20:02:01Z INFO dartunifrac] Writing uncompressed output → ag_emp_dmh_RTX.tsv
[2025-09-17T20:02:56Z INFO dartunifrac] Done → ag_emp_dmh_RTX.tsv
2 Nividia A100
$ RUST_LOG=info dartunifrac-cuda -t ./ag_emp.tre -b ./ag_emp_even500.biom --weighted -m dmh -s 3072 -o ag_emp.weighted.dmh.cuda.tsv --compress --pcoa
************** initializing logger *****************
[2025-09-07T05:33:22Z INFO dartunifrac] Logger initialized from default environment
[2025-09-07T05:33:22Z INFO dartunifrac] 8 threads will be used
[2025-09-07T05:33:22Z INFO dartunifrac] method=dmh k=3072 seed=1337
[2025-09-07T05:33:22Z INFO dartunifrac] Weighted mode
[2025-09-07T05:33:26Z INFO dartunifrac] nodes = 614110 leaves = 307055
[2025-09-07T05:33:27Z INFO dartunifrac] transposing BIOM CSR→CSC …
[2025-09-07T05:33:27Z INFO dartunifrac] building per-sample weighted sets from BIOM (CSC) …
[2025-09-07T05:33:28Z INFO dartunifrac] built weighted sets in 936 ms
[2025-09-07T05:33:28Z INFO dartunifrac] active edges = 545049 (from 614110 total, 307055 leaves)
[2025-09-07T05:33:28Z INFO dartunifrac] sketching starting...
[2025-09-07T05:33:43Z INFO dartunifrac] sketching done.
[2025-09-07T05:33:43Z INFO dartunifrac] CUDA detected (2 devices). Computing pairwise distances on GPUs …
[2025-09-07T05:34:03Z INFO dartunifrac] pairwise distances (GPU) in 19668 ms
[2025-09-07T05:34:03Z INFO dartunifrac] Writing compressed (zstd) output → ag_emp.weighted.dmh.cuda.tsv.zst
1.A distance matrix (see also benchmark section) and pcoa stype output. Distance matrix:
Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
---|---|---|---|---|---|---|
Orwoll_BI0023_BI | 0.0 | 0.40307617187 | 0.36596679687 | 0.36010742187 | 0.35815429687 | 0.530273437 |
Orwoll_BI0056_BI | 0.40307617187 | 0.0 | 0.3901367187 | 0.40747070312 | 0.3398437 | 0.5209960937 |
Orwoll_BI0131_BI | 0.36596679687 | 0.3901367187 | 0.0 | 0.41577148437 | 0.40014648437 | 0.56860351562 |
Orwoll_BI0153_BI | 0.36010742187 | 0.40747070312 | 0.41577148437 | 0.0 | 0.260742187 | 0.422851562 |
Orwoll_BI0215_BI | 0.35815429687 | 0.3398437 | 0.40014648437 | 0.260742187 | 0.0 | 0.45825195312 |
Orwoll_BI0353_BI | 0.530273437 | 0.5209960937 | 0.56860351562 | 0.422851562 | 0.45825195312 | 0.0 |
2.pcoa.txt (simple):
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | |
---|---|---|---|---|---|---|
Orwoll_BI0023_BI | -0.12912805176115577 | -0.13478145930844512 | 0.042380530554922635 | 0.17338355657990664 | -0.003431287906286917 | 0.0 |
Orwoll_BI0056_BI | -0.09201950670064461 | 0.23750165032060416 | -0.030767869352507412 | 0.05365540164234924 | -0.04135676583746806 | 0.0 |
Orwoll_BI0131_BI | -0.18672940644255911 | -0.01880872256563759 | 0.15163198033883335 | -0.1408718693633402 | 0.013611393370684486 | 0.0 |
Orwoll_BI0153_BI | 0.06843838458354137 | -0.09791766405635079 | -0.12381078744019998 | -0.07508375468747691 | -0.10416130668711489 | 0.0 |
Orwoll_BI0215_BI | -0.003937706919220717 | -0.00814970660379848 | -0.15251179756497754 | -0.028692966865033972 | 0.11819355132475588 | 0.0 |
Orwoll_BI0353_BI | 0.3433762872400387 | 0.022155902213627746 | 0.11307794346392891 | 0.017609632693595324 | 0.017144415735429568 | 0.0 |
PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | |
---|---|---|---|---|---|---|
proportion_explained | 0.4233339721296298 | 0.19721082825095754 | 0.17875499078496368 | 0.13806192702856432 | 0.0626382818058847 | 0.0 |
3.ordination.txt (scikit-bio/qiime format):
Eigvals 6
0.18261610725439215 0.0850720143777559 0.07711060939718065 0.059556599124712034 0.027020650223156684 0.0
Proportion explained 6
0.42333397212962964 0.19721082825095754 0.17875499078496382 0.13806192702856435 0.06263828180588471 0.0
Species 0 0
Site 6 6
Orwoll_BI0023_BI -0.12912805176115583 -0.1347814593084449 0.04238053055492221 0.1733835565799069 -0.0034312879062869286 0.0
Orwoll_BI0056_BI -0.09201950670064456 0.23750165032060455 -0.030767869352506142 0.05365540164234891 -0.0413567658374681 0.0
Orwoll_BI0131_BI -0.18672940644255914 -0.018808722565638855 0.15163198033883318 -0.14087186936334042 0.013611393370684503 0.0
Orwoll_BI0153_BI 0.06843838458354136 -0.0979176640563503 -0.1238107874402007 -0.07508375468747673 -0.104161306687115 0.0
Orwoll_BI0215_BI -0.003937706919220702 -0.008149706603797708 -0.15251179756497782 -0.028692966865033843 0.11819355132475592 0.0
Orwoll_BI0353_BI 0.3433762872400386 0.02215590221362715 0.11307794346392917 0.017609632693595185 0.017144415735429578 0.0
Biplot 0 0
Site constraints 0 0
the ordination.txt file can be imported into qiime2 like this:
qiime tools import --input-path ordination.txt --type 'PCoAResults' --output-path pcoa.qza
You can then visulize it via qiime emperor plot.
We use Striped UniFrac algorithm as the ground truth, which is an exact and efficient algorithm for large number of samples. A pure Rust implementaion, as a supporting crate for this one, can be found here, also included as a binary in this crate.
For the testing data (ASVs_count.tsv and ASV_aligned.tre), the truth from Striped UniFrac (unweighted) is:
Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
---|---|---|---|---|---|---|
Orwoll_BI0023_BI | 0 | 0.403847873210907 | 0.3646169304847717 | 0.366204708814621 | 0.3484474122524261 | 0.5317433476448059 |
Orwoll_BI0056_BI | 0.403847873210907 | 0 | 0.3883068859577179 | 0.4069649279117584 | 0.3338068425655365 | 0.5172212719917297 |
Orwoll_BI0131_BI | 0.3646169304847717 | 0.3883068859577179 | 0 | 0.4222687482833862 | 0.4006152451038361 | 0.5693299174308777 |
Orwoll_BI0153_BI | 0.366204708814621 | 0.4069649279117584 | 0.4222687482833862 | 0 | 0.2608364820480347 | 0.4297698140144348 |
Orwoll_BI0215_BI | 0.3484474122524261 | 0.3338068425655365 | 0.4006152451038361 | 0.2608364820480347 | 0 | 0.4572696685791016 |
Orwoll_BI0353_BI | 0.5317433476448059 | 0.5172212719917297 | 0.5693299174308777 | 0.4297698140144348 | 0.4572696685791016 | 0 |
DartUniFrac estimation (DartMinHash, unweighted) is:
Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
---|---|---|---|---|---|---|
Orwoll_BI0023_BI | 0.0 | 0.40307617187 | 0.36596679687 | 0.36010742187 | 0.35815429687 | 0.530273437 |
Orwoll_BI0056_BI | 0.40307617187 | 0.0 | 0.3901367187 | 0.40747070312 | 0.3398437 | 0.5209960937 |
Orwoll_BI0131_BI | 0.36596679687 | 0.3901367187 | 0.0 | 0.41577148437 | 0.40014648437 | 0.56860351562 |
Orwoll_BI0153_BI | 0.36010742187 | 0.40747070312 | 0.41577148437 | 0.0 | 0.260742187 | 0.422851562 |
Orwoll_BI0215_BI | 0.35815429687 | 0.3398437 | 0.40014648437 | 0.260742187 | 0.0 | 0.45825195312 |
Orwoll_BI0353_BI | 0.530273437 | 0.5209960937 | 0.56860351562 | 0.422851562 | 0.45825195312 | 0.0 |
DartUniFrac estimation (Efficient Rejection Sampling, uniweghted) is:
Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
---|---|---|---|---|---|---|
Orwoll_BI0023_BI | 0.0 | 0.4101562 | 0.3657226562 | 0.3618164062 | 0.3569335937 | 0.52539062 |
Orwoll_BI0056_BI | 0.4101562 | 0.0 | 0.40039062 | 0.4086914062 | 0.340820312 | 0.5190429687 |
Orwoll_BI0131_BI | 0.3657226562 | 0.40039062 | 0.0 | 0.4360351562 | 0.4145507812 | 0.570312 |
Orwoll_BI0153_BI | 0.3618164062 | 0.4086914062 | 0.4360351562 | 0.0 | 0.264648437 | 0.4311523437 |
Orwoll_BI0215_BI | 0.3569335937 | 0.340820312 | 0.4145507812 | 0.264648437 | 0.0 | 0.460937 |
Orwoll_BI0353_BI | 0.52539062 | 0.5190429687 | 0.570312 | 0.4311523437 | 0.460937 | 0.0 |
For the testing data (ASVs_count.tsv and ASV_aligned.tre), the truth from Striped UniFrac (weighted_normalized) is:
Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
---|---|---|---|---|---|---|
Orwoll_BI0023_BI | 0 | 0.3701851367950439 | 0.3719282746315002 | 0.2419339865446091 | 0.235030472278595 | 0.2684231698513031 |
Orwoll_BI0056_BI | 0.3701851367950439 | 0 | 0.4768573343753815 | 0.3429801762104034 | 0.2965413630008698 | 0.3132076263427734 |
Orwoll_BI0131_BI | 0.3719282746315002 | 0.4768573343753815 | 0 | 0.3818131387233734 | 0.3768087327480316 | 0.4305019080638885 |
Orwoll_BI0153_BI | 0.2419339865446091 | 0.3429801762104034 | 0.3818131387233734 | 0 | 0.1378696262836456 | 0.1329282969236374 |
Orwoll_BI0215_BI | 0.235030472278595 | 0.2965413630008698 | 0.3768087327480316 | 0.1378696262836456 | 0 | 0.139195516705513 |
Orwoll_BI0353_BI | 0.2684231698513031 | 0.3132076263427734 | 0.4305019080638885 | 0.1329282969236374 | 0.139195516705513 | 0 |
DartUniFrac estimation (DartMinHash, weighted) is:
Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
---|---|---|---|---|---|---|
Orwoll_BI0023_BI | 0 | 0.3726541554959786 | 0.3562913907284768 | 0.2412121212121212 | 0.24878048780487805 | 0.26889714993804215 |
Orwoll_BI0056_BI | 0.3726541554959786 | 0 | 0.4566145092460882 | 0.33159947984395316 | 0.2832080200501253 | 0.30612244897959184 |
Orwoll_BI0131_BI | 0.3562913907284768 | 0.4566145092460882 | 0 | 0.38565629228687415 | 0.3781965006729475 | 0.4301675977653631 |
Orwoll_BI0153_BI | 0.2412121212121212 | 0.33159947984395316 | 0.38565629228687415 | 0 | 0.1265126512651265 | 0.1277533039647577 |
Orwoll_BI0215_BI | 0.24878048780487805 | 0.2832080200501253 | 0.3781965006729475 | 0.1265126512651265 | 0 | 0.13399778516057587 |
Orwoll_BI0353_BI | 0.26889714993804215 | 0.30612244897959184 | 0.4301675977653631 | 0.1277533039647577 | 0.13399778516057587 | 0 |
DartUniFrac estimation (ERS, weighted) is:
Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
---|---|---|---|---|---|---|
Orwoll_BI0023_BI | 0.0 | 0.37403555853740356 | 0.3540495867768595 | 0.24612108305445696 | 0.2322503008423586 | 0.26929036256585065 |
Orwoll_BI0056_BI | 0.37403555853740356 | 0.0 | 0.46810035842293907 | 0.3671562082777036 | 0.31324142353318374 | 0.3333333333333333 |
Orwoll_BI0131_BI | 0.3540495867768595 | 0.46810035842293907 | 0.0 | 0.38331644714623436 | 0.3735747820254863 | 0.4301675977653631 |
Orwoll_BI0153_BI | 0.24612108305445696 | 0.3671562082777036 | 0.38331644714623436 | 0.0 | 0.14381457693381736 | 0.1265126512651265 |
Orwoll_BI0215_BI | 0.2322503008423586 | 0.31324142353318374 | 0.3735747820254863 | 0.14381457693381736 | 0.0 | 0.13936022253129346 |
Orwoll_BI0353_BI | 0.26929036256585065 | 0.3333333333333333 | 0.4301675977653631 | 0.1265126512651265 | 0.13936022253129346 | 0.0 |
The best L for achiving a given accuracy is related to the sparsity of the data (see ERS paper here). The author recommended an equation for L:
RUST_LOG=info ./target/release/striped_unifrac -t ./data/ASVs_aligned.tre -i ./data/AS
Vs_counts.txt -o ASVs_striped_unifac_dist.tsv
You will see some log like this:
************** initializing logger *****************
[2025-08-21T20:12:38Z INFO striped_unifrac] logger initialized from default environment
[2025-08-21T20:12:38Z INFO striped_unifrac] Total branches with positive length: 923
[2025-08-21T20:12:38Z INFO striped_unifrac] Start parsing input.
[2025-08-21T20:12:38Z INFO striped_unifrac] phase-1 masks built 0 ms
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 0: relevant branches = 689 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 1: relevant branches = 594 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 2: relevant branches = 646 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 3: relevant branches = 584 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 4: relevant branches = 647 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 5: relevant branches = 468 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] phase-2 sparse lists built (1 strips)
[2025-08-21T20:12:38Z INFO striped_unifrac] phase-3 block pass 0 ms
[2025-08-21T20:12:38Z INFO striped_unifrac] Start writing output.
This is a dense example where
We want to thank Otmar Ertl and Xiaoyun Li for their helpful comments on DartMinHash and Efficient Rejection Sampling, respectively. We want to thank Yuhan(Sherlyn) Weng for helping with DartUniFrac logo design.
Paper to come