Data Bubbles: Quality Preserving Performance Boosting for Hierarchical Clustering
Summary: Introduces Data Bubbles, a compression-based pipeline to scale OPTICS: compress to representatives, cluster the compressed data, then infer the full clustering. Tackles three failure modes of naive sampling/BIRCH via post-processing and the Data Bubble concept, enabling near-accurate clustering at high compression with minimal quality loss. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Markus M. Breunig
- 2. Hans-Peter Kriegel
- 3. Peer Kröger
- 4. Jörg Sander
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,799 | Incremental and Effective Data Summarization for Dynamic Hierarchical Clustering | 2004 | SIGMOD | 4.9232394e-05 |
| 6,883 | C2P: Clustering based on Closest Pairs | 2001 | VLDB | 4.8960306e-05 |
| 12,623 | Data Bubbles for Non-Vector Data: Speeding-up Hierarchical Clustering in Arbitrary Metric Spaces | 2003 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 2 of 2 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 33 | BIRCH: An Efficient Data Clustering Method for Very Large Databases | 1996 | SIGMOD | 0.00077324389 |
| 270 | OPTICS: Ordering Points To Identify the Clustering Structure | 1999 | SIGMOD | 0.00029505642 |
Previous
Page 1 / 1
Next