Database Paper Browser

Back to papers

BIRCH: An Efficient Data Clustering Method for Very Large Databases

Summary: BIRCH offers incremental, memory-efficient clustering for very large databases; first DB clustering method to effectively handle noise. Shows strong time/space efficiency and single-scan quality; outperforms CLARANS on large datasets. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
2875
Venue
SIGMOD
Year
1996
Pagerank
0.00077324389
Overall Rank
33 | 99.78%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 84 citing papers.

Rank Citing Paper Year Venue Pagerank
161 LOF: Identifying Density-Based Local Outliers 2000 SIGMOD 0.00039846974
207 Storing Semistructured Data with STORED 1999 SIGMOD 0.00034611968
270 OPTICS: Ordering Points To Identify the Clustering Structure 1999 SIGMOD 0.00029505642
277 Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications 1998 SIGMOD 0.00029311426
341 CURE: An Efficient Clustering Algorithm for Large Databases 1998 SIGMOD 0.00026810548
662 A Framework for Clustering Evolving Data Streams 2003 VLDB 0.00018475968
693 Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences 1997 SIGMOD 0.00018077335
701 Efficient Algorithms for Mining Outliers from Large Data Sets 2000 SIGMOD 0.00017938417
774 Algorithms for Mining Distance-Based Outliers in Large Datasets 1998 VLDB 0.00016865771
1,097 STING : A Statistical Information Grid Approach to Spatial Data Mining 1997 VLDB 0.00014119975
1,126 Trajectory Clustering: A Partition-and-Group Framework 2007 SIGMOD 0.00013821443
1,241 Multi-dimensional Selectivity Estimation Using Compressed Histogram Information 1999 SIGMOD 0.00013097578
1,336 Clustering Categorical Data: An Approach Based on Dynamical Systems 1998 VLDB 0.00012498064
1,346 Streaming Pattern Discovery in Multiple Time-Series 2005 VLDB 0.00012466288
1,372 SQLEM: Fast Clustering in SQL using the EM Algorithm 2000 SIGMOD 0.00012318334
1,477 Fine-grained Partitioning for Aggressive Data Skipping 2014 SIGMOD 0.00011770865
1,538 WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases 1998 VLDB 0.00011464884
1,595 Fast Algorithms for Projected Clustering 1999 SIGMOD 0.00011222442
1,598 Semantic Compression and Pattern Extraction with Fascicles 1999 VLDB 0.00011202905
1,806 Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces 2000 VLDB 0.00010490769
1,816 Incremental Clustering for Mining in a Data Warehousing Environment 1998 VLDB 0.0001045313
1,860 Approximation Algorithms for Clustering Uncertain Data 2008 PODS 0.0001028857
1,908 Information-Theoretic Tools for Mining Database Structure from Large Data Sets 2004 SIGMOD 0.00010126101
1,962 Plan Selection based on Query Clustering 2002 VLDB 9.950467e-05
2,093 Scalable K-Means++ 2012 VLDB 9.5588104e-05
2,096 Automatic Categorization of Query Results 2004 SIGMOD 9.5498009e-05
2,160 DEVise: Integrated Querying and Visual Exploration of Large Datasets 1997 SIGMOD 9.4065027e-05
2,377 CS2: A New Database Synopsis for Query Estimation 2013 SIGMOD 8.9402115e-05
2,404 Maintaining Variance and k–Medians over Data Stream Windows 2003 PODS 8.8837279e-05
2,661 WALRUS: A Similarity Retrieval Algorithm for Image Databases 1999 SIGMOD 8.3575285e-05
2,784 Approximate XML Joins 2002 SIGMOD 8.128931e-05
3,300 Indexing the Distance: An Efficient Method to KNN Processing 2001 VLDB 7.2516103e-05
3,376 A Monte Carlo Algorithm for Fast Projective Clustering 2002 SIGMOD 7.1630476e-05
3,419 Approximate XML Query Answers 2004 SIGMOD 7.1173416e-05
3,475 Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering 1999 VLDB 7.0614822e-05
3,593 Graph-Based Synopses for Relational Selectivity Estimation 2006 SIGMOD 6.9385476e-05
3,654 Using Trees to Depict a Forest 2009 VLDB 6.873144e-05
3,822 Association Rules over Interval Data 1997 SIGMOD 6.7263391e-05
3,916 Compressing Large Boolean Matrices Using Reordering Techniques 2004 VLDB 6.6328898e-05
4,065 AutoPlait: Automatic Mining of Co-evolving Time Sequences 2014 SIGMOD 6.4819215e-05
4,177 Density Biased Sampling: An Improved Method for Data Mining and Clustering 2000 SIGMOD 6.3835403e-05
4,269 VSS: A Storage System for Video Analytics 2021 SIGMOD 6.306798e-05
4,342 LinkClus: Efficient Clustering via Heterogeneous Semantic Links 2006 VLDB 6.2758722e-05
4,552 Outlier Detection for High Dimensional Data 2001 SIGMOD 6.0922282e-05
4,817 Clustering by Pattern Similarity in Large Data Sets 2002 SIGMOD 5.8987807e-05
5,276 The 3W Model and Algebra for Unified Data Mining 2000 VLDB 5.5905507e-05
5,324 Clustering Stream Data by Exploring the Evolution of Density Mountain 2018 VLDB 5.5691645e-05
5,760 Outlier-robust Clustering using Independent Components 2008 SIGMOD 5.3382727e-05
5,996 A New Sparse Data Clustering Method Based On Frequent Items 2023 SIGMOD 5.2415551e-05
6,093 Density-based Place Clustering in Geo-Social Networks 2014 SIGMOD 5.2131159e-05
Previous Page 1 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 1 of 1 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
27 Efficient and Effective Clustering Methods for Spatial Data Mining 1994 VLDB 0.00080736878
Previous Page 1 / 1 Next

Semantically Similar Papers