Clustering by Pattern Similarity in Large Data Sets
Summary: pCluster: pattern-based similarity; clustering by coherent patterns on a subset of dimensions, not by value proximity. Applies to gene expression and collaborative filtering; shows an efficient algorithm with real and synthetic data demonstrations. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Haixun Wang
- 2. Wei Wang
- 3. Jiong Yang
- 4. Philip S. Yu
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,162 | Computing Clusters of Correlation Connected Objects | 2004 | SIGMOD | 6.3937203e-05 |
| 4,362 | triCluster: An Effective Algorithm for Mining Coherent Clusters in 3D Microarray Data | 2005 | SIGMOD | 6.2556473e-05 |
| 7,400 | Missing Value Imputation for Multi-attribute Sensor Data Streams via Message Propagation | 2024 | VLDB | 4.7397846e-05 |
| 12,408 | Detecting Clusters in Moderate-to-High Dimensional Data: Subspace Clustering, Pattern-based Clustering, and Correlation Clustering | 2008 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 27 | Efficient and Effective Clustering Methods for Spatial Data Mining | 1994 | VLDB | 0.00080736878 |
| 33 | BIRCH: An Efficient Data Clustering Method for Very Large Databases | 1996 | SIGMOD | 0.00077324389 |
| 277 | Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications | 1998 | SIGMOD | 0.00029311426 |
| 1,595 | Fast Algorithms for Projected Clustering | 1999 | SIGMOD | 0.00011222442 |
| 1,598 | Semantic Compression and Pattern Extraction with Fascicles | 1999 | VLDB | 0.00011202905 |
| 2,019 | Finding Generalized Projected Clusters in High Dimensional Spaces | 2000 | SIGMOD | 9.7707059e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,090 | Finding Near Neighbors Through Cluster Pruning | 2007 | PODS | 6.4577834e-05 |
| 7,522 | Efficient and Tunable Similar Set Retrieval | 2001 | SIGMOD | 4.7180617e-05 |
| 1,336 | Clustering Categorical Data: An Approach Based on Dynamical Systems | 1998 | VLDB | 0.00012498064 |
| 12,408 | Detecting Clusters in Moderate-to-High Dimensional Data: Subspace Clustering, Pattern-based Clustering, and Correlation Clustering | 2008 | VLDB | 4.1945683e-05 |
| 13,926 | Clustering Methods for Large Databases: From the Past to the Future | 1999 | SIGMOD | - |
| 4,162 | Computing Clusters of Correlation Connected Objects | 2004 | SIGMOD | 6.3937203e-05 |
| 2,019 | Finding Generalized Projected Clusters in High Dimensional Spaces | 2000 | SIGMOD | 9.7707059e-05 |
| 1,595 | Fast Algorithms for Projected Clustering | 1999 | SIGMOD | 0.00011222442 |
| 313 | Graph Clustering Based on Structural/Attribute Similarities | 2009 | VLDB | 0.00028097557 |
| 8,168 | Evaluating Clustering in Subspace Projections of High Dimensional Data | 2009 | VLDB | 4.5701004e-05 |