A Framework for Measuring Changes in Data Characteristics
Summary: Framework defines a model-based "deviation" measure between datasets—distance between induced models (e.g., frequent itemsets, decision trees, clusters)—thus unifying metrics like misclassification rate and chi-squared. Adds statistical tests to judge significance and outlines applications for change detection and model-driven data comparison. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Venkatesh Ganti
- 2. Johannes Gehrke
- 3. Raghu Ramakrishnan
- 4. Wei-Yin Loh
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,706 | A Framework for Diagnosing Changes in Evolving Data Streams | 2003 | SIGMOD | 4.9543915e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 15 of 15 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,572 | Mining Statistically Significant Connected Subgraphs in Vertex Labeled Graphs | 2014 | SIGMOD | 5.005963e-05 |
| 2,807 | A Model-based Approach to Attributed Graph Clustering | 2012 | SIGMOD | 8.0905959e-05 |
| 5,755 | A Framework for Clustering Uncertain Data | 2015 | VLDB | 5.3402052e-05 |
| 12,408 | Detecting Clusters in Moderate-to-High Dimensional Data: Subspace Clustering, Pattern-based Clustering, and Correlation Clustering | 2008 | VLDB | 4.1945683e-05 |
| 744 | Beyond Market Baskets: Generalizing Association Rules to Correlations | 1997 | SIGMOD | 0.00017333019 |
| 13,513 | Database Systems Research on Data Mining | 2010 | SIGMOD | - |
| 3,162 | Looking for Trouble: Analyzing Classifier Behavior via Pattern Divergence | 2021 | SIGMOD | 7.4589576e-05 |
| 40 | Privacy-Preserving Data Mining | 2000 | SIGMOD | 0.00074232718 |
| 8,218 | Mining Deviants in a Time Series Database | 1999 | VLDB | 4.5566051e-05 |
| 14,020 | Data Mining Techniques | 1996 | SIGMOD | - |