Information-Theoretic Tools for Mining Database Structure from Large Data Sets
Summary: Information-theoretic summaries infer database structure when the model is unknown or incomplete, robust to noise, missing values, and duplicates. Scalable algorithms extract these summaries from large categorical data; ranking functional dependencies by redundancy guides vertical decompositions that boost information content, with real-data validation. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 11 of 11 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 492 | Query by Output | 2009 | SIGMOD | 0.00021974699 |
| 1,153 | SQAK: Doing More with Keywords | 2008 | SIGMOD | 0.00013642866 |
| 1,762 | Tuning Schema Matching Software using Synthetic Scenarios | 2005 | VLDB | 0.00010646894 |
| 2,066 | DBLife: A Community Information Management Platform for the Database Research Community | 2007 | CIDR | 9.6399561e-05 |
| 3,426 | Discovering Topical Structures of Databases | 2008 | SIGMOD | 7.1063105e-05 |
| 3,467 | Data Profiling – A Tutorial | 2017 | SIGMOD | 7.069081e-05 |
| 7,571 | Reducing Ambiguity in Json Schema Discovery | 2021 | SIGMOD | 4.7075853e-05 |
| 8,044 | Information Theory for Data Management | 2010 | SIGMOD | 4.5993522e-05 |
| 10,791 | FDepHunter: Harnessing Negative Examples to Expose Fakes and Reveal Ghosts | 2025 | VLDB | 4.1945683e-05 |
| 11,366 | Statistical Schema Learning using Occam's Razor | 2022 | SIGMOD | 4.1945683e-05 |
| 12,355 | Information Theory For Data Management | 2009 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 12 of 12 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,823 | Automatic Discovery of Attributes in Relational Databases | 2011 | SIGMOD | 6.7261168e-05 |
| 894 | A Hybrid Approach to Functional Dependency Discovery | 2016 | SIGMOD | 0.00015556428 |
| 3,047 | Comprehensive Approach to the Design of Relational Database Schemes | 1984 | VLDB | 7.6561027e-05 |
| 25 | Dependency Inference (Extended Abstract) | 1987 | VLDB | 0.00083101742 |
| 14,303 | Information Theoretic Aspects Of Data Bases | 1983 | PODS | - |
| 6,180 | The Design of non-1NF Relational Databases into Nested Normal Form | 1987 | SIGMOD | 5.1686632e-05 |
| 1,047 | Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms | 2015 | VLDB | 0.00014459715 |
| 7,366 | Discovery Algorithms for Embedded Functional Dependencies | 2020 | SIGMOD | 4.7515248e-05 |
| 12,486 | On Redundancy vs Dependency Preservation in Normalization: An Information-Theoretic Study of 3NF | 2006 | PODS | 4.1945683e-05 |
| 3,818 | Embedded Functional Dependencies and Data-completeness Tailored Database Design | 2019 | VLDB | 6.7300958e-05 |