Discovering Topical Structures of Databases
Summary: iDisc uses a multi-strategy learning approach over schemas and data to cluster tables by topic with representations and meta-clustering. Extensible, it adds aggregation, clusterer boosting, a table-importance measure, and representations for semantic browsing, with strong accuracy on databases. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Wensheng Wu
- 2. Berthold Reinwald
- 3. Yannis Sismanis
- 4. Rajesh Manjrekar
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 492 | Query by Output | 2009 | SIGMOD | 0.00021974699 |
| 1,796 | Summary Graphs for Relational Database Schemas | 2011 | VLDB | 0.00010524897 |
| 10,305 | Schuyler: Self-Supervised Clustering of Tables in Relational Databases | 2026 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 208 | Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach | 2001 | SIGMOD | 0.0003460594 |
| 475 | Mining Database Structure; Or, How to Build a Data Quality Browser | 2002 | SIGMOD | 0.00022303253 |
| 1,271 | Schema Summarization | 2006 | VLDB | 0.00012923966 |
| 1,908 | Information-Theoretic Tools for Mining Database Structure from Large Data Sets | 2004 | SIGMOD | 0.00010126101 |
| 2,549 | GORDIAN: Efficient and Scalable Discovery of Composite Keys | 2006 | VLDB | 8.5641554e-05 |
| 2,788 | Incremental Schema Matching | 2006 | VLDB | 8.1251255e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,758 | Optimization for Active Learning-based Interactive Database Exploration | 2019 | VLDB | 5.9422515e-05 |
| 13,602 | Information Discovery in Loosely Integrated Data | 2007 | SIGMOD | - |
| 2,576 | S4: Top-k Spreadsheet-Style Search for Query Discovery | 2015 | SIGMOD | 8.5112408e-05 |
| 7,274 | DiscoPG: Property Graph Schema Discovery and Exploration | 2022 | VLDB | 4.7807315e-05 |
| 9,723 | Discovering and Ranking Semantic Associations over a Large RDF Metabase | 2004 | VLDB | 4.2958329e-05 |
| 146 | Knowledge Discovery in Databases: An Attribute-Oriented Approach | 1992 | VLDB | 0.00041315295 |
| 54 | DISCOVER: Keyword Search in Relational Databases | 2002 | VLDB | 0.00066047203 |
| 7,643 | Cross Modal Data Discovery over Structured and Unstructured Data Lakes | 2023 | VLDB | 4.6901105e-05 |
| 5,529 | Data-Driven Domain Discovery for Structured Datasets | 2020 | VLDB | 5.4566641e-05 |
| 1,510 | Summarizing Relational Databases | 2009 | VLDB | 0.00011606901 |