Discovering Similarity Inclusion Dependencies
Summary: Introduces similarity inclusion dependencies, relaxing INDs to tolerate dissimilar values on dirty data for foreign-key candidate discovery. Sawfish is the first algorithm to discover all similarity INDs, merging traditional IND discovery with string similarity joins, a sliding window, and lazy validation; up to 6.5× faster. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,749 | Efficient Differential Dependency Discovery | 2024 | VLDB | 4.2897489e-05 |
| 10,540 | Discovering Approximate Inclusion Dependencies | 2025 | VLDB | 4.1945683e-05 |
| 10,676 | Meaningful Data Erasure in the Presence of Dependencies | 2025 | VLDB | 4.1945683e-05 |
| 10,951 | Determining the Largest Overlap between Tables | 2024 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 560 | Dependencies Revisited for Improving Data Quality | 2008 | PODS | 0.00020141923 |
| 1,401 | Extending Dependencies with Conditions | 2007 | VLDB | 0.00012187775 |
| 1,625 | Data Profiling with Metanome | 2015 | VLDB | 0.00011094926 |
| 4,784 | Divide & Conquer-based Inclusion Dependency Discovery | 2015 | VLDB | 5.9240851e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,574 | Discovery of Genuine Functional Dependencies from Relational Data with Missing Values | 2018 | VLDB | 8.5173637e-05 |
| 9,176 | RDFind: Scalable Conditional Inclusion Dependency Discovery in RDF Datasets | 2016 | SIGMOD | 4.383548e-05 |
| 2,077 | Efficient Discovery of Approximate Dependencies | 2018 | VLDB | 9.6001836e-05 |
| 11,546 | Making DBMSes Dependency-Aware | 2020 | CIDR | 4.1945683e-05 |
| 702 | Reasoning about Record Matching Rules | 2009 | VLDB | 0.00017918203 |
| 894 | A Hybrid Approach to Functional Dependency Discovery | 2016 | SIGMOD | 0.00015556428 |
| 4,784 | Divide & Conquer-based Inclusion Dependency Discovery | 2015 | VLDB | 5.9240851e-05 |
| 1,047 | Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms | 2015 | VLDB | 0.00014459715 |
| 10,587 | Efficient Discovery of Relaxed Functional Dependencies | 2025 | VLDB | 4.1945683e-05 |
| 10,540 | Discovering Approximate Inclusion Dependencies | 2025 | VLDB | 4.1945683e-05 |