Database Paper Browser

Back to papers

Discovering Similarity Inclusion Dependencies

Summary: Introduces similarity inclusion dependencies, relaxing INDs to tolerate dissimilar values on dirty data for foreign-key candidate discovery. Sawfish is the first algorithm to discover all similarity INDs, merging traditional IND discovery with string similarity joins, a sliding window, and lazy validation; up to 6.5× faster. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
6578
Venue
SIGMOD
Year
2023
Pagerank
4.4234478e-05
Overall Rank
8,949 | 37.75%
DOI
10.1145/3588929

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 4 of 4 citing papers.

Rank Citing Paper Year Venue Pagerank
9,749 Efficient Differential Dependency Discovery 2024 VLDB 4.2897489e-05
10,540 Discovering Approximate Inclusion Dependencies 2025 VLDB 4.1945683e-05
10,676 Meaningful Data Erasure in the Presence of Dependencies 2025 VLDB 4.1945683e-05
10,951 Determining the Largest Overlap between Tables 2024 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 4 of 4 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
560 Dependencies Revisited for Improving Data Quality 2008 PODS 0.00020141923
1,401 Extending Dependencies with Conditions 2007 VLDB 0.00012187775
1,625 Data Profiling with Metanome 2015 VLDB 0.00011094926
4,784 Divide & Conquer-based Inclusion Dependency Discovery 2015 VLDB 5.9240851e-05
Previous Page 1 / 1 Next

Semantically Similar Papers