Database Paper Browser

Back to papers

Discovering Top-k Relevant and Diversified Rules

Summary: Top-k relevant and diversified Entity Enhancing Rules (REEs) for data quality; trains a relevance model and four diversity measures to reduce noise. NP-hard problem; practical parallel algorithm with approximation guarantees delivers real-data speedups. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
6961
Venue
SIGMOD
Year
2024
Pagerank
4.2721228e-05
Overall Rank
9,847 | 31.50%
DOI
10.1145/3677131

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 1 of 1 citing papers.

Rank Citing Paper Year Venue Pagerank
10,308 Efficient Partition-based Approaches for Diversified Top-k Subgraph Matching 2026 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 29 of 29 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
49 Consistent Query Answers in Inconsistent Databases 1999 PODS 0.00067660624
192 HoloClean: Holistic Data Repairs with Probabilistic Inference 2017 VLDB 0.00035728858
214 Scorpion: Explaining Away Outliers in Aggregate Queries 2013 VLDB 0.0003363692
221 Deep Entity Matching with Pre-Trained Language Models 2021 VLDB 0.00033121824
555 Discovering Denial Constraints 2013 VLDB 0.00020254908
894 A Hybrid Approach to Functional Dependency Discovery 2016 SIGMOD 0.00015556428
942 A Formal Approach to Finding Explanations for Database Queries 2014 SIGMOD 0.00015155714
1,047 Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms 2015 VLDB 0.00014459715
1,188 On Generating Near-Optimal Tableaux for Conditional Functional Dependencies 2008 VLDB 0.00013441729
1,445 Diversifying Top-K Results 2012 VLDB 0.00011945231
1,831 Synthesizing Entity Matching Rules by Examples 2018 VLDB 0.00010384082
1,894 Baran: Effective Error Correction via a Unified Context Representation and Transfer Learning 2020 VLDB 0.0001018378
2,077 Efficient Discovery of Approximate Dependencies 2018 VLDB 9.6001836e-05
2,253 Efficient Denial Constraint Discovery with Hydra 2018 VLDB 9.1937209e-05
2,450 Functional Dependencies for Graphs 2016 SIGMOD 8.7882979e-05
2,480 Top-k Bounded Diversification 2012 SIGMOD 8.6899714e-05
2,483 Discovery of Approximate (and Exact) Denial Constraints 2020 VLDB 8.6864916e-05
3,440 Approximate Denial Constraints 2020 VLDB 7.0918817e-05
4,056 On the Complexity of Query Result Diversification 2013 VLDB 6.4883623e-05
4,127 A Statistical Perspective on Discovering Functional Dependencies in Noisy Data 2020 SIGMOD 6.4310458e-05
4,205 Association Rules with Graph Patterns 2015 VLDB 6.3597474e-05
4,807 Diversified Top-k Graph Pattern Matching 2013 VLDB 5.9092289e-05
5,854 Diversified Top-k Subgraph Querying in a Large Graph 2016 SIGMOD 5.3006473e-05
5,941 Big Graphs: Challenges and Opportunities 2022 VLDB 5.2635446e-05
6,042 MDedup: Duplicate Detection with Matching Dependencies 2020 VLDB 5.2405269e-05
6,690 Parallel Discrepancy Detection and Incremental Detection 2021 VLDB 4.9621556e-05
9,355 Discovering Top-k Rules using Subjective and Objective Criteria 2023 SIGMOD 4.3514328e-05
9,434 Rock: Cleaning Data by Embedding ML in Logic Rules 2024 SIGMOD 4.3430376e-05
9,963 Parallel Rule Discovery from Large Datasets by Sampling 2022 SIGMOD 4.2294678e-05
Previous Page 1 / 1 Next

Semantically Similar Papers

Overall Rank Paper Year Venue Pagerank
1,808 Top-k Query Evaluation with Probabilistic Guarantees 2004 VLDB 0.00010486213
1,208 Efficient Diversity-Aware Search 2011 SIGMOD 0.00013275712
227 Discovery of Multiple-Level Association Rules from Large Databases 1995 VLDB 0.00032284058
4,904 Temporal Rules Discovery for Web Data Cleaning 2016 VLDB 5.8399195e-05
4,056 On the Complexity of Query Result Diversification 2013 VLDB 6.4883623e-05
7,287 Discovering Association Rules from Big Graphs 2022 VLDB 4.7762276e-05
732 Discovering Data Quality Rules 2008 VLDB 0.00017465093
10,489 Incremental Rule Discovery in Response to Parameter Updates 2025 SIGMOD 4.1945683e-05
9,963 Parallel Rule Discovery from Large Datasets by Sampling 2022 SIGMOD 4.2294678e-05
9,355 Discovering Top-k Rules using Subjective and Objective Criteria 2023 SIGMOD 4.3514328e-05