Cleaning Inconsistencies in Information Extraction via Prioritized Repairs
Summary: Declarative framework for cleaning inconsistent IE outputs by integrating prioritized repairs into document spanners, enabling user-declared conflict-resolution policies that capture industrial cleaning operations and POSIX regex semantics. Analyzes unambiguity and expressive power of such policies, with both positive and negative (decidability/complexity) results. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Ronald Fagin
- 2. Benny Kimelfeld
- 3. Frederick Reiss
- 4. Stijn Vansummeren
Incoming Citations (Sorted by Pagerank)
Showing 6 of 6 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,483 | Discovery of Approximate (and Exact) Denial Constraints | 2020 | VLDB | 8.6864916e-05 |
| 3,042 | Dichotomies in the Complexity of Preferred Repairs | 2015 | PODS | 7.669374e-05 |
| 7,702 | Counting and Enumerating (Preferred) Database Repairs | 2017 | PODS | 4.6736471e-05 |
| 8,722 | Preference-aware Integration of Temporal Data | 2015 | VLDB | 4.4606662e-05 |
| 9,423 | Database Principles in Information Extraction | 2014 | PODS | 4.3441378e-05 |
| 11,240 | Autonomously Computable Information Extraction | 2023 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 49 | Consistent Query Answers in Inconsistent Databases | 1999 | PODS | 0.00067660624 |
| 287 | Declarative Information Extraction Using Datalog with Embedded Extraction Predicates | 2007 | VLDB | 0.00028971272 |
| 560 | Dependencies Revisited for Improving Data Quality | 2008 | PODS | 0.00020141923 |
| 1,014 | Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS | 2011 | VLDB | 0.00014640258 |
| 2,823 | Interaction between Record Matching and Data Repairing | 2011 | SIGMOD | 8.0593894e-05 |
| 4,521 | A Temporal-Probabilistic Database Model for Information Extraction | 2013 | VLDB | 6.1168322e-05 |
| 6,490 | Spanners: A Formal Framework for Information Extraction | 2013 | PODS | 5.0431719e-05 |
| 6,534 | Automatic Rule Refinement for Information Extraction | 2010 | VLDB | 5.0244622e-05 |
| 13,485 | Rewrite Rules for Search Database Systems | 2011 | PODS | - |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 287 | Declarative Information Extraction Using Datalog with Embedded Extraction Predicates | 2007 | VLDB | 0.00028971272 |
| 11,240 | Autonomously Computable Information Extraction | 2023 | VLDB | 4.1945683e-05 |
| 9,423 | Database Principles in Information Extraction | 2014 | PODS | 4.3441378e-05 |
| 7,702 | Counting and Enumerating (Preferred) Database Repairs | 2017 | PODS | 4.6736471e-05 |
| 3,192 | Towards Dependable Data Repairing with Fixing Rules | 2014 | SIGMOD | 7.4095761e-05 |
| 1,624 | Sampling the Repairs of Functional Dependency Violations under Hard Constraints | 2010 | VLDB | 0.00011099222 |
| 6,546 | Properties of Inconsistency Measures for Databases | 2021 | SIGMOD | 5.0185588e-05 |
| 2,823 | Interaction between Record Matching and Data Repairing | 2011 | SIGMOD | 8.0593894e-05 |
| 8,007 | A Grammar-based Entity Representation Framework for Data Cleaning | 2009 | SIGMOD | 4.6068018e-05 |
| 199 | Declarative Data Cleaning: Language, Model, and Algorithms | 2001 | VLDB | 0.00035041015 |