When Speed Has a Price: Fast Information Extraction Using Approximate Algorithms
Summary: Proposes an optimizer for information extraction that embraces approximate execution plans to accelerate IE tasks. It models time, recall, and precision across exact and approximate plans, and validates scalability with large-scale real-world datasets. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Goncalo Simoes
- 2. Helena Galhardas
- 3. Luis Gravano
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,627 | Data Cleaning: Overview and Emerging Challenges | 2016 | SIGMOD | 0.00011086905 |
| 11,930 | ConfSeer: Leveraging Customer Support Knowledge Bases for Automated Misconfiguration Detection | 2015 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 2 of 2 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 287 | Declarative Information Extraction Using Datalog with Embedded Extraction Predicates | 2007 | VLDB | 0.00028971272 |
| 759 | To Search or to Crawl? Towards a Query Optimizer for Text-Centric Tasks | 2006 | SIGMOD | 0.00017064615 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,938 | Split-Correctness in Information Extraction | 2019 | PODS | 0.00010028895 |
| 287 | Declarative Information Extraction Using Datalog with Embedded Extraction Predicates | 2007 | VLDB | 0.00028971272 |
| 3,477 | Toward Best-Effort Information Extraction | 2008 | SIGMOD | 7.0583481e-05 |
| 3,578 | Efficient Approximate Entity Extraction with Edit Distance Constraints | 2009 | SIGMOD | 6.9503858e-05 |
| 12,052 | Provenance-based Dictionary Refinement in Information Extraction | 2013 | SIGMOD | 4.1945683e-05 |
| 759 | To Search or to Crawl? Towards a Query Optimizer for Text-Centric Tasks | 2006 | SIGMOD | 0.00017064615 |
| 13,720 | QXtract: A Building Block for Efficient Information Extraction from Text Databases | 2003 | SIGMOD | - |
| 2,319 | Expressive and Flexible Access to Web-Extracted Data: A Keyword-based Structured Query Language | 2010 | SIGMOD | 9.0387108e-05 |
| 11,975 | Which Concepts Are Worth Extracting? | 2014 | SIGMOD | 4.1945683e-05 |
| 11,240 | Autonomously Computable Information Extraction | 2023 | VLDB | 4.1945683e-05 |