Document Spanners — A Brief Overview of Concepts, Results, and Recent Developments
Summary: Survey of the document spanners framework (formalising AQL/SystemT) and its theoretical foundations—expressiveness (regex formulas, core spanners), complexity, and algebraic properties. Highlights recent advances in enumeration, containment, extensions, and connections to automata and logic. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,339 | A Lower Bound on Unambiguous Context Free Grammars via Communication Complexity | 2025 | PODS | 4.1945683e-05 |
| 13,147 | Revisiting Weighted Information Extraction: A Simpler and Faster Algorithm for Ranked Enumeration | 2024 | PODS | - |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,938 | Split-Correctness in Information Extraction | 2019 | PODS | 0.00010028895 |
| 2,929 | Complexity Bounds for Relational Algebra over Document Spanners | 2019 | PODS | 7.8800307e-05 |
| 3,563 | Spanner Evaluation over SLP-Compressed Documents | 2021 | PODS | 6.9690833e-05 |
| 8,752 | Query Evaluation Over SLP-Represented Document Databases With Complex Document Editing | 2022 | PODS | 4.456315e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,771 | A Relational Approach to Incrementally Extracting and Querying Structure in Unstructured Data | 2007 | VLDB | 8.1421432e-05 |
| 13,162 | SpannerLib: Embedding Declarative Information Extraction in an Imperative Workflow | 2024 | VLDB | - |
| 6,534 | Automatic Rule Refinement for Information Extraction | 2010 | VLDB | 5.0244622e-05 |
| 9,423 | Database Principles in Information Extraction | 2014 | PODS | 4.3441378e-05 |
| 3,820 | Enterprise Information Extraction: Recent Developments and Open Challenges | 2010 | SIGMOD | 6.7299199e-05 |
| 11,240 | Autonomously Computable Information Extraction | 2023 | VLDB | 4.1945683e-05 |
| 12,115 | Just-in-Time Information Extraction using Extraction Views | 2012 | SIGMOD | 4.1945683e-05 |
| 1,938 | Split-Correctness in Information Extraction | 2019 | PODS | 0.00010028895 |
| 393 | From Structured Documents to Novel Query Facilities | 1994 | SIGMOD | 0.00024524092 |
| 6,490 | Spanners: A Formal Framework for Information Extraction | 2013 | PODS | 5.0431719e-05 |