TEGRA: Table Extraction by Global Record Alignment
Summary: TEGRA extracts multi-column relational tables from list-style web content, not only HTML tables. It quantifies column coherence with a 100M-table web corpus and casts extraction as a fixed-column token assignment that maximizes intra-column coherence via a 2-approximation (A*-like) algorithm; achieves 90%+ F-measure on web and enterprise data. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Xu Chu
- 2. Yeye He
- 3. Kaushik Chakrabarti
- 4. Kris Ganjam
Incoming Citations (Sorted by Pagerank)
Showing 11 of 11 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 62 | Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge | 2008 | SIGMOD | 0.0006429466 |
| 107 | WebTables: Exploring the Power of Tables on the Web | 2008 | VLDB | 0.00048377684 |
| 420 | InfoGather: Entity Augmentation and Attribute Discovery By Holistic Matching with Web Tables | 2012 | SIGMOD | 0.00023719065 |
| 518 | Data Integration for the Relational Web | 2009 | VLDB | 0.00021158934 |
| 637 | Automatic segmentation of text into structured records | 2001 | SIGMOD | 0.00018824614 |
| 1,317 | Harvesting Relational Tables from Lists on the Web | 2009 | VLDB | 0.00012625853 |
| 1,585 | Answering Table Augmentation Queries from Unstructured Lists on the Web | 2009 | VLDB | 0.00011255098 |
| 3,229 | InfoGather+: Semantic Matching and Annotation of Numeric and Time-Varying Attributes in Web Tables | 2013 | SIGMOD | 7.3393682e-05 |
| 5,399 | Joint Unsupervised Structure Discovery and Information Extraction | 2011 | SIGMOD | 5.5291067e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,797 | Stitching Web Tables for Improving Matching Quality | 2017 | VLDB | 6.7597149e-05 |
| 364 | Annotating and Searching Web Tables Using Entities, Types and Relationships | 2010 | VLDB | 0.00025637562 |
| 3,285 | Using the Structure of Web Sites for Automatic Segmentation of Tables | 2004 | SIGMOD | 7.2759001e-05 |
| 7,424 | Table Extraction and Understanding for Scientific and Enterprise Applications | 2020 | VLDB | 4.7339251e-05 |
| 8,579 | RECA: Related Tables Enhanced Column Semantic Type Annotation Framework | 2023 | VLDB | 4.4922446e-05 |
| 107 | WebTables: Exploring the Power of Tables on the Web | 2008 | VLDB | 0.00048377684 |
| 1,001 | Recovering Semantics of Tables on the Web | 2011 | VLDB | 0.00014706505 |
| 2,633 | Schema Extraction for Tabular Data on the Web | 2013 | VLDB | 8.4063569e-05 |
| 1,367 | Answering Table Queries on the Web using Column Keywords | 2012 | VLDB | 0.00012349783 |
| 1,317 | Harvesting Relational Tables from Lists on the Web | 2009 | VLDB | 0.00012625853 |