The Lixto Data Extraction Project - Back and Forth between Theory and Practice
Summary: Lixto ties database theory to scraping practice via a logic-based wrapper language (Elog/monadic datalog over trees) with formal expressiveness and complexity characterizations. Combined with a visual spec UI and a streaming Transformation Server for scalable Web-data integration. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Georg Gottlob
- 2. Christoph Koch
- 3. Robert Baumgartner
- 4. Marcus Herzog
- 5. Sergio Flesca
Incoming Citations (Sorted by Pagerank)
Showing 17 of 17 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 713 | Efficient Algorithms for Processing XPath Queries | 2002 | VLDB | 0.00017731096 |
| 1,324 | Numerical Document Queries | 2003 | PODS | 0.00012581674 |
| 1,370 | Monadic Datalog and the Expressive Power of Languages for Web Information Extraction | 2002 | PODS | 0.00012338027 |
| 1,663 | Conjunctive Queries over Trees | 2004 | PODS | 0.00010977096 |
| 2,698 | Visual Web Information Extraction with Lixto* | 2001 | VLDB | 8.2753317e-05 |
| 2,855 | Efficient Processing of Expressive Node-Selecting Queries on XML Data in Secondary Storage: A Tree Automata-based Approach | 2003 | VLDB | 8.0059865e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,746 | Scalable Web Data Extraction for Online Market Intelligence | 2009 | VLDB | 4.6617126e-05 |
| 287 | Declarative Information Extraction Using Datalog with Embedded Extraction Predicates | 2007 | VLDB | 0.00028971272 |
| 8,322 | An XML-based Wrapper Generator for Web Information Extraction | 1999 | SIGMOD | 4.5435639e-05 |
| 2,584 | Expressive and efficient pattern languages for tree-structured data (extended abstract) | 2000 | PODS | 8.4948053e-05 |
| 8,943 | Towards Theory for Real-World Data | 2022 | PODS | 4.4258797e-05 |
| 2,319 | Expressive and Flexible Access to Web-Extracted Data: A Keyword-based Structured Query Language | 2010 | SIGMOD | 9.0387108e-05 |
| 2,952 | On Wrapping Query Languages and Efficient XML Integration | 2000 | SIGMOD | 7.8300484e-05 |
| 9,717 | Supervised Wrapper Generation with Lixto | 2001 | VLDB | 4.299267e-05 |
| 2,698 | Visual Web Information Extraction with Lixto* | 2001 | VLDB | 8.2753317e-05 |
| 1,370 | Monadic Datalog and the Expressive Power of Languages for Web Information Extraction | 2002 | PODS | 0.00012338027 |