Database Paper Browser

Back to papers

The Lixto Data Extraction Project - Back and Forth between Theory and Practice

Summary: Lixto ties database theory to scraping practice via a logic-based wrapper language (Elog/monadic datalog over trees) with formal expressiveness and complexity characterizations. Combined with a visual spec UI and a streaming Transformation Server for scalable Web-data integration. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
1303
Venue
PODS
Year
2004
Pagerank
0.00014126427
Overall Rank
1,095 | 92.39%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 17 of 17 citing papers.

Rank Citing Paper Year Venue Pagerank
287 Declarative Information Extraction Using Datalog with Embedded Extraction Predicates 2007 VLDB 0.00028971272
667 Incremental Knowledge Base Construction Using DeepDive 2015 VLDB 0.00018440557
2,224 The SphereSearch Engine for Unified Ranked Retrieval of Heterogeneous XML and Web Documents 2005 VLDB 9.251962e-05
2,984 Efficiently Incorporating User Feedback into Information Extraction and Integration Programs 2009 SIGMOD 7.7796344e-05
3,477 Toward Best-Effort Information Extraction 2008 SIGMOD 7.0583481e-05
4,106 Extracting Databases from Dark Data with DeepDive 2016 SIGMOD 6.4456184e-05
5,609 Documentum ECI Self-Repairing Wrappers: Performance Analysis 2006 SIGMOD 5.4129892e-05
5,620 Datalog and Emerging Applications: An Interactive Tutorial 2011 SIGMOD 5.407079e-05
5,652 From Information to Knowledge: Harvesting Entities and Relationships from Web Sources 2010 PODS 5.3903671e-05
5,705 Datalog Unchained 2021 PODS 5.3621239e-05
7,405 The INFOMIX System for Advanced Integration of Incomplete and Inconsistent Data 2005 SIGMOD 4.7378885e-05
7,681 SXPath - Extending XPath towards Spatial Querying on Web Documents 2011 VLDB 4.6804276e-05
7,826 The Smallest Extraction Problem 2021 VLDB 4.6416742e-05
9,459 Relational Data Mapping in MIQIS 2005 SIGMOD 4.3373848e-05
11,899 Defining Relations on Graphs: How Hard is it in the Presence of Node Partitions? 2015 PODS 4.1945683e-05
12,044 Knowledge Harvesting in the Big-Data Era 2013 SIGMOD 4.1945683e-05
12,258 ObjectRunner: Lightweight, Targeted Extraction and Querying of Structured Web Data 2010 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 6 of 6 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers