NoDoSE - A Tool for Semi-Automatically Extracting Structured and Semistructured Data from Text Documents.
Summary: NoDoSE is an interactive tool for semi-automatic discovery and extraction of semi-structured text into a DBMS. A GUI hierarchically partitions documents; a mining component infers grammar from user input; a Java prototype serves as a structure-mining test bed. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 10 of 10 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 112 | Potter's Wheel: An Interactive Data Cleaning System | 2001 | VLDB | 0.00047045036 |
| 533 | RoadRunner: Towards Automatic Data Extraction from Large Web Sites | 2001 | VLDB | 0.00020757722 |
| 637 | Automatic segmentation of text into structured records | 2001 | SIGMOD | 0.00018824614 |
| 2,005 | Record-Boundary Discovery in Web Documents | 1999 | SIGMOD | 9.8112591e-05 |
| 2,698 | Visual Web Information Extraction with Lixto* | 2001 | VLDB | 8.2753317e-05 |
| 6,958 | Computational Aspects of Resilient Data Extraction from Semistructured Sources | 2000 | PODS | 4.8857878e-05 |
| 7,826 | The Smallest Extraction Problem | 2021 | VLDB | 4.6416742e-05 |
| 12,525 | Automatic Extraction of Dynamic Record Sections From Search Engine Result Pages | 2006 | VLDB | 4.1945683e-05 |
| 12,663 | Querying Websites Using Compact Skeletons | 2001 | PODS | 4.1945683e-05 |
| 13,935 | Nodose Version 2.0 | 1999 | SIGMOD | - |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 0 of 0 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 393 | From Structured Documents to Novel Query Facilities | 1994 | SIGMOD | 0.00024524092 |
| 10,438 | Doctopus: A System for Budget-aware Structural Data Extraction from Unstructured Documents | 2025 | SIGMOD | 4.1945683e-05 |
| 207 | Storing Semistructured Data with STORED | 1999 | SIGMOD | 0.00034611968 |
| 3,285 | Using the Structure of Web Sites for Automatic Segmentation of Tables | 2004 | SIGMOD | 7.2759001e-05 |
| 5,399 | Joint Unsupervised Structure Discovery and Information Extraction | 2011 | SIGMOD | 5.5291067e-05 |
| 2,319 | Expressive and Flexible Access to Web-Extracted Data: A Keyword-based Structured Query Language | 2010 | SIGMOD | 9.0387108e-05 |
| 13,935 | Nodose Version 2.0 | 1999 | SIGMOD | - |
| 13,134 | DocDB: A Database for Unstructured Document Analysis | 2025 | VLDB | - |
| 637 | Automatic segmentation of text into structured records | 2001 | SIGMOD | 0.00018824614 |
| 1,395 | Structured Querying of Web Text: A Technical Challenge | 2007 | CIDR | 0.00012207039 |