XTRACT: A System for Extracting Document Type Descriptors from XML Documents
Summary: XTRACT infers concise, meaningful DTDs by collapsing repeated sequences into regexes and factoring candidates with optimization techniques. It uses MDL to pick the best DTD, showing scalable, accurate schema extraction on real and synthetic XML data. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Minos Garofalakis
- 2. Aristides Gionis
- 3. Rajeev Rastogi
- 4. S. Seshadri
- 5. Kyuseok Shim
Incoming Citations (Sorted by Pagerank)
Showing 12 of 12 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 61 | DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases | 1997 | VLDB | 0.00064329285 |
| 153 | Relational Databases for Querying XML Documents: Limitations and Opportunities | 1999 | VLDB | 0.00040784455 |
| 207 | Storing Semistructured Data with STORED | 1999 | SIGMOD | 0.00034611968 |
| 1,163 | Extracting Schema from Semistructured Data | 1998 | SIGMOD | 0.00013577466 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,113 | Structure and Value Synopses for XML Data Graphs | 2002 | VLDB | 7.5469926e-05 |
| 7,148 | EXTRUCT: Using Deep Structural Information in XML Keyword Search | 2010 | VLDB | 4.8174598e-05 |
| 6,509 | Representing and Querying XML with Incomplete Information | 2001 | PODS | 5.0331402e-05 |
| 12,372 | SchemaScope: a System for Inferring and Cleaning XML Schemas | 2008 | SIGMOD | 4.1945683e-05 |
| 13,723 | TREX: DTD-Conforming XML to XML Transformations | 2003 | SIGMOD | - |
| 4,215 | Generating XML Structure Using Examples and Constraints | 2008 | VLDB | 6.3527334e-05 |
| 153 | Relational Databases for Querying XML Documents: Limitations and Opportunities | 1999 | VLDB | 0.00040784455 |
| 882 | DTD Inference for Views of XML Data | 2000 | PODS | 0.00015657456 |
| 3,138 | Inference of Concise DTDs from XML Data | 2006 | VLDB | 7.4876241e-05 |
| 2,864 | Inferring XML Schema Definitions from XML Data | 2007 | VLDB | 7.9863574e-05 |