Database Paper Browser

Back to papers

Extracting Schema from Semistructured Data

Summary: Models semistructured data as labeled directed graphs; types them via greatest fixpoint semantics of monadic Datalog. Approximate typing algorithm; optimal typing NP-hard; clustering-based heuristics yield near-optimal results; preliminary experiments. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
3022
Venue
SIGMOD
Year
1998
Pagerank
0.00013577466
Overall Rank
1,163 | 91.92%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 11 of 11 citing papers.

Rank Citing Paper Year Venue Pagerank
66 Spark SQL: Relational Data Processing in Spark 2015 SIGMOD 0.00061639801
207 Storing Semistructured Data with STORED 1999 SIGMOD 0.00034611968
882 DTD Inference for Views of XML Data 2000 PODS 0.00015657456
992 XTRACT: A System for Extracting Document Type Descriptors from XML Documents 2000 SIGMOD 0.00014799689
2,864 Inferring XML Schema Definitions from XML Data 2007 VLDB 7.9863574e-05
3,138 Inference of Concise DTDs from XML Data 2006 VLDB 7.4876241e-05
3,349 Schema Management for Document Stores 2015 VLDB 7.1903648e-05
3,681 Queries with Incomplete Answers over Semistructured Data 1999 PODS 6.8492288e-05
7,571 Reducing Ambiguity in Json Schema Discovery 2021 SIGMOD 4.7075853e-05
8,632 Measuring the Structural Similarity of Semistructured Documents Using Entropy 2007 VLDB 4.4803734e-05
12,663 Querying Websites Using Compact Skeletons 2001 PODS 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 4 of 4 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers