Reducing Ambiguity in Json Schema Discovery
Summary: Reduces ambiguity in Json schema discovery for ad-hoc data and APIs with Jxplain, a heuristic-driven algorithm that constrains the schema space. Slightly slower than competitors but yields far more precise schemas, reducing validation false positives. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. William Spoth
- 2. Oliver Kennedy
- 3. Ying Lu
- 4. Beda Hammerschmidt
- 5. Zhen Hua Liu
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,943 | Towards Theory for Real-World Data | 2022 | PODS | 4.4258797e-05 |
| 9,750 | ReCG: Bottom-Up JSON Schema Discovery Using a Repetitive Cluster-and-Generalize Framework | 2024 | VLDB | 4.2897489e-05 |
| 10,860 | Exploring Exploratory Querying | 2025 | VLDB | 4.1945683e-05 |
| 11,067 | Partition, Don’t Sort! Compression Boosters for Cloud Data Ingestion Pipelines | 2024 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 61 | DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases | 1997 | VLDB | 0.00064329285 |
| 1,163 | Extracting Schema from Semistructured Data | 1998 | SIGMOD | 0.00013577466 |
| 1,677 | Graceful Database Schema Evolution: the PRISM Workbench | 2008 | VLDB | 0.00010939366 |
| 1,908 | Information-Theoretic Tools for Mining Database Structure from Large Data Sets | 2004 | SIGMOD | 0.00010126101 |
| 2,864 | Inferring XML Schema Definitions from XML Data | 2007 | VLDB | 7.9863574e-05 |
| 3,138 | Inference of Concise DTDs from XML Data | 2006 | VLDB | 7.4876241e-05 |
| 3,349 | Schema Management for Document Stores | 2015 | VLDB | 7.1903648e-05 |
| 4,489 | Automatic Generation of Normalized Relational Schemas from Nested Key-Value Data | 2016 | SIGMOD | 6.1434237e-05 |
| 4,533 | LegoDB: Customizing Relational Storage for XML Documents | 2002 | VLDB | 6.1063172e-05 |
| 7,007 | Closing the functional and Performance Gap between SQL and NoSQL | 2016 | SIGMOD | 4.8653116e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,489 | Automatic Generation of Normalized Relational Schemas from Nested Key-Value Data | 2016 | SIGMOD | 6.1434237e-05 |
| 3,349 | Schema Management for Document Stores | 2015 | VLDB | 7.1903648e-05 |
| 4,704 | JSON Tiles: Fast Analytics on Semi-Structured Data | 2021 | SIGMOD | 5.9853687e-05 |
| 13,089 | Blaze: Compiling JSON Schema for 10× Faster Validation | 2026 | VLDB | - |
| 5,595 | Schemas and Types for JSON Data: from Theory to Practice | 2019 | SIGMOD | 5.4191724e-05 |
| 9,939 | Witness Generation for JSON Schema | 2022 | VLDB | 4.2462227e-05 |
| 2,781 | JSON: Data model, Query languages and Schema specification | 2017 | PODS | 8.1305074e-05 |
| 10,294 | Streaming Validation of JSON Documents Against Schemas | 2026 | VLDB | 4.1945683e-05 |
| 9,750 | ReCG: Bottom-Up JSON Schema Discovery Using a Repetitive Cluster-and-Generalize Framework | 2024 | VLDB | 4.2897489e-05 |
| 11,575 | JSON Schema Matching: Empirical Observations | 2020 | SIGMOD | 4.1945683e-05 |