Back to papers
Mison: A Fast JSON Parser for Data Analytics
Summary: Mison: an analytics JSON parser that pushes projection and filtering into parsing. Eschews FSMs for a scheme combining speculative field hints and indices that map logical to physical positions, turning control into data flow and yielding speedups.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 11398
- Venue
- VLDB
- Year
- 2017
- Pagerank
- 8.0651326e-05
- Overall Rank
- 2,819 | 80.40%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 21 of 21 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 2,700 |
Filter Before You Parse: Faster Analytics on Raw Data with Sparser |
2018 |
VLDB |
8.2728509e-05 |
| 3,437 |
Speculative Distributed CSV Data Parsing for Big Data Analytics |
2019 |
SIGMOD |
7.0942161e-05 |
| 4,602 |
Accelerating Raw Data Analysis with the ACCORDA Software and Hardware Architecture |
2019 |
VLDB |
6.0567387e-05 |
| 4,704 |
JSON Tiles: Fast Analytics on Semi-Structured Data |
2021 |
SIGMOD |
5.9853687e-05 |
| 5,595 |
Schemas and Types for JSON Data: from Theory to Practice |
2019 |
SIGMOD |
5.4191724e-05 |
| 6,977 |
FishStore: Fast Ingestion and Indexing of Raw Data |
2019 |
VLDB |
4.8761802e-05 |
| 7,360 |
ParPaRaw: Massively Parallel Parsing of Delimiter-Separated Raw Data |
2020 |
VLDB |
4.7525925e-05 |
| 7,427 |
Selection Pushdown in Column Stores using Bit Manipulation Instructions |
2023 |
SIGMOD |
4.7327406e-05 |
| 7,830 |
Scalable Structural Index Construction for JSON Analytics |
2021 |
VLDB |
4.6388763e-05 |
| 8,645 |
Predicate Pushdown for Data Science Pipelines |
2023 |
SIGMOD |
4.4772518e-05 |
| 8,719 |
Native JSON Datatype Support: Maturing SQL and NoSQL convergence in Oracle Database |
2020 |
VLDB |
4.4612589e-05 |
| 8,788 |
FishStore: Faster Ingestion with Subset Hashing |
2019 |
SIGMOD |
4.451039e-05 |
| 9,124 |
Dynamic Speculative Optimizations for SQL Compilation in Apache Spark |
2020 |
VLDB |
4.391961e-05 |
| 9,268 |
Language-Agnostic Integrated Queries in a Managed Polyglot Runtime |
2021 |
VLDB |
4.3657168e-05 |
| 9,379 |
GIO: Generating Efficient Matrix and Frame Readers for Custom Data Formats by Example |
2023 |
SIGMOD |
4.3462787e-05 |
| 9,719 |
Tuplex: Robust, Efficient Analytics When Python Rules |
2019 |
VLDB |
4.2980763e-05 |
| 9,750 |
ReCG: Bottom-Up JSON Schema Discovery Using a Repetitive Cluster-and-Generalize Framework |
2024 |
VLDB |
4.2897489e-05 |
| 9,837 |
GpJSON: High-performance JSON Data Processing on GPUs |
2025 |
VLDB |
4.2740344e-05 |
| 10,482 |
Fast and Scalable Data Transfer Across Data Systems |
2025 |
SIGMOD |
4.1945683e-05 |
| 11,150 |
Zed: Leveraging Data Types to Process Eclectic Data |
2023 |
CIDR |
4.1945683e-05 |
| 11,189 |
dsJSON: A Distributed SQL JSON Processor |
2023 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 14 of 14 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 66 |
Spark SQL: Relational Data Processing in Spark |
2015 |
SIGMOD |
0.00061639801 |
| 109 |
Dremel: Interactive Analysis of Web-Scale Datasets |
2010 |
VLDB |
0.00048186983 |
| 288 |
Storm @Twitter |
2014 |
SIGMOD |
0.00028939871 |
| 1,098 |
Trill: A High-Performance Incremental Query Processor for Diverse Analytics |
2015 |
VLDB |
0.00014114442 |
| 1,265 |
Jaql: A Scripting Language for Large Scale Semistructured Data Analysis |
2011 |
VLDB |
0.00012947629 |
| 1,270 |
BitWeaving: Fast Scans for Main Memory Data Processing |
2013 |
SIGMOD |
0.00012926086 |
| 1,343 |
NoDB: Efficient Query Execution on Raw Data Files |
2012 |
SIGMOD |
0.00012482538 |
| 1,618 |
Row-wise Parallel Predicate Evaluation |
2008 |
VLDB |
0.00011114015 |
| 2,001 |
Sinew: A SQL System for Multi-Structured Data |
2014 |
SIGMOD |
9.8186417e-05 |
| 2,322 |
Instant Loading for Main Memory Databases |
2013 |
VLDB |
9.034874e-05 |
| 2,367 |
Here are my Data Files. Here are my Queries. Where are my Results? |
2011 |
CIDR |
8.9511058e-05 |
| 2,773 |
JSON Data Management – Supporting Schema-less Development in RDBMS |
2014 |
SIGMOD |
8.1386587e-05 |
| 2,973 |
Parallel In-Situ Data Processing with Speculative Loading |
2014 |
SIGMOD |
7.7902322e-05 |
| 4,489 |
Automatic Generation of Normalized Relational Schemas from Nested Key-Value Data |
2016 |
SIGMOD |
6.1434237e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 11,248 |
Scalable Reasoning on Document Stores via Instance-Aware Query Rewriting |
2023 |
VLDB |
4.1945683e-05 |
| 2,781 |
JSON: Data model, Query languages and Schema specification |
2017 |
PODS |
8.1305074e-05 |
| 3,562 |
MISO: Souping Up Big Data Query Processing with a Multistore System |
2014 |
SIGMOD |
6.9694564e-05 |
| 10,294 |
Streaming Validation of JSON Documents Against Schemas |
2026 |
VLDB |
4.1945683e-05 |
| 5,343 |
FAD.js: Fast JSON Data Access Using JIT-based Speculative Optimizations |
2017 |
VLDB |
5.5581129e-05 |
| 4,704 |
JSON Tiles: Fast Analytics on Semi-Structured Data |
2021 |
SIGMOD |
5.9853687e-05 |
| 2,700 |
Filter Before You Parse: Faster Analytics on Raw Data with Sparser |
2018 |
VLDB |
8.2728509e-05 |
| 9,837 |
GpJSON: High-performance JSON Data Processing on GPUs |
2025 |
VLDB |
4.2740344e-05 |
| 11,189 |
dsJSON: A Distributed SQL JSON Processor |
2023 |
SIGMOD |
4.1945683e-05 |
| 7,830 |
Scalable Structural Index Construction for JSON Analytics |
2021 |
VLDB |
4.6388763e-05 |