Database Paper Browser

Back to papers

Fast Queries Over Heterogeneous Data Through Engine Customization

Summary: Proteus natively supports CSV, JSON, and relational binary data via a multi-format algebra. It uses per-query code generation to tailor storage and execution to the dataset and workload, delivering fast cross-format analytics under one interface. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
11384
Venue
VLDB
Year
2016
Pagerank
6.288323e-05
Overall Rank
4,326 | 69.91%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 25 of 25 citing papers.

Rank Citing Paper Year Venue Pagerank
2,122 SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle 2020 CIDR 9.4989076e-05
2,651 HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines 2019 VLDB 8.3694317e-05
2,700 Filter Before You Parse: Faster Analytics on Raw Data with Sparser 2018 VLDB 8.2728509e-05
2,838 How to Architect a Query Compiler, Revisited 2018 SIGMOD 8.0408472e-05
3,891 Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing 2017 VLDB 6.659442e-05
3,918 On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML 2018 VLDB 6.6315176e-05
4,363 Hardware-conscious Query Processing in GPU-accelerated Analytical Engines 2019 CIDR 6.2552614e-05
4,505 SPOOF: Sum-Product Optimization and Operator Fusion for Large-Scale Machine Learning 2017 CIDR 6.1327108e-05
4,602 Accelerating Raw Data Analysis with the ACCORDA Software and Hardware Architecture 2019 VLDB 6.0567387e-05
4,704 JSON Tiles: Fast Analytics on Semi-Structured Data 2021 SIGMOD 5.9853687e-05
4,770 The Case For Heterogeneous HTAP 2017 CIDR 5.9338845e-05
5,005 Adaptive HTAP through Elastic Resource Scheduling 2020 SIGMOD 5.7641797e-05
5,301 ReCache: Reactive Caching for Fast Analytics over Heterogeneous Data 2018 VLDB 5.5790928e-05
5,731 Babelfish: Efficient Execution of Polyglot Queries 2022 VLDB 5.3502065e-05
7,237 CleanM: An Optimizable Query Language for Unified Scale-Out Data Cleaning 2017 VLDB 4.7928651e-05
7,360 ParPaRaw: Massively Parallel Parsing of Delimiter-Separated Raw Data 2020 VLDB 4.7525925e-05
7,704 ExDRa: Exploratory Data Science on Federated Raw Data 2021 SIGMOD 4.6733838e-05
8,096 Micro-architectural Analysis of OLAP: Limitations and Opportunities 2020 VLDB 4.5860565e-05
8,393 LAQy: Efficient and Reusable Query Approximations via Lazy Sampling 2023 SIGMOD 4.5280102e-05
8,692 Boosting Efficiency of External Pipelines by Blurring Application Boundaries 2022 CIDR 4.4661967e-05
9,052 RawVis: A System for Efficient In-situ Visual Analytics 2021 SIGMOD 4.4039656e-05
9,289 In-Browser Interactive SQL Analytics with Afterburner 2017 SIGMOD 4.362197e-05
9,379 GIO: Generating Efficient Matrix and Frame Readers for Custom Data Formats by Example 2023 SIGMOD 4.3462787e-05
9,918 Shared Load(ing): Efficient Bulk Loading into Optimized Storage 2020 CIDR 4.2561557e-05
11,784 Alpine: Efficient In situ Data Exploration in the Presence of Updates 2017 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 34 of 34 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
3 Pig Latin: A Not-So-Foreign Language for Data Processing 2008 SIGMOD 0.0024183614
60 Efficiently Compiling Efficient Query Plans for Modern Hardware 2011 VLDB 0.00064439773
66 Spark SQL: Relational Data Processing in Spark 2015 SIGMOD 0.00061639801
70 Hive - A Warehousing Solution Over a Map-Reduce Framework 2009 VLDB 0.00059533166
88 Common Expression Analysis in Database Applications 1982 SIGMOD 0.00052316625
109 Dremel: Interactive Analysis of Web-Scale Datasets 2010 VLDB 0.00048186983
153 Relational Databases for Querying XML Documents: Limitations and Opportunities 1999 VLDB 0.00040784455
157 HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads 2009 VLDB 0.00040397359
158 Automated Selection of Materialized Views and Indexes for SQL Databases 2000 VLDB 0.00040071492
179 Efficient and Extensible Algorithms for Multi Query Optimization 2000 SIGMOD 0.00037672155
346 Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources 1997 VLDB 0.00026656272
404 Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited 2014 VLDB 0.00024143076
596 HYRISE—A Main Memory Hybrid Storage Engine 2011 VLDB 0.00019481482
704 Building Efficient Query Engines in a High-Level Language 2014 VLDB 0.00017900583
981 DynaMat: A Dynamic View Management System for Data Warehouses 1999 SIGMOD 0.00014879532
1,265 Jaql: A Scripting Language for Large Scale Semistructured Data Analysis 2011 VLDB 0.00012947629
1,343 NoDB: Efficient Query Execution on Raw Data Files 2012 SIGMOD 0.00012482538
1,438 AsterixDB: A Scalable, Open Source BDMS 2014 VLDB 0.00011973592
1,795 MonetDB/XQuery: A Fast XQuery Processor Powered by a Relational Engine 2006 SIGMOD 0.00010526672
1,807 H2O: A Hands-free Adaptive Store 2014 SIGMOD 0.00010487796
1,977 Split Query Processing in Polybase 2013 SIGMOD 9.8824589e-05
2,001 Sinew: A SQL System for Multi-Structured Data 2014 SIGMOD 9.8186417e-05
2,069 System RX: One Part Relational, One Part XML 2005 SIGMOD 9.6329563e-05
2,367 Here are my Data Files. Here are my Queries. Where are my Results? 2011 CIDR 8.9511058e-05
2,383 How to Architect a Query Compiler 2016 SIGMOD 8.9294108e-05
2,693 An Architecture for Recycling Intermediates in a Column-store 2009 SIGMOD 8.2883398e-05
2,757 Parallel Data Analysis Directly on Scientific File Formats 2014 SIGMOD 8.1679384e-05
2,773 JSON Data Management – Supporting Schema-less Development in RDBMS 2014 SIGMOD 8.1386587e-05
2,973 Parallel In-Situ Data Processing with Speculative Loading 2014 SIGMOD 7.7902322e-05
3,548 Adaptive Query Processing on RAW Data 2014 VLDB 6.9859242e-05
4,202 Cost Models DO Matter: Providing Cost Information for Diverse Data Sources in a Federated System 1999 VLDB 6.36184e-05
6,407 Just-In-Time Data Virtualization: Lightweight Data Management with ViDa 2015 CIDR 5.076547e-05
6,591 Towards An Enterprise XML Architecture 2005 SIGMOD 5.0008467e-05
7,557 Invisible Glue: Scalable Self-Tuning Multi-Stores 2015 CIDR 4.7112819e-05
Previous Page 1 / 1 Next

Semantically Similar Papers