Database Paper Browser

Back to papers

Dremel: A Decade of Interactive SQL Analysis at Web Scale

Summary: Dremel pioneers interactive SQL on web-scale data by combining disaggregated storage and compute, in situ analysis, and columnar storage for semi-structured data. The paper traces how these principles evolved over a decade to become the foundation of Google BigQuery. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12227
Venue
VLDB
Year
2020
Pagerank
9.6481955e-05
Overall Rank
2,062 | 85.66%
DOI
10.14778/3415478.3415568

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 36 of 36 citing papers.

Rank Citing Paper Year Venue Pagerank
3,644 BtrBlocks: Efficient Columnar Compression for Data Lakes 2023 SIGMOD 6.8854928e-05
3,763 Flexible Rule-Based Decomposition and Metadata Independence in Modin: A Parallel Dataframe System 2022 VLDB 6.7801795e-05
4,036 Adore: Differentially Oblivious Relational Database Operators 2023 VLDB 6.5089579e-05
4,495 ClickHouse - Lightning Fast Analytics for Everyone 2024 VLDB 6.1410277e-05
4,514 An Empirical Evaluation of Columnar Storage Formats 2024 VLDB 6.1204636e-05
4,530 Big Metadata: When Metadata is Big Data 2021 VLDB 6.1075429e-05
4,870 Exploiting Cloud Object Storage for High-Performance Analytics 2023 VLDB 5.8613885e-05
5,531 Presto: A Decade of SQL Analytics at Meta 2023 SIGMOD 5.4549499e-05
5,864 Redy: Remote Dynamic Memory Cache 2022 VLDB 5.2975079e-05
6,402 BigLake: BigQuery’s Evolution toward a Multi-Cloud Lakehouse 2024 SIGMOD 5.079818e-05
6,541 ConnectorX: Accelerating Data Loading From Databases to Dataframes 2022 VLDB 5.0216945e-05
7,205 Unified Query Optimization in the Fabric Data Warehouse 2024 SIGMOD 4.8014977e-05
7,296 Multi-Tenant Cloud Data Services: State-of-the-Art, Challenges and Opportunities 2022 SIGMOD 4.7723197e-05
7,492 Krypton: Real-time Serving and Analytical SQL Engine at ByteDance 2023 VLDB 4.7180617e-05
7,543 Cloud Databases: New Techniques, Challenges, and Opportunities 2022 VLDB 4.715241e-05
7,628 Understanding the Performance Implications of the Design Principles in Storage-Disaggregated Databases 2024 SIGMOD 4.692459e-05
7,907 Petabyte-Scale Row-Level Operations in Data Lakehouses 2024 VLDB 4.6205839e-05
7,916 Terabyte-Scale Analytics in the Blink of an Eye 2026 VLDB 4.6173899e-05
8,513 CXL Memory Performance for In-Memory Data Processing 2025 VLDB 4.4947795e-05
8,746 Texera: A System for Collaborative and Interactive Data Analytics Using Workflows 2024 VLDB 4.456315e-05
9,125 On-Demand State Separation for Cloud Data Warehousing 2022 VLDB 4.3917246e-05
9,201 F3: The Open-Source Data File Format for the Future 2026 SIGMOD 4.3743539e-05
9,401 Vortex: A Stream-oriented Storage Engine For Big Data Analytics 2024 SIGMOD 4.3441378e-05
9,743 Databases in the Era of Memory-Centric Computing 2025 CIDR 4.2897489e-05
10,202 Reducing Tail Latency in Storage-Disaggregated Database Systems 2026 SIGMOD 4.1945683e-05
10,220 FlatStor: An Efficient Embedded-Index Based Columnar Data Layout for Multimodal Data Workloads 2026 VLDB 4.1945683e-05
10,372 Data Chunk Compaction in Vectorized Execution 2025 SIGMOD 4.1945683e-05
10,403 CockroachDB Serverless: Sub-second Scaling from Zero with Multi-region Cluster Virtualization 2025 SIGMOD 4.1945683e-05
10,494 Nested Parquet Is Flat, Why Not Use It? How To Scan Nested Data With On-the-Fly Key Generation and Joins 2025 SIGMOD 4.1945683e-05
10,767 The HANA Native Query Engine for Lakehouse Systems 2025 VLDB 4.1945683e-05
10,777 Magnus: A Holistic Approach to Data Management for Large-Scale Machine Learning Workloads 2025 VLDB 4.1945683e-05
10,841 Filtered Vector Search: State-of-the-art and Research Opportunities 2025 VLDB 4.1945683e-05
10,852 CloudGlide: Deconstructing the Landscape of Cloud-Based Analytics 2025 VLDB 4.1945683e-05
10,854 LiquidCache: Efficient Pushdown Caching for Cloud-Native Data Analytics 2025 VLDB 4.1945683e-05
11,067 Partition, Don’t Sort! Compression Boosters for Cloud Data Ingestion Pipelines 2024 VLDB 4.1945683e-05
11,485 Real-time Data Infrastructure at Uber 2021 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 19 of 19 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
99 On the Propagation of Errors in the Size of Join Results 1991 SIGMOD 0.00050022914
109 Dremel: Interactive Analysis of Web-Scale Datasets 2010 VLDB 0.00048186983
131 Integrating Compression and Execution in Column-Oriented Database Systems 2006 SIGMOD 0.0004370331
156 Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases 2017 SIGMOD 0.00040504295
167 The Snowflake Elastic Data Warehouse 2016 SIGMOD 0.00039180521
521 Hyder - A Transactional Record Manager for Shared Flash 2011 CIDR 0.00021139547
906 F1: A Distributed SQL Database That Scales 2013 VLDB 0.00015448884
913 Tenzing A SQL Implementation On The MapReduce Framework 2011 VLDB 0.00015408131
918 Socrates: The New SQL Server in the Cloud 2019 SIGMOD 0.00015350181
1,015 Spanner: Becoming a SQL System 2017 SIGMOD 0.00014638696
1,343 NoDB: Efficient Query Execution on Raw Data Files 2012 SIGMOD 0.00012482538
1,470 Processing a Trillion Cells per Mouse Click 2012 VLDB 0.00011833779
1,814 Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing 2014 VLDB 0.00010458107
1,943 Procella: Unifying serving and analytical data at YouTube 2019 VLDB 0.00010012569
3,038 Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics 2017 SIGMOD 7.6717218e-05
3,355 F1 Query: Declarative Querying at Scale 2018 VLDB 7.1829142e-05
3,768 F1 Lightning: HTAP as a Service 2020 VLDB 6.7782774e-05
4,248 Hyper Dimension Shuffle: Efficient Data Repartition at Petabyte Scale in SCOPE 2019 VLDB 6.3247927e-05
7,554 Storing and Querying Tree-Structured Records in Dremel 2014 VLDB 4.712434e-05
Previous Page 1 / 1 Next

Semantically Similar Papers