Blink and It's Done: Interactive Queries on Very Large Data
Summary: BlinkDB is a massively parallel, sampling-based approximate query processing framework for interactive SPJA queries on petabyte-scale data, delivering real-time results with statistical error guarantees atop Hive/HDFS. Demonstrates up to 150x speedups vs Hive MR and 10-150x vs Shark on tens of terabytes across ~100 machines, with 2-10% error and fault-tolerant, scalable deployment. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Sameer Agarwal
- 2. Aurojit Panda
- 3. Barzan Mozafari
- 4. Anand P. Iyer
- 5. Samuel Madden
- 6. Ion Stoica
Incoming Citations (Sorted by Pagerank)
Showing 28 of 28 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 449 | Approximate Query Processing: Taming the TeraBytes! A Tutorial | 2001 | VLDB | 0.00022846068 |
| 1,909 | SciBORQ: Scientific data management with Bounds On Runtime and Quality | 2011 | CIDR | 0.00010121304 |
| 2,017 | The Researcher’s Guide to the Data Deluge: Querying a Scientific Database in Just a Few Seconds | 2011 | VLDB | 9.7810458e-05 |
| 2,488 | Shark: Fast Data Analysis Using Coarse-grained Distributed Memory | 2012 | SIGMOD | 8.6683713e-05 |
| 2,817 | Recurring Job Optimization in Scope | 2012 | SIGMOD | 8.0677653e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,488 | Shark: Fast Data Analysis Using Coarse-grained Distributed Memory | 2012 | SIGMOD | 8.6683713e-05 |
| 979 | Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads | 2012 | VLDB | 0.0001488055 |
| 9,504 | Supporting Scalable Analytics with Latency Constraints | 2015 | VLDB | 4.3341665e-05 |
| 3,066 | HAWQ: A Massively Parallel Processing SQL Engine in Hadoop | 2014 | SIGMOD | 7.6221974e-05 |
| 2,337 | Efficient Processing of Data Warehousing Queries in a Split Execution Environment | 2011 | SIGMOD | 9.0098186e-05 |
| 2,127 | SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures | 2014 | VLDB | 9.4863172e-05 |
| 9,347 | Rank Join Queries in NoSQL Databases | 2014 | VLDB | 4.3526718e-05 |
| 11,229 | Blink-hash: An Adaptive Hybrid Index for In-Memory Time-Series Databases | 2023 | VLDB | 4.1945683e-05 |
| 7,916 | Terabyte-Scale Analytics in the Blink of an Eye | 2026 | VLDB | 4.6173899e-05 |
| 542 | Shark: SQL and Rich Analytics at Scale | 2013 | SIGMOD | 0.00020595648 |