Database Paper Browser

Back to papers

Quickly Generating Billion-Record Synthetic Databases

Summary: Scalable billion-record synthetic SQL databases on shared-nothing clusters for benchmarking. Key ideas: congruential generators for dense, unique uniform data; concurrent index generation via discrete logs; and support for exponential, normal, and self-similar distributions. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
2730
Venue
SIGMOD
Year
1994
Pagerank
0.0004138408
Overall Rank
145 | 99.00%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 78 citing papers.

Rank Citing Paper Year Venue Pagerank
281 LinkBench: a Database Benchmark Based on the Facebook Social Graph 2013 SIGMOD 0.0002906793
351 Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs 2009 VLDB 0.0002636504
635 Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores 2015 VLDB 0.00018879031
714 Adaptive Aggregation on Chip Multiprocessors 2007 VLDB 0.00017730584
798 Broadcast Disks: Data Management for Asymmetric Communication Environments 1995 SIGMOD 0.00016579273
888 QAGen: Generating Query-Aware Test Databases 2007 SIGMOD 0.00015578618
934 Flexible Database Generators 2005 VLDB 0.00015227409
979 Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads 2012 VLDB 0.0001488055
1,124 Improving the Performance of List Intersection 2009 VLDB 0.00013847565
1,217 Rethinking serializable multiversion concurrency control 2015 VLDB 0.0001323177
1,439 Consistency Rationing in the Cloud: Pay only when it matters 2009 VLDB 0.00011964135
1,483 Simple and Realistic Data Generation 2006 VLDB 0.00011720317
1,521 High Performance Transactions via Early Write Visibility 2017 VLDB 0.00011532045
1,661 Managing Non-Volatile Memory in Database Systems 2018 SIGMOD 0.00010978755
1,804 An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory 2016 SIGMOD 0.00010501185
1,845 Improving Optimistic Concurrency Control Through Transaction Batching and Operation Reordering 2019 VLDB 0.00010338323
1,961 TicToc: Time Traveling Optimistic Concurrency Control 2016 SIGMOD 9.9514005e-05
2,006 PALM: Parallel Architecture-Friendly Latch-Free Modifications to B+ Trees on Many-Core Processors 2011 VLDB 9.8101551e-05
2,277 Generating Targeted Queries for Database Testing 2008 SIGMOD 9.1241198e-05
2,291 Data Generation using Declarative Constraints 2011 SIGMOD 9.0926719e-05
2,369 Aria: A Fast and Practical Deterministic OLTP Database 2020 VLDB 8.9490403e-05
2,650 Detecting Logic Bugs of Join Optimizations in DBMS 2023 SIGMOD 8.3708191e-05
2,751 Mega-KV: A Case for GPUs to Maximize the Throughput of In-Memory Key-Value Stores 2015 VLDB 8.1760621e-05
2,933 Answering Top-k Queries Using Views 2006 VLDB 7.8679669e-05
3,151 A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs 2017 SIGMOD 7.4720668e-05
3,218 Reverse Data Management 2011 VLDB 7.3592173e-05
3,304 Plausible Deniability for Privacy-Preserving Data Synthesis 2017 VLDB 7.2467347e-05
3,470 Evaluating Persistent Memory Range Indexes 2020 VLDB 7.0655357e-05
3,498 Cubetree: Organization of and Bulk Incremental Updates on the Data Cube 1997 SIGMOD 7.0389539e-05
3,547 Parallel Analytics as a Service 2013 SIGMOD 6.9862051e-05
3,564 Accordion: Better Memory Organization for LSM Key-Value Stores 2018 VLDB 6.9669032e-05
3,655 CloudRAMSort: Fast and Efficient Large-Scale Distributed RAM Sort on Shared-Nothing Cluster 2012 SIGMOD 6.8718304e-05
3,826 To Lock, Swap, or Elide: On the Interplay of Hardware Transactional Memory and Lock-Free Indexing 2015 VLDB 6.7250243e-05
4,042 PARADIS: An Efficient Parallel Algorithm for In-place Radix Sort 2015 VLDB 6.5026989e-05
4,235 Spitfire: A Three-Tier Buffer Manager for Volatile and Non-Volatile Memory 2021 SIGMOD 6.3342932e-05
4,517 Generating Databases for Query Workloads 2010 VLDB 6.1178732e-05
4,667 FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS 2021 VLDB 6.0116919e-05
5,019 Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS 2022 VLDB 5.7559197e-05
5,037 Keep It Simple: Testing Databases via Differential Query Plans 2024 SIGMOD 5.7434825e-05
5,234 Consistent Synchronization Schemes for Workload Replay 2011 VLDB 5.6123331e-05
5,322 Generalized Hash Teams for Join and Group-by 1999 VLDB 5.5701077e-05
5,327 An Evaluation of Checkpoint Recovery for Massively Multiplayer Online Games 2009 VLDB 5.5671416e-05
5,721 FPGA-based Multithreading for In-Memory Hash Joins 2015 CIDR 5.3525009e-05
5,768 Epoch-based Commit and Replication in Distributed OLTP Databases 2021 VLDB 5.3333911e-05
6,137 Detecting Metadata-Related Logic Bugs in Database Systems via Raw Database Construction 2024 VLDB 5.1916986e-05
6,234 Just can't get enough - Synthesizing Big Data 2015 SIGMOD 5.1451686e-05
6,246 Taking Omid to the Clouds: Fast, Scalable Transactions for Real-Time Cloud Analytics 2018 VLDB 5.1389356e-05
6,634 Fine-Grained Re-Execution for Efficient Batched Commit of Distributed Transactions 2023 VLDB 4.982784e-05
6,697 The TEXTURE Benchmark: Measuring Performance of Text Queries on a Relational DBMS 2005 VLDB 4.9577992e-05
6,887 Synthesizing Linked Data Under Cardinality and Integrity Constraints 2021 SIGMOD 4.8937852e-05
Previous Page 1 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 2 of 2 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
10 Benchmarking Database Systems: A Systematic Approach 1983 VLDB 0.0012103754
20 GAMMA - A High Performance Dataflow Database Machine 1986 VLDB 0.00086459551
Previous Page 1 / 1 Next

Semantically Similar Papers