DeepDB: Learn from Data, not from Queries!

Summary: Data-driven learned DBMS components bypass workload-driven training, enabling changes in workload or data without retraining. Empirically higher accuracy and better generalization to unseen queries than state-of-the-art learned components. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID: 12289
Venue: VLDB
Year: 2020
Pagerank: 0.00019235898
Overall Rank: 608 | 95.78%
DOI: 10.14778/3384345.3384349

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 121 citing papers.

Rank	Citing Paper	Year	Venue	Pagerank
7,011	Simple Adaptive Query Processing vs. Learned Query Optimizers: Observations and Analysis	2023	VLDB	4.8629458e-05
7,034	A Neural Database for Differentially Private Spatial Range Queries	2022	VLDB	4.8550912e-05
7,123	ASM: Harmonizing Autoregressive Model, Sampling, and Multi-dimensional Statistics Merging for Cardinality Estimation	2024	SIGMOD	4.8251036e-05
7,126	Debunking the Myth of Join Ordering: Toward Robust SQL Analytics	2025	SIGMOD	4.8232367e-05
7,186	LPLM: A Neural Language Model for Cardinality Estimation of LIKE-Queries	2024	SIGMOD	4.8063731e-05
7,221	Speeding Up End-to-end Query Execution via Learning-based Progressive Cardinality Estimation	2023	SIGMOD	4.797194e-05
7,336	Refactoring Index Tuning Process with Benefit Estimation	2024	VLDB	4.7599411e-05
7,457	Selectivity Functions of Range Queries are Learnable*	2022	SIGMOD	4.7247191e-05
7,467	Yannakakis+: Practical Acyclic Query Evaluation with Theoretical Guarantees	2025	SIGMOD	4.7218691e-05
7,474	Cardinality Estimation of Approximate Substring Queries using Deep Learning	2022	VLDB	4.7194345e-05
7,610	Learning to be a Statistician: Learned Estimator for Number of Distinct Values	2022	VLDB	4.6965039e-05
7,634	ReStore - Neural Data Completion for Relational Databases	2021	SIGMOD	4.6911382e-05
7,753	Rethinking Learned Cost Models: Why Start from Scratch?	2023	SIGMOD	4.660151e-05
7,828	Modeling Shifting Workloads for Learned Database Systems	2024	SIGMOD	4.6407986e-05
7,854	dbET: Execution Time Distribution-based Plan Selection	2023	SIGMOD	4.6350172e-05
8,009	CAMAL: Optimizing LSM-trees via Active Learning	2024	SIGMOD	4.6066863e-05
8,020	The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-Actions	2024	VLDB	4.6040862e-05
8,080	Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines	2024	VLDB	4.5911668e-05
8,220	PerfGuard: Deploying ML-for-Systems without Performance Regressions, Almost!	2021	VLDB	4.5557328e-05
8,393	LAQy: Efficient and Reusable Query Approximations via Lazy Sampling	2023	SIGMOD	4.5280102e-05
8,650	HAP: An Efficient Hamming Space Index Based on Augmented Pigeonhole Principle	2022	SIGMOD	4.4761716e-05
8,680	A Practical Approach to Groupjoin and Nested Aggregates	2021	VLDB	4.4694927e-05
8,697	Convolution and Cross-Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join Queries	2024	SIGMOD	4.4657888e-05
8,834	ByteCard: Enhancing ByteDance’s Data Warehouse with Learned Cardinality Estimation	2024	SIGMOD	4.4394021e-05
8,847	Towards Foundation Database Models	2025	CIDR	4.4371897e-05
8,854	Optimizing the cloud? Don't train models. Build oracles!	2024	CIDR	4.4349047e-05
8,948	One Seed, Two Birds: A Unified Learned Structure for Exact and Approximate Counting	2024	SIGMOD	4.423786e-05
8,956	T3: Accurate and Fast Performance Prediction for Relational Database Systems With Compiled Decision Trees	2025	SIGMOD	4.4214154e-05
9,006	Hit the Gym: Accelerating Query Execution to Efficiently Bootstrap Behavior Models for Self-Driving Database Management Systems	2024	VLDB	4.4101482e-05
9,107	NeuroSketch: Fast and Approximate Evaluation of Range Aggregate Queries with Neural Networks	2023	SIGMOD	4.3950706e-05
9,194	Phoebe: A Learning-based Checkpoint Optimizer	2021	VLDB	4.3761777e-05
9,213	PACE: Poisoning Attacks on Learned Cardinality Estimation	2024	SIGMOD	4.3721075e-05
9,317	Are Joins over LSM-trees Ready? Take RocksDB as an Example	2025	VLDB	4.3556432e-05
9,431	PairwiseHist: Fast, Accurate and Space-Efficient Approximate Query Processing with Data Compression	2024	VLDB	4.3434046e-05
9,621	ShadowAQP: Efficient Approximate Group-by and Join Query via Attribute-oriented Sample Size Allocation and Data Generation	2023	VLDB	4.3167167e-05
9,628	Approximate Sketches	2024	SIGMOD	4.3143499e-05
9,662	Efficient Query Re-optimization with Judicious Subquery Selections	2023	SIGMOD	4.3097631e-05
9,691	Selectivity Estimation for Queries Containing Predicates over Set-Valued Attributes	2023	SIGMOD	4.3035354e-05
9,693	ROME: Robust Query Optimization via Parallel Multi-Plan Execution	2024	SIGMOD	4.3027391e-05
9,726	Cardinality Estimation of LIKE Predicate Queries using Deep Learning	2025	SIGMOD	4.2943379e-05
9,747	Still Asking: How Good Are Query Optimizers, Really?	2025	VLDB	4.2897489e-05
9,757	Efficient Insights Discovery through Conditional Generative Model based Query Approximation	2022	SIGMOD	4.2893233e-05
9,812	A Practical Theory of Generalization in Selectivity Learning	2025	VLDB	4.2783272e-05
9,825	Athena: An Effective Learning-based Framework for Query Optimizer Performance Improvement	2025	SIGMOD	4.2751057e-05
9,852	Machine Unlearning in Learned Databases: An Experimental Analysis	2024	SIGMOD	4.2714575e-05
9,869	Turbo-Charging SPJ Query Plans with Learned Physical Join Operator Selections	2022	VLDB	4.2675361e-05
9,878	PRICE: A Pretrained Model for Cross-Database Cardinality Estimation	2025	VLDB	4.2656547e-05
9,917	Check Out the Big Brain on BRAD: Simplifying Cloud Data Processing with Learned Automated Data Meshes	2023	VLDB	4.2561557e-05
9,945	SSCard: Substring Cardinality Estimation using Suffix Tree-Guided Learned FM-Index	2026	SIGMOD	4.2432653e-05
9,960	An Elephant Under The Microscope: Analyzing The Interaction Of Optimizer Components In PostgreSQL	2025	SIGMOD	4.2294678e-05

Outgoing Citations (Sorted by Pagerank)

Showing 25 of 25 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank	Cited Paper	Year	Venue	Pagerank
71	How Good Are Query Optimizers, Really?	2016	VLDB	0.00059038975
102	The Case for Learned Index Structures	2018	SIGMOD	0.00049545203
204	Learned Cardinalities: Estimating Correlated Joins with Deep Learning	2019	CIDR	0.00034784455
333	Neo: A Learned Query Optimizer	2019	VLDB	0.00027206884
372	Selectivity Estimation using Probabilistic Models	2001	SIGMOD	0.00025354779
405	Approximate Query Processing Using Wavelets	2000	VLDB	0.00024057494
758	Deep Unsupervised Cardinality Estimation	2020	VLDB	0.0001706608
806	An End-to-End Learning-based Cost Estimator	2020	VLDB	0.00016434274
943	Wander Join: Online Aggregation via Random Walks	2016	SIGMOD	0.00015145883
980	BayesStore: Managing Large, Uncertain Data Repositories with Probabilistic Graphical Models	2008	VLDB	0.00014879747
1,105	Cardinality Estimation Done Right: Index-Based Join Sampling	2017	CIDR	0.00013990395
1,204	VerdictDB: Universalizing Approximate Query Processing	2018	SIGMOD	0.00013319541
1,254	Selectivity Estimation for Range Predicates using Lightweight Models	2019	VLDB	0.00013027411
1,569	Querying Continuous Functions in a Database System	2008	SIGMOD	0.0001132337
2,083	Towards a Learning Optimizer for Shared Clouds	2019	VLDB	9.5834572e-05
2,129	IDEBench: A Benchmark for Interactive Data Exploration	2020	SIGMOD	9.480002e-05
2,501	DBEst: Revisiting Approximate Query Processing Engines with Machine Learning Models	2019	SIGMOD	8.6453446e-05
2,588	Database Learning: Toward a Database that Becomes Smarter Every Time	2017	SIGMOD	8.4909562e-05
2,669	A Black-Box Approach to Query Cardinality Estimation	2007	CIDR	8.3389856e-05
2,841	Selectivity Estimation in Extensible Databases - A Neural Network Approach	1998	VLDB	8.0287389e-05
2,865	Designing Succinct Secondary Indexing Mechanism by Exploiting Column Correlations	2019	SIGMOD	7.9862595e-05
4,164	SlimShot: In-Database Probabilistic Inference for Knowledge Bases	2016	VLDB	6.3923099e-05
5,266	Probabilistic Databases with MarkoViews	2012	VLDB	5.5972559e-05
7,434	Local Structure and Determinism in Probabilistic Databases	2012	SIGMOD	4.7314358e-05
8,067	Learning Statistical Models from Relational Data	2011	SIGMOD	4.5937196e-05

Semantically Similar Papers

Overall Rank	Paper	Year	Venue	Pagerank
5,637	Database Workload Characterization with Query Plan Encoders	2022	VLDB	5.3979505e-05
6,230	Learned Approximate Query Processing: Make it Light, Accurate and Fast	2021	CIDR	5.145989e-05
5,473	Facilitating SQL Query Composition and Analysis	2020	SIGMOD	5.4885366e-05
13,229	Using Deep Learning Models to Replace Large Materialized Views in Relational Database	2021	CIDR	-
2,364	Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries	2020	SIGMOD	8.9554751e-05
9,120	Deep Query Optimization	2019	SIGMOD	4.392741e-05
9,852	Machine Unlearning in Learned Databases: An Experimental Analysis	2024	SIGMOD	4.2714575e-05
884	Plan-Structured Deep Neural Network Models for Query Performance Prediction	2019	VLDB	0.00015654004
3,658	Towards a Hands-Free Query Optimizer through Deep Learning	2019	CIDR	6.8704209e-05
9,892	DBMS Fitting: Why should we learn what we already know?	2020	CIDR	4.261445e-05