Back to papers
BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees
Summary: BlinkML enables fast, approximate ML training with probabilistic guarantees that the approximate model matches full-model predictions. Supports any MLE-based model (GLMs, PPCA) and uses error-bounded sampling to deliver 6x–629x speedups while preserving decisions.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 5608
- Venue
- SIGMOD
- Year
- 2019
- Pagerank
- 5.3200643e-05
- Overall Rank
- 5,806 | 59.61%
- DOI
-
10.1145/3299869.3300077
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 8 of 8 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 27 of 27 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 3 |
Pig Latin: A Not-So-Foreign Language for Data Processing |
2008 |
SIGMOD |
0.0024183614 |
| 140 |
The MADlib Analytics Library or MAD Skills, the SQL |
2012 |
VLDB |
0.00042270404 |
| 211 |
Join Synopses for Approximate Query Answering |
1999 |
SIGMOD |
0.00033981214 |
| 316 |
NoScope: Optimizing Neural Network Queries over Video at Scale |
2017 |
VLDB |
0.00027988668 |
| 429 |
The Aqua Approximate Query Answering System |
1999 |
SIGMOD |
0.00023476494 |
| 557 |
SystemML: Declarative Machine Learning on Spark |
2016 |
VLDB |
0.00020197988 |
| 667 |
Incremental Knowledge Base Construction Using DeepDive |
2015 |
VLDB |
0.00018440557 |
| 761 |
Materialization Optimizations for Feature Selection Workloads |
2014 |
SIGMOD |
0.00017053783 |
| 834 |
Learning Linear Regression Models over Factorized Joins |
2016 |
SIGMOD |
0.00016135159 |
| 903 |
To Join or Not to Join? Thinking Twice about Joins before Feature Selection |
2016 |
SIGMOD |
0.0001547016 |
| 943 |
Wander Join: Online Aggregation via Random Walks |
2016 |
SIGMOD |
0.00015145883 |
| 1,044 |
DimmWitted: A Study of Main-Memory Statistical Analytics |
2014 |
VLDB |
0.00014475229 |
| 1,167 |
Learning Generalized Linear Models Over Normalized Data |
2015 |
SIGMOD |
0.00013547713 |
| 1,204 |
VerdictDB: Universalizing Approximate Query Processing |
2018 |
SIGMOD |
0.00013319541 |
| 1,260 |
Dynamic Sample Selection for Approximate Query Processing |
2003 |
SIGMOD |
0.00012993347 |
| 1,323 |
Quickr: Lazily Approximating Complex AdHoc Queries in BigData Clusters |
2016 |
SIGMOD |
0.00012601997 |
| 1,874 |
Knowing When You’re Wrong: Building Fast and Reliable Approximate Query Processing Systems |
2014 |
SIGMOD |
0.00010244443 |
| 1,942 |
Heterogeneity-aware Distributed Parameter Servers |
2017 |
SIGMOD |
0.00010012691 |
| 1,967 |
Compressed Linear Algebra for Large-Scale Machine Learning |
2016 |
VLDB |
9.9131712e-05 |
| 2,365 |
The Analytical Bootstrap: a New Method for Fast Error Estimation in Approximate Query Processing |
2014 |
SIGMOD |
8.9551432e-05 |
| 2,588 |
Database Learning: Toward a Database that Becomes Smarter Every Time |
2017 |
SIGMOD |
8.4909562e-05 |
| 2,915 |
Brainwash: A Data System for Feature Engineering |
2013 |
CIDR |
7.9078385e-05 |
| 3,842 |
Turbo-Charging Estimate Convergence in DBO |
2009 |
VLDB |
6.7102374e-05 |
| 5,224 |
Neighbor-Sensitive Hashing |
2016 |
VLDB |
5.6197981e-05 |
| 6,411 |
Approximate Query Engines: Commercial Challenges and Research Opportunities |
2017 |
SIGMOD |
5.0752468e-05 |
| 9,469 |
DimBoost: Boosting Gradient Boosting Decision Tree to Higher Dimensions |
2018 |
SIGMOD |
4.3342363e-05 |
| 11,711 |
Demonstration of VerdictDB, the Platform-Independent AQP System |
2018 |
SIGMOD |
4.1945683e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 11,607 |
Ease.ml/snoopy in Action: Towards Automatic Feasibility Analysis for Machine Learning Application Development |
2020 |
VLDB |
4.1945683e-05 |
| 3,897 |
SLiMFast: Guaranteed Results for Data Fusion and Source Reliability |
2017 |
SIGMOD |
6.6554845e-05 |
| 6,733 |
Hindsight Logging for Model Training |
2021 |
VLDB |
4.9467666e-05 |
| 11,539 |
FlashP: An Analytical Pipeline for Real-time Forecasting of Time-Series Relational Data |
2021 |
VLDB |
4.1945683e-05 |
| 1,967 |
Compressed Linear Algebra for Large-Scale Machine Learning |
2016 |
VLDB |
9.9131712e-05 |
| 3,808 |
SketchML: Accelerating Distributed Machine Learning with Data Sketches |
2018 |
SIGMOD |
6.7455428e-05 |
| 5,123 |
Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-Precision Learning |
2019 |
VLDB |
5.6796998e-05 |
| 1,469 |
BlinkFill: Semi-supervised Programming By Example for Syntactic String Transformations |
2016 |
VLDB |
0.00011836053 |
| 6,330 |
Efficient Construction of Approximate Ad-Hoc ML models Through Materialization and Reuse |
2018 |
VLDB |
5.1077416e-05 |
| 8,653 |
ApproxML: Efficient Approximate Ad-Hoc ML Models Through Materialization and Reuse |
2019 |
VLDB |
4.475291e-05 |