Database Paper Browser

Back to papers

Brainwash: A Data System for Feature Engineering

Summary: Brainwash: a data-system vision to ease feature engineering for large ML-driven systems by shortening the Explore–Extract–Evaluate loop and revealing how feature code interacts with massive datasets. Focuses on faster iterative feedback and run reuse. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
212
Venue
CIDR
Year
2013
Pagerank
7.9078385e-05
Overall Rank
2,915 | 79.73%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 23 of 23 citing papers.

Rank Citing Paper Year Venue Pagerank
609 Monkey: Optimal Navigable Key-Value Store 2017 SIGMOD 0.0001923446
903 To Join or Not to Join? Thinking Twice about Joins before Feature Selection 2016 SIGMOD 0.0001547016
1,167 Learning Generalized Linear Models Over Normalized Data 2015 SIGMOD 0.00013547713
1,532 Data Management in Machine Learning: Challenges, Techniques, and Systems 2017 SIGMOD 0.00011472681
1,666 HELIX: Holistic Optimization for Accelerating Iterative Machine Learning 2019 VLDB 0.0001096361
2,157 The Data Calculator*: Data Structure Design and Cost Synthesis from First Principles and Learned Cost Models 2018 SIGMOD 9.416022e-05
4,106 Extracting Databases from Dark Data with DeepDive 2016 SIGMOD 6.4456184e-05
4,129 Are Key-Foreign Key Joins Safe to Avoid when Learning High-Capacity Classifiers? 2018 VLDB 6.428887e-05
4,785 Demonstration of Santoku: Optimizing Machine Learning over Normalized Data 2015 VLDB 5.9236989e-05
5,308 Key-Value Storage Engines 2020 SIGMOD 5.576303e-05
5,806 BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees 2019 SIGMOD 5.3200643e-05
6,115 An Integrated Development Environment for Faster Feature Engineering 2014 VLDB 5.2042468e-05
6,347 A Relational Framework for Classifier Engineering 2017 PODS 5.1019568e-05
6,456 From Auto-tuning One Size Fits All to Self-designed and Learned Data-intensive Systems 2019 SIGMOD 5.0564619e-05
7,664 Schema Independent Relational Learning 2017 SIGMOD 4.6857329e-05
8,864 Cerebro: A Layered Data Platform for Scalable Deep Learning 2021 CIDR 4.4326439e-05
9,382 Hephaestus: Data Reuse for Accelerating Scientific Discovery 2015 CIDR 4.3457368e-05
10,177 InferF: Declarative Factorization of AI/ML Inferences over Joins 2026 SIGMOD 4.1945683e-05
11,476 Enforcing Constraints for Machine Learning Systems via Declarative Feature Selection: An Experimental Study 2021 SIGMOD 4.1945683e-05
11,975 Which Concepts Are Worth Extracting? 2014 SIGMOD 4.1945683e-05
12,020 The Case for Personal Data-Driven Decision Making 2014 VLDB 4.1945683e-05
13,360 Faster Evaluation of Labor-Intensive Features 2015 CIDR -
13,448 Ringtail: A Generalized Nowcasting System 2013 VLDB -
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 0 of 0 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
Previous Page 1 / 1 Next

Semantically Similar Papers