CAFE: Constraint-Aware Feature Extraction from Large Databases
Summary: CAFE extracts features from large DBs while enforcing high-level constraints (consistency, interpretability, fairness) by mapping them to low-level pruning strategies and using an inverted index to find candidate columns. An optimizer-like planner uses sample-based estimates, models strategy dependencies, and orders pruning to maximize downstream ML accuracy while bounding runtime and feature quality. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,769 | Automated Feature Engineering for Algorithmic Fairness | 2021 | VLDB | 5.934329e-05 |
| 6,270 | MATE: Multi-Attribute Table Extraction | 2022 | VLDB | 5.1337451e-05 |
| 10,836 | Data Discovery in Data Lakes: Operations, Indexes, Systems | 2025 | VLDB | 4.1945683e-05 |
| 11,476 | Enforcing Constraints for Machine Learning Systems via Declarative Feature Selection: An Experimental Study | 2021 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 2 of 2 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 420 | InfoGather: Entity Augmentation and Attribute Discovery By Holistic Matching with Web Tables | 2012 | SIGMOD | 0.00023719065 |
| 1,277 | The Data Civilizer System | 2017 | CIDR | 0.00012879695 |
Previous
Page 1 / 1
Next