Database Paper Browser

Back to papers

ActiveClean: An Interactive Data Cleaning Framework For Modern Machine Learning

Summary: ActiveClean is a progressive data-cleaning framework that interleaves cleaning with ML training, updating models as analysts clean small data batches. Key ideas include importance weighting, dirty-data detection, and a visual interface, enabling robust learning in high-dimensional pipelines, demonstrated on video classification and topic modeling. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5181
Venue
SIGMOD
Year
2016
Pagerank
5.2682177e-05
Overall Rank
5,929 | 58.76%
DOI
10.1145/2882903.2899409

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 6 of 6 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 6 of 6 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers