Database Paper Browser

Back to papers

Data Cleaning in the Era of Data Science: Challenges and Opportunities

Summary: Traditional one-shot, monolithic cleaning tools fail for iterative, multi-pipeline data-science workflows with many operators. Paper pinpoints pipeline diversity, tool monoethnicity, and subjective/ad-hoc errors, calling for composable, pipeline-aware cleaning primitives. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
396
Venue
CIDR
Year
2021
Pagerank
-
Overall Rank
13,232 | 7.95%
DOI
-

Incoming Non-self Citations Over Time

No non-self incoming citations found for this paper in this database.

Authors

Incoming Citations (Sorted by Pagerank)

Showing 0 of 0 citing papers.

Rank Citing Paper Year Venue Pagerank
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 2 of 2 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
5,684 Dagger: A Data (not code) Debugger 2020 CIDR 5.3720749e-05
9,306 Debugging Large-Scale Data Science Pipelines using Dagger 2020 VLDB 4.3572942e-05
Previous Page 1 / 1 Next

Semantically Similar Papers