| 791 |
ActiveClean: Interactive Data Cleaning For Statistical Modeling |
2016 |
VLDB |
0.00016629664 |
| 1,350 |
Northstar: An Interactive Data Science System |
2018 |
VLDB |
0.00012431059 |
| 1,627 |
Data Cleaning: Overview and Emerging Challenges |
2016 |
SIGMOD |
0.00011086905 |
| 1,874 |
Knowing When You’re Wrong: Building Fast and Reliable Approximate Query Processing Systems |
2014 |
SIGMOD |
0.00010244443 |
| 1,882 |
Tuplex: Data Science in Python at Native Code Speed |
2021 |
SIGMOD |
0.0001021625 |
| 1,894 |
Baran: Effective Error Correction via a Unified Context Representation and Transfer Learning |
2020 |
VLDB |
0.0001018378 |
| 2,132 |
Towards Sustainable Insights or why polygamy is bad for you |
2017 |
CIDR |
9.4770432e-05 |
| 2,302 |
Nearest Neighbor Classifiers over Incomplete Information: From Certain Answers to Certain Predictions |
2021 |
VLDB |
9.0668832e-05 |
| 2,797 |
Query-Oriented Data Cleaning with Oracles |
2015 |
SIGMOD |
8.1108589e-05 |
| 2,946 |
BigDansing: A System for Big Data Cleansing |
2015 |
SIGMOD |
7.8372441e-05 |
| 3,263 |
QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications |
2015 |
SIGMOD |
7.3097573e-05 |
| 3,773 |
Cleaning Crowdsourced Labels Using Oracles for Statistical Classification |
2019 |
VLDB |
6.7758649e-05 |
| 3,944 |
AQP++: Connecting Approximate Query Processing With Aggregate Precomputation for Interactive Analytics |
2018 |
SIGMOD |
6.6078243e-05 |
| 4,273 |
Cleaning Denial Constraint Violations through Relaxation |
2020 |
SIGMOD |
6.3003864e-05 |
| 4,375 |
Sample Debiasing in the Themis Open World Database System |
2020 |
SIGMOD |
6.2427076e-05 |
| 4,451 |
CLAMShell: Speeding up Crowds for Low-latency Data Labeling |
2016 |
VLDB |
6.1738675e-05 |
| 4,668 |
PrivateClean: Data Cleaning and Differential Privacy |
2016 |
SIGMOD |
6.0115918e-05 |
| 5,153 |
Horizon: Scalable Dependency-driven Data Cleaning |
2021 |
VLDB |
5.6607963e-05 |
| 5,586 |
QuERy: A Framework for Integrating Entity Resolution with Query Processing |
2016 |
VLDB |
5.4219548e-05 |
| 5,929 |
ActiveClean: An Interactive Data Cleaning Framework For Modern Machine Learning |
2016 |
SIGMOD |
5.2682177e-05 |
| 6,689 |
Efficient Knowledge Graph Accuracy Evaluation |
2019 |
VLDB |
4.9623586e-05 |
| 6,740 |
Combining Aggregation and Sampling (Nearly) Optimally for Approximate Query Processing |
2021 |
SIGMOD |
4.944395e-05 |
| 7,013 |
Qualitative Data Cleaning |
2016 |
VLDB |
4.8619024e-05 |
| 7,117 |
Crowdsourced Data Management: Overview and Challenges |
2017 |
SIGMOD |
4.826509e-05 |
| 7,237 |
CleanM: An Optimizable Query Language for Unified Scale-Out Data Cleaning |
2017 |
VLDB |
4.7928651e-05 |
| 7,251 |
Learning to Sample: Counting with Complex Queries |
2020 |
VLDB |
4.7890519e-05 |
| 7,634 |
ReStore - Neural Data Completion for Relational Databases |
2021 |
SIGMOD |
4.6911382e-05 |
| 7,766 |
ICARUS: Minimizing Human Effort in Iterative Data Completion |
2018 |
VLDB |
4.6564959e-05 |
| 8,593 |
Wisteria: Nurturing Scalable Data Cleaning Infrastructure |
2015 |
VLDB |
4.4891474e-05 |
| 8,728 |
Stale View Cleaning: Getting Fresh Answers from Stale Materialized Views |
2015 |
VLDB |
4.4589711e-05 |
| 9,043 |
Query-Guided Resolution in Uncertain Databases |
2023 |
SIGMOD |
4.4039656e-05 |
| 9,054 |
Selecting Data to Clean for Fact Checking: Minimizing Uncertainty vs. Maximizing Surprise |
2019 |
VLDB |
4.4039656e-05 |
| 9,056 |
A Data Quality Metric (DQM): How to Estimate the Number of Undetected Errors in Data Sets |
2017 |
VLDB |
4.4039656e-05 |
| 9,196 |
QOCO: A Query Oriented Data Cleaning System with Oracles |
2015 |
VLDB |
4.3749064e-05 |
| 9,348 |
GIDCL: A Graph-Enhanced Interpretable Data Cleaning Framework with Large Language Models |
2024 |
SIGMOD |
4.3526427e-05 |
| 10,617 |
Deduplicated Sampling On-Demand |
2025 |
VLDB |
4.1945683e-05 |
| 11,029 |
Efficient and Reliable Estimation of Knowledge Graph Accuracy |
2024 |
VLDB |
4.1945683e-05 |