Database Paper Browser

Back to papers

Potter's Wheel: An Interactive Data Cleaning System

Summary: Potter's Wheel is an interactive data-cleaning system that tightly couples transformation with discrepancy detection, enabling progressive fixes in a spreadsheet-like interface. Transforms are defined graphically or by example, with immediate visual feedback; the system infers value structures and checks constraints in the background. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
8762
Venue
VLDB
Year
2001
Pagerank
0.00047045036
Overall Rank
112 | 99.23%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 66 citing papers.

Rank Citing Paper Year Venue Pagerank
149 Trio: A System for Integrated Management of Data, Accuracy, and Lineage 2005 CIDR 0.00041101118
265 A Cost-Based Model and Effective Heuristic for Repairing Constraints by Value Modification 2005 SIGMOD 0.00029763412
280 Eliminating Fuzzy Duplicates in Data Warehouses 2002 VLDB 0.00029113044
355 Hippocratic Databases 2002 VLDB 0.00026087195
475 Mining Database Structure; Or, How to Build a Data Quality Browser 2002 SIGMOD 0.00022303253
489 Data Curation at Scale: The Data Tamer System 2013 CIDR 0.00022030728
518 Data Integration for the Relational Web 2009 VLDB 0.00021158934
623 Improving Data Quality: Consistency and Accuracy 2007 VLDB 0.00018996374
672 An Interactive Clustering-based Approach to Integrating Source Query Interfaces on the Deep Web 2004 SIGMOD 0.00018355746
732 Discovering Data Quality Rules 2008 VLDB 0.00017465093
833 Guided Data Repair 2011 VLDB 0.00016138432
1,012 NADEEF: A Commodity Data Cleaning System 2013 SIGMOD 0.0001464733
1,267 Foofah: Transforming Data By Example 2017 SIGMOD 0.00012936483
1,427 Towards Scalable Dataframe Systems 2020 VLDB 0.0001204248
1,469 BlinkFill: Semi-supervised Programming By Example for Syntactic String Transformations 2016 VLDB 0.00011836053
1,546 KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing 2015 SIGMOD 0.00011446851
1,627 Data Cleaning: Overview and Emerging Challenges 2016 SIGMOD 0.00011086905
1,908 Information-Theoretic Tools for Mining Database Structure from Large Data Sets 2004 SIGMOD 0.00010126101
2,097 Predictive Interaction for Data Transformation 2015 CIDR 9.5489822e-05
2,122 SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle 2020 CIDR 9.4989076e-05
2,158 Uni-Detect: A Unified Approach to Automated Error Detection in Tables 2019 SIGMOD 9.4141354e-05
2,506 Auto-Detect: Data-Driven Error Detection in Tables 2018 SIGMOD 8.6335464e-05
2,589 DogmatiX Tracks down Duplicates in XML 2005 SIGMOD 8.4847146e-05
2,797 Query-Oriented Data Cleaning with Oracles 2015 SIGMOD 8.1108589e-05
2,888 Sato: Contextual Semantic Type Detection in Tables 2020 VLDB 7.9594996e-05
3,051 Partial Results in Database Systems 2014 SIGMOD 7.6512591e-05
3,105 Data X-Ray: A Diagnostic Tool for Data Errors 2015 SIGMOD 7.5568954e-05
3,192 Towards Dependable Data Repairing with Fixing Rules 2014 SIGMOD 7.4095761e-05
3,467 Data Profiling – A Tutorial 2017 SIGMOD 7.069081e-05
3,690 Navigating the Data Lake with DATAMARAN: Automatically Extracting Structure from Log Datasets 2018 SIGMOD 6.8384476e-05
3,712 MOMA - A Mapping-based Object Matching System 2007 CIDR 6.823134e-05
3,713 GDR: A System for Guided Data Repair 2010 SIGMOD 6.8224341e-05
4,635 Mining Precision Interfaces From Query Logs 2019 SIGMOD 6.033398e-05
4,929 Data Auditor: Exploring Data Quality and Semantics using Pattern Tableaux 2010 VLDB 5.8217296e-05
5,096 Auto-Transform: Learning-to-Transform by Patterns 2020 VLDB 5.7011825e-05
5,099 ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models 2024 VLDB 5.6997784e-05
5,445 QFix: Diagnosing Errors through Query Histories 2017 SIGMOD 5.5020909e-05
5,525 QueryBooster: Improving SQL Performance Using Middleware Services for Human-Centered Query Rewriting 2023 VLDB 5.4600815e-05
5,803 Semandaq: A Data Quality System Based on Conditional Functional Dependencies 2008 VLDB 5.3205861e-05
5,867 Combining Design and Performance in a Data Visualization Management System 2017 CIDR 5.296418e-05
5,981 DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python 2021 SIGMOD 5.2448986e-05
5,987 Sampling Cube: A Framework for Statistical OLAP Over Sampling Data 2008 SIGMOD 5.2432535e-05
6,196 Just-in-time Data Integration in Action 2010 VLDB 5.1614813e-05
6,350 NADEEF: A Generalized Data Cleaning System 2013 VLDB 5.101815e-05
6,384 A Demonstration of DBWipes: Clean as You Query 2012 VLDB 5.0880333e-05
6,416 Synthesizing Type-Detection Logic for Rich Semantic Data Types using Open-source Code 2018 SIGMOD 5.072267e-05
6,817 Error Diagnosis and Data Profiling with Data X-Ray 2015 VLDB 4.9171711e-05
7,013 Qualitative Data Cleaning 2016 VLDB 4.8619024e-05
7,237 CleanM: An Optimizable Query Language for Unified Scale-Out Data Cleaning 2017 VLDB 4.7928651e-05
7,564 PIClean: A Probabilistic and Interactive Data Cleaning System 2019 SIGMOD 4.7093702e-05
Previous Page 1 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 8 of 8 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers