Generalizable Data Cleaning of Tabular Data in Latent Space
Summary: Clean tabular data in latent space: shape latent manifold to define a "clean" region and train Lopster operators that shift noisy/outlier/missing-row embeddings back to it. Unified detection+repair, generalizes to unseen errors and outperforms SOTA. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Eduardo Reis
- 2. Mohamed Abdelaal
- 3. Carsten Binnig
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 14 of 14 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,018 | Statistical Distortion: Consequences of Data Cleaning | 2012 | VLDB | 9.7764643e-05 |
| 13,232 | Data Cleaning in the Era of Data Science: Challenges and Opportunities | 2021 | CIDR | - |
| 881 | Don’t be SCAREd: Use SCalable Automatic REpairing with Maximal Likelihood and Bounded Changes | 2013 | SIGMOD | 0.00015661103 |
| 6,187 | Semi-Supervised Data Cleaning with Raha and Baran | 2021 | CIDR | 5.1656857e-05 |
| 6,280 | Self-supervised and Interpretable Data Cleaning with Sequence Generative Adversarial Networks | 2023 | VLDB | 5.1290457e-05 |
| 7,013 | Qualitative Data Cleaning | 2016 | VLDB | 4.8619024e-05 |
| 10,512 | Auto-Test: Learning Semantic-Domain Constraints for Unsupervised Error Detection in Tables | 2025 | SIGMOD | 4.1945683e-05 |
| 7,867 | Learning Over Dirty Data Without Cleaning | 2020 | SIGMOD | 4.6320452e-05 |
| 3,396 | Automatic Data Repair: Are We Ready to Deploy? | 2024 | VLDB | 7.1455126e-05 |
| 1,627 | Data Cleaning: Overview and Emerging Challenges | 2016 | SIGMOD | 0.00011086905 |