Rock: Cleaning Data with both ML and Logic Rules
Summary: Rock embeds ML models as predicates into logic rules, enabling joint rule-based deduction and learned inference for ER, conflict resolution, timeliness deduction and missing-value imputation across relational tables. Scalable parallel batch/incremental algorithms for rule discovery, error detection and correction with a user-friendly interface and production deployments (banks, HR). (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Zian Bao
- 2. Binbin Bie
- 3. Wenfei Fan
- 4. Daji Li
- 5. Mengyun Li
- 6. Kaiwen Lin
- 7. Wei Lin
- 8. Peijie Liu
- 9. Peng Liu
- 10. Zhicong Lv
- 11. Mingliang Ouyang
- 12. Chenyang Sun
- 13. Shuai Tang
- 14. Yaoshu Wang
- 15. Qiyuan Wei
- 16. Xiangqian Wu
- 17. Min Xie
- 18. Jing Zhang
- 19. Runxiao Zhao
- 20. Jie Zhu
- 21. Yilin Zhu
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 49 | Consistent Query Answers in Inconsistent Databases | 1999 | PODS | 0.00067660624 |
| 192 | HoloClean: Holistic Data Repairs with Probabilistic Inference | 2017 | VLDB | 0.00035728858 |
| 1,894 | Baran: Effective Error Correction via a Unified Context Representation and Transfer Learning | 2020 | VLDB | 0.0001018378 |
| 2,968 | Raha: A Configuration-Free Error Detection System | 2019 | SIGMOD | 7.7985097e-05 |
| 6,690 | Parallel Discrepancy Detection and Incremental Detection | 2021 | VLDB | 4.9621556e-05 |
| 9,355 | Discovering Top-k Rules using Subjective and Objective Criteria | 2023 | SIGMOD | 4.3514328e-05 |
| 9,434 | Rock: Cleaning Data by Embedding ML in Logic Rules | 2024 | SIGMOD | 4.3430376e-05 |
| 9,963 | Parallel Rule Discovery from Large Datasets by Sampling | 2022 | SIGMOD | 4.2294678e-05 |
| 11,223 | Splitting Tuples of Mismatched Entities | 2023 | SIGMOD | 4.1945683e-05 |
| 11,234 | Learning and Deducing Temporal Orders | 2023 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,446 | MiniClean: A Single-Machine System for Cleaning Big Graphs | 2025 | SIGMOD | 4.1945683e-05 |
| 732 | Discovering Data Quality Rules | 2008 | VLDB | 0.00017465093 |
| 10,723 | UniClean: A Scalable Data Cleaning Solution for Mixed Errors based on Unified Cleaners and Optimized Cleaning Workflow | 2025 | VLDB | 4.1945683e-05 |
| 1,627 | Data Cleaning: Overview and Emerging Challenges | 2016 | SIGMOD | 0.00011086905 |
| 10,821 | Demonstrating Matelda for Multi-Table Error Detection | 2025 | VLDB | 4.1945683e-05 |
| 5,660 | Descriptive and Prescriptive Data Cleaning | 2014 | SIGMOD | 5.3847321e-05 |
| 7,867 | Learning Over Dirty Data Without Cleaning | 2020 | SIGMOD | 4.6320452e-05 |
| 9,278 | Interactive and Deterministic Data Cleaning: A Tossed Stone Raises a Thousand Ripples | 2016 | SIGMOD | 4.3639892e-05 |
| 9,487 | Making It Tractable to Catch Duplicates and Conflicts in Graphs | 2023 | SIGMOD | 4.3341665e-05 |
| 9,434 | Rock: Cleaning Data by Embedding ML in Logic Rules | 2024 | SIGMOD | 4.3430376e-05 |