Repairing Data through Regular Expressions
Summary: Repair data by enforcing sequences to match a given regex with minimal edits. RSR is a DP over NFAs to measure prefix-regex edit distance (O(n m^2) time, O(mn) space) and adds token-value repair via a unified edit-distance with rule-based selection. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Zeyu Li
- 2. Hongzhi Wang
- 3. Wei Shao
- 4. Jianzhong Li
- 5. Hong Gao
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,826 | Exploiting Structure in Regular Expression Queries | 2023 | SIGMOD | 4.2751057e-05 |
| 10,216 | The Case For Language Model Approximated LIKE Predicate | 2026 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 13 | Mining Association Rules between Sets of Items in Large Databases | 1993 | SIGMOD | 0.0010864752 |
| 112 | Potter's Wheel: An Interactive Data Cleaning System | 2001 | VLDB | 0.00047045036 |
| 656 | ERACER: A Database Approach for Statistical Inference and Data Cleaning | 2010 | SIGMOD | 0.00018588729 |
| 732 | Discovering Data Quality Rules | 2008 | VLDB | 0.00017465093 |
| 855 | Integrating Conflicting Data: The Role of Source Dependence | 2009 | VLDB | 0.00015906735 |
| 1,197 | The LLUNATIC Data-Cleaning Framework | 2013 | VLDB | 0.00013390321 |
| 1,246 | Truth Discovery and Copying Detection in a Dynamic World | 2009 | VLDB | 0.0001307161 |
Previous
Page 1 / 1
Next