Data Cleaning in Microsoft SQL Server 2005
Summary: Introduces Fuzzy Lookup and Fuzzy Grouping in SQL Server 2005 Integration Services for data cleaning. Scalable, domain-independent operators repair foreign-key mismatches and near-duplicate records, enabling high-quality data for data warehouses and mining. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Surajit Chaudhuri
- 2. Kris Ganjam
- 3. Venky Ganti
- 4. Rahul Kapoor
- 5. Vivek Narasayya
- 6. Theo Vassilakis
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,419 | A Deferred Cleansing Method for RFID Data Analytics | 2006 | VLDB | 5.0690363e-05 |
| 7,669 | Incorporating String Transformations in Record Matching | 2008 | SIGMOD | 4.6833751e-05 |
| 12,425 | XClean in Action: A Demonstration of Declarative XML Data Cleaning | 2007 | CIDR | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 2 of 2 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 67 | The Merge/Purge Problem for Large Databases | 1995 | SIGMOD | 0.00061348205 |
| 155 | Robust and Efficient Fuzzy Match for Online Data Cleaning | 2003 | SIGMOD | 0.00040637896 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 732 | Discovering Data Quality Rules | 2008 | VLDB | 0.00017465093 |
| 8,137 | Customizable and Scalable Fuzzy Join for Big Data | 2019 | VLDB | 4.5774794e-05 |
| 1,627 | Data Cleaning: Overview and Emerging Challenges | 2016 | SIGMOD | 0.00011086905 |
| 3,360 | Modeling and Querying Possible Repairs in Duplicate Detection | 2009 | VLDB | 7.1742067e-05 |
| 623 | Improving Data Quality: Consistency and Accuracy | 2007 | VLDB | 0.00018996374 |
| 7,013 | Qualitative Data Cleaning | 2016 | VLDB | 4.8619024e-05 |
| 5,660 | Descriptive and Prescriptive Data Cleaning | 2014 | SIGMOD | 5.3847321e-05 |
| 507 | Data Quality and Data Cleaning: An Overview | 2003 | SIGMOD | 0.00021473263 |
| 155 | Robust and Efficient Fuzzy Match for Online Data Cleaning | 2003 | SIGMOD | 0.00040637896 |
| 280 | Eliminating Fuzzy Duplicates in Data Warehouses | 2002 | VLDB | 0.00029113044 |