Using SPIDER: An Experience Report
Summary: Experience report on SPIDER, a SQL-driven framework for flexible, fuzzy string matching over large databases. It uses tf-idf weighted q-grams and cosine similarity with declarative SQL preprocessing to leverage DBMS optimizers for scalable data cleaning and integration. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Nick Koudas
- 2. Amit Marathe
- 3. Divesh Srivastava
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 2 of 2 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,026 | Flexible String Matching Against Large Databases in Practice | 2004 | VLDB | 6.5169976e-05 |
| 9,430 | Approximate Joins: Concepts and Techniques | 2005 | VLDB | 4.3441378e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,438 | Selectivity Estimation for Fuzzy String Predicates in Large Data Sets | 2005 | VLDB | 6.1898903e-05 |
| 3,992 | Discovering Linkage Points over Web Data | 2013 | VLDB | 6.5544834e-05 |
| 3,823 | Automatic Discovery of Attributes in Relational Databases | 2011 | SIGMOD | 6.7261168e-05 |
| 7,669 | Incorporating String Transformations in Record Matching | 2008 | SIGMOD | 4.6833751e-05 |
| 125 | Approximate String Joins in a Database (Almost) for Free | 2001 | VLDB | 0.00044847972 |
| 1,533 | Example-driven Design of Efficient Record Matching Queries | 2007 | VLDB | 0.00011471971 |
| 4,435 | Sampling Dirty Data for Matching Attributes | 2010 | SIGMOD | 6.1918164e-05 |
| 4,340 | SPIDER: a Schema mapPIng DEbuggeR | 2006 | VLDB | 6.2769021e-05 |
| 4,026 | Flexible String Matching Against Large Databases in Practice | 2004 | VLDB | 6.5169976e-05 |
| 12,544 | SPIDER: Flexible Matching in Databases | 2005 | SIGMOD | 4.1945683e-05 |