Database Paper Browser

Back to papers

Faerie: Efficient Filtering Algorithms for Approximate Dictionary-based Entity Extraction

Summary: Faerie provides a unified framework for approximate dictionary-based entity extraction, supporting diverse similarity/dissimilarity measures. It uses overlap-aware filtering and pruning to share work across substrings, achieving top performance. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
4410
Venue
SIGMOD
Year
2011
Pagerank
5.7177424e-05
Overall Rank
5,073 | 64.71%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 11 of 11 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 16 of 16 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
125 Approximate String Joins in a Database (Almost) for Free 2001 VLDB 0.00044847972
155 Robust and Efficient Fuzzy Match for Online Data Cleaning 2003 SIGMOD 0.00040637896
250 Efficient set joins on similarity predicates 2004 SIGMOD 0.00030661988
266 Efficient Exact Set-Similarity Joins 2006 VLDB 0.00029718727
1,202 VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams 2007 VLDB 0.00013326298
1,234 Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints 2008 VLDB 0.00013122499
1,830 Relaxing Join and Selection Queries 2006 VLDB 0.000103862
2,213 n-Gram/2L: A Space and Time Efficient Two-Level n-Gram Inverted Index Structure 2005 VLDB 9.2765152e-05
2,779 Hashed Samples: Selectivity Estimators For Set Similarity Selection Queries 2008 VLDB 8.1320575e-05
3,226 Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance 2007 VLDB 7.3433307e-05
3,578 Efficient Approximate Entity Extraction with Edit Distance Constraints 2009 SIGMOD 6.9503858e-05
3,868 An Efficient Filter for Approximate Membership Checking 2008 SIGMOD 6.6822543e-05
4,873 Power-Law Based Estimation of Set Similarity Join Size 2009 VLDB 5.8602304e-05
4,951 Mining Document Collections to Facilitate Accurate Approximate Entity Matching 2009 VLDB 5.8100413e-05
4,988 Incremental Maintenance of Length Normalized Indexes for Approximate String Matching 2009 SIGMOD 5.783959e-05
5,379 Scalable Ad-hoc Entity Extraction from Text Collections 2008 VLDB 5.5405989e-05
Previous Page 1 / 1 Next

Semantically Similar Papers