Back to papers
An Efficient Filter for Approximate Membership Checking
Summary: Proposes a filter-verification framework for approximate substring membership against a large dictionary, enabling efficient named-entity and biomedical concept extraction. Introduces a novel in-memory filter that prunes non-matches early, guarantees no false negatives, and outperforms prior methods in filtering power and runtime.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 4034
- Venue
- SIGMOD
- Year
- 2008
- Pagerank
- 6.6822543e-05
- Overall Rank
- 3,868 | 73.10%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 12 of 12 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 2,592 |
Pass-Join: A Partition-based Method for Similarity Joins |
2012 |
VLDB |
8.4795761e-05 |
| 3,490 |
Leveraging Set Relations in Exact Set Similarity Join |
2017 |
VLDB |
7.0465856e-05 |
| 3,578 |
Efficient Approximate Entity Extraction with Edit Distance Constraints |
2009 |
SIGMOD |
6.9503858e-05 |
| 4,250 |
Local Similarity Search for Unstructured Text |
2016 |
SIGMOD |
6.3241139e-05 |
| 4,951 |
Mining Document Collections to Facilitate Accurate Approximate Entity Matching |
2009 |
VLDB |
5.8100413e-05 |
| 5,073 |
Faerie: Efficient Filtering Algorithms for Approximate Dictionary-based Entity Extraction |
2011 |
SIGMOD |
5.7177424e-05 |
| 5,379 |
Scalable Ad-hoc Entity Extraction from Text Collections |
2008 |
VLDB |
5.5405989e-05 |
| 5,887 |
Efficient Approximate Search on String Collections (Tutorial) |
2009 |
VLDB |
5.2879769e-05 |
| 6,351 |
SigMatch: Fast and Scalable Multi-Pattern Matching |
2010 |
VLDB |
5.1005697e-05 |
| 6,580 |
Query Portals: Dynamically Generating Portals for Entity-Oriented Web Queries |
2010 |
SIGMOD |
5.0034092e-05 |
| 8,007 |
A Grammar-based Entity Representation Framework for Data Cleaning |
2009 |
SIGMOD |
4.6068018e-05 |
| 9,932 |
Local Filtering: Improving the Performance of Approximate Queries on String Collections |
2015 |
SIGMOD |
4.2500258e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 11,305 |
TokenJoin: Efficient Filtering for Set Similarity Join with Maximum Weighted Bipartite Matching |
2023 |
VLDB |
4.1945683e-05 |
| 4,684 |
Approximate String Joins with Abbreviations |
2018 |
VLDB |
6.0006406e-05 |
| 9,931 |
ChainedFilter: Combining Membership Filters by Chain Rule |
2023 |
SIGMOD |
4.250188e-05 |
| 5,073 |
Faerie: Efficient Filtering Algorithms for Approximate Dictionary-based Entity Extraction |
2011 |
SIGMOD |
5.7177424e-05 |
| 9,933 |
Efficient and Effective KNN Sequence Search with Approximate n-grams |
2014 |
VLDB |
4.2500258e-05 |
| 6,726 |
A Pivotal Prefix Based Filtering Algorithm for String Similarity Search |
2014 |
SIGMOD |
4.9484027e-05 |
| 8,143 |
Approximate Substring Matching over Uncertain Strings |
2011 |
VLDB |
4.5768015e-05 |
| 3,578 |
Efficient Approximate Entity Extraction with Edit Distance Constraints |
2009 |
SIGMOD |
6.9503858e-05 |
| 9,932 |
Local Filtering: Improving the Performance of Approximate Queries on String Collections |
2015 |
SIGMOD |
4.2500258e-05 |
| 7,708 |
Efficient Top-k Algorithms for Approximate Substring Matching |
2013 |
SIGMOD |
4.6721808e-05 |