On-the-Fly Token Similarity Joins in Relational Databases
Summary: Introduces tokenize, a relational operator that generates tokens and embeds them in the plan, enabling optimization without precomputed tokens. Key ideas: algebraic rules, cardinality estimates, and replication-free handling of nested tokenize in PostgreSQL. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Nikolaus Augsten
- 2. Armando Miraglia
- 3. Thomas Neumann
- 4. Alfons Kemper
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,402 | Smurf: Self-Service String Matching Using Random Forests | 2019 | VLDB | 6.2195162e-05 |
| 4,775 | Set Similarity Joins on MapReduce: An Experimental Survey | 2018 | VLDB | 5.9315784e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 11 of 11 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 674 | Supporting Top-k Join Queries in Relational Databases | 2003 | VLDB | 0.00018327585 |
| 5,421 | Constructing Queries from Tokens | 1986 | SIGMOD | 5.514998e-05 |
| 9,563 | Towards a Unified Framework for String Similarity Joins | 2019 | VLDB | 4.3254416e-05 |
| 3,490 | Leveraging Set Relations in Exact Set Similarity Join | 2017 | VLDB | 7.0465856e-05 |
| 250 | Efficient set joins on similarity predicates | 2004 | SIGMOD | 0.00030661988 |
| 125 | Approximate String Joins in a Database (Almost) for Free | 2001 | VLDB | 0.00044847972 |
| 3,833 | Output-optimal Parallel Algorithms for Similarity Joins | 2017 | PODS | 6.7173578e-05 |
| 8,899 | Fast Approximate Similarity Join in Vector Databases | 2025 | SIGMOD | 4.427232e-05 |
| 13,473 | Exploiting Database Similarity Joins for Metric Spaces | 2012 | VLDB | - |
| 11,305 | TokenJoin: Efficient Filtering for Set Similarity Join with Maximum Weighted Bipartite Matching | 2023 | VLDB | 4.1945683e-05 |