Back to papers
Fast Dataset Search with Earth Mover’s Distance
Summary: Proposes Dual-Bound Filtering (DBF) for fast EMD-based spatial dataset search. Datasets are encoded as Z-order histograms in a tree, with two pruning levels: pooling-based bounds and a TICT EMD bound, delivering substantial candidate reduction on four real repositories.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 12742
- Venue
- VLDB
- Year
- 2022
- Pagerank
- 4.1945683e-05
- Overall Rank
- 11,379 | 20.84%
- DOI
-
10.14778/3551793.3551811
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
Outgoing Citations (Sorted by Pagerank)
Showing 16 of 16 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 2 |
R-Trees: A Dynamic Index Structure For Spatial Searching |
1984 |
SIGMOD |
0.0032169493 |
| 610 |
Goods: Organizing Google's Datasets |
2016 |
SIGMOD |
0.00019232674 |
| 1,187 |
JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes |
2019 |
SIGMOD |
0.00013443639 |
| 1,644 |
Finding Related Tables in Data Lakes for Interactive Data Science |
2020 |
SIGMOD |
0.00011041787 |
| 1,751 |
Auctus: A Dataset Search Engine for Data Discovery and Augmentation |
2021 |
VLDB |
0.00010683295 |
| 1,958 |
Exemplar Queries: Give me an Example of What You Need |
2014 |
VLDB |
9.9572632e-05 |
| 3,358 |
Organizing Data Lakes for Navigation |
2020 |
SIGMOD |
7.1784949e-05 |
| 3,425 |
Efficient EMD-based Similarity Search in Multimedia Databases via Flexible Dimensionality Reduction |
2008 |
SIGMOD |
7.1077107e-05 |
| 4,373 |
Efficient and Effective Similarity Search over Probabilistic Data based on Earth Mover's Distance |
2010 |
VLDB |
6.2443809e-05 |
| 5,326 |
Earth Mover's Distance based Similarity Search at Scale |
2014 |
VLDB |
5.5680074e-05 |
| 6,320 |
Indexing the Earth Mover's Distance Using Normal Distributions |
2012 |
VLDB |
5.1129965e-05 |
| 6,438 |
RONIN: Data Lake Exploration |
2021 |
VLDB |
5.0620163e-05 |
| 6,770 |
An Incremental Hausdorff Distance Calculation Algorithm |
2011 |
VLDB |
4.9317829e-05 |
| 7,034 |
A Neural Database for Differentially Private Spatial Range Queries |
2022 |
VLDB |
4.8550912e-05 |
| 7,210 |
Set-based Similarity Search for Time Series |
2016 |
SIGMOD |
4.799457e-05 |
| 7,303 |
DICE: Data Discovery by Example |
2021 |
VLDB |
4.7684686e-05 |
Semantically Similar Papers