Back to papers
Sampling Methods for Inner Product Sketching
Summary: Introduce two linear-time sampling-based inner-product sketches (threshold and priority sampling) that are far cheaper than Bessa et al.'s coordinated weighted sampling and empirically outperform JL/CountSketch. Provide new approximation guarantees and state-of-the-art accuracy for tasks like estimating column correlations in unjoined tables.
(summarized by gpt-5-mini on Feb 09 2026)
- Paper ID
- 13450
- Venue
- VLDB
- Year
- 2024
- Pagerank
- 4.1945683e-05
- Overall Rank
- 11,025 | 23.31%
- DOI
-
10.14778/3665844.3665850
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 13 of 13 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 549 |
Tracking Join and Self-Join Sizes in Limited Storage |
1999 |
PODS |
0.00020376603 |
| 727 |
On Synopses for Distinct-Value Estimation Under Multiset Operations |
2007 |
SIGMOD |
0.00017508726 |
| 1,193 |
Join Size Estimation Subject to Filter Conditions |
2015 |
VLDB |
0.00013414989 |
| 1,392 |
Sketching Streams Through the Net: Distributed Approximate Query Tracking |
2005 |
VLDB |
0.00012229045 |
| 1,463 |
ARDA: Automatic Relational Data Augmentation for Machine Learning |
2020 |
VLDB |
0.00011869295 |
| 2,141 |
LSH Ensemble: Internet-Scale Domain Search |
2016 |
VLDB |
9.4542625e-05 |
| 2,254 |
Two-Level Sampling for Join Size Estimation |
2017 |
SIGMOD |
9.1897043e-05 |
| 3,708 |
Is Min-Wise Hashing Optimal for Summarizing Set Intersection? |
2014 |
PODS |
6.8247903e-05 |
| 3,824 |
Correlation Sketches for Approximate Join-Correlation Queries |
2021 |
SIGMOD |
6.7260705e-05 |
| 4,955 |
Estimating arbitrary subset sums with few probes |
2005 |
PODS |
5.8053317e-05 |
| 5,415 |
Coordinated Weighted Sampling for Estimating Aggregates Over Multiple Weight Assignments |
2009 |
VLDB |
5.5196338e-05 |
| 8,470 |
Sampling Big Ideas in Query Optimization |
2023 |
PODS |
4.5038423e-05 |
| 11,168 |
Weighted Minwise Hashing Beats Linear Sketching for Inner Product Estimation |
2023 |
PODS |
4.1945683e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 7,771 |
Modeling High-Dimensional Index Structures using Sampling |
2001 |
SIGMOD |
4.6560482e-05 |
| 1,105 |
Cardinality Estimation Done Right: Index-Based Join Sampling |
2017 |
CIDR |
0.00013990395 |
| 3,928 |
Tighter Estimation using Bottom-k Sketches |
2008 |
VLDB |
6.6254568e-05 |
| 8,470 |
Sampling Big Ideas in Query Optimization |
2023 |
PODS |
4.5038423e-05 |
| 3,271 |
Data Sketches for Disaggregated Subset Sum and Frequent Item Estimation |
2018 |
SIGMOD |
7.2968732e-05 |
| 1,193 |
Join Size Estimation Subject to Filter Conditions |
2015 |
VLDB |
0.00013414989 |
| 5,415 |
Coordinated Weighted Sampling for Estimating Aggregates Over Multiple Weight Assignments |
2009 |
VLDB |
5.5196338e-05 |
| 3,702 |
Every Row Counts: Combining Sketches and Sampling for Accurate Group-By Result Estimates |
2019 |
CIDR |
6.8295759e-05 |
| 9,082 |
JoinSketch: A Sketch Algorithm for Accurate and Unbiased Inner-Product Estimation |
2023 |
SIGMOD |
4.3998984e-05 |
| 11,168 |
Weighted Minwise Hashing Beats Linear Sketching for Inner Product Estimation |
2023 |
PODS |
4.1945683e-05 |