Towards Better Bounds for Finding Quasi-Identifiers *
Summary: Improves bounds for epsilon-separation keys (quasi-identifiers): Θ(m/√ε) uniform-sampling suffices, beating Motwani–Xu’s Θ(m/ε) baseline and speeding discovery. Provides tight uniform-sampling limits and sketching costs: strong failure requires Θ(√(m/ε)) samples; constant δ needs Ω(√(log m/ε)); coordinate-sketching uses Ω(mk log(1/ε)) space, vs Θ(mk/(αε^2)) with sampling. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Ryan Hildebrant
- 2. Quoc-Tung Le
- 3. Duy-Hoang Ta
- 4. Hoa T. Vu
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 155 | Robust and Efficient Fuzzy Match for Online Data Cleaning | 2003 | SIGMOD | 0.00040637896 |
| 280 | Eliminating Fuzzy Duplicates in Data Warehouses | 2002 | VLDB | 0.00029113044 |
| 7,496 | Subspace Exploration: Bounds on Projected Frequency Estimation | 2021 | PODS | 4.7180617e-05 |
| 7,504 | Space Lower Bounds for Itemset Frequency Sketches | 2016 | PODS | 4.7180617e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,953 | On Join Sampling and the Hardness of Combinatorial Output-Sensitive Join Algorithms | 2023 | PODS | 5.8085795e-05 |
| 2,749 | Towards Tight Bounds for the Streaming Set Cover Problem | 2016 | PODS | 8.1773535e-05 |
| 5,682 | Optimal Indexing Using Near-Minimal Space [Extended Abstract] | 2003 | PODS | 5.372736e-05 |
| 126 | Space-Efficient Online Computation of Quantile Summaries | 2001 | SIGMOD | 0.00044744986 |
| 8,130 | Simple & Optimal Quantile Sketch: Combining Greenwald-Khanna with Khanna-Greenwald | 2024 | PODS | 4.5784634e-05 |
| 5,956 | A Tight Lower Bound for Comparison-Based Quantile Summaries | 2020 | PODS | 5.2566971e-05 |
| 4,966 | Relative Error Streaming Quantiles | 2021 | PODS | 5.7959749e-05 |
| 7,504 | Space Lower Bounds for Itemset Frequency Sketches | 2016 | PODS | 4.7180617e-05 |
| 2,203 | Independent Range Sampling | 2014 | PODS | 9.2981095e-05 |
| 1,094 | Tight Bounds for Lp Samplers, Finding Duplicates in Streams, and Related Problems | 2011 | PODS | 0.00014129658 |