Cost-efficient Data Acquisition on Online Data Marketplaces for Correlation Analysis
Summary: DANCE, a middleware for cost-efficient data acquisition from marketplaces, maximizes correlation between requested attributes. Offline builds a two-layer join graph from samples; online searches under quality, budget, and join informativeness constraints use an MCMC-based heuristic for an NP-hard problem. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Yanying Li
- 2. Haipei Sun
- 3. Boxiang Dong
- 4. Hui (Wendy) Wang
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,976 | Responsible Data Integration: Next-generation Challenges | 2022 | SIGMOD | 5.245976e-05 |
| 10,981 | Enabling Adaptive Sampling for Intra-Window Join: Simultaneously Optimizing Quantity and Quality | 2024 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 49 | Consistent Query Answers in Inconsistent Databases | 1999 | PODS | 0.00067660624 |
| 732 | Discovering Data Quality Rules | 2008 | VLDB | 0.00017465093 |
| 1,193 | Join Size Estimation Subject to Filter Conditions | 2015 | VLDB | 0.00013414989 |
| 1,401 | Extending Dependencies with Conditions | 2007 | VLDB | 0.00012187775 |
| 1,572 | Reverse Engineering Complex Join Queries | 2013 | SIGMOD | 0.00011298251 |
| 1,660 | Data Markets in the Cloud: An Opportunity for the Database Community | 2011 | VLDB | 0.00010979534 |
| 1,796 | Summary Graphs for Relational Database Schemas | 2011 | VLDB | 0.00010524897 |
| 2,078 | Sample-Driven Schema Mapping | 2012 | SIGMOD | 9.599707e-05 |
| 5,800 | QueryMarket Demonstration: Pricing for Online Data Markets | 2012 | VLDB | 5.3211601e-05 |
| 8,989 | Stochastic Data Acquisition for Answering Queries as Time Goes by | 2017 | VLDB | 4.413361e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,755 | Multivariate Correlations Discovery in Static and Streaming Data | 2022 | VLDB | 4.456315e-05 |
| 5,800 | QueryMarket Demonstration: Pricing for Online Data Markets | 2012 | VLDB | 5.3211601e-05 |
| 4,619 | Crowd-Based Deduplication: An Adaptive Approach | 2015 | SIGMOD | 6.0444854e-05 |
| 3,750 | Data Acquisition for Improving Machine Learning Models | 2021 | VLDB | 6.7895763e-05 |
| 8,869 | Sharing-Aware Horizontal Partitioning for Exploiting Correlations During Query Processing | 2010 | VLDB | 4.4320338e-05 |
| 2,837 | Correlation Maps: A Compressed Access Method for Exploiting Soft Functional Dependencies | 2009 | VLDB | 8.0414149e-05 |
| 9,444 | Online Optimization and Fair Costing for Dynamic Data Sharing in a Cloud Data Market | 2014 | SIGMOD | 4.3408772e-05 |
| 5,024 | Towards Distribution-aware Query Answering in Data Markets | 2022 | VLDB | 5.7535043e-05 |
| 2,370 | Query-Based Data Pricing | 2012 | PODS | 8.9488834e-05 |
| 3,824 | Correlation Sketches for Approximate Join-Correlation Queries | 2021 | SIGMOD | 6.7260705e-05 |