Arnold: Declarative Crowd-Machine Data Integration
Summary: Proposes Labor Independence, a declarative data‑independence layer that separates logical cleaning operators from their physical implementations so the system can choose per-operator/per-record crowd vs. machine implementations. Implements Arnold, an architecture that uses this model to optimize quality–cost tradeoffs for large-scale data cleaning and integration. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Shawn R. Jeffery
- 2. Liwen Sun
- 3. Matt DeLand
- 4. Nick Pendar
- 5. Rick Barber
- 6. Andrew Galdi
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 866 | Leveraging Transitive Relations for Crowdsourced Joins | 2013 | SIGMOD | 0.00015801196 |
| 1,841 | Crowdsourcing Algorithms for Entity Resolution | 2014 | VLDB | 0.00010348858 |
| 4,619 | Crowd-Based Deduplication: An Adaptive Approach | 2015 | SIGMOD | 6.0444854e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 14 of 14 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 267 | Human-powered Sorts and Joins | 2012 | VLDB | 0.00029690405 |
| 3,840 | Revisiting Prompt Engineering via Declarative Crowdsourcing | 2024 | CIDR | 6.7106924e-05 |
| 7,867 | Learning Over Dirty Data Without Cleaning | 2020 | SIGMOD | 4.6320452e-05 |
| 263 | CrowdER: Crowdsourcing Entity Resolution | 2012 | VLDB | 0.00029862413 |
| 7,237 | CleanM: An Optimizable Query Language for Unified Scale-Out Data Cleaning | 2017 | VLDB | 4.7928651e-05 |
| 8,007 | A Grammar-based Entity Representation Framework for Data Cleaning | 2009 | SIGMOD | 4.6068018e-05 |
| 4,619 | Crowd-Based Deduplication: An Adaptive Approach | 2015 | SIGMOD | 6.0444854e-05 |
| 2,797 | Query-Oriented Data Cleaning with Oracles | 2015 | SIGMOD | 8.1108589e-05 |
| 4,665 | Argonaut: Macrotask Crowdsourcing for Complex Data Processing | 2015 | VLDB | 6.0125329e-05 |
| 199 | Declarative Data Cleaning: Language, Model, and Algorithms | 2001 | VLDB | 0.00035041015 |