Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services
Summary: Falcon scales hands-off crowdsourced EM beyond Corleone with RDBMS-style planning on Hadoop. It defines EM operators, turns workflows into executable plans mixing machine and crowd tasks, using crowd time to mask machine time for million-tuple cloud-scale EM. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Sanjib Das
- 2. Paul Suganthan G. C.
- 3. AnHai Doan
- 4. Jeffrey F. Naughton
- 5. Ganesh Krishnan
- 6. Rohit Deep
- 7. Esteban Arcaute
- 8. Vijay Raghavendra
- 9. Youngchoon Park
Incoming Citations (Sorted by Pagerank)
Showing 30 of 30 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 30 of 30 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,402 | Smurf: Self-Service String Matching Using Random Forests | 2019 | VLDB | 6.2195162e-05 |
| 263 | CrowdER: Crowdsourcing Entity Resolution | 2012 | VLDB | 0.00029862413 |
| 11,788 | CDB: Optimizing Queries with Crowd-Based Selections and Joins | 2017 | SIGMOD | 4.1945683e-05 |
| 1,326 | Starling: A Scalable Query Engine on Cloud Functions | 2020 | SIGMOD | 0.00012576952 |
| 5,279 | CDB: A Crowd-Powered Database System | 2018 | VLDB | 5.5902418e-05 |
| 3,645 | Large-Scale Collective Entity Matching | 2011 | VLDB | 6.8853274e-05 |
| 5,362 | Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach | 2016 | SIGMOD | 5.5473503e-05 |
| 3,500 | FalconDB: Blockchain-based Collaborative Database | 2020 | SIGMOD | 7.0373486e-05 |
| 11,739 | CloudMatcher: A Hands-Off Cloud/Crowd Service for Entity Matching | 2018 | VLDB | 4.1945683e-05 |
| 643 | Corleone: Hands-Off Crowdsourcing for Entity Matching | 2014 | SIGMOD | 0.00018754451 |