WOO: A Scalable and Multi-tenant Platform for Continuous Knowledge Base Synthesis
Summary: WOO: Hadoop-based, scalable, multi-tenant platform for continuous knowledge-base synthesis - ingest, disambiguate, and enrich entities from structured and unstructured data. Yahoo!-level deployment enabling multi-domain, multi-version KBs with hundreds of millions of entities; architecture and real-world evaluation. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Kedar Bellare
- 2. Carlo Curino
- 3. Ashwin Machanavajihala
- 4. Peter Mika
- 5. Mandar Rahurkar
- 6. Aamod Sane
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,391 | Ease.ml: Towards Multi-tenant Resource Sharing for Machine Learning Workloads | 2018 | VLDB | 0.0001223506 |
| 7,833 | Dependency-Driven Analytics: a Compass for Uncharted Data Oceans | 2017 | CIDR | 4.6382648e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 319 | Evaluation of entity resolution approaches on real-world match problems | 2010 | VLDB | 0.00027781866 |
| 447 | Efficient Parallel Set-Similarity Joins Using MapReduce | 2010 | SIGMOD | 0.00022900171 |
| 494 | Data Exchange: Getting to the Core | 2003 | PODS | 0.00021805832 |
| 1,221 | A Web of Concepts | 2009 | PODS | 0.00013219242 |
| 1,851 | An Analysis of Structured Data on the Web | 2012 | VLDB | 0.00010327871 |
| 2,231 | Dedoop: Efficient Deduplication with Hadoop | 2012 | VLDB | 9.2304499e-05 |
| 4,443 | PRIMA: Archiving and Querying Historical Data with Evolving Schemas | 2009 | SIGMOD | 6.1853525e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,326 | Peta-Scale Data Warehousing at Yahoo! | 2009 | SIGMOD | 5.1091146e-05 |
| 4,676 | Extracting large-scale knowledge bases from the web | 1999 | VLDB | 6.0052781e-05 |
| 12,223 | Schema Clustering and Retrieval for Multi-domain Pay-As-You-Go Data Integration Systems | 2010 | SIGMOD | 4.1945683e-05 |
| 12,044 | Knowledge Harvesting in the Big-Data Era | 2013 | SIGMOD | 4.1945683e-05 |
| 127 | Querying Heterogeneous Information Sources Using Source Descriptions | 1996 | VLDB | 0.00044642203 |
| 8,307 | Automatic Web-Scale Information Extraction | 2012 | SIGMOD | 4.5435639e-05 |
| 7,588 | Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases | 2013 | VLDB | 4.7030914e-05 |
| 2,420 | From Data Fusion to Knowledge Fusion | 2014 | VLDB | 8.8530994e-05 |
| 13,399 | ONTOCUBO: Cube-based Ontology Construction and Exploration | 2014 | SIGMOD | - |
| 1,221 | A Web of Concepts | 2009 | PODS | 0.00013219242 |