Entity Extraction, Linking, Classification, and Tagging for Social Media: A Wikipedia-Based Approach
Summary: End-to-end industrial system for social data: entity extraction, linking to Wikipedia, classification, and tagging for Twitter-scale streams. Real-time Wikipedia KB, task interleaving, and social signals boost accuracy; scalable to the Twitter firehose; deployed at Kosmix/WalmartLabs with empirical gains. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Rohit Kumar
- 2. Digvijay S. Lamba
- 3. Nikesh Garera
- 4. Mitul Tiwari
- 5. Xiaoyong Chai
- 6. Sanjib Das
- 7. Sri Subramaniam
- 8. Anand Rajaraman
- 9. Venky Harinarayan
- 10. AnHai Doan
Incoming Citations (Sorted by Pagerank)
Showing 6 of 6 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 712 | Magellan: Toward Building Entity Matching Management Systems | 2016 | VLDB | 0.00017732426 |
| 6,111 | Why Big Data Industrial Systems Need Rules and What We Can Do About It | 2015 | SIGMOD | 5.2049579e-05 |
| 9,171 | InsightNotes: Summary-Based Annotation Management in Relational Databases | 2014 | SIGMOD | 4.3848773e-05 |
| 11,507 | TQEL: Framework for Query-Driven Linking of Top-K Entities in Social Media Blogs | 2021 | VLDB | 4.1945683e-05 |
| 11,847 | Automatic Entity Recognition and Typing in Massive Text Data | 2016 | SIGMOD | 4.1945683e-05 |
| 11,908 | Even Metadata is Getting Big: Annotation Summarization using InsightNotes | 2015 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 62 | Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge | 2008 | SIGMOD | 0.0006429466 |
| 637 | Automatic segmentation of text into structured records | 2001 | SIGMOD | 0.00018824614 |
| 1,722 | Building Structured Web Community Portals: A Top-Down, Compositional, and Incremental Approach | 2007 | VLDB | 0.00010757784 |
| 2,847 | Building, Maintaining, and Using Knowledge Bases: A Report from the Trenches | 2013 | SIGMOD | 8.0224023e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 12,353 | Improved Search for Socially Annotated Data | 2009 | VLDB | 4.1945683e-05 |
| 12,121 | Surfacing Time-critical Insights from Social Media | 2012 | SIGMOD | 4.1945683e-05 |
| 12,025 | A Social Network Database that Learns How to Answer Queries | 2013 | CIDR | 4.1945683e-05 |
| 8,929 | Towards Social Data Platform: Automatic Topic-focused Monitor for Twitter Stream | 2013 | VLDB | 4.427232e-05 |
| 11,947 | Structured Analytics in Social Media | 2015 | VLDB | 4.1945683e-05 |
| 11,775 | Building Structured Databases of Factual Knowledge from Massive Text Corpora | 2017 | SIGMOD | 4.1945683e-05 |
| 5,900 | Partitioning and Ranking Tagged Data Sources | 2013 | VLDB | 5.281001e-05 |
| 11,847 | Automatic Entity Recognition and Typing in Massive Text Data | 2016 | SIGMOD | 4.1945683e-05 |
| 11,507 | TQEL: Framework for Query-Driven Linking of Top-K Entities in Social Media Blogs | 2021 | VLDB | 4.1945683e-05 |
| 5,479 | Microblog Entity Linking with Social Temporal Context | 2015 | SIGMOD | 5.4850984e-05 |