Towards the Web of Concepts: Extracting Concepts from Large Datasets
Summary: Concept extraction reframed as market-basket mining over large corpora to build a Web of Concepts for search. Uses market-basket style measures of support and confidence to extract high-precision concept sequences; evaluated on AOL-scale query logs. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,912 | Mining Quality Phrases from Massive Text Corpora | 2015 | SIGMOD | 4.6183486e-05 |
| 11,591 | GIANT: Scalable Creation of a Web-scale Ontology | 2020 | SIGMOD | 4.1945683e-05 |
| 11,775 | Building Structured Databases of Factual Knowledge from Massive Text Corpora | 2017 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,221 | A Web of Concepts | 2009 | PODS | 0.00013219242 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,379 | Scalable Ad-hoc Entity Extraction from Text Collections | 2008 | VLDB | 5.5405989e-05 |
| 7,890 | Mining a Search Engine’s Corpus: Efficient Yet Unbiased Sampling and Aggregate Estimation | 2011 | SIGMOD | 4.6249533e-05 |
| 1,140 | EntityRank: Searching Entities Directly and Holistically | 2007 | VLDB | 0.00013720706 |
| 4,676 | Extracting large-scale knowledge bases from the web | 1999 | VLDB | 6.0052781e-05 |
| 4,474 | Measure-driven Keyword-Query Expansion | 2009 | VLDB | 6.1528736e-05 |
| 4,092 | Structured Annotations of Web Queries | 2010 | SIGMOD | 6.4561959e-05 |
| 2,319 | Expressive and Flexible Access to Web-Extracted Data: A Keyword-based Structured Query Language | 2010 | SIGMOD | 9.0387108e-05 |
| 11,975 | Which Concepts Are Worth Extracting? | 2014 | SIGMOD | 4.1945683e-05 |
| 7,588 | Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases | 2013 | VLDB | 4.7030914e-05 |
| 1,221 | A Web of Concepts | 2009 | PODS | 0.00013219242 |