Which Concepts Are Worth Extracting?
Summary: Introduces cost-effective conceptual design: selecting a budget-limited subset of concepts to annotate to boost query effectiveness. Proposes APM and AAM with provable guarantees (APM PTAS without overlap; constant-factor when concepts are exclusive; AAM PTAS for exclusive concepts); experiments on Wikipedia and query logs validate performance and guide choice. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,224 | The SphereSearch Engine for Unified Ranked Retrieval of Heterogeneous XML and Web Documents | 2005 | VLDB | 9.251962e-05 |
| 2,915 | Brainwash: A Data System for Feature Engineering | 2013 | CIDR | 7.9078385e-05 |
| 3,477 | Toward Best-Effort Information Extraction | 2008 | SIGMOD | 7.0583481e-05 |
| 3,820 | Enterprise Information Extraction: Recent Developments and Open Challenges | 2010 | SIGMOD | 6.7299199e-05 |
| 6,586 | Web Data Management | 2011 | SIGMOD | 5.0023398e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,964 | Classifier Construction Under Budget Constraints | 2022 | SIGMOD | 4.2269436e-05 |
| 5,734 | Efficient Algorithms for Crowd-Aided Categorization | 2020 | VLDB | 5.3482904e-05 |
| 8,206 | Query Expansion Based on Clustered Results | 2011 | VLDB | 4.5586037e-05 |
| 6,080 | Answering Top-k Representative Queries on Graph Databases | 2014 | SIGMOD | 5.2214553e-05 |
| 759 | To Search or to Crawl? Towards a Query Optimizer for Text-Centric Tasks | 2006 | SIGMOD | 0.00017064615 |
| 7,475 | Optimizing Index for Taxonomy Keyword Search | 2012 | SIGMOD | 4.7191809e-05 |
| 7,588 | Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases | 2013 | VLDB | 4.7030914e-05 |
| 5,520 | Towards the Web of Concepts: Extracting Concepts from Large Datasets | 2010 | VLDB | 5.4614656e-05 |
| 11,595 | Minimization of Classifier Construction Cost for Search Queries | 2020 | SIGMOD | 4.1945683e-05 |
| 8,148 | When Speed Has a Price: Fast Information Extraction Using Approximate Algorithms | 2013 | VLDB | 4.5754467e-05 |