Database Paper Browser

Back to papers

Mining Quality Phrases from Massive Text Corpora

Summary: Proposes a scalable framework for mining quality phrases from massive text corpora by integrating phrasal segmentation with limited supervision. Demonstrates near-human phrase quality and linear time/space scalability, validated on large corpora. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5087
Venue
SIGMOD
Year
2015
Pagerank
4.6183486e-05
Overall Rank
7,912 | 44.96%
DOI
10.1145/2723372.2751523

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 5 of 5 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 4 of 4 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
3,256 Multidimensional Content eXploration 2008 VLDB 7.3158557e-05
5,520 Towards the Web of Concepts: Extracting Concepts from Large Datasets 2010 VLDB 5.4614656e-05
6,684 Interesting-Phrase Mining for Ad-Hoc Text Analytics 2010 VLDB 4.9629004e-05
11,954 Scalable Topical Phrase Mining from Text Corpora 2015 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Semantically Similar Papers