Database Paper Browser

Back to papers

Finding Frequent Items in Probabilistic Data

Summary: Possible-world semantics define likely frequent items in probabilistic data, capturing structure beyond simple expected frequency. Exact offline algorithms (quadratic/cubic) and sublinear-memory streaming sampling with provable accuracy and confidence-based ranking, validated on real and synthetic data. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
4035
Venue
SIGMOD
Year
2008
Pagerank
5.3240234e-05
Overall Rank
5,796 | 59.68%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 3 of 3 citing papers.

Rank Citing Paper Year Venue Pagerank
4,080 Sliding-Window Top-k Queries on Uncertain Streams 2008 VLDB 6.4652983e-05
7,633 Mining Frequent Itemsets over Uncertain Databases 2012 VLDB 4.6914549e-05
11,952 Beyond Itemsets: Mining Frequent Featuresets over Structured Items 2015 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 20 of 20 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
74 Efficient Query Evaluation on Probabilistic Databases 2004 VLDB 0.00057857292
101 ULDBs: Databases with Uncertainty and Lineage 2006 VLDB 0.0004955674
155 Robust and Efficient Fuzzy Match for Online Data Cleaning 2003 SIGMOD 0.00040637896
166 Approximate Frequency Counts over Data Streams 2002 VLDB 0.00039361552
184 New Sampling-Based Summary Statistics for Improving Approximate Query Answers 1998 SIGMOD 0.00036625711
299 Trio: A System for Data, Uncertainty, and Lineage 2006 VLDB 0.00028525071
467 Evaluating Probabilistic Queries over Imprecise Data 2003 SIGMOD 0.00022443768
477 Model-Driven Data Acquisition in Sensor Networks 2004 VLDB 0.00022221803
597 Computing Iceberg Queries Efficiently 1998 VLDB 0.00019475592
678 ConQuer: Efficient Management of Inconsistent Databases 2005 SIGMOD 0.00018253213
865 What’s Hot and What’s Not: Tracking Most Frequent Items Dynamically 2003 PODS 0.00015808172
893 Data Integration: The Teenage Years 2006 VLDB 0.00015558352
1,179 Probabilistic Skylines on Uncertain Data 2007 VLDB 0.00013457451
1,586 Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions 2005 VLDB 0.00011250856
1,955 Efficient Computation of Iceberg Cubes with Complex Measures 2001 SIGMOD 9.9629452e-05
2,491 From Complete to Incomplete Information and Back 2007 SIGMOD 8.655056e-05
2,759 A Simpler and More Efficient Deterministic Scheme for Finding Frequent Items over Sliding Windows 2006 PODS 8.1636123e-05
3,041 Sketching Probabilistic Data Streams 2007 SIGMOD 7.6697078e-05
3,385 Estimating Statistical Aggregates on Probabilistic Data Streams 2007 PODS 7.1580391e-05
4,334 Diamond in the Rough: Finding Hierarchical Heavy Hitters in Multi-Dimensional Data 2004 SIGMOD 6.2798179e-05
Previous Page 1 / 1 Next

Semantically Similar Papers