Language-Model Based Informed Partition of Databases to Speed Up Pattern Mining
Summary: Proposes language-model/word-embedding–driven horizontal partitioning for frequent itemset mining: treat transactions as sentences, items as words, then cluster to form informed partitions. Goal is not just parallelism, but shrinking per-partition vocabulary/entropy to make mining scalable on large, sparse databases (e.g., graph propositionalizations). (summarized by gpt-5.4-mini on May 24 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Carlos Bobed
- 2. Jorge Bernad
- 3. Pierre Maillot
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 36 | Fast Algorithms for Mining Association Rules | 1994 | VLDB | 0.00076161096 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 547 | An Efficient Algorithm for Mining Association Rules in Large Databases | 1995 | VLDB | 0.00020420717 |
| 4,175 | Scalable Parallel Data Mining for Association Rules | 1997 | SIGMOD | 6.3851209e-05 |
| 3,989 | Mind the Gap: Large-Scale Frequent Sequence Mining | 2013 | SIGMOD | 6.5583327e-05 |
| 5,772 | Mining Frequent Patterns with Differential Privacy | 2013 | VLDB | 5.3322378e-05 |
| 13,817 | Communication-Efficient Distributed Mining of Association Rules | 2001 | SIGMOD | - |
| 13,889 | Towards Data Mining Benchmarking: A Test Bed for Performance Study of Frequent Pattern Mining | 2000 | SIGMOD | - |
| 181 | Mining Frequent Patterns without Candidate Generation | 2000 | SIGMOD | 0.00036992674 |
| 12,729 | Parallel Mining Algorithms for Generalized Association Rules with Classification Hierarchy | 1998 | SIGMOD | 4.1945683e-05 |
| 840 | Efficiently Mining Long Patterns from Databases | 1998 | SIGMOD | 0.00016058396 |
| 9,064 | Feasible Itemset Distributions in Data Mining: Theory and Application | 2003 | PODS | 4.4039656e-05 |