SQLEM: Fast Clustering in SQL using the EM Algorithm
Summary: SQL-based EM clustering for very large databases, handling high-dimensional data, many clusters, and massive records. Proposes horizontal, vertical, and hybrid SQL strategies to execute EM inside a relational DBMS for scalable data mining. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Carlos Ordonez
- 2. Paul Cereghini
Incoming Citations (Sorted by Pagerank)
Showing 5 of 5 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 140 | The MADlib Analytics Library or MAD Skills, the SQL | 2012 | VLDB | 0.00042270404 |
| 3,455 | A Comparison of Platforms for Implementing and Running Very Large Scale Machine Learning Algorithms | 2014 | SIGMOD | 7.0771839e-05 |
| 5,079 | Combi-Operator – Database Support for Data Mining Applications | 2003 | VLDB | 5.7140516e-05 |
| 8,466 | Building Statistical Models and Scoring with UDFs | 2007 | SIGMOD | 4.5050696e-05 |
| 9,420 | Local Search Methods for k-Means with Outliers | 2017 | VLDB | 4.3441378e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 27 | Efficient and Effective Clustering Methods for Spatial Data Mining | 1994 | VLDB | 0.00080736878 |
| 33 | BIRCH: An Efficient Data Clustering Method for Very Large Databases | 1996 | SIGMOD | 0.00077324389 |
| 277 | Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications | 1998 | SIGMOD | 0.00029311426 |
| 1,595 | Fast Algorithms for Projected Clustering | 1999 | SIGMOD | 0.00011222442 |
| 3,475 | Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering | 1999 | VLDB | 7.0614822e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 547 | An Efficient Algorithm for Mining Association Rules in Large Databases | 1995 | VLDB | 0.00020420717 |
| 1,816 | Incremental Clustering for Mining in a Data Warehousing Environment | 1998 | VLDB | 0.0001045313 |
| 5,079 | Combi-Operator – Database Support for Data Mining Applications | 2003 | VLDB | 5.7140516e-05 |
| 13,926 | Clustering Methods for Large Databases: From the Past to the Future | 1999 | SIGMOD | - |
| 277 | Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications | 1998 | SIGMOD | 0.00029311426 |
| 9,585 | One-pass Data Mining Algorithms in a DBMS with UDFs | 2011 | SIGMOD | 4.3218691e-05 |
| 13,425 | Data Mining Algorithms as a Service in the Cloud: Exploiting Relational Database Systems | 2013 | SIGMOD | - |
| 10,924 | Improved Approximation Algorithms for Relational Clustering | 2024 | PODS | 4.1945683e-05 |
| 1,595 | Fast Algorithms for Projected Clustering | 1999 | SIGMOD | 0.00011222442 |
| 13,513 | Database Systems Research on Data Mining | 2010 | SIGMOD | - |