Building Statistical Models and Scoring with UDFs
Summary: Inside-DBMS computation of multidimensional models (correlation, regression, PCA, clustering) using SQL+UDFs with a single-table scan for building and scoring. Models rely on compact summary matrices (sums, cross-products); aggregate UDFs and scalar UDFs outperform SQL and C++ by avoiding data export. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,644 | A Relational Matrix Algebra and its Implementation in a Column Store | 2020 | SIGMOD | 4.9782839e-05 |
| 9,585 | One-pass Data Mining Algorithms in a DBMS with UDFs | 2011 | SIGMOD | 4.3218691e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 13 | Mining Association Rules between Sets of Items in Large Databases | 1993 | SIGMOD | 0.0010864752 |
| 33 | BIRCH: An Efficient Data Clustering Method for Very Large Databases | 1996 | SIGMOD | 0.00077324389 |
| 277 | Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications | 1998 | SIGMOD | 0.00029311426 |
| 454 | An Overview of Query Optimization in Relational Systems | 1998 | PODS | 0.00022734812 |
| 904 | Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications | 1998 | SIGMOD | 0.00015469655 |
| 1,372 | SQLEM: Fast Clustering in SQL using the EM Algorithm | 2000 | SIGMOD | 0.00012318334 |
| 1,549 | Spreadsheets in RDBMS for OLAP | 2003 | SIGMOD | 0.00011428835 |
| 2,216 | On Parallel Processing of Aggregate and Scalar Functions in Object-Relational DBMS | 1998 | SIGMOD | 9.2699038e-05 |
| 2,687 | BOAT—Optimistic Decision Tree Construction | 1999 | SIGMOD | 8.3050259e-05 |
| 5,079 | Combi-Operator – Database Support for Data Mining Applications | 2003 | VLDB | 5.7140516e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,459 | UDFBench: A Tool for Benchmarking UDF Queries on SQL Engines | 2025 | SIGMOD | 4.1945683e-05 |
| 13,513 | Database Systems Research on Data Mining | 2010 | SIGMOD | - |
| 6,189 | Accelerating Python UDFs in Vectorized Query Execution | 2022 | CIDR | 5.1647573e-05 |
| 4,014 | Exploiting Correlations for Expensive Predicate Evaluation | 2015 | SIGMOD | 6.5273084e-05 |
| 658 | Towards a Unified Architecture for in-RDBMS Analytics | 2012 | SIGMOD | 0.00018506577 |
| 1,355 | SQL/MapReduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions | 2009 | VLDB | 0.00012404572 |
| 9,763 | The UDFBench Benchmark for General-purpose UDF Queries | 2025 | VLDB | 4.2856106e-05 |
| 8,583 | Efficient Execution of User-Defined Functions in SQL Queries | 2023 | VLDB | 4.4919445e-05 |
| 12,316 | Fast and Dynamic OLAP Exploration Using UDFs | 2009 | SIGMOD | 4.1945683e-05 |
| 9,585 | One-pass Data Mining Algorithms in a DBMS with UDFs | 2011 | SIGMOD | 4.3218691e-05 |