To UDFs and Beyond: Demonstration of a Fully Decomposed Data Processor for General Data Wrangling Tasks
Summary: Fully decomposed, modular processor for semi-structured data enabling extensions into engine internals (indexes, I/O, aggregations) via plain Python—not just UDFs. Demo shows rapid prototyping to embed ML libraries/pretrained models into query pipelines (e.g., tweet sentiment) for customizable data‑wrangling. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 140 | The MADlib Analytics Library or MAD Skills, the SQL | 2012 | VLDB | 0.00042270404 |
| 1,355 | SQL/MapReduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions | 2009 | VLDB | 0.00012404572 |
| 1,873 | An Architecture for Compiling UDF-centric Workflows | 2015 | VLDB | 0.00010253002 |
| 1,882 | Tuplex: Data Science in Python at Native Code Speed | 2021 | SIGMOD | 0.0001021625 |
| 2,122 | SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle | 2020 | CIDR | 9.4989076e-05 |
| 2,237 | Procedural Extensions of SQL: Understanding their usage in the wild | 2021 | VLDB | 9.2212748e-05 |
| 3,080 | Compiling PL/SQL Away | 2020 | CIDR | 7.603389e-05 |
| 3,648 | One WITH RECURSIVE is Worth Many GOTOs | 2021 | SIGMOD | 6.8831123e-05 |
| 4,924 | User-Defined Operators: Efficiently Integrating Custom Algorithms into Modern Databases | 2022 | VLDB | 5.822682e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,213 | Udon: Efficient Debugging of User-Defined Functions in Big Data Systems with Line-by-Line Control | 2023 | SIGMOD | 4.1945683e-05 |
| 8,583 | Efficient Execution of User-Defined Functions in SQL Queries | 2023 | VLDB | 4.4919445e-05 |
| 9,585 | One-pass Data Mining Algorithms in a DBMS with UDFs | 2011 | SIGMOD | 4.3218691e-05 |
| 4,924 | User-Defined Operators: Efficiently Integrating Custom Algorithms into Modern Databases | 2022 | VLDB | 5.822682e-05 |
| 1,532 | Data Management in Machine Learning: Challenges, Techniques, and Systems | 2017 | SIGMOD | 0.00011472681 |
| 11,888 | Synthesizing Data Programs | 2015 | CIDR | 4.1945683e-05 |
| 7,384 | The VADA Architecture for Cost-Effective Data Wrangling | 2017 | SIGMOD | 4.7445719e-05 |
| 9,608 | Unified Data Analytics: State-of-the-art and Open Problems | 2022 | VLDB | 4.3177432e-05 |
| 1,873 | An Architecture for Compiling UDF-centric Workflows | 2015 | VLDB | 0.00010253002 |
| 4,813 | Putting Pandas in a Box | 2021 | CIDR | 5.9049746e-05 |