Database Paper Browser

Back to authors

Arun Kumar

Author ID
871
ORCID
-
Links
(found by gpt-5.2 on feb 8th, 2026)
Most Frequent Institution
University of California San Diego
Pagerank
0.34458215
Overall Rank
122 | 99.43%
Paper Count
42

Affiliation Timeline

Incoming Non-self Citations Over Time

Total yearly non-self incoming citations across all papers by this author.

Publications by Paper Pagerank

Showing 42 of 42 publications.

Rank Title Year Venue Pagerank
140 The MADlib Analytics Library or MAD Skills, the SQL 2012 VLDB 0.00042270404
658 Towards a Unified Architecture for in-RDBMS Analytics 2012 SIGMOD 0.00018506577
683 Cerebro: A Data System for Optimized Deep Learning Model Selection 2020 VLDB 0.00018195476
761 Materialization Optimizations for Feature Selection Workloads 2014 SIGMOD 0.00017053783
903 To Join or Not to Join? Thinking Twice about Joins before Feature Selection 2016 SIGMOD 0.0001547016
1,167 Learning Generalized Linear Models Over Normalized Data 2015 SIGMOD 0.00013547713
1,279 Towards Linear Algebra over Normalized Data 2017 VLDB 0.00012868394
1,532 Data Management in Machine Learning: Challenges, Techniques, and Systems 2017 SIGMOD 0.00011472681
1,891 Towards Model-based Pricing for Machine Learning in a Data Marketplace 2019 SIGMOD 0.00010194092
2,194 Enabling and Optimizing Non-linear Feature Interactions in Factorized Linear Algebra 2019 SIGMOD 9.3138337e-05
2,863 Incremental and Approximate Inference for Faster Occlusion-based Deep CNN Explanations 2019 SIGMOD 7.9877991e-05
2,886 VISTA: Optimized System for Declarative Feature Transfer from Deep CNNs at Scale 2020 SIGMOD 7.9612767e-05
2,915 Brainwash: A Data System for Feature Engineering 2013 CIDR 7.9078385e-05
3,206 Panorama: A Data System for Unbounded Vocabulary Querying over Video 2020 VLDB 7.3826363e-05
3,638 Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics 2017 SIGMOD 6.8952488e-05
3,948 A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics 2018 VLDB 6.5959084e-05
4,033 In-RDBMS Hardware Acceleration of Advanced Analytics 2018 VLDB 6.5113267e-05
4,129 Are Key-Foreign Key Joins Safe to Avoid when Learning High-Capacity Classifiers? 2018 VLDB 6.428887e-05
4,291 The future of data(base) education: Is the "cow book" dead? 2021 VLDB 6.2885419e-05
4,377 Understanding and Benchmarking the Impact of GDPR on Database Systems 2020 VLDB 6.2404627e-05
4,467 Demonstration of SpeakQL: Speech-driven Multimodal Querying of Structured Data 2019 SIGMOD 6.1585143e-05
4,557 Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches 2021 VLDB 6.087611e-05
4,785 Demonstration of Santoku: Optimizing Machine Learning over Normalized Data 2015 VLDB 5.9236989e-05
5,242 Towards Benchmarking Feature Type Inference for AutoML Platforms 2021 SIGMOD 5.6074743e-05
5,437 SNAILS: Schema Naming Assessments for Improved LLM-Based SQL Inference 2025 SIGMOD 5.5033018e-05
5,523 SpeakQL: Towards Speech-driven Multimodal Querying of Structured Data 2020 SIGMOD 5.4605231e-05
6,538 Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent 2019 SIGMOD 5.023239e-05
6,549 Demonstration of Nimbus: Model-based Pricing for Machine Learning in a Data Marketplace 2019 SIGMOD 5.0175568e-05
6,553 How do Categorical Duplicates Affect ML? A New Benchmark and Empirical Analyses 2024 VLDB 5.0157344e-05
6,884 Lotan: Bridging the Gap between GNNs and Scalable Graph Analytics Engines 2023 VLDB 4.8955332e-05
7,273 Feature Selection in Enterprise Analytics: A Demonstration using an R-based Data Analytics System 2013 VLDB 4.7810804e-05
7,656 Nautilus: An Optimized System for Deep Transfer Learning over Evolving Training Datasets 2022 SIGMOD 4.6871575e-05
8,121 Automation of Data Prep, ML, and Data Science: New Cure or Snake Oil? 2021 SIGMOD 4.5809305e-05
8,378 Probabilistic Management of OCR Data using an RDBMS 2012 VLDB 4.5320288e-05
8,595 Towards A Polyglot Framework for Factorized ML 2021 VLDB 4.4889397e-05
8,864 Cerebro: A Layered Data Platform for Scalable Deep Learning 2021 CIDR 4.4326439e-05
9,222 Towards an Optimized GROUP BY Abstraction for Large-Scale Machine Learning 2021 VLDB 4.3698672e-05
9,223 Intermittent Human-in-the-Loop Model Selection using Cerebro: A Demonstration 2021 VLDB 4.3698672e-05
9,603 Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads 2024 VLDB 4.3177432e-05
13,171 Reimagining Deep Learning Systems Through the Lens of Data Systems 2024 VLDB -
13,271 Errata for “Cerebro: A Data System for Optimized Deep Learning Model Selection” 2021 VLDB -
13,313 Demonstration of Krypton: Optimized CNN Inference for Occlusion-based Deep CNN Explanations 2019 VLDB -
Previous Page 1 / 1 Next

Frequent Co-authors

Co-authored at least 5 papers.

Co-author Shared Papers Rank Pagerank
Supun Nakandala 8 887 0.074394824
Jeffrey Naughton 6 7 1.0202982
Christopher Ré 6 60 0.50138748
Yuhao Zhang 6 1,658 0.043538221
Jignesh Patel 5 28 0.68787443
Lingjiao Chen 5 1,591 0.045262248
Vraj Shah 5 1,593 0.045134595
Side Li 5 1,609 0.044632832