I'm Ryan Marcus, an assistant professor of computer science at the University of Pennsylvania. I'm using machine learning to build the next generation of data management tools that automatically adapt to new hardware and user workloads, invent novel processing strategies, and understand user intention.

I am especially interested in query optimization, index structures, intelligent clouds, programming language runtimes, program synthesis for data processing, and applications of reinforcement learning to systems problems.

Email: rcmarcus@seas.upenn.edu
Office: AGH 407

News

30 Jun 2025Our 📄 theory of generalization in learned cardinality estimation, along with our paper on 📄 learning cardinality estimates from incomplete data, will appear at VLDB '25! Both papers are from 🎓 final-year PhD student Peizhi Wu.
05 May 2025We'll present two papers on learned offline query optimization at SIGMOD '25: Jeff Tao's 📄 BayesQO work on using Bayesian optimization to find "super-optimized" query plans, and Zixuan Yi's 📄 LimeQO work on optimizing entire query workloads at once.
15 Apr 2025Our demonstration of 🛠️ ScaleLLM, which combines embeddings and small models to emulate using an LLM on each row of a large database, will be presented at SIGMOD '25.
01 Mar 2025The 📄 BFTBrain paper, which uses reinforcement learning to maximize the performance of adversary-tolerant distributed systems, will be presented at NSDI '25.

Previous news items ...

06 Dec 2024Our work on 📄 LLMSteer, a system for steering query optimizers with large language models, will be presented during a 🔦 spotlight talk at the NeurIPS ML4Sys workshop!
20 Jul 2024We'll be presenting our 📄 vision for full stack adaptivity via machine learning for blockchain systems at VLDB '24, along with a 🛠️ demo of BFTGym, our environment for performance testing BFT protocols under various fault conditions.
01 Jun 2024Two fresh takes on query planning presented at SIGMOD '24: first, 📄 Stage, the cache-based multistage query latency predictor used in Redshift, and second, 📄 LimeQO (aiDM workshop), a workload-level query steering technique using linear methods.
20 May 2024I appeared on the 🎙️ Disseminate podcast.
06 Dec 2023I gave a 🗣️ talk at PrestoCon about learned query optimization and 📄 AutoSteer (abstract).
16 Aug 2023Our 📄 AutoSteer paper, an extensible learned query optimizer for any SQL database, was published in VLDB '23. We're also presenting a demo of 🛠️ QO-Insight, our tool for exploring and understanding learned query optimizers.
19 Jun 2023Our 📄 Kepler (robust learned parametric query optimization) and 📄 Auto-WLM (learning enhanced workload management) papers were published at SIGMOD '23.
07 Apr 2023Our 📄 AdaChain paper, the first adaptive blockchain that switches architectures in order to optimize throughput for dynamic workloads, was published at VLDB '23.
20 Feb 2023Our 📄 paper on robust cardinality estimation under dynamic workloads was published at VLDB '23.
15 Sep 2022Our 📄 SageDB paper, the first complete data system built with instance optimization as a foundational design principle, was published at VLDB '22.
30 Apr 2022I will be 👋 joining the CIS faculty at the University of Pennsylvania in Fall 2023!
15 Jun 2021Our 📄 Bao paper, a practical approach to learned query optimization, 🏆 wins the Best Paper Award at SIGMOD '21.
18 Mar 2021Our 📄 paper presenting the first 🛠️ benchmark of learned indexes has been accepted to VLDB '21.

Blog Posts

Applying Bao to distributed systems We can apply Bao, a technique for learned query optimization, to a number of distributed cloud databases (17 Jun 2021).
Machine learning for systems A recent groundswell of research has been pushing machine learning into computer systems (06 Jun 2019).
Good comment, bad comment A few collected tips on writing readable code (05 Nov 2018).
Overflow in consistent hashing Exploring the implications of fixed-capacity machines in consistent hashing schemes (14 Sep 2018).
Pretty pictures with Perlin noise fields Procedurally generated flurry-like graphics and videos with particle tracing in a Perlin noise field (04 Mar 2018).

Older Newer