Teaching, research, and advising

The three pillars of machine programming. Invention, intention, and adaptation.

Our group’s research focus is on the implementation (as opposed to applications) of data systems. Broadly, our group works on machine programming for data systems, investigating how to build data systems that automatically adapt to new hardware and user workloads, invent novel processing strategies, and understand user intention.

Active projects

My group's projects include:

Offline query optimization: important queries deserve additional optimization. How can we find the best possible query plan given a time budget?
Scalable semantic operators: how can we scale semantic operators to datasets with billions of rows? Even calling a small LLM per-row is too slow.
Rethinking execution engines: as hardware and query workloads become more complex, how should we design exeution engines for future DBMSes?
LLMs for query optimization: LLMs have great reasoning capabilities in some domains. How can we leverage those abilities for QO?

I also collaborate on:

TIDES: How can organizations share data while ensuring compliance with privacy regulations? TIDES is a privacy-preserving cross-organization data integration service.
TruthTable: how can someone verify that an external database is giving them the correct answer? TruthTable is a verifiable query engine that builds proofs that a result is correct.

Are you interested in working with us? Fill out this application. Note that we generally work with students who have strong systems backgrounds.

If you are a current Penn student who has questions about working with us, feel free to book a quick 15 minute chat with me. Please include a link to your website, and use the phrase “march of the penguins” in the description so I know you read this!

Current Ph.D. students

Noopur Bhatt (co-advised with Sebastian Angel and Mike Hicks)
Jeff Tao (co-advised with Andrew Head)
Zixuan Yi (co-advised with Zack Ives)
Zijie Zhao

Current undergraduate RAs

Joshua Ahn
Rachel Lee

Ph.D. Alumni

Dr. Peizhi Wu (Ph.D. co-advised with Zack Ives, first employment: Bytedance US, 2026)

Thesis alumni

Paul Kotys M.S. thesis (first employment: PhD student at MIT, 2026)
Ashwin Alaparthi MA/MS thesis (first employment: SDE at VAPI, 2025)
Daniel Xue B.S. thesis (first employment: SWE at Roblox, 2025)
Peter Akioyamen MA/MS thesis (first employment: Senior ML Engineer at Bain, 2024)

Courses

At the University of Pennsylvania:

Fall 2026, CIS 5450, Big Data Analytics (upcoming)
Spring 2026, CIS 5450, Big Data Analytics, joint with Harry Smith (syllabus)
Spring 2026, CIS 8000, AI Systems Seminar, joint with Zack Ives and Varun Jana (syllabus) (website)
Fall 2025, CIS 6500, Advanced Topics in Database Systems (syllabus) (website)
Spring 2025, CIS 5450, Big Data Analytics (syllabus)
Fall 2024, CIS 5450, Big Data Analytics (syllabus)
Spring 2024, CIS 5450, Big Data Analytics, joint with Jacob Gardner (syllabus)
Fall 2023, CIS 6500, Advanced Topics in Database Systems (syllabus)

At Brandeis University:

Spring 2019, COSI 127b, Introduction to Database Systems (syllabus)

Other collaborators

One of our best and brightest collaborators, Mishmish is an expert in computational winguistics.