Ryan Marcus, assistant professor at the University of Pennsylvania.
Using machine learning to build the next generation of data systems.
____ __ ___
/ __ \__ ______ _____ / |/ /___ _____________ _______
/ /_/ / / / / __ `/ __ \ / /|_/ / __ `/ ___/ ___/ / / / ___/
/ _, _/ /_/ / /_/ / / / / / / / / /_/ / / / /__/ /_/ (__ )
/_/ |_|\__, /\__,_/_/ /_/ /_/ /_/\__,_/_/ \___/\__,_/____/
/____/
___ __ ___
/ _ \__ _____ ____ / |/ /__ ___________ _____
/ , _/ // / _ `/ _ \ / /|_/ / _ `/ __/ __/ // (_-<
/_/|_|\_, /\_,_/_//_/ /_/ /_/\_,_/_/ \__/\_,_/___/
/___/
___ __ ___
/ _ \/ |/ /__ ___________ _____
/ , _/ /|_/ / _ `/ __/ __/ // (_-<
/_/|_/_/ /_/\_,_/_/ \__/\_,_/___/
Scalable Semantic Operators

Semantic operators, generally powered by LLMs, are taking the database world by storm. Queries that were previously out of reach β like βdoes this review appear fake?β β are now possible. Unfortunately, naive implementations of semantic operators generally involve calling an expensive LLM for every row of data. How can we scale semantic operators to datasets with billions of rows?
Papers
- ScaleLLM: A technique for scalable LLM-augmented data systems Demo.
- Ashwin Alaparthi
- Paul Loh
- Ryan Marcus
SIGMOD '25 (pdf) (doi)
People