Ryan Marcus, assistant professor at the University of Pennsylvania. Using machine learning to build the next generation of data systems.
      
    ____                       __  ___                          
   / __ \__  ______ _____     /  |/  /___ _____________  _______
  / /_/ / / / / __ `/ __ \   / /|_/ / __ `/ ___/ ___/ / / / ___/
 / _, _/ /_/ / /_/ / / / /  / /  / / /_/ / /  / /__/ /_/ (__  ) 
/_/ |_|\__, /\__,_/_/ /_/  /_/  /_/\__,_/_/   \___/\__,_/____/  
      /____/                                                    
        
   ___                   __  ___                    
  / _ \__ _____ ____    /  |/  /__ ___________ _____
 / , _/ // / _ `/ _ \  / /|_/ / _ `/ __/ __/ // (_-<
/_/|_|\_, /\_,_/_//_/ /_/  /_/\_,_/_/  \__/\_,_/___/
     /___/                                          
        
   ___  __  ___                    
  / _ \/  |/  /__ ___________ _____
 / , _/ /|_/ / _ `/ __/ __/ // (_-<
/_/|_/_/  /_/\_,_/_/  \__/\_,_/___/                                   
        

VLDB 2023

List of 7 paper (repeated below) presented at VLDB '23

I’m excited to be part of seven different papers, demos, and workshop papers at VLDB in Vancouver this year!

AdaChain

Every permissioned blockchain architecture has different performance characteristics. AdaChain automatically switches between architectures to optimize performance online, compensating for changes in workload and network conditions.

AutoSteer

Bringing a learned steering optimizer to a new database can be difficult, since optimizers can have 1000s of knobs. AutoSteer automatically finds a good set, and optimizes your queries as well! We tested a large deployment of AutoSteer at Meta.

QO-Insight

Alongside AutoSteer, we developed a tool called QO-Insight to help DBAs understand the decisions of learned query optimizers. We will present a demo of our tool which enables side-by-side query plan analysis!

Robust cardinality estimation

Query-driven cardinality estimators learn powerful, workload-tailored strategies, but have a hard time dealing with data drift. We show robust techniques that can tune a learned cardinality estimator online, as data changes.

SageDB

The culmination of several years of work on instance-optimized system, SageDB is a prototype analytic database combining together several learned techniques at once for the first time.

(Technically, SageDB was published in VLDB ā€˜22 proceedings, but the presentation is happening this year!)

RLShard

Almost every distributed transactional database today can tolerate crashes, but not Byzantine failures. Here, we take a first look at building a distributed, sharded database that can tolerate – and adapt to – Byzantine adversaries.

Learned query superoptimization

Modern analytics databases frequently run the same query multiple times. Could it be worth spending a long time – hours – optimizing such queries? I argue that doing so might allow DBMSes to capture some of the performance of bespoke systems.