Auto-Pipeline: Synthesizing Complex Data Pipelines By-Target Using Reinforcement Learning and Search
Summary: Auto-Pipeline synthesizes data pipelines (string transforms + table ops) via DRL and search, guided by a by-target interface to a target output. Uses FDs and keys to prune the space; evaluated on 700 real pipelines, ~70% solved with up to 10 steps. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Junwen Yang
- 2. Yeye He
- 3. Surajit Chaudhuri
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,275 | Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples | 2023 | VLDB | 5.5905507e-05 |
| 5,280 | Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V | 2023 | VLDB | 5.5896735e-05 |
| 8,828 | HAIPipe: Combining Human-generated and Machine-generated Pipelines for Data Preparation | 2023 | SIGMOD | 4.4407488e-05 |
| 9,371 | Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations | 2024 | SIGMOD | 4.3480692e-05 |
| 10,168 | FlowPilot: A Suggestion System for Designing Scientific Workflows | 2026 | SIGMOD | 4.1945683e-05 |
| 10,598 | Auto-Prep: Holistic Prediction of Data Preparation Steps for Self-Service Business Intelligence | 2025 | VLDB | 4.1945683e-05 |
| 11,103 | LucidScript: Bottom-up Standardization for Data Preparation | 2024 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 12 of 12 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next