Helix: Accelerating Human-in-the-loop Machine Learning
Summary: Helix is a declarative ML system that speeds iterative development with end-to-end optimization and selective materialization of prior results. It adds a DAG UI for version comparison and achieves up to 10x runtime reductions across classification and structured prediction. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Doris Xin
- 2. Litian Ma
- 3. Jialin Liu
- 4. Stephen Macke
- 5. Shuchen Song
- 6. Aditya Parameswaran
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,666 | HELIX: Holistic Optimization for Accelerating Iterative Machine Learning | 2019 | VLDB | 0.0001096361 |
| 2,170 | tf.data: A Machine Learning Data Processing Framework | 2021 | VLDB | 9.3821603e-05 |
| 5,684 | Dagger: A Data (not code) Debugger | 2020 | CIDR | 5.3720749e-05 |
| 6,053 | Optimizing Machine Learning Workloads in Collaborative Environments | 2020 | SIGMOD | 5.2326838e-05 |
| 7,704 | ExDRa: Exploratory Data Science on Federated Raw Data | 2021 | SIGMOD | 4.6733838e-05 |
| 8,000 | Data Civilizer 2.0: A Holistic Framework for Data Preparation and Analytics | 2019 | VLDB | 4.6092803e-05 |
| 9,306 | Debugging Large-Scale Data Science Pipelines using Dagger | 2020 | VLDB | 4.3572942e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 66 | Spark SQL: Relational Data Processing in Spark | 2015 | SIGMOD | 0.00061639801 |
| 543 | MLbase: A Distributed Machine-learning System | 2013 | CIDR | 0.00020526854 |
| 658 | Towards a Unified Architecture for in-RDBMS Analytics | 2012 | SIGMOD | 0.00018506577 |
| 761 | Materialization Optimizations for Feature Selection Workloads | 2014 | SIGMOD | 0.00017053783 |
| 1,666 | HELIX: Holistic Optimization for Accelerating Iterative Machine Learning | 2019 | VLDB | 0.0001096361 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 254 | Snorkel: Rapid Training Data Creation with Weak Supervision | 2018 | VLDB | 0.00030540555 |
| 11,958 | Shared Execution of Recurring Workloads in MapReduce | 2015 | VLDB | 4.1945683e-05 |
| 8,121 | Automation of Data Prep, ML, and Data Science: New Cure or Snake Oil? | 2021 | SIGMOD | 4.5809305e-05 |
| 2,384 | Oracle AutoML: A Fast and Predictive AutoML Pipeline | 2020 | VLDB | 8.925354e-05 |
| 11,511 | HyMAC: A Hybrid Matrix Computation System | 2021 | VLDB | 4.1945683e-05 |
| 3,958 | MLog: Towards Declarative In-Database Machine Learning | 2017 | VLDB | 6.5897636e-05 |
| 13,268 | From ML Models to Intelligent Applications: The Rise of MLOps | 2021 | VLDB | - |
| 13,303 | International Workshop on Human-In-the-Loop Data Analytics (HILDA) | 2019 | SIGMOD | - |
| 2,172 | Spinning Fast Iterative Data Flows | 2012 | VLDB | 9.3706587e-05 |
| 1,666 | HELIX: Holistic Optimization for Accelerating Iterative Machine Learning | 2019 | VLDB | 0.0001096361 |