Data Debugging and Exploration with Vizier
Summary: Vizier unifies Python, SQL, and automated data curation/debugging for multi-modal exploration on Spark, handling large, multi-format data. Notebook + spreadsheet UI with visualizations; native provenance and versioning enable collaboration and uncertainty management on real-data tasks. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Mike Brachmann
- 2. Carlos Bautista
- 3. Sonia Castelo
- 4. Su Feng
- 5. Juliana Freire
- 6. Boris Glavic
- 7. Oliver Kennedy
- 8. Heiko Müller
- 9. Rémi Rampin
- 10. William Spoth
- 11. Ying Yang
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,993 | Automatically Generating Data Exploration Sessions Using Deep Reinforcement Learning | 2020 | SIGMOD | 9.8453334e-05 |
| 5,976 | Responsible Data Integration: Next-generation Challenges | 2022 | SIGMOD | 5.245976e-05 |
| 6,291 | Lightweight Inspection of Data Preprocessing in Native Machine Learning Pipelines | 2021 | CIDR | 5.1269764e-05 |
| 6,944 | DataPrism: Exposing Disconnect between Data and Systems | 2022 | SIGMOD | 4.8912787e-05 |
| 7,941 | Efficient Uncertainty Tracking for Complex Queries with Attribute-level Bounds | 2021 | SIGMOD | 4.613363e-05 |
| 9,306 | Debugging Large-Scale Data Science Pipelines using Dagger | 2020 | VLDB | 4.3572942e-05 |
| 10,888 | Kishu: Time-Traveling for Computational Notebooks | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 2 of 2 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,413 | VisTrails: Visualization meets Data Management | 2006 | SIGMOD | 0.00012121257 |
| 5,779 | Lenses: An On-Demand Approach to ETL | 2015 | VLDB | 5.3307398e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,463 | PyExplore: Query Recommendations for Data Exploration without Query Logs | 2021 | SIGMOD | 4.1945683e-05 |
| 5,107 | SeeDB: Automatically Generating Query Visualizations | 2014 | VLDB | 5.6925578e-05 |
| 10,433 | DataDazzle: Intelligent Data Exploration through Natural Language | 2025 | SIGMOD | 4.1945683e-05 |
| 2,251 | Vizdom: Interactive Analytics through Pen and Touch | 2015 | VLDB | 9.1986441e-05 |
| 2,160 | DEVise: Integrated Querying and Visual Exploration of Large Datasets | 1997 | SIGMOD | 9.4065027e-05 |
| 9,306 | Debugging Large-Scale Data Science Pipelines using Dagger | 2020 | VLDB | 4.3572942e-05 |
| 1,918 | VizDeck: Self-Organizing Dashboards for Visual Analytics | 2012 | SIGMOD | 0.00010097599 |
| 5,058 | A Demo of the Data Civilizer System | 2017 | SIGMOD | 5.7280139e-05 |
| 12,766 | DEVise: Integrated Querying and Visual Exploration of Large Datasets (DEMO ABSTRACT) | 1997 | SIGMOD | 4.1945683e-05 |
| 6,295 | Your notebook is not crumby enough, REPLace it | 2020 | CIDR | 5.1249204e-05 |