Database Paper Browser

Back to papers

Towards Automated Cross-domain Exploratory Data Analysis through Large Language Models

Summary: TiInsight: an end-to-end LLM-driven, cross-domain SQL EDA system that introduces hierarchical data context (HDC) to summarize schema and enable open-world generalization. Four-stage pipeline (HDC gen, question clarification/decomposition, TiSQL text-to-SQL, TiChart viz), production deployment with open APIs, achieves 86.3% exec accuracy on Spider (GPT‑4) and strong user-study gains vs experts. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
14109
Venue
VLDB
Year
2025
Pagerank
4.1945683e-05
Overall Rank
10,784 | 24.98%
DOI
10.14778/3750601.3750629

Incoming Non-self Citations Over Time

No non-self incoming citations found for this paper in this database.

Authors

Incoming Citations (Sorted by Pagerank)

Showing 0 of 0 citing papers.

Rank Citing Paper Year Venue Pagerank
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 28 of 28 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
369 Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation 2024 VLDB 0.0002547515
460 SeeDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics 2015 VLDB 0.00022516069
535 ATHENA: An Ontology-Driven System for Natural Language Querying over Relational Data Stores 2016 VLDB 0.00020727678
984 Natural language to SQL: Where are we today? 2020 VLDB 0.00014857465
998 CodeS: Towards Building Open-source Language Models for Text-to-SQL 2024 SIGMOD 0.00014729379
1,350 Northstar: An Interactive Data Science System 2018 VLDB 0.00012431059
1,430 Duoquest: A Dual-Specification System for Expressive SQL Queries 2020 SIGMOD 0.00012031061
1,552 Overview of Data Exploration Techniques 2015 SIGMOD 0.00011408814
2,945 Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning 2023 SIGMOD 7.8377395e-05
2,988 NL2SQL is a solved problem... Not! 2024 CIDR 7.7761714e-05
3,393 Lux: Always-on Visualization Recommendations for Exploratory Dataframe Workflows 2022 VLDB 7.1483239e-05
3,546 Extracting Top-K Insights from Multi-dimensional Data 2017 SIGMOD 6.9870745e-05
3,661 Example-Driven Query Intent Discovery: Abductive Reasoning using Semantic Similarity 2019 VLDB 6.8689912e-05
3,970 HAIChart: Human and AI Paired Visualization System 2024 VLDB 6.5784767e-05
4,540 Automating Exploratory Data Analysis via Machine Learning: An Overview 2020 SIGMOD 6.1033443e-05
4,739 AutoTQA: Towards Autonomous Tabular Question Answering through Multi-Agent Large Language Models 2024 VLDB 5.959592e-05
4,908 Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL 2024 VLDB 5.8339245e-05
5,033 FinSQL: Model-Agnostic LLMs-based Text-to-SQL Framework for Financial Analysis 2024 SIGMOD 5.7486224e-05
5,217 QuickInsights: Quick and Automatic Discovery of Insights from Multi-Dimensional Data 2019 SIGMOD 5.6227959e-05
5,313 XInsight: eXplainable Data Analysis Through The Lens of Causality 2023 SIGMOD 5.573009e-05
5,981 DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python 2021 SIGMOD 5.2448986e-05
7,989 RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database Systems 2025 VLDB 4.6124681e-05
8,388 FEDEX: An Explainability Framework for Data Exploration Steps 2022 VLDB 4.5297787e-05
8,996 MetaInsight: Automatic Discovery of Structured Knowledge for Exploratory Data Analysis 2021 SIGMOD 4.4124959e-05
9,219 Intelligent Agents for Data Exploration 2024 VLDB 4.3702863e-05
9,829 Sevi: Speech-to-Visualization through Neural Machine Translation 2022 SIGMOD 4.2751057e-05
9,830 Towards Autonomous, Hands-Free Data Exploration 2020 CIDR 4.2751057e-05
10,460 UNITQA: A Unified Automated Tabular Question Answering System with Multi-Agent Large Language Models 2025 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Semantically Similar Papers