Database Paper Browser

Back to papers

Unveiling Challenges for LLMs in Enterprise Data Engineering

Summary: Identifies enterprise-specific obstacles for LLM-driven tabular data engineering—large tables, more complex tasks, and dependence on internal background knowledge. Systematic evaluation shows substantial accuracy degradation and practical limits of current LLMs in real-world enterprise settings. (summarized by gpt-5-mini on Mar 13 2026)

Paper ID
14311
Venue
VLDB
Year
2026
Pagerank
4.456315e-05
Overall Rank
8,736 | 39.23%
DOI
10.14778/3773749.3773758

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 1 of 1 citing papers.

Rank Citing Paper Year Venue Pagerank
10,183 Mixtera: A Data Plane for Foundation Model Training 2026 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 17 of 17 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
300 Deep Learning for Entity Matching: A Design Space Exploration 2018 SIGMOD 0.00028441466
369 Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation 2024 VLDB 0.0002547515
513 TURL: Table Understanding through Representation Learning 2021 VLDB 0.00021288342
517 Can Foundation Models Wrangle Your Data? 2023 VLDB 0.00021169035
916 On Schema Matching with Opaque Column Names and Data Values 2003 SIGMOD 0.00015379422
1,082 CAESURA: Language Models as Multi-Modal Query Planners 2024 CIDR 0.00014214232
1,187 JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes 2019 SIGMOD 0.00013443639
2,106 Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing 2025 CIDR 9.5342543e-05
2,587 Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks 2024 SIGMOD 8.4924618e-05
2,888 Sato: Contextual Semantic Type Detection in Tables 2020 VLDB 7.9594996e-05
3,015 Chorus: Foundation Models for Unified Data Discovery and Exploration 2024 VLDB 7.7092391e-05
3,520 GitTables: A Large-Scale Corpus of Relational Tables 2023 SIGMOD 7.0131061e-05
4,464 Magellan: Toward Building Entity Matching Management Systems over Data Science Stacks 2016 VLDB 6.1606042e-05
7,026 Mind the Data Gap: Bridging LLMs to Enterprise Data Integration 2025 CIDR 4.8570811e-05
7,048 Magneto: Combining Small and Large Language Models for Schema Matching 2025 VLDB 4.8520651e-05
8,052 Generating Succinct Descriptions of Database Schemata for Cost-Efficient Prompting of Large Language Models 2024 VLDB 4.5953106e-05
9,515 Automating the Enterprise with Foundation Models 2024 VLDB 4.3335877e-05
Previous Page 1 / 1 Next

Semantically Similar Papers