Database Paper Browser

Back to papers

How Large Language Models Will Disrupt Data Management

Summary: LLMs provide semantic grounding of tuples, schemas, and queries, enabling automation breakthroughs in tasks that stalled (entity resolution, schema matching, data discovery, query synthesis). They also blur predictive models and IR, prompting new DB/architecture designs. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
13166
Venue
VLDB
Year
2023
Pagerank
6.5513237e-05
Overall Rank
3,995 | 72.21%
DOI
10.14778/3611479.3611527

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 12 of 12 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 22 of 22 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
31 Provenance Semirings 2007 PODS 0.0007857786
221 Deep Entity Matching with Pre-Trained Language Models 2021 VLDB 0.00033121824
513 TURL: Table Understanding through Representation Learning 2021 VLDB 0.00021288342
517 Can Foundation Models Wrangle Your Data? 2023 VLDB 0.00021169035
518 Data Integration for the Relational Web 2009 VLDB 0.00021158934
567 NaLIR: An Interactive Natural Language Interface for Querying Relational Databases 2014 SIGMOD 0.00019966681
667 Incremental Knowledge Base Construction Using DeepDive 2015 VLDB 0.00018440557
893 Data Integration: The Teenage Years 2006 VLDB 0.00015558352
1,147 Web-scale Data Integration: You can only afford to Pay As You Go 2007 CIDR 0.00013677658
1,407 DB-BERT: A Database Tuning Tool that "Reads the Manual" 2022 SIGMOD 0.00012146739
1,643 CodexDB: Synthesizing Code for Query Processing from Natural Language Instructions using GPT-3 Codex 2022 VLDB 0.0001104256
2,152 MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis 2018 SIGMOD 9.4239787e-05
2,352 MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud 2023 VLDB 8.9766205e-05
2,888 Sato: Contextual Semantic Type Detection in Tables 2020 VLDB 7.9594996e-05
2,902 PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel 2023 VLDB 7.93939e-05
3,617 Ava: From Data to Insights Through Conversation 2017 CIDR 6.9091789e-05
4,180 FastFlow: Accelerating Deep Learning Model Training with Smart Offloading of Input Data Pipeline 2023 VLDB 6.3793352e-05
4,630 Knowledge Graphs 2021: A Data Odyssey 2021 VLDB 6.0348379e-05
4,967 Leva: Boosting Machine Learning Performance with Relational Embedding Data Augmentation 2022 SIGMOD 5.7956612e-05
6,377 Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism 2023 VLDB 5.0911095e-05
7,868 Solo: Data Discovery Using Natural Language Questions Via A Self-Supervised Approach 2023 SIGMOD 4.6319504e-05
8,615 The Case for NLP-Enhanced Database Tuning: Towards Tuning Tools that "Read the Manual" 2021 VLDB 4.484683e-05
Previous Page 1 / 1 Next

Semantically Similar Papers