Towards NLP-Enhanced Data Profiling Tools
Summary: Propose NLP-enhanced data profiling that leverages column/table name semantics to prioritize profiling methods and targets when time or compute budget is limited. Name-derived signals guide targeted analyses, reducing cost and improving relevance of discovered metadata. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,015 | Chorus: Foundation Models for Unified Data Discovery and Exploration | 2024 | VLDB | 7.7092391e-05 |
| 5,509 | Can Large Language Models Predict Data Correlations from Column Names? | 2023 | VLDB | 5.4703368e-05 |
| 7,026 | Mind the Data Gap: Bridging LLMs to Enterprise Data Integration | 2025 | CIDR | 4.8570811e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 0 of 0 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,635 | A Deep Dive into Deep Learning Approaches for Text-to-SQL Systems | 2021 | SIGMOD | 6.8981006e-05 |
| 6,826 | Natural Language Interfaces for Databases with Deep Learning | 2023 | VLDB | 4.9142824e-05 |
| 10,837 | Natural Language to SQL: State of the Art and Open Problems | 2025 | VLDB | 4.1945683e-05 |
| 11,344 | Simplifying Access to Large-scale Structured Datasets by Meta-Profiling with Scalable Training Set Enrichment | 2022 | SIGMOD | 4.1945683e-05 |
| 1,625 | Data Profiling with Metanome | 2015 | VLDB | 0.00011094926 |
| 984 | Natural language to SQL: Where are we today? | 2020 | VLDB | 0.00014857465 |
| 3,467 | Data Profiling – A Tutorial | 2017 | SIGMOD | 7.069081e-05 |
| 5,449 | Transformers for Tabular Data Representation: A Tutorial on Models and Applications | 2022 | VLDB | 5.5008652e-05 |
| 5,509 | Can Large Language Models Predict Data Correlations from Column Names? | 2023 | VLDB | 5.4703368e-05 |
| 8,615 | The Case for NLP-Enhanced Database Tuning: Towards Tuning Tools that "Read the Manual" | 2021 | VLDB | 4.484683e-05 |