Database Paper Browser

Back to papers

Transform-Data-by-Example (TDE): An Extensible Search Engine for Data Transformations

Summary: Transform-Data-by-Example (TDE) is an extensible search engine that indexes 50K transformation functions from code, DLLs, web services and mapping tables; users provide input/output examples to synthesize programs. On 200 tasks, TDE reaches 72% correctness, outperforms existing systems; Excel beta and Power BI integration extend practical adoption. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
11609
Venue
VLDB
Year
2018
Pagerank
7.054159e-05
Overall Rank
3,478 | 75.81%
DOI
10.14778/3231751.3231766

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 19 of 19 citing papers.

Rank Citing Paper Year Venue Pagerank
517 Can Foundation Models Wrangle Your Data? 2023 VLDB 0.00021169035
2,158 Uni-Detect: A Unified Approach to Automated Error Detection in Tables 2019 SIGMOD 9.4141354e-05
2,587 Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks 2024 SIGMOD 8.4924618e-05
3,252 Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks 2020 SIGMOD 7.3178277e-05
5,096 Auto-Transform: Learning-to-Transform by Patterns 2020 VLDB 5.7011825e-05
5,242 Towards Benchmarking Feature Type Inference for AutoML Platforms 2021 SIGMOD 5.6074743e-05
5,275 Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples 2023 VLDB 5.5905507e-05
5,280 Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V 2023 VLDB 5.5896735e-05
5,383 Auto-Pipeline: Synthesizing Complex Data Pipelines By-Target Using Reinforcement Learning and Search 2021 VLDB 5.5393038e-05
5,525 QueryBooster: Improving SQL Performance Using Middleware Services for Human-Centered Query Rewriting 2023 VLDB 5.4600815e-05
6,553 How do Categorical Duplicates Affect ML? A New Benchmark and Empirical Analyses 2024 VLDB 5.0157344e-05
6,800 DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language Models 2024 SIGMOD 4.9231471e-05
9,371 Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations 2024 SIGMOD 4.3480692e-05
9,399 TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations 2025 VLDB 4.3441378e-05
10,598 Auto-Prep: Holistic Prediction of Data Preparation Steps for Self-Service Business Intelligence 2025 VLDB 4.1945683e-05
10,610 Weak-to-Strong Prompts with Lightweight-to-Powerful LLMs for High-Accuracy, Low-Cost, and Explainable Data Transformation 2025 VLDB 4.1945683e-05
11,178 LinCQA: Faster Consistent Query Answering with Linear Time Guarantees 2023 SIGMOD 4.1945683e-05
11,297 DataRinse: Semantic Transforms for Data preparation based on Code Mining 2023 VLDB 4.1945683e-05
11,343 SPINE: Scaling up Programming-by-Negative-Example for String Filtering and Transformation 2022 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 9 of 9 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers