Leva: Boosting Machine Learning Performance with Relational Embedding Data Augmentation
Summary: Leva constructs a relational embedding by graphifying the database and learning vectors that summarize the entire data. Downstream supervision filters noisy graph signals, reducing cross-relational feature engineering and data-discovery burden, and boosting ML performance on classification/regression tasks. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 9 of 9 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,836 | Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning | 2023 | VLDB | 8.0443826e-05 |
| 3,995 | How Large Language Models Will Disrupt Data Management | 2023 | VLDB | 6.5513237e-05 |
| 5,429 | DiffPrep: Differentiable Data Preprocessing Pipeline Search for Learning over Tabular Data | 2023 | SIGMOD | 5.5087325e-05 |
| 6,217 | Pneuma: Leveraging LLMs for Tabular Data Representation and Retrieval in an End-to-End System | 2025 | SIGMOD | 5.1534752e-05 |
| 7,868 | Solo: Data Discovery Using Natural Language Questions Via A Self-Supervised Approach | 2023 | SIGMOD | 4.6319504e-05 |
| 8,852 | Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation | 2023 | SIGMOD | 4.4356508e-05 |
| 10,754 | OmniMatch: Joinability Discovery in Data Products | 2025 | VLDB | 4.1945683e-05 |
| 10,973 | Unstructured Data Fusion for Schema and Data Extraction | 2024 | SIGMOD | 4.1945683e-05 |
| 11,054 | Enriching Relations with Additional Attributes for ER | 2024 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 513 | TURL: Table Understanding through Representation Learning | 2021 | VLDB | 0.00021288342 |
| 754 | Distributed Representations of Tuples for Entity Resolution | 2018 | VLDB | 0.00017117211 |
| 903 | To Join or Not to Join? Thinking Twice about Joins before Feature Selection | 2016 | SIGMOD | 0.0001547016 |
| 1,463 | ARDA: Automatic Relational Data Augmentation for Machine Learning | 2020 | VLDB | 0.00011869295 |
| 1,751 | Auctus: A Dataset Search Engine for Data Discovery and Augmentation | 2021 | VLDB | 0.00010683295 |
| 1,914 | Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks | 2020 | SIGMOD | 0.00010109102 |
| 2,141 | LSH Ensemble: Internet-Scale Domain Search | 2016 | VLDB | 9.4542625e-05 |
| 3,824 | Correlation Sketches for Approximate Join-Correlation Queries | 2021 | SIGMOD | 6.7260705e-05 |
| 4,129 | Are Key-Foreign Key Joins Safe to Avoid when Learning High-Capacity Classifiers? | 2018 | VLDB | 6.428887e-05 |
| 7,867 | Learning Over Dirty Data Without Cleaning | 2020 | SIGMOD | 4.6320452e-05 |
Previous
Page 1 / 1
Next