Solo: Data Discovery Using Natural Language Questions Via A Self-Supervised Approach
Summary: Solo enables natural-language data discovery with self-supervised training, no labeled data needed. It develops self-supervised data generation, table representations, and relevance models for end-to-end learned discovery that outperforms baselines. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 6 of 6 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,995 | How Large Language Models Will Disrupt Data Management | 2023 | VLDB | 6.5513237e-05 |
| 6,217 | Pneuma: Leveraging LLMs for Tabular Data Representation and Retrieval in an End-to-End System | 2025 | SIGMOD | 5.1534752e-05 |
| 9,928 | Fainder: A Fast and Accurate Index for Distribution-Aware Dataset Search | 2024 | VLDB | 4.2511622e-05 |
| 10,064 | Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees | 2026 | SIGMOD | 4.1945683e-05 |
| 10,197 | Qualitative Join Discovery in Data Lakes using Examples | 2026 | SIGMOD | 4.1945683e-05 |
| 10,589 | Birdie: Natural Language-Driven Table Discovery Using Differentiable Search Index | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 107 | WebTables: Exploring the Power of Tables on the Web | 2008 | VLDB | 0.00048377684 |
| 276 | Efficient IR-Style Keyword Search over Relational Databases | 2003 | VLDB | 0.00029336949 |
| 610 | Goods: Organizing Google's Datasets | 2016 | SIGMOD | 0.00019232674 |
| 1,644 | Finding Related Tables in Data Lakes for Interactive Data Science | 2020 | SIGMOD | 0.00011041787 |
| 1,751 | Auctus: A Dataset Search Engine for Data Discovery and Augmentation | 2021 | VLDB | 0.00010683295 |
| 4,967 | Leva: Boosting Machine Learning Performance with Relational Embedding Data Augmentation | 2022 | SIGMOD | 5.7956612e-05 |
Previous
Page 1 / 1
Next