| 3,152 |
AnalyticDB: Real-time OLAP Database System at Alibaba Cloud |
2019 |
VLDB |
7.4711766e-05 |
| 3,248 |
A Learned Query Rewrite System using Monte Carlo Tree Search |
2022 |
VLDB |
7.3258782e-05 |
| 3,662 |
The Dawn of Natural Language to SQL: Are We Fully Ready? |
2024 |
VLDB |
6.8672143e-05 |
| 3,727 |
Cost-based or Learning-based? A Hybrid Query Optimizer for Query Plan Selection |
2022 |
VLDB |
6.8141709e-05 |
| 4,102 |
GoodCore: Data-effective and Data-efficient Machine Learning through Coreset Selection over Incomplete Data |
2023 |
SIGMOD |
6.4522929e-05 |
| 4,543 |
FACE: A Normalizing Flow based Cardinality Estimator |
2022 |
VLDB |
6.1011198e-05 |
| 4,825 |
Synthesizing Natural Language to Visualization (NL2VIS) Benchmarks from NL2SQL Benchmarks |
2021 |
SIGMOD |
5.8946721e-05 |
| 5,279 |
CDB: A Crowd-Powered Database System |
2018 |
VLDB |
5.5902418e-05 |
| 5,362 |
Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach |
2016 |
SIGMOD |
5.5473503e-05 |
| 5,371 |
LearnedSQLGen: Constraint-aware SQL Generation using Reinforcement Learning |
2022 |
SIGMOD |
5.5428776e-05 |
| 5,381 |
Selective Data Acquisition in the Wild for Model Charging |
2022 |
VLDB |
5.5399508e-05 |
| 5,963 |
Automatic Data Acquisition for Deep Learning |
2021 |
VLDB |
5.2526794e-05 |
| 6,569 |
Domain Adaptation for Deep Entity Resolution |
2022 |
SIGMOD |
5.0065379e-05 |
| 7,179 |
Coresets over Multiple Tables for Feature-rich and Data-efficient Machine Learning |
2023 |
VLDB |
4.8078895e-05 |
| 7,575 |
Human-in-the-loop Outlier Detection |
2020 |
SIGMOD |
4.7068909e-05 |
| 7,582 |
LakeCompass: An End-to-End System for Data Maintenance, Search and Analysis in Data Lakes |
2024 |
VLDB |
4.7046388e-05 |
| 8,116 |
LakeBench: A Benchmark for Discovering Joinable and Unionable Tables in Data Lakes |
2024 |
VLDB |
4.581507e-05 |
| 8,268 |
Learned Data-aware Image Representations of Line Charts for Similarity Search |
2023 |
SIGMOD |
4.5456668e-05 |
| 8,406 |
DADER: Hands-Off Entity Resolution with Domain Adaptation |
2022 |
VLDB |
4.5220083e-05 |
| 8,828 |
HAIPipe: Combining Human-generated and Machine-generated Pipelines for Data Preparation |
2023 |
SIGMOD |
4.4407488e-05 |
| 9,152 |
Doctopus: Budget-aware Structural Table Extraction from Unstructured Documents |
2025 |
VLDB |
4.3849295e-05 |
| 9,213 |
PACE: Poisoning Attacks on Learned Cardinality Estimation |
2024 |
SIGMOD |
4.3721075e-05 |
| 9,221 |
VisClean: Interactive Cleaning for Progressive Visualization |
2020 |
VLDB |
4.3699444e-05 |
| 9,475 |
OIE: An Interpretable System for Outlier Explanation and Summarization |
2025 |
SIGMOD |
4.3341665e-05 |
| 9,479 |
Data Imputation with Limited Data Redundancy Using Data Lakes |
2025 |
VLDB |
4.3341665e-05 |
| 10,239 |
BRIEF: Bi-level Coreset Selection for Efficient Instruction Tuning in LLMs |
2026 |
VLDB |
4.1945683e-05 |
| 10,289 |
LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning |
2026 |
VLDB |
4.1945683e-05 |
| 10,438 |
Doctopus: A System for Budget-aware Structural Data Extraction from Unstructured Documents |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,528 |
Two Birds with One Stone: Efficient Deep Learning over Mislabeled Data through Subset Selection |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,752 |
QUEST: Query Optimization in Unstructured Document Analysis |
2025 |
VLDB |
4.1945683e-05 |
| 10,837 |
Natural Language to SQL: State of the Art and Open Problems |
2025 |
VLDB |
4.1945683e-05 |
| 11,000 |
MisDetect: Iterative Mislabel Detection using Early Loss |
2024 |
VLDB |
4.1945683e-05 |
| 11,582 |
Interactively Discovering and Ranking Desired Tuples without Writing SQL Queries |
2020 |
SIGMOD |
4.1945683e-05 |
| 11,788 |
CDB: Optimizing Queries with Crowd-Based Selections and Joins |
2017 |
SIGMOD |
4.1945683e-05 |
| 13,134 |
DocDB: A Database for Unstructured Document Analysis |
2025 |
VLDB |
- |