Database Paper Browser

Back to papers

Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins

Summary: Ember enables no-code context enrichment via a general keyless-join operator. It learns task-specific embeddings with Transformer-based representations, builds a similarity index over these embeddings, and delivers up to 39% recall gains across five domains with minimal configuration. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12942
Venue
VLDB
Year
2022
Pagerank
6.6114622e-05
Overall Rank
3,942 | 72.58%
DOI
10.14778/3494124.3494149

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 9 of 9 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 21 of 21 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
206 Constructing an Interactive Natural Language Interface for Relational Databases 2015 VLDB 0.00034667032
221 Deep Entity Matching with Pre-Trained Language Models 2021 VLDB 0.00033121824
254 Snorkel: Rapid Training Data Creation with Weak Supervision 2018 VLDB 0.00030540555
300 Deep Learning for Entity Matching: A Design Space Exploration 2018 SIGMOD 0.00028441466
518 Data Integration for the Relational Web 2009 VLDB 0.00021158934
610 Goods: Organizing Google's Datasets 2016 SIGMOD 0.00019232674
712 Magellan: Toward Building Entity Matching Management Systems 2016 VLDB 0.00017732426
754 Distributed Representations of Tuples for Entity Resolution 2018 VLDB 0.00017117211
903 To Join or Not to Join? Thinking Twice about Joins before Feature Selection 2016 SIGMOD 0.0001547016
1,198 Crossing the Structure Chasm 2003 CIDR 0.00013366708
1,281 DataHub: Collaborative Data Science & Dataset Version Management at Scale 2015 CIDR 0.00012854744
1,463 ARDA: Automatic Relational Data Augmentation for Machine Learning 2020 VLDB 0.00011869295
1,751 Auctus: A Dataset Search Engine for Data Discovery and Augmentation 2021 VLDB 0.00010683295
3,640 Deep Learning for Blocking in Entity Matching: A Design Space Exploration 2021 VLDB 6.8891671e-05
4,129 Are Key-Foreign Key Joins Safe to Avoid when Learning High-Capacity Classifiers? 2018 VLDB 6.428887e-05
4,196 Overton: A Data System for Monitoring and Improving Machine-Learned Products 2020 CIDR 6.3686231e-05
5,058 A Demo of the Data Civilizer System 2017 SIGMOD 5.7280139e-05
5,434 Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples 2021 SIGMOD 5.5045402e-05
8,137 Customizable and Scalable Fuzzy Join for Big Data 2019 VLDB 4.5774794e-05
9,438 Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation 2021 CIDR 4.3425082e-05
11,629 Leveraging Organizational Resources to Adapt Models to New Data Modalities 2020 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Semantically Similar Papers