GOGGLES: Automatic Image Labeling with Affinity Coding
Summary: GOGGLES introduces affinity coding, a domain-agnostic approach to automatic image labeling using affinity functions to compare instance pairs and separate same-class from different-class pairs. A hierarchical generative model infers labels from a small development set, delivering 71-98% accuracy and outperforming Snuba and few-shot baselines while approaching fully supervised performance. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Nilaksh Das
- 2. Sanya Chaba
- 3. Renzhi Wu
- 4. Sakshi Gandhi
- 5. Duen Horng Chau
- 6. Xu Chu
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,935 | OmniFair: A Declarative System for Model-Agnostic Group Fairness in Machine Learning | 2021 | SIGMOD | 5.8198727e-05 |
| 7,796 | CHEF: A Cheap and Fast Pipeline for Iteratively Cleaning Label Uncertainties | 2021 | VLDB | 4.6482625e-05 |
| 8,292 | Nemo: Guiding and Contextualizing Weak Supervision for Interactive Data Programming | 2022 | VLDB | 4.5435639e-05 |
| 8,714 | LANCET: Labeling Complex Data at Scale | 2021 | VLDB | 4.4619818e-05 |
| 9,409 | Ground Truth Inference for Weakly Supervised Entity Matching | 2023 | SIGMOD | 4.3441378e-05 |
| 9,806 | The Image Calculator: 10x Faster Image-AI Inference by Replacing JPEG with Self-designing Storage Format | 2024 | SIGMOD | 4.2805224e-05 |
| 10,465 | A Cost-Effective LLM-based Approach to Identify Wildlife Trafficking in Online Marketplaces | 2025 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 192 | HoloClean: Holistic Data Repairs with Probabilistic Inference | 2017 | VLDB | 0.00035728858 |
| 254 | Snorkel: Rapid Training Data Creation with Weak Supervision | 2018 | VLDB | 0.00030540555 |
| 908 | Fusing Data with Correlations | 2014 | SIGMOD | 0.00015431241 |
| 1,215 | Snuba: Automating Weak Supervision to Label Training Data | 2019 | VLDB | 0.0001323375 |
| 3,897 | SLiMFast: Guaranteed Results for Data Fusion and Source Reliability | 2017 | SIGMOD | 6.6554845e-05 |
| 7,178 | Towards Globally Optimal Crowdsourcing Quality Management: The Uniform Worker Setting | 2016 | SIGMOD | 4.8085946e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,047 | Blocker and Matcher Can Mutually Benefit: A Co-Learning Framework for Low-Resource Entity Resolution | 2024 | VLDB | 4.1945683e-05 |
| 5,963 | Automatic Data Acquisition for Deep Learning | 2021 | VLDB | 5.2526794e-05 |
| 6,868 | Cost-Effective Data Annotation using Game-Based Crowdsourcing | 2019 | VLDB | 4.9010083e-05 |
| 8,908 | Deep Active Alignment of Knowledge Graph Entities and Schemata | 2023 | SIGMOD | 4.427232e-05 |
| 5,304 | A Scalable AutoML Approach Based on Graph Neural Networks | 2022 | VLDB | 5.5779335e-05 |
| 8,343 | CrowdGame: A Game-Based Crowdsourcing System for Cost-Effective Data Labeling | 2019 | SIGMOD | 4.5429217e-05 |
| 254 | Snorkel: Rapid Training Data Creation with Weak Supervision | 2018 | VLDB | 0.00030540555 |
| 1,215 | Snuba: Automating Weak Supervision to Label Training Data | 2019 | VLDB | 0.0001323375 |
| 6,955 | Inspector Gadget: A Data Programming-based Labeling System for Industrial Images | 2021 | VLDB | 4.8864297e-05 |
| 8,714 | LANCET: Labeling Complex Data at Scale | 2021 | VLDB | 4.4619818e-05 |