Adaptive Rule Discovery for Labeling Text Data
Summary: Weakly supervised labeling of text with feedback; Darwin auto-generates and refines rules from an initial cue and scales to 1M+ sentences. CFG-based labeling functions; yields ~40% more positives than Snuba with the same effort. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Sainyam Galhotra
- 2. Behzad Golshan
- 3. Wang-Chiew Tan
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,872 | Explainable AI: Foundations, Applications, Opportunities for Data Management Research | 2022 | SIGMOD | 5.8609352e-05 |
| 7,288 | Witan: Unsupervised Labelling Function Generation for Assisted Data Programming | 2022 | VLDB | 4.7762276e-05 |
| 8,292 | Nemo: Guiding and Contextualizing Weak Supervision for Interactive Data Programming | 2022 | VLDB | 4.5435639e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 254 | Snorkel: Rapid Training Data Creation with Weak Supervision | 2018 | VLDB | 0.00030540555 |
| 398 | Big Data Integration | 2013 | VLDB | 0.00024372588 |
| 1,215 | Snuba: Automating Weak Supervision to Label Training Data | 2019 | VLDB | 0.0001323375 |
| 3,897 | SLiMFast: Guaranteed Results for Data Fusion and Source Reliability | 2017 | SIGMOD | 6.6554845e-05 |
| 6,868 | Cost-Effective Data Annotation using Game-Based Crowdsourcing | 2019 | VLDB | 4.9010083e-05 |
| 7,766 | ICARUS: Minimizing Human Effort in Iterative Data Completion | 2018 | VLDB | 4.6564959e-05 |
| 8,585 | Robust Entity Resolution using Random Graphs | 2018 | SIGMOD | 4.4905755e-05 |
| 11,755 | Scalable Semantic Querying of Text | 2018 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next