ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models
Summary: ArcheType uses LLMs for fully zero-shot column type annotation via context sampling, prompt serialization, and label remapping. It establishes new zero-shot SOTA, ships three domain benchmarks, and when combined with classical CTA beats fine-tuned DoDuo. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Benjamin Feuer
- 2. Yurong Liu
- 3. Chinmay Hegde
- 4. Juliana Freire
Incoming Citations (Sorted by Pagerank)
Showing 12 of 12 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 11 of 11 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 107 | WebTables: Exploring the Power of Tables on the Web | 2008 | VLDB | 0.00048377684 |
| 112 | Potter's Wheel: An Interactive Data Cleaning System | 2001 | VLDB | 0.00047045036 |
| 513 | TURL: Table Understanding through Representation Learning | 2021 | VLDB | 0.00021288342 |
| 517 | Can Foundation Models Wrangle Your Data? | 2023 | VLDB | 0.00021169035 |
| 2,517 | Annotating Columns with Pre-trained Language Models | 2022 | SIGMOD | 8.6092139e-05 |
| 2,888 | Sato: Contextual Semantic Type Detection in Tables | 2020 | VLDB | 7.9594996e-05 |
| 3,000 | SANTOS: Relationship-based Semantic Table Union Search | 2023 | SIGMOD | 7.7462128e-05 |
| 3,015 | Chorus: Foundation Models for Unified Data Discovery and Exploration | 2024 | VLDB | 7.7092391e-05 |
| 3,520 | GitTables: A Large-Scale Corpus of Relational Tables | 2023 | SIGMOD | 7.0131061e-05 |
| 4,212 | Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration | 2023 | SIGMOD | 6.3555142e-05 |
| 5,529 | Data-Driven Domain Discovery for Structured Datasets | 2020 | VLDB | 5.4566641e-05 |
Previous
Page 1 / 1
Next