Database Paper Browser

Back to authors

AnHai Doan

Author ID
8
ORCID
-
Links
(found by gpt-5.2 on feb 8th, 2026)
Most Frequent Institution
University of Wisconsin
Pagerank
0.39738072
Overall Rank
91 | 99.58%
Paper Count
43

Affiliation Timeline

Incoming Non-self Citations Over Time

Total yearly non-self incoming citations across all papers by this author.

Publications by Paper Pagerank

Showing 43 of 43 publications.

Rank Title Year Venue Pagerank
208 Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach 2001 SIGMOD 0.0003460594
221 Deep Entity Matching with Pre-Trained Language Models 2021 VLDB 0.00033121824
287 Declarative Information Extraction Using Datalog with Embedded Extraction Predicates 2007 VLDB 0.00028971272
300 Deep Learning for Entity Matching: A Design Space Exploration 2018 SIGMOD 0.00028441466
643 Corleone: Hands-Off Crowdsourcing for Entity Matching 2014 SIGMOD 0.00018754451
652 On the Provenance of Non-Answers to Queries over Extracted Data 2008 VLDB 0.00018634477
672 An Interactive Clustering-based Approach to Integrating Source Query Interfaces on the Deep Web 2004 SIGMOD 0.00018355746
712 Magellan: Toward Building Entity Matching Management Systems 2016 VLDB 0.00017732426
1,014 Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS 2011 VLDB 0.00014640258
1,198 Crossing the Structure Chasm 2003 CIDR 0.00013366708
1,716 Chimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing 2014 VLDB 0.00010795718
1,722 Building Structured Web Community Portals: A Top-Down, Compositional, and Incremental Approach 2007 VLDB 0.00010757784
1,762 Tuning Schema Matching Software using Synthetic Scenarios 2005 VLDB 0.00010646894
2,066 DBLife: A Community Information Management Platform for the Database Research Community 2007 CIDR 9.6399561e-05
2,174 iMAP: Discovering Complex Semantic Matches between Database Schemas 2004 SIGMOD 9.3672342e-05
2,175 Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services 2017 SIGMOD 9.3644117e-05
2,605 Muppet: MapReduce-Style Processing of Fast Data 2012 VLDB 8.4646171e-05
2,771 A Relational Approach to Incrementally Extracting and Querying Structure in Unstructured Data 2007 VLDB 8.1421432e-05
2,847 Building, Maintaining, and Using Knowledge Bases: A Report from the Trenches 2013 SIGMOD 8.0224023e-05
2,984 Efficiently Incorporating User Feedback into Information Extraction and Integration Programs 2009 SIGMOD 7.7796344e-05
3,372 OLAP over Imprecise Data with Domain Constraints 2007 VLDB 7.1683982e-05
3,477 Toward Best-Effort Information Extraction 2008 SIGMOD 7.0583481e-05
3,640 Deep Learning for Blocking in Entity Matching: A Design Space Exploration 2021 VLDB 6.8891671e-05
4,402 Smurf: Self-Service String Matching Using Random Forests 2019 VLDB 6.2195162e-05
4,464 Magellan: Toward Building Entity Matching Management Systems over Data Science Stacks 2016 VLDB 6.1606042e-05
5,056 Crowds, Clouds, and Algorithms: Exploring the Human Side of "Big Data" Applications 2010 SIGMOD 5.73044e-05
5,174 Mapping Maintenance for Data Integration Systems 2005 VLDB 5.6443463e-05
5,419 Combining Keyword Search and Forms for Ad Hoc Querying of Databases 2009 SIGMOD 5.5176475e-05
5,431 Entity Extraction, Linking, Classification, and Tagging for Social Media: A Wikipedia-Based Approach 2013 VLDB 5.5076946e-05
5,450 Crowdsourcing Applications and Platforms: A Data Management Perspective 2011 VLDB 5.5003491e-05
6,111 Why Big Data Industrial Systems Need Rules and What We Can Do About It 2015 SIGMOD 5.2049579e-05
6,747 Entity Matching Meets Data Science: A Progress Report from the Magellan Project 2019 SIGMOD 4.9408824e-05
6,754 Modeling Entity Evolution for Temporal Record Matching 2014 SIGMOD 4.9384574e-05
7,706 Tracking Entities in the Dynamic World: A Fast Algorithm for Matching Temporal Records 2014 VLDB 4.6723595e-05
8,099 Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity Matching 2023 VLDB 4.5859317e-05
8,766 Toward Scalable Keyword Search over Relational Data 2010 VLDB 4.456315e-05
8,824 Analyzing and Revising Data Integration Schemas to Improve Their Matchability 2008 VLDB 4.4415658e-05
9,577 CoClean: Collaborative Data Cleaning 2020 SIGMOD 4.3248438e-05
9,635 Optimizing Complex Extraction Programs over Evolving Text Data 2009 SIGMOD 4.3118125e-05
11,739 CloudMatcher: A Hands-Off Cloud/Crowd Service for Entity Matching 2018 VLDB 4.1945683e-05
12,286 The Case for a Structured Approach to Managing Unstructured Data 2009 CIDR 4.1945683e-05
13,227 Cloud Data Systems: What are the Opportunities for the Database Research Community? 2022 VLDB -
13,626 Managing Information Extraction [Tutorial Outline] 2006 SIGMOD -
Previous Page 1 / 1 Next

Frequent Co-authors

Co-authored at least 5 papers.

Co-author Shared Papers Rank Pagerank
Jeffrey Naughton 13 7 1.0202982
Raghu Ramakrishnan 7 10 0.94534432
Sanjib Das 7 989 0.067944257
Ganesh Krishnan 5 1,253 0.054921394
Rohit Deep 5 1,254 0.054921394
Vijay Raghavendra 5 1,255 0.054921394
Han Li 5 1,298 0.053359107
Warren Shen 5 1,351 0.052089198
Xiaoyong Chai 5 2,043 0.036810751