Probabilistic Management of OCR Data using an RDBMS
Summary: Stores probabilistic OCR models in an RDBMS to preserve uncertainty, instead of materializing ASCII text. Introduces Staccato, a controllable approximation that trades recall for speed, with formal guarantees and integration with standard RDBMS text indexing. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Arun Kumar
- 2. Christopher RĂ©
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,434 | Local Structure and Determinism in Probabilistic Databases | 2012 | SIGMOD | 4.7314358e-05 |
| 8,864 | Cerebro: A Layered Data Platform for Scalable Deep Learning | 2021 | CIDR | 4.4326439e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 12 of 12 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,581 | Anytime Approximation in Probabilistic Databases via Scaled Dissociations | 2019 | SIGMOD | 4.492241e-05 |
| 2,186 | Scalable Probabilistic Databases with Factor Graphs and MCMC | 2010 | VLDB | 9.3378109e-05 |
| 7,623 | Optimizing Probabilistic Query Processing on Continuous Uncertain Data | 2011 | VLDB | 4.6933659e-05 |
| 2,118 | Using Probabilistic Models for Data Management in Acquisitional Environments | 2005 | CIDR | 9.5100739e-05 |
| 1,992 | Probabilistic Ranking of Database Query Results | 2004 | VLDB | 9.8462684e-05 |
| 8,090 | Probabilistic Histograms for Probabilistic Data | 2009 | VLDB | 4.5888589e-05 |
| 4,387 | Hybrid In-Database Inference for Declarative Information Extraction | 2011 | SIGMOD | 6.2320072e-05 |
| 74 | Efficient Query Evaluation on Probabilistic Databases | 2004 | VLDB | 0.00057857292 |
| 760 | Creating Probabilistic Databases from Information Extraction Models | 2006 | VLDB | 0.00017053935 |
| 5,874 | Incrementally Maintaining Classification using an RDBMS | 2011 | VLDB | 5.2930628e-05 |