Online Deduplication for Databases
Summary: dbDedup is a similarity-based online dedup scheme for DBMSs using byte-level delta encoding of records. Single-pass encoding integrates with storage and oplog, delivering up to 37x storage and 61x replication reductions with negligible throughput. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Lianghong Xu
- 2. Andrew Pavlo
- 3. Sudipta Sengupta
- 4. Gregory R. Ganger
Incoming Citations (Sorted by Pagerank)
Showing 6 of 6 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,972 | ForkBase: An Efficient Storage Engine for Blockchain and Forkable Applications | 2018 | VLDB | 7.79259e-05 |
| 5,048 | Put an Elephant into a Fridge: Optimizing Cache Efficiency for In-memory Key-value Stores | 2020 | VLDB | 5.7378052e-05 |
| 6,367 | Good to the Last Bit: Data-Driven Encoding with CodecDB | 2021 | SIGMOD | 5.0941072e-05 |
| 8,698 | Everything You Always Wanted to Know About Storage Compressibility of Pre-Trained ML Models but Were Afraid to Ask | 2024 | VLDB | 4.4657846e-05 |
| 10,192 | Performant Synchronization in Geo-Distributed Databases | 2026 | SIGMOD | 4.1945683e-05 |
| 10,291 | Morphing-based Compression for Data-centric ML Pipelines | 2026 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 21 | C-Store: A Column-oriented DBMS | 2005 | VLDB | 0.00086087497 |
| 131 | Integrating Compression and Execution in Column-Oriented Database Systems | 2006 | SIGMOD | 0.0004370331 |
| 241 | DB2 with BLU Acceleration: So Much More than Just a Column Store | 2013 | VLDB | 0.00031420034 |
| 710 | Performance Tradeoffs in Read-Optimized Databases | 2006 | VLDB | 0.00017765454 |
| 898 | Data Compression Support in Databases | 1994 | VLDB | 0.00015525779 |
| 1,134 | Dictionary-based Order-preserving String Compression for Main Memory Column Stores | 2009 | SIGMOD | 0.00013761456 |
| 1,417 | Data Compression in Oracle | 2003 | VLDB | 0.00012104308 |
| 8,274 | XANADUE: A System for Detecting Changes to XML Data in Tree-Unaware Relational Databases | 2007 | SIGMOD | 4.544567e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 800 | An Efficient, Fault-Tolerant Protocol For Replicated Data Management | 1985 | PODS | 0.00016543841 |
| 131 | Integrating Compression and Execution in Column-Oriented Database Systems | 2006 | SIGMOD | 0.0004370331 |
| 6,157 | Compression Aware Physical Database Design | 2011 | VLDB | 5.1801143e-05 |
| 3,689 | Compacting Transactional Data in Hybrid OLTP&OLAP Databases | 2012 | VLDB | 6.8396366e-05 |
| 4,597 | Scalable Replay-Based Replication For Fast Databases | 2017 | VLDB | 6.0588467e-05 |
| 941 | Performance Enhancement Through Replication in an Object-Oriented DBMS | 1989 | SIGMOD | 0.00015158552 |
| 9,918 | Shared Load(ing): Efficient Bulk Loading into Optimized Storage | 2020 | CIDR | 4.2561557e-05 |
| 7,061 | Serving Deep Learning Models with Deduplication from Relational Databases | 2022 | VLDB | 4.8463881e-05 |
| 7,429 | CompressDB: Enabling Efficient Compressed Data Direct Processing for Various Databases | 2022 | SIGMOD | 4.7320139e-05 |
| 3,528 | Distributed Data Deduplication | 2016 | VLDB | 7.0066139e-05 |