Online Template Induction for Machine-Generated Emails
Summary: Crusher enables online template induction for machine-generated emails. Real-time discovery reduces template delay from weeks to minutes and achieves order-of-magnitude throughput gains over batch systems, while exposing limitations of conventional stream engines for online template induction. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Michael Whittaker
- 2. Nick Edmonds
- 3. Sandeep Tata
- 4. James B. Wendt
- 5. Marc Najork
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,543 | Migrating a Privacy-Safe Information Extraction System to a Software 2.0 Design | 2020 | CIDR | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 142 | TelegraphCQ: Continuous Dataflow Processing for an Uncertain World | 2003 | CIDR | 0.00041725802 |
| 191 | The Design of the Borealis Stream Processing Engine | 2005 | CIDR | 0.00035738595 |
| 288 | Storm @Twitter | 2014 | SIGMOD | 0.00028939871 |
| 314 | MillWheel: Fault-Tolerant Stream Processing at Internet Scale | 2013 | VLDB | 0.00028084774 |
| 587 | Extracting Structured Data from Web Pages | 2003 | SIGMOD | 0.00019648348 |
| 824 | Twitter Heron: Stream Processing at Scale | 2015 | SIGMOD | 0.0001623129 |
| 1,467 | SPADE: The System S Declarative Stream Processing Engine | 2008 | SIGMOD | 0.00011849864 |
| 2,338 | Samza: Stateful Scalable Stream Processing at LinkedIn | 2017 | VLDB | 9.00711e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,240 | Autonomously Computable Information Extraction | 2023 | VLDB | 4.1945683e-05 |
| 4,137 | Exploiting Content Redundancy for Web Information Extraction | 2010 | VLDB | 6.4181549e-05 |
| 979 | Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads | 2012 | VLDB | 0.0001488055 |
| 13,360 | Faster Evaluation of Labor-Intensive Features | 2015 | CIDR | - |
| 12,400 | Ad-Hoc Data Processing in the Cloud | 2008 | VLDB | 4.1945683e-05 |
| 587 | Extracting Structured Data from Web Pages | 2003 | SIGMOD | 0.00019648348 |
| 8,313 | Resource-Adaptive Real-Time New Event Detection | 2007 | SIGMOD | 4.5435639e-05 |
| 2,476 | A Platform for Scalable One-Pass Analytics using MapReduce | 2011 | SIGMOD | 8.6960139e-05 |
| 1,464 | Online Aggregation for Large MapReduce Jobs | 2011 | VLDB | 0.00011865546 |
| 12,193 | Auto-Grouping Emails For Faster E-Discovery | 2011 | VLDB | 4.1945683e-05 |