Pluto: Sample Selection for Robust Anomaly Detection on Polluted Log Data
Summary: Pluto automatically selects a clean subset from polluted log data to train a Transformer-based anomaly detector. It uses Gaussian mixtures to identify and discard polluted embedding regions and a (1−1/e) greedy facility-location approach to purify samples, iterating training. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Lei Ma
- 2. Lei Cao
- 3. Peter M. VanNostrand
- 4. Dennis M. Hofmann
- 5. Yao Su
- 6. Elke A. Rundensteiner
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,218 | Unseen Anomaly Detection from System Logs | 2026 | SIGMOD | 4.1945683e-05 |
| 10,713 | CoLA: Model Collaboration for Log-based Anomaly Detection | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 3 of 3 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 161 | LOF: Identifying Density-Based Local Outliers | 2000 | SIGMOD | 0.00039846974 |
| 4,154 | Robust and Transferable Log-based Anomaly Detection | 2023 | SIGMOD | 6.4032498e-05 |
| 4,911 | Unsupervised Contextual Anomaly Detection for Database Systems | 2022 | SIGMOD | 5.8328593e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,713 | CoLA: Model Collaboration for Log-based Anomaly Detection | 2025 | VLDB | 4.1945683e-05 |
| 5,984 | Streaming Anomaly Detection Using Randomized Matrix Sketching | 2016 | VLDB | 5.244512e-05 |
| 6,440 | An Experimental Evaluation of Anomaly Detection in Time Series | 2024 | VLDB | 5.0603878e-05 |
| 10,658 | LLMLog: Advanced Log Template Generation via LLM-driven Multi-Round Annotation | 2025 | VLDB | 4.1945683e-05 |
| 2,290 | TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data | 2022 | VLDB | 9.0934125e-05 |
| 10,876 | MLP-Mixer based Masked Autoencoders Are Effective, Explainable and Robust for Time Series Anomaly Detection | 2025 | VLDB | 4.1945683e-05 |
| 6,897 | PreLog: A Pre-trained Model for Log Analytics | 2024 | SIGMOD | 4.8925595e-05 |
| 10,218 | Unseen Anomaly Detection from System Logs | 2026 | SIGMOD | 4.1945683e-05 |
| 9,872 | Substructure-aware Log Anomaly Detection | 2025 | VLDB | 4.2667743e-05 |
| 4,154 | Robust and Transferable Log-based Anomaly Detection | 2023 | SIGMOD | 6.4032498e-05 |