Back to papers
Algorithms for Mining Distance-Based Outliers in Large Datasets
Summary: Proposes distance-based outliers (DB-outliers) for large, high-dimensional data, enabling outlier discovery beyond 2D domains. Presents two simple O(k N^2) algorithms, a cell-based O(N) method exponential in k, and a disk-resident variant with ≤3 passes; experiments show cell-based methods dominate for k ≤ 4.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 8505
- Venue
- VLDB
- Year
- 1998
- Pagerank
- 0.00016865771
- Overall Rank
- 774 | 94.62%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 28 of 28 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 161 |
LOF: Identifying Density-Based Local Outliers |
2000 |
SIGMOD |
0.00039846974 |
| 701 |
Efficient Algorithms for Mining Outliers from Large Data Sets |
2000 |
SIGMOD |
0.00017938417 |
| 1,854 |
Distance-based Outlier Detection in Data Streams |
2016 |
VLDB |
0.00010317762 |
| 1,931 |
Efficient Processing of k Nearest Neighbor Joins using MapReduce |
2012 |
VLDB |
0.00010040427 |
| 2,158 |
Uni-Detect: A Unified Approach to Automated Error Detection in Tables |
2019 |
SIGMOD |
9.4141354e-05 |
| 2,281 |
Epsilon Grid Order: An Algorithm for the Similarity Join on Massive High-Dimensional Data |
2001 |
SIGMOD |
9.1077704e-05 |
| 2,506 |
Auto-Detect: Data-Driven Error Detection in Tables |
2018 |
SIGMOD |
8.6335464e-05 |
| 2,629 |
Online Outlier Detection in Sensor Data Using Non-Parametric Models |
2006 |
VLDB |
8.4160309e-05 |
| 2,822 |
Finding Intensional Knowledge of Distance-Based Outliers |
1999 |
VLDB |
8.0608136e-05 |
| 3,012 |
NETS: Extremely Fast Outlier Detection from a Data Stream via Set-Based Processing |
2019 |
VLDB |
7.7153586e-05 |
| 3,171 |
Interactive Outlier Exploration in Big Data Streams |
2014 |
VLDB |
7.4447236e-05 |
| 3,468 |
Real-Time Distance-Based Outlier Detection in Data Streams |
2021 |
VLDB |
7.0686044e-05 |
| 3,920 |
Continuous Outlier Detection in Data Streams: An Extensible Framework and State-Of-The-Art Algorithms |
2013 |
SIGMOD |
6.6309693e-05 |
| 4,456 |
AutoOD: Automatic Outlier Detection |
2023 |
SIGMOD |
6.1704203e-05 |
| 4,552 |
Outlier Detection for High Dimensional Data |
2001 |
SIGMOD |
6.0922282e-05 |
| 4,554 |
A Demonstration of AutoOD: A Self-Tuning Anomaly Detection System |
2022 |
VLDB |
6.0911296e-05 |
| 6,107 |
Continuously Adaptive Similarity Search |
2020 |
SIGMOD |
5.2066612e-05 |
| 6,544 |
A Framework for Measuring Changes in Data Characteristics |
1999 |
PODS |
5.0202405e-05 |
| 6,991 |
Sharing-Aware Outlier Analytics over High-Volume Data Streams |
2016 |
SIGMOD |
4.8702811e-05 |
| 8,083 |
A New Distributional Treatment for Time Series and An Anomaly Detection Investigation |
2022 |
VLDB |
4.5903492e-05 |
| 8,707 |
Multiple Dynamic Outlier-Detection from a Data Stream by Exploiting Duality of Data and Queries |
2021 |
SIGMOD |
4.463922e-05 |
| 9,035 |
Data-Driven Insight Synthesis for Multi-Dimensional Data |
2024 |
VLDB |
4.4039656e-05 |
| 9,420 |
Local Search Methods for k-Means with Outliers |
2017 |
VLDB |
4.3441378e-05 |
| 9,924 |
On Saving Outliers for Better Clustering over Noisy Data |
2021 |
SIGMOD |
4.2544238e-05 |
| 10,019 |
Guardrail: Automated Integrity Constraint Synthesis From Noisy Data |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,029 |
Outliers: The Good, the Bad and the Ugly |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,512 |
Auto-Test: Learning Semantic-Domain Constraints for Unsupervised Error Detection in Tables |
2025 |
SIGMOD |
4.1945683e-05 |
| 11,467 |
Fast and Exact Outlier Detection in Metric Spaces: A Proximity Graph-based Approach |
2021 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 7,480 |
Towards Metric DBSCAN: Exact, Approximate, and Streaming Algorithms |
2024 |
SIGMOD |
4.7180617e-05 |
| 9,420 |
Local Search Methods for k-Means with Outliers |
2017 |
VLDB |
4.3441378e-05 |
| 10,003 |
Clustering with Set Outliers and Applications in Relational Clustering |
2026 |
PODS |
4.1945683e-05 |
| 9,924 |
On Saving Outliers for Better Clustering over Noisy Data |
2021 |
SIGMOD |
4.2544238e-05 |
| 3,920 |
Continuous Outlier Detection in Data Streams: An Extensible Framework and State-Of-The-Art Algorithms |
2013 |
SIGMOD |
6.6309693e-05 |
| 2,822 |
Finding Intensional Knowledge of Distance-Based Outliers |
1999 |
VLDB |
8.0608136e-05 |
| 11,467 |
Fast and Exact Outlier Detection in Metric Spaces: A Proximity Graph-based Approach |
2021 |
SIGMOD |
4.1945683e-05 |
| 4,552 |
Outlier Detection for High Dimensional Data |
2001 |
SIGMOD |
6.0922282e-05 |
| 9,787 |
Distance-Based Outlier Detection: Consolidation and Renewed Bearing |
2010 |
VLDB |
4.2823546e-05 |
| 701 |
Efficient Algorithms for Mining Outliers from Large Data Sets |
2000 |
SIGMOD |
0.00017938417 |