Anomaly Detection

Aeon provides anomaly detection methods for identifying unusual patterns in time series at both series and collection levels.

Collection Anomaly Detectors

Detect anomalous time series within a collection:

ClassificationAdapter - Adapts classifiers for anomaly detection
Train on normal data, flag outliers during prediction
Use when: Have labeled normal data, want classification-based approach
OutlierDetectionAdapter - Wraps sklearn outlier detectors
Works with IsolationForest, LOF, OneClassSVM
Use when: Want to use sklearn anomaly detectors on collections

Series Anomaly Detectors

Detect anomalous points or subsequences within a single time series.

Distance-Based Methods

Use similarity metrics to identify anomalies:

CBLOF - Cluster-Based Local Outlier Factor
Clusters data, identifies outliers based on cluster properties
Use when: Anomalies form sparse clusters
KMeansAD - K-means based anomaly detection
Distance to nearest cluster center indicates anomaly
Use when: Normal patterns cluster well
LeftSTAMPi - Left STAMP incremental
Matrix profile for online anomaly detection
Use when: Streaming data, need online detection
STOMP - Scalable Time series Ordered-search Matrix Profile
Computes matrix profile for subsequence anomalies
Use when: Discord discovery, motif detection
MERLIN - Matrix profile-based method
Efficient matrix profile computation
Use when: Large time series, need scalability
LOF - Local Outlier Factor adapted for time series
Density-based outlier detection
Use when: Anomalies in low-density regions
ROCKAD - ROCKET-based semi-supervised detection
Uses ROCKET features for anomaly identification
Use when: Have some labeled data, want feature-based approach

Distribution-Based Methods

Analyze statistical distributions:

COPOD - Copula-Based Outlier Detection
Models marginal and joint distributions
Use when: Multi-dimensional time series, complex dependencies
DWT_MLEAD - Discrete Wavelet Transform Multi-Level Anomaly Detection
Decomposes series into frequency bands
Use when: Anomalies at specific frequencies

Isolation-Based Methods

Use isolation principles:

IsolationForest - Random forest-based isolation
Anomalies easier to isolate than normal points
Use when: High-dimensional data, no assumptions about distribution
OneClassSVM - Support vector machine for novelty detection
Learns boundary around normal data
Use when: Well-defined normal region, need robust boundary
STRAY - Streaming Robust Anomaly Detection
Robust to data distribution changes
Use when: Streaming data, distribution shifts

External Library Integration

PyODAdapter - Bridges PyOD library to aeon
Access 40+ PyOD anomaly detectors
Use when: Need specific PyOD algorithm

Quick Start

from aeon.anomaly_detection import STOMP
import numpy as np

# Create time series with anomaly
y = np.concatenate([
    np.sin(np.linspace(0, 10, 100)),
    [5.0],  # Anomaly spike
    np.sin(np.linspace(10, 20, 100))
])

# Detect anomalies
detector = STOMP(window_size=10)
anomaly_scores = detector.fit_predict(y)

# Higher scores indicate more anomalous points
threshold = np.percentile(anomaly_scores, 95)
anomalies = anomaly_scores > threshold

Point vs Subsequence Anomalies

Point anomalies: Single unusual values
Use: COPOD, DWT_MLEAD, IsolationForest
Subsequence anomalies (discords): Unusual patterns
Use: STOMP, LeftSTAMPi, MERLIN
Collective anomalies: Groups of points forming unusual pattern
Use: Matrix profile methods, clustering-based

Evaluation Metrics

Specialized metrics for anomaly detection:

from aeon.benchmarking.metrics.anomaly_detection import (
    range_precision,
    range_recall,
    range_f_score,
    roc_auc_score
)

# Range-based metrics account for window detection
precision = range_precision(y_true, y_pred, alpha=0.5)
recall = range_recall(y_true, y_pred, alpha=0.5)
f1 = range_f_score(y_true, y_pred, alpha=0.5)

Algorithm Selection

Speed priority: KMeansAD, IsolationForest
Accuracy priority: STOMP, COPOD
Streaming data: LeftSTAMPi, STRAY
Discord discovery: STOMP, MERLIN
Multi-dimensional: COPOD, PyODAdapter
Semi-supervised: ROCKAD, OneClassSVM
No training data: IsolationForest, STOMP

Best Practices

Normalize data: Many methods sensitive to scale
Choose window size: For matrix profile methods, window size critical
Set threshold: Use percentile-based or domain-specific thresholds
Validate results: Visualize detections to verify meaningfulness
Handle seasonality: Detrend/deseasonalize before detection

references/anomaly_detection.md