references/transformations.md

Transformations

Aeon provides extensive transformation capabilities for preprocessing, feature extraction, and representation learning from time series data.

Transformation Types

Aeon distinguishes between: - CollectionTransformers: Transform multiple time series (collections) - SeriesTransformers: Transform individual time series

Collection Transformers

Convolution-Based Feature Extraction

Fast, scalable feature generation using random kernels:

  • RocketTransformer - Random convolutional kernels
  • MiniRocketTransformer - Simplified ROCKET for speed
  • MultiRocketTransformer - Enhanced ROCKET variant
  • HydraTransformer - Multi-resolution dilated convolutions
  • MultiRocketHydraTransformer - Combines ROCKET and Hydra
  • ROCKETGPU - GPU-accelerated variant

Use when: Need fast, scalable features for any ML algorithm, strong baseline performance.

Statistical Feature Extraction

Domain-agnostic features based on time series characteristics:

  • Catch22 - 22 canonical time-series characteristics
  • TSFresh - Comprehensive automated feature extraction (100+ features)
  • TSFreshRelevant - Feature extraction with relevance filtering
  • SevenNumberSummary - Descriptive statistics (mean, std, quantiles)

Use when: Need interpretable features, domain-agnostic approach, or feeding traditional ML.

Dictionary-Based Representations

Symbolic approximations for discrete representations:

  • SAX - Symbolic Aggregate approXimation
  • PAA - Piecewise Aggregate Approximation
  • SFA - Symbolic Fourier Approximation
  • SFAFast - Optimized SFA
  • SFAWhole - SFA on entire series (no windowing)
  • BORF - Bag-of-Receptive-Fields

Use when: Need discrete/symbolic representation, dimensionality reduction, interpretability.

Shapelet-Based Features

Discriminative subsequence extraction:

  • RandomShapeletTransform - Random discriminative shapelets
  • RandomDilatedShapeletTransform - Dilated shapelets for multi-scale
  • SAST - Scalable And Accurate Subsequence Transform
  • RSAST - Randomized SAST

Use when: Need interpretable discriminative patterns, phase-invariant features.

Interval-Based Features

Statistical summaries from time intervals:

  • RandomIntervals - Features from random intervals
  • SupervisedIntervals - Supervised interval selection
  • QUANTTransformer - Quantile-based interval features

Use when: Predictive patterns localized to specific windows.

Preprocessing Transformations

Data preparation and normalization:

  • MinMaxScaler - Scale to [0, 1] range
  • Normalizer - Z-normalization (zero mean, unit variance)
  • Centerer - Center to zero mean
  • SimpleImputer - Fill missing values
  • DownsampleTransformer - Reduce temporal resolution
  • Tabularizer - Convert time series to tabular format

Use when: Need standardization, missing value handling, format conversion.

Specialized Transformations

Advanced analysis methods:

  • MatrixProfile - Computes distance profiles for pattern discovery
  • DWTTransformer - Discrete Wavelet Transform
  • AutocorrelationFunctionTransformer - ACF computation
  • Dobin - Distance-based Outlier BasIs using Neighbors
  • SignatureTransformer - Path signature methods
  • PLATransformer - Piecewise Linear Approximation

Class Imbalance Handling

  • ADASYN - Adaptive Synthetic Sampling
  • SMOTE - Synthetic Minority Over-sampling
  • OHIT - Over-sampling with Highly Imbalanced Time series

Use when: Classification with imbalanced classes.

Pipeline Composition

  • CollectionTransformerPipeline - Chain multiple transformers

Series Transformers

Transform individual time series (e.g., for preprocessing in forecasting).

Statistical Analysis

  • AutoCorrelationSeriesTransformer - Autocorrelation
  • StatsModelsACF - ACF using statsmodels
  • StatsModelsPACF - Partial autocorrelation

Smoothing and Filtering

  • ExponentialSmoothing - Exponentially weighted moving average
  • MovingAverage - Simple or weighted moving average
  • SavitzkyGolayFilter - Polynomial smoothing
  • GaussianFilter - Gaussian kernel smoothing
  • BKFilter - Baxter-King bandpass filter
  • DiscreteFourierApproximation - Fourier-based filtering

Use when: Need noise reduction, trend extraction, or frequency filtering.

Dimensionality Reduction

  • PCASeriesTransformer - Principal component analysis
  • PlASeriesTransformer - Piecewise Linear Approximation

Transformations

  • BoxCoxTransformer - Variance stabilization
  • LogTransformer - Logarithmic scaling
  • ClaSPTransformer - Classification Score Profile

Pipeline Composition

  • SeriesTransformerPipeline - Chain series transformers

Quick Start: Feature Extraction

from aeon.transformations.collection.convolution_based import RocketTransformer
from aeon.classification.sklearn import RotationForest
from aeon.datasets import load_classification

# Load data
X_train, y_train = load_classification("GunPoint", split="train")
X_test, y_test = load_classification("GunPoint", split="test")

# Extract ROCKET features
rocket = RocketTransformer()
X_train_features = rocket.fit_transform(X_train)
X_test_features = rocket.transform(X_test)

# Use with any sklearn classifier
clf = RotationForest()
clf.fit(X_train_features, y_train)
accuracy = clf.score(X_test_features, y_test)

Quick Start: Preprocessing Pipeline

from aeon.transformations.collection import (
    MinMaxScaler,
    SimpleImputer,
    CollectionTransformerPipeline
)

# Build preprocessing pipeline
pipeline = CollectionTransformerPipeline([
    ('imputer', SimpleImputer(strategy='mean')),
    ('scaler', MinMaxScaler())
])

X_transformed = pipeline.fit_transform(X_train)

Quick Start: Series Smoothing

from aeon.transformations.series import MovingAverage

# Smooth individual time series
smoother = MovingAverage(window_size=5)
y_smoothed = smoother.fit_transform(y)

Algorithm Selection

For Feature Extraction:

  • Speed + Performance: MiniRocketTransformer
  • Interpretability: Catch22, TSFresh
  • Dimensionality reduction: PAA, SAX, PCA
  • Discriminative patterns: Shapelet transforms
  • Comprehensive features: TSFresh (with longer runtime)

For Preprocessing:

  • Normalization: Normalizer, MinMaxScaler
  • Smoothing: MovingAverage, SavitzkyGolayFilter
  • Missing values: SimpleImputer
  • Frequency analysis: DWTTransformer, Fourier methods

For Symbolic Representation:

  • Fast approximation: PAA
  • Alphabet-based: SAX
  • Frequency-based: SFA, SFAFast

Best Practices

  1. Fit on training data only: Avoid data leakage python transformer.fit(X_train) X_train_tf = transformer.transform(X_train) X_test_tf = transformer.transform(X_test)

  2. Pipeline composition: Chain transformers for complex workflows python pipeline = CollectionTransformerPipeline([ ('imputer', SimpleImputer()), ('scaler', Normalizer()), ('features', RocketTransformer()) ])

  3. Feature selection: TSFresh can generate many features; consider selection python from sklearn.feature_selection import SelectKBest selector = SelectKBest(k=100) X_selected = selector.fit_transform(X_features, y)

  4. Memory considerations: Some transformers memory-intensive on large datasets

  5. Use MiniRocket instead of ROCKET for speed
  6. Consider downsampling for very long series
  7. Use ROCKETGPU for GPU acceleration

  8. Domain knowledge: Choose transformations matching domain:

  9. Periodic data: Fourier-based methods
  10. Noisy data: Smoothing filters
  11. Spike detection: Wavelet transforms
← Back to aeon