references/filters_preprocessing.md

Filters and Preprocessing

Overview

Histolab provides a comprehensive set of filters for preprocessing whole slide images and tiles. Filters can be applied to images for visualization, quality control, tissue detection, and artifact removal. They are composable and can be chained together to create sophisticated preprocessing pipelines.

Filter Categories

Image Filters

Color space conversions, thresholding, and intensity adjustments

Morphological Filters

Structural operations like dilation, erosion, opening, and closing

Composition Filters

Utilities for combining multiple filters

Image Filters

RgbToGrayscale

Convert RGB images to grayscale.

from histolab.filters.image_filters import RgbToGrayscale

gray_filter = RgbToGrayscale()
gray_image = gray_filter(rgb_image)

Use cases: - Preprocessing for intensity-based operations - Simplifying color complexity - Input for morphological operations

RgbToHsv

Convert RGB to HSV (Hue, Saturation, Value) color space.

from histolab.filters.image_filters import RgbToHsv

hsv_filter = RgbToHsv()
hsv_image = hsv_filter(rgb_image)

Use cases: - Color-based tissue segmentation - Detecting pen markings by hue - Separating chromatic from achromatic content

RgbToHed

Convert RGB to HED (Hematoxylin-Eosin-DAB) color space for stain deconvolution.

from histolab.filters.image_filters import RgbToHed

hed_filter = RgbToHed()
hed_image = hed_filter(rgb_image)

Use cases: - Separating H&E stain components - Analyzing nuclear (hematoxylin) vs. cytoplasmic (eosin) staining - Quantifying stain intensity

OtsuThreshold

Apply Otsu's automatic thresholding method to create binary images.

from histolab.filters.image_filters import OtsuThreshold

otsu_filter = OtsuThreshold()
binary_image = otsu_filter(grayscale_image)

How it works: - Automatically determines optimal threshold - Separates foreground from background - Minimizes intra-class variance

Use cases: - Tissue detection - Nuclei segmentation - Binary mask creation

AdaptiveThreshold

Apply adaptive thresholding for local intensity variations.

from histolab.filters.image_filters import AdaptiveThreshold

adaptive_filter = AdaptiveThreshold(
    block_size=11,      # Size of local neighborhood
    offset=2            # Constant subtracted from mean
)
binary_image = adaptive_filter(grayscale_image)

Use cases: - Non-uniform illumination - Local contrast enhancement - Handling variable staining intensity

Invert

Invert image intensity values.

from histolab.filters.image_filters import Invert

invert_filter = Invert()
inverted_image = invert_filter(image)

Use cases: - Preprocessing for certain segmentation algorithms - Visualization adjustments

StretchContrast

Enhance image contrast by stretching intensity range.

from histolab.filters.image_filters import StretchContrast

contrast_filter = StretchContrast()
enhanced_image = contrast_filter(image)

Use cases: - Improving visibility of low-contrast features - Preprocessing for visualization - Enhancing faint structures

HistogramEqualization

Equalize image histogram for contrast enhancement.

from histolab.filters.image_filters import HistogramEqualization

hist_eq_filter = HistogramEqualization()
equalized_image = hist_eq_filter(grayscale_image)

Use cases: - Standardizing image contrast - Revealing hidden details - Preprocessing for feature extraction

Morphological Filters

BinaryDilation

Expand white regions in binary images.

from histolab.filters.morphological_filters import BinaryDilation

dilation_filter = BinaryDilation(disk_size=5)
dilated_image = dilation_filter(binary_image)

Parameters: - disk_size: Size of structuring element (default: 5)

Use cases: - Connecting nearby tissue regions - Filling small gaps - Expanding tissue masks

BinaryErosion

Shrink white regions in binary images.

from histolab.filters.morphological_filters import BinaryErosion

erosion_filter = BinaryErosion(disk_size=5)
eroded_image = erosion_filter(binary_image)

Use cases: - Removing small protrusions - Separating connected objects - Shrinking tissue boundaries

BinaryOpening

Erosion followed by dilation (removes small objects).

from histolab.filters.morphological_filters import BinaryOpening

opening_filter = BinaryOpening(disk_size=3)
opened_image = opening_filter(binary_image)

Use cases: - Removing small artifacts - Smoothing object boundaries - Noise reduction

BinaryClosing

Dilation followed by erosion (fills small holes).

from histolab.filters.morphological_filters import BinaryClosing

closing_filter = BinaryClosing(disk_size=5)
closed_image = closing_filter(binary_image)

Use cases: - Filling small holes in tissue regions - Connecting nearby objects - Smoothing internal boundaries

RemoveSmallObjects

Remove connected components smaller than a threshold.

from histolab.filters.morphological_filters import RemoveSmallObjects

remove_small_filter = RemoveSmallObjects(
    area_threshold=500  # Minimum area in pixels
)
cleaned_image = remove_small_filter(binary_image)

Use cases: - Removing dust and artifacts - Filtering noise - Cleaning tissue masks

RemoveSmallHoles

Fill holes smaller than a threshold.

from histolab.filters.morphological_filters import RemoveSmallHoles

fill_holes_filter = RemoveSmallHoles(
    area_threshold=1000  # Maximum hole size to fill
)
filled_image = fill_holes_filter(binary_image)

Use cases: - Filling small gaps in tissue - Creating continuous tissue regions - Removing internal artifacts

Filter Composition

Chaining Filters

Combine multiple filters in sequence:

from histolab.filters.image_filters import RgbToGrayscale, OtsuThreshold
from histolab.filters.morphological_filters import BinaryDilation, RemoveSmallObjects
from histolab.filters.compositions import Compose

# Create filter pipeline
tissue_detection_pipeline = Compose([
    RgbToGrayscale(),
    OtsuThreshold(),
    BinaryDilation(disk_size=5),
    RemoveSmallHoles(area_threshold=1000),
    RemoveSmallObjects(area_threshold=500)
])

# Apply pipeline
result = tissue_detection_pipeline(rgb_image)

Lambda Filters

Create custom filters inline:

from histolab.filters.image_filters import Lambda
import numpy as np

# Custom brightness adjustment
brightness_filter = Lambda(lambda img: np.clip(img * 1.2, 0, 255).astype(np.uint8))

# Custom color channel extraction
red_channel_filter = Lambda(lambda img: img[:, :, 0])

Common Preprocessing Pipelines

Standard Tissue Detection

from histolab.filters.compositions import Compose
from histolab.filters.image_filters import RgbToGrayscale, OtsuThreshold
from histolab.filters.morphological_filters import (
    BinaryDilation, RemoveSmallHoles, RemoveSmallObjects
)

tissue_detection = Compose([
    RgbToGrayscale(),
    OtsuThreshold(),
    BinaryDilation(disk_size=5),
    RemoveSmallHoles(area_threshold=1000),
    RemoveSmallObjects(area_threshold=500)
])

Pen Mark Removal

from histolab.filters.image_filters import RgbToHsv, Lambda
import numpy as np

def remove_pen_marks(hsv_image):
    """Remove blue/green pen markings."""
    h, s, v = hsv_image[:, :, 0], hsv_image[:, :, 1], hsv_image[:, :, 2]
    # Mask for blue/green hues (common pen colors)
    pen_mask = ((h > 0.45) & (h < 0.7) & (s > 0.3))
    # Set pen regions to white
    hsv_image[pen_mask] = [0, 0, 1]
    return hsv_image

pen_removal = Compose([
    RgbToHsv(),
    Lambda(remove_pen_marks)
])

Nuclei Enhancement

from histolab.filters.image_filters import RgbToHed, HistogramEqualization
from histolab.filters.compositions import Compose

nuclei_enhancement = Compose([
    RgbToHed(),
    Lambda(lambda hed: hed[:, :, 0]),  # Extract hematoxylin channel
    HistogramEqualization()
])

Contrast Normalization

from histolab.filters.image_filters import StretchContrast, HistogramEqualization

contrast_normalization = Compose([
    RgbToGrayscale(),
    StretchContrast(),
    HistogramEqualization()
])

Applying Filters to Tiles

Filters can be applied to individual tiles:

from histolab.tile import Tile
from histolab.filters.image_filters import RgbToGrayscale

# Load or extract tile
tile = Tile(image=pil_image, coords=(x, y))

# Apply filter
gray_filter = RgbToGrayscale()
filtered_tile = tile.apply_filters(gray_filter)

# Chain multiple filters
from histolab.filters.compositions import Compose
from histolab.filters.image_filters import StretchContrast

filter_chain = Compose([
    RgbToGrayscale(),
    StretchContrast()
])
processed_tile = tile.apply_filters(filter_chain)

Custom Mask Filters

Integrate custom filters with tissue masks:

from histolab.masks import TissueMask
from histolab.filters.compositions import Compose
from histolab.filters.image_filters import RgbToGrayscale, OtsuThreshold
from histolab.filters.morphological_filters import BinaryDilation

# Custom aggressive tissue detection
aggressive_filters = Compose([
    RgbToGrayscale(),
    OtsuThreshold(),
    BinaryDilation(disk_size=10),  # Larger dilation
    RemoveSmallObjects(area_threshold=5000)  # Remove only large artifacts
])

# Create mask with custom filters
custom_mask = TissueMask(filters=aggressive_filters)

Stain Normalization

While histolab doesn't have built-in stain normalization, filters can be used for basic normalization:

from histolab.filters.image_filters import RgbToHed, Lambda
import numpy as np

def normalize_hed(hed_image, target_means=[0.65, 0.70], target_stds=[0.15, 0.13]):
    """Simple H&E normalization."""
    h_channel = hed_image[:, :, 0]
    e_channel = hed_image[:, :, 1]

    # Normalize hematoxylin
    h_normalized = (h_channel - h_channel.mean()) / h_channel.std()
    h_normalized = h_normalized * target_stds[0] + target_means[0]

    # Normalize eosin
    e_normalized = (e_channel - e_channel.mean()) / e_channel.std()
    e_normalized = e_normalized * target_stds[1] + target_means[1]

    hed_image[:, :, 0] = h_normalized
    hed_image[:, :, 1] = e_normalized

    return hed_image

normalization_pipeline = Compose([
    RgbToHed(),
    Lambda(normalize_hed)
    # Convert back to RGB if needed
])

Best Practices

  1. Preview filters: Visualize filter outputs on thumbnails before applying to tiles
  2. Chain efficiently: Order filters logically (e.g., color conversion before thresholding)
  3. Tune parameters: Adjust thresholds and structuring element sizes for specific tissues
  4. Use composition: Build reusable filter pipelines with Compose
  5. Consider performance: Complex filter chains increase processing time
  6. Validate on diverse slides: Test filters across different scanners, stains, and tissue types
  7. Document custom filters: Clearly describe purpose and parameters of custom pipelines

Quality Control Filters

Blur Detection

from histolab.filters.image_filters import Lambda
import cv2
import numpy as np

def laplacian_blur_score(gray_image):
    """Calculate Laplacian variance (blur metric)."""
    return cv2.Laplacian(np.array(gray_image), cv2.CV_64F).var()

blur_detector = Lambda(lambda img: laplacian_blur_score(
    RgbToGrayscale()(img)
))

Tissue Coverage

from histolab.filters.image_filters import RgbToGrayscale, OtsuThreshold
from histolab.filters.compositions import Compose

def tissue_coverage(image):
    """Calculate percentage of tissue in image."""
    tissue_mask = Compose([
        RgbToGrayscale(),
        OtsuThreshold()
    ])(image)
    return tissue_mask.sum() / tissue_mask.size * 100

coverage_filter = Lambda(tissue_coverage)

Troubleshooting

Issue: Tissue detection misses valid tissue

Solutions: - Reduce area_threshold in RemoveSmallObjects - Decrease erosion/opening disk size - Try adaptive thresholding instead of Otsu

Issue: Too many artifacts included

Solutions: - Increase area_threshold in RemoveSmallObjects - Add opening/closing operations - Use custom color-based filtering for specific artifacts

Issue: Tissue boundaries too rough

Solutions: - Add BinaryClosing or BinaryOpening for smoothing - Adjust disk_size for morphological operations

Issue: Variable staining quality

Solutions: - Apply histogram equalization - Use adaptive thresholding - Implement stain normalization pipeline

← Back to histolab