Content-Based Image Retrieval Using Color Moments, Hu Invariants, and Gray-Level Co-occurrence Matrices
A typical content-based image retrieval pipeline transforms visual data into discirminative feature vectors through three primary stages: descriptor computation, database indexing, and similarity ranking. This implementation employs a multi-modal feature fusion strategy combining chromatic statistics, geometric invariants, and textural patterns to acheive robust matching performance.
The evaluation utilizes the Corel-1000 benchmark dataset, containing 1,000 images distributed across ten semantic categories including architecture, wildlife, landscapes, and cuisine. This diverse collection provides varying illumination conditions, perspectives, and subject compositions necessary for validating descriptor effectiveness.
Color Feature Extraction
Coler distribution characteristics are captured using statistical moments computed in the HSV color space. The first moment represents average color, the second quantifies chromatic variance, and the third measures distribution asymmetry. These nine values (three moments across three channels) provide a compact representation of the image's palette.
import cv2
import numpy as np
def extract_chromatic_profile(image_file):
"""Compute statistical moments from HSV color space."""
bgr_array = cv2.imread(image_file)
if bgr_array is None:
raise FileNotFoundError(f"Image not found: {image_file}")
hsv_array = cv2.cvtColor(bgr_array, cv2.COLOR_BGR2HSV)
channels = cv2.split(hsv_array)
profile = []
# First-order moments (mean)
profile.extend([np.mean(ch) for ch in channels])
# Second-order moments (standard deviation)
profile.extend([np.std(ch) for ch in channels])
# Third-order moments (skewness)
for channel in channels:
mean_val = np.mean(channel)
skewness = np.mean(np.abs(channel - mean_val) ** 3) ** (1.0 / 3.0)
profile.append(skewness)
return profile
Shape Feature Extraction
Geometric properties are encoded using Hu moment invariants, which derive from normalized central moments and remain constant under translation, scaling, and rotation. The seven resulting values undergo logarithmic transformation to improve numerical stability and dynamic range.
def compute_geometric_descriptors(file_path):
"""Calculate rotation and scale invariant shape descriptors."""
grayscale = cv2.imread(file_path, cv2.IMREAD_GRAYSCALE)
if grayscale is None:
return None
# Calculate spatial moments and derive invariants
spatial_moments = cv2.moments(grayscale)
invariants = cv2.HuMoments(spatial_moments).flatten()
# Apply log transformation to compress value ranges
scaled_features = []
for moment in invariants:
if moment != 0:
scaled_features.append(-np.sign(moment) * np.log10(abs(moment)))
else:
scaled_features.append(0.0)
return scaled_features
Texture Feature Extraction
Surface characteristics are quantified using the Gray-Level Co-occurrence Matrix (GLCM), which analyzes spatial relationships between pixel intensity pairs. Four key statistical measures capture textural properteis: contrast (local intensity variation), homogeneity (local gray level similarity), energy (uniformity), and entropy (randomness).
def analyze_texture_pattern(image_path, levels=64):
"""Extract GLCM-based texture characteristics."""
gray_img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
if gray_img is None:
return []
# Reduce gray levels for computational efficiency
quantized = (gray_img.astype(np.float32) / 256.0 * levels).astype(np.uint8)
rows, cols = quantized.shape
# Build co-occurrence matrix (horizontal direction)
co_matrix = np.zeros((levels, levels), dtype=np.float64)
for r in range(rows):
for c in range(cols - 1):
current = quantized[r, c]
neighbor = quantized[r, c + 1]
co_matrix[current, neighbor] += 1
# Normalize to create probability distribution
co_matrix /= np.sum(co_matrix)
# Calculate Haralick features
contrast = 0.0
homogeneity = 0.0
energy = 0.0
entropy = 0.0
for i in range(levels):
for j in range(levels):
probability = co_matrix[i, j]
if probability > 0:
contrast += probability * ((i - j) ** 2)
homogeneity += probability / (1.0 + (i - j) ** 2)
energy += probability ** 2
entropy -= probability * np.log2(probability)
return [contrast, homogeneity, energy, entropy]
These complementary feature vectors are concatenated to form a comprehensive image signature. Distance metrics such as Euclidean or Manhattan distance then facilitate similarity comparisons between query images and database entries, enabling efficient retrieval of visually similar content.