Implementing Image Classification with RDNet: A Practical Guide
RDNet is an enhanced version of the DenseNet architecture, designed to improve performance and computational efficiency through key modifications. The model emphasizes concatenation operations over additive shortcuts, expands intermediate channel dimensions by adjusting the expansion ratio independently of the growth rate, and incorporates memory-efficient design principles to reduce resource usage while maintaining high accuracy.
For image classification tasks, such as plant seedling classification, RDNet can achieve over 97% accuracy with the rdnet_tiny variant. This guide covers essential techniques for training and optimizing RDNet models.
Data Augmentation Strategies
Data augmentation enhances model generalization. Use PyTorch's transforms for basic augmentations and integrate advanced methods like CutOut and MixUp.
Install required packages:
pip install timm torchtoolbox
Implement CutOut in the transformation pipeline:
from torchvision import transforms
from torchtoolbox.transform import Cutout
transform = transforms.Compose([
transforms.Resize((224, 224)),
Cutout(),
transforms.ToTensor(),
transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
])
Define MixUp with SoftTargetCrossEntropy loss:
from timm.data.mixup import Mixup
from timm.loss import SoftTargetCrossEntropy
mixup_func = Mixup(
mixup_alpha=0.8, cutmix_alpha=1.0, cutmix_minmax=None,
prob=0.1, switch_prob=0.5, mode='batch',
label_smoothing=0.1, num_classes=12
)
loss_func = SoftTargetCrossEntropy()
Exponential Moving Average (EMA)
EMA smooths model parameters to improve stability and generalization. It updates parameters using a decay factor, balancing historical and current values.
Implementation:
import torch
import torch.nn as nn
from copy import deepcopy
class ExponentialMovingAverage:
def __init__(self, network, decay_factor=0.9999, device=''):
self.shadow_model = deepcopy(network)
self.shadow_model.eval()
self.decay = decay_factor
self.device = device
if device:
self.shadow_model.to(device=device)
for param in self.shadow_model.parameters():
param.requires_grad = False
def update_parameters(self, network):
with torch.no_grad():
model_state = network.state_dict()
for key, shadow_param in self.shadow_model.state_dict().items():
if self.device:
model_param = model_state[key].detach().to(device=self.device)
else:
model_param = model_state[key].detach()
shadow_param.copy_(shadow_param * self.decay + (1.0 - self.decay) * model_param)
Integrate EMA during training:
if use_ema:
ema_handler = ExponentialMovingAverage(model, decay_factor=0.9999, device='cpu')
# In training loop
def train_step():
optimizer.step()
if ema_handler is not None:
ema_handler.update_parameters(model)
# Use EMA model for validation
validate(ema_handler.shadow_model, device, validation_loader)
Dataset Preparation
Organize data into ImageNet format with separate train and validation directories. Compute mean and standard deviation for normalization to accelerate convergence.
Calculate mean and std:
from torchvision.datasets import ImageFolder
import torch
def compute_statistics(dataset):
loader = torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=False)
mean = torch.zeros(3)
std = torch.zeros(3)
for images, _ in loader:
for channel in range(3):
mean[channel] += images[:, channel, :, :].mean()
std[channel] += images[:, channel, :, :].std()
mean /= len(dataset)
std /= len(dataset)
return mean.tolist(), std.tolist()
if __name__ == '__main__':
dataset = ImageFolder(root='data1', transform=transforms.ToTensor())
print(compute_statistics(dataset))
Convert dataset structure:
import os
import shutil
from sklearn.model_selection import train_test_split
import glob
image_paths = glob.glob('data1/*/*.png')
base_dir = 'data'
if os.path.exists(base_dir):
shutil.rmtree(base_dir)
os.makedirs(base_dir)
train_paths, val_paths = train_test_split(image_paths, test_size=0.3, random_state=42)
def copy_files(file_list, target_dir):
for path in file_list:
class_name = os.path.basename(os.path.dirname(path))
file_name = os.path.basename(path)
dest_dir = os.path.join(target_dir, class_name)
os.makedirs(dest_dir, exist_ok=True)
shutil.copy(path, os.path.join(dest_dir, file_name))
copy_files(train_paths, os.path.join(base_dir, 'train'))
copy_files(val_paths, os.path.join(base_dir, 'val'))
Training Techniques
- Use mixed precision training with PyTorch's
torch.cuda.ampto speed up training and reduce memory usage. - Apply gradient clipping to prevent exploding gradients.
- Implement distributed data parallel (DDP) for multi-GPU training.
- Monitor training with loss and accuracy curves.
- Adjust learning rate dynamically using cosine annealing.
- Track metrics with custom utilities like
AverageMeter. - Evaluate models using Top-1 and Top-5 accuracy metrics.
Project Structure
RDNet_Project
├── data
│ ├── train
│ └── val
├── models
│ └── rdnet.py
├── compute_stats.py
├── prepare_data.py
├── train_model.py
└── evaluate_model.py
Key files:
compute_stats.py: Calculates mean and std for normalization.prepare_data.py: Organizes dataset into train/val splits.train_model.py: Contains training loop with augmentation, EMA, and optimization.evaluate_model.py: Script for testing and generating performance reports.