Visualizing Positive Sample Selection in YOLOv5
Understanding the process of positive sample selection is crucial for YOLOv5. This guide demonstrates how to visualize these selected samples across different scales to gain deeper insight into the underlying mechanism.
Visual Outputs
When visualized, the positive sample selection process reveals several key compnoents:
- Original Image with Bounding Boxes: The base image with ground truth annotations restored.
- Grid Cell Visualization: Highlights the specific grid cells selected as positive samples on the feature maps (e.g., 20x20, 40x40, 80x80 scales).
- Anchor Box Visualization: Displays the anchor boxes associated with the positive samples, both on the feature map grid and projected onto the original image.
Implemantation Code
The visualization logic is integrated into the training pipeline, primarily within the loss computation function.
1. Drawing Bounding Boxes on the Original Image
To provide context, ground truth boxes are drawn onto the input image. This can be added in the data loading function (e.g., load_mosaic in dataloaders.py).
if label_data.size:
# Convert normalized coordinates to pixel coordinates
label_data[:, 1:] = xywhn2xyxy(label_data[:, 1:], img_w, img_h, pad_w, pad_h)
for _, x1, y1, x2, y2 in label_data:
cv2.rectangle(composite_img, (int(x1), int(y1)), (int(x2), int(y2)), (255, 0, 0), 2)
2. Core Visualization in the Loss Function
The main visualizations are generated inside the build_targets function within loss.py. Key visualization calls are placed where positive samples are identified.
# Visualize selected grid cells on the feature map
plot_grid_on_featuremap(grid_indices, feat_map_size)
# Visualize matched anchor boxes on the feature map
plot_anchors_on_featuremap(grid_indices, anchor_set, feat_map_size)
# Save the original input image for reference
save_original_image(batch_imgs)
# Project and visualize selected grid cells onto the original image
plot_grid_on_image(grid_indices, feat_map_size, batch_imgs)
# Project and visualize matched anchor boxes onto the original image
plot_anchors_on_image(grid_indices, anchor_set, feat_map_size, batch_imgs)
# These lists store the positive sample data for loss calculation
sample_indices.append((batch_id, anchor_id, grid_y.clamp_(0, feat_map_size - 1), grid_x.clamp_(0, feat_map_size - 1)))
target_boxes.append(torch.cat((gt_xy - grid_indices, gt_wh), 1))
3. Helper Functions for Visualization
Saving the Original Image:
def save_original_image(image_batch):
import uuid
import numpy as np
from PIL import Image
img_array = image_batch[0].cpu().numpy().transpose(1, 2, 0) * 255
img_pil = Image.fromarray(np.uint8(img_array))
img_pil.save(f'vis_results/original_{uuid.uuid4().hex}.png')
Plotting Grid Cells on Feature Map:
def plot_grid_on_featuremap(grid_inds, grid_scale):
import matplotlib.pyplot as plt
import matplotlib.patches as patches
fig, axis = plt.subplots()
# Draw grid lines
for i in range(grid_scale + 1):
axis.plot([0, grid_scale], [i, i], color='gray', linewidth=0.5)
axis.plot([i, i], [0, grid_scale], color='gray', linewidth=0.5)
axis.set_xlim(0, grid_scale)
axis.set_ylim(grid_scale, 0)
axis.set_aspect('equal')
axis.xaxis.set_ticks_position('top')
# Highlight positive grid cells
for x_coord, y_coord in grid_inds:
rect = patches.Rectangle((x_coord.cpu(), y_coord.cpu()), 1, 1, linewidth=1, edgecolor='red', facecolor='red', alpha=0.5)
axis.add_patch(rect)
plt.savefig(f"vis_results/feat_grid_{grid_scale}.png")
plt.close()
Plotting Anchors on Original Image:
def plot_anchors_on_image(grid_inds, anchors, grid_scale, image_batch):
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from PIL import Image
import numpy as np
img_array = image_batch[0].cpu().numpy().transpose(1, 2, 0) * 255
img_pil = Image.fromarray(np.uint8(img_array))
img_w, img_h = img_pil.size
cell_size = img_w / grid_scale
fig, axis = plt.subplots()
axis.imshow(img_pil)
# Draw image grid
for i in range(grid_scale + 1):
axis.plot([0, img_w], [i * cell_size, i * cell_size], color='gray', linewidth=0.2)
axis.plot([i * cell_size, i * cell_size], [0, img_h], color='gray', linewidth=0.2)
axis.set_xlim(0, img_w)
axis.set_ylim(img_h, 0)
axis.xaxis.set_ticks_position('top')
# Draw anchor boxes
for idx in range(len(grid_inds.cpu())):
gx, gy = grid_inds.cpu()[idx]
aw, ah = anchors.cpu()[idx]
box_center_x = gx * cell_size
box_center_y = gy * cell_size
box_x = box_center_x - (aw * cell_size) / 2
box_y = box_center_y - (ah * cell_size) / 2
rect = patches.Rectangle((box_x, box_y), aw * cell_size, ah * cell_size, linewidth=1, edgecolor='red', facecolor='none')
axis.add_patch(rect)
plt.savefig(f"vis_results/img_anchors_{grid_scale}.png")
plt.close()
Execution
Run the training script with a batch size of 1 to generate the visualizations for each step.
python train.py --data dataset.yaml --weights yolov5s.pt --batch-size 1
Visualizing the Selection Rules
The ganerated images directly illustrate YOLOv5's core positive sample assignment strategies:
- Multiple Anchors per Cell: A single ground truth center can match multiple anchors within the same grid cell if their aspect ratios are compatible. This is visible as dense clusters of anchor boxes around objects.
- Adjacent Grid Prediction: A ground truth box is assigned not only to the grid cell containing its center but also to the two nearest neighboring cells. Visualizations consistently show selected grid cells appearing in groups of three.
- Multi-Scale Prediction: A single ground truth object can match anchors across multiple detection layers (e.g., P3, P4, P5). The visualizations will show the same object being assigned positive samples at different feature map resolutions (20x20, 40x40, 80x80).