Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Visualizing Positive Sample Selection in YOLOv5

Tech May 12 4

Understanding the process of positive sample selection is crucial for YOLOv5. This guide demonstrates how to visualize these selected samples across different scales to gain deeper insight into the underlying mechanism.

Visual Outputs

When visualized, the positive sample selection process reveals several key compnoents:

  • Original Image with Bounding Boxes: The base image with ground truth annotations restored.
  • Grid Cell Visualization: Highlights the specific grid cells selected as positive samples on the feature maps (e.g., 20x20, 40x40, 80x80 scales).
  • Anchor Box Visualization: Displays the anchor boxes associated with the positive samples, both on the feature map grid and projected onto the original image.

Implemantation Code

The visualization logic is integrated into the training pipeline, primarily within the loss computation function.

1. Drawing Bounding Boxes on the Original Image

To provide context, ground truth boxes are drawn onto the input image. This can be added in the data loading function (e.g., load_mosaic in dataloaders.py).

if label_data.size:
    # Convert normalized coordinates to pixel coordinates
    label_data[:, 1:] = xywhn2xyxy(label_data[:, 1:], img_w, img_h, pad_w, pad_h)
    for _, x1, y1, x2, y2 in label_data:
        cv2.rectangle(composite_img, (int(x1), int(y1)), (int(x2), int(y2)), (255, 0, 0), 2)

2. Core Visualization in the Loss Function

The main visualizations are generated inside the build_targets function within loss.py. Key visualization calls are placed where positive samples are identified.

# Visualize selected grid cells on the feature map
plot_grid_on_featuremap(grid_indices, feat_map_size)

# Visualize matched anchor boxes on the feature map
plot_anchors_on_featuremap(grid_indices, anchor_set, feat_map_size)

# Save the original input image for reference
save_original_image(batch_imgs)

# Project and visualize selected grid cells onto the original image
plot_grid_on_image(grid_indices, feat_map_size, batch_imgs)

# Project and visualize matched anchor boxes onto the original image
plot_anchors_on_image(grid_indices, anchor_set, feat_map_size, batch_imgs)

# These lists store the positive sample data for loss calculation
sample_indices.append((batch_id, anchor_id, grid_y.clamp_(0, feat_map_size - 1), grid_x.clamp_(0, feat_map_size - 1)))
target_boxes.append(torch.cat((gt_xy - grid_indices, gt_wh), 1))

3. Helper Functions for Visualization

Saving the Original Image:

def save_original_image(image_batch):
    import uuid
    import numpy as np
    from PIL import Image
    img_array = image_batch[0].cpu().numpy().transpose(1, 2, 0) * 255
    img_pil = Image.fromarray(np.uint8(img_array))
    img_pil.save(f'vis_results/original_{uuid.uuid4().hex}.png')

Plotting Grid Cells on Feature Map:

def plot_grid_on_featuremap(grid_inds, grid_scale):
    import matplotlib.pyplot as plt
    import matplotlib.patches as patches
    fig, axis = plt.subplots()
    # Draw grid lines
    for i in range(grid_scale + 1):
        axis.plot([0, grid_scale], [i, i], color='gray', linewidth=0.5)
        axis.plot([i, i], [0, grid_scale], color='gray', linewidth=0.5)
    axis.set_xlim(0, grid_scale)
    axis.set_ylim(grid_scale, 0)
    axis.set_aspect('equal')
    axis.xaxis.set_ticks_position('top')
    # Highlight positive grid cells
    for x_coord, y_coord in grid_inds:
        rect = patches.Rectangle((x_coord.cpu(), y_coord.cpu()), 1, 1, linewidth=1, edgecolor='red', facecolor='red', alpha=0.5)
        axis.add_patch(rect)
    plt.savefig(f"vis_results/feat_grid_{grid_scale}.png")
    plt.close()

Plotting Anchors on Original Image:

def plot_anchors_on_image(grid_inds, anchors, grid_scale, image_batch):
    import matplotlib.pyplot as plt
    import matplotlib.patches as patches
    from PIL import Image
    import numpy as np
    img_array = image_batch[0].cpu().numpy().transpose(1, 2, 0) * 255
    img_pil = Image.fromarray(np.uint8(img_array))
    img_w, img_h = img_pil.size
    cell_size = img_w / grid_scale
    fig, axis = plt.subplots()
    axis.imshow(img_pil)
    # Draw image grid
    for i in range(grid_scale + 1):
        axis.plot([0, img_w], [i * cell_size, i * cell_size], color='gray', linewidth=0.2)
        axis.plot([i * cell_size, i * cell_size], [0, img_h], color='gray', linewidth=0.2)
    axis.set_xlim(0, img_w)
    axis.set_ylim(img_h, 0)
    axis.xaxis.set_ticks_position('top')
    # Draw anchor boxes
    for idx in range(len(grid_inds.cpu())):
        gx, gy = grid_inds.cpu()[idx]
        aw, ah = anchors.cpu()[idx]
        box_center_x = gx * cell_size
        box_center_y = gy * cell_size
        box_x = box_center_x - (aw * cell_size) / 2
        box_y = box_center_y - (ah * cell_size) / 2
        rect = patches.Rectangle((box_x, box_y), aw * cell_size, ah * cell_size, linewidth=1, edgecolor='red', facecolor='none')
        axis.add_patch(rect)
    plt.savefig(f"vis_results/img_anchors_{grid_scale}.png")
    plt.close()

Execution

Run the training script with a batch size of 1 to generate the visualizations for each step.

python train.py --data dataset.yaml --weights yolov5s.pt --batch-size 1

Visualizing the Selection Rules

The ganerated images directly illustrate YOLOv5's core positive sample assignment strategies:

  1. Multiple Anchors per Cell: A single ground truth center can match multiple anchors within the same grid cell if their aspect ratios are compatible. This is visible as dense clusters of anchor boxes around objects.
  2. Adjacent Grid Prediction: A ground truth box is assigned not only to the grid cell containing its center but also to the two nearest neighboring cells. Visualizations consistently show selected grid cells appearing in groups of three.
  3. Multi-Scale Prediction: A single ground truth object can match anchors across multiple detection layers (e.g., P3, P4, P5). The visualizations will show the same object being assigned positive samples at different feature map resolutions (20x20, 40x40, 80x80).

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.