Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Object Detection with MMDetection: Setup, Inference, and Custom Training

Tech Apr 19 10

MMDetection is a comprehensive deep learning toolkit designed for object detection and instance segmentation. It features an extensive library of over 440 pre-trained models and reproduces more than 60 academic papers. The framework supports a diverse range of architectures, including two-stage, single-stage, cascade, anchor-free, and transformer-based detectors. It provides streamlined utilities for training, testing, and ifnerence.

Environment Configuration

The OpenMMLab ecosystem relies on foundational libraries like MMEngine and MMCV. These must be installed prior to setting up MMDetection.

pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"

Install the MMDetection package using MIM:

mim install mmdet

Verify the installation:

import mmdet
print(mmdet.__version__)

Model Zoo and Inference

MMDetection offers out-of-the-box inference via Python APIs. You can search for models using the MIM tool:

mim search mmdet --model "mask r-cnn"

Download a specific configuration and its corresponding weights:

mim download mmdet --config mask-rcnn_r50_fpn_2x_coco --dest ./weights

Perform inference on an image using the downloaded assets:

import cv2
import mmcv
from mmdet.apis import init_detector, inference_detector
from mmdet.registry import VISUALIZERS

cfg_path = 'mask-rcnn_r50_fpn_2x_coco.py'
weights_path = 'mask_rcnn_r50_fpn_2x_coco_bbox_mAP-0.392__segm_mAP-0.354_20200505_003907-3e542a40.pth'

detector = init_detector(cfg_path, weights_path, device='cpu')
detections = inference_detector(detector, 'sample_image.jpg')

vis = VISUALIZERS.build(detector.cfg.visualizer)
vis.dataset_meta = detector.dataset_meta

input_img = mmcv.imread('sample_image.jpg')
vis.add_datasample(
    name='prediction',
    image=input_img,
    data_sample=detections,
    draw_gt=False,
    pred_score_thr=0.3,
    show=False,
    out_file='output_visualization.png'
)

Configuration System

Deep learning experiments require defining several components: model architecture, dataset pipelines, training schedules (optimizers, learning rates, epochs), runtime environments (GPUs, distributed setups), and hooks (logging, checkpointing). In MMDetection, all these elements are consolidated into a single Python configuration file.

Key fields include:

  • model: Defines the network structure.
  • data: Specifies dataset paths and augmentation strategies.
  • optimizer and lr_config: Manage the training strategy.
  • load_from: Points to pre-trained weight files (.pth files storing PyTorch parameters).

Custom Training and Fine-tuning

Custom training typically involves fine-tuning a model pre-trained on datasets like COCO. Since the model already has converged weights, the learning rate must be reduced to prevent catastrophci forgetting.

To avoid duplicating configurations, MMDetection uses an inheritance mechanism. A custom config can inherit from a base config:

# custom_detector.py
_base_ = 'mask-rcnn_r50_fpn_2x_coco.py'

When loaded, the framework parses the base configuration and merges it with the custom settings.

from mmcv import Config

parsed_cfg = Config.fromfile('custom_detector.py')
print(parsed_cfg.pretty_text)

Launch the training job using the MIM command-line interface:

mim train mmdet custom_detector.py

COCO Dataset Format

When preparing custom data, the COCO format is widely adopted. It consists of a JSON annotation file containing three primary keys:

  • images: Metadata for all images in the dataset.
  • annotations: Bounding boxes, segmentation masks, and labels for every object instance.
  • categories: Class definitions and mapping IDs.
Tags: MMDetection

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.