Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Object Detection with MMDetection: Setup, Inference, and Custom Training

Tech 2

MMDetection is a comprehensive deep learning toolkit designed for object detection and instance segmentation. It features an extensive library of over 440 pre-trained models and reproduces more than 60 academic papers. The framework supports a diverse range of architectures, including two-stage, single-stage, cascade, anchor-free, and transformer-based detectors. It provides streamlined utilities for training, testing, and inference.

Environment Configuration

The OpenMMLab ecosystem relies on foundational libraries like MMEngine and MMCV. These must be installed prior to setting up MMDetection.

bash pip install -U openmim mim install mmengine mim install "mmcv>=2.0.0"

Install the MMDetection package using MIM:

bash mim install mmdet

Verify the installation:

python import mmdet print(mmdet.version)

Model Zoo and Inference

MMDetection offers out-of-the-box inference via Python APIs. You can search for models using the MIM tool:

bash mim search mmdet --model "mask r-cnn"

Download a specific configuration and its corresponding weights:

bash mim download mmdet --config mask-rcnn_r50_fpn_2x_coco --dest ./weights

Perform inference on an image using the downloaded assets:

python import cv2 import mmcv from mmdet.apis import init_detector, inference_detector from mmdet.registry import VISUALIZERS

cfg_path = 'mask-rcnn_r50_fpn_2x_coco.py' weights_path = 'mask_rcnn_r50_fpn_2x_coco_bbox_mAP-0.392__segm_mAP-0.354_20200505_003907-3e542a40.pth'

detector = init_detector(cfg_path, weights_path, device='cpu') detections = inference_detector(detector, 'sample_image.jpg')

vis = VISUALIZERS.build(detector.cfg.visualizer) vis.dataset_meta = detector.dataset_meta

input_img = mmcv.imread('sample_image.jpg') vis.add_datasample( name='prediction', image=input_img, data_sample=detections, draw_gt=False, pred_score_thr=0.3, show=False, out_file='output_visualization.png' )

Configuration System

Deep learning experiments require defining several components: model architecture, dataset pipelines, training schedules (optimizers, learning rates, epochs), runtime environments (GPUs, distributed setups), and hooks (logging, checkpointing). In MMDetection, all these elements are consolidated into a single Python configuration file.

Key fields include:

  • model: Defines the network structure.
  • data: Specifies dataset paths and augmentation strategies.
  • optimizer and lr_config: Manage the training strategy.
  • load_from: Points to pre-trained weight files (.pth files storing PyTorch parameters).

Custom Training and Fine-tuning

Custom training typically involves fine-tuning a model pre-trained on datasets like COCO. Since the model already has converged weights, the learning rate must be reduced to prevent catastrophic forgetting.

To avoid duplicating configurations, MMDetection uses an inheritance mechanism. A custom config can inherit from a base config:

python

custom_detector.py

base = 'mask-rcnn_r50_fpn_2x_coco.py'

When loaded, the framework parses the base configuraton and merges it with the custom settings.

python from mmcv import Config

parsed_cfg = Config.fromfile('custom_detector.py') print(parsed_cfg.pretty_text)

Launch the training job using the MIM command-line interface:

bash mim train mmdet custom_detector.py

COCO Dataset Format

When preparing custom data, the COCO format is widely adopted. It consists of a JSON annotation file containing three primary keys:

  • images: Metadata for all images in the dataset.
  • annotations: Bounding boxes, segmentation masks, and labels for every object instance.
  • categories: Class definitions and mapping IDs.
Tags: MMDetection

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.