Understanding the MP4 File Format and Analysis
Table of Contents
- Overview
- Fundamentals of the MP4 Format
- Key Concepts of the Container Format
- Box Structure
- Track
- Samples
- Sample Tables
- Chunks
- Detailed Explanation of Core Boxes
- Additional Boxes and Their Usage
- Practical Technical Insights
- Open Source Tools
Overview
MP4, officially known as MPEG-4 Part 14, is a widely-used multimedia container format. Files have the .mp4 extension, and the format evolved from Apple’s QuickTime protocol.
Historical Development
- 2001: QuickTime’s
.movformat was adapted to the MPEG standard, introducing a box-based structure. - 2004: The MPEG-4 specification split container format (Part 12: ISO Base Media File Format) from encoding definitions (Part 14).
- MP4 utilizes a hierarchical box system for organizing data end metadata.
Fundamentals of the MP4 Format
Core Tools for MP4 Analysis
Several tools can parce MP4 files efficiently:
- mp4box.js: A browser-based MP4 parser.
- Bento4: Provides tools like
mp4dump,mp4edit, andmp4encrypt. - MP4Box by GPAC: Comprehansive multimedia processing.
- mp4info.exe: A Windows tool for inspecting MP4 files.
Key Concepts
Box Structure
The MP4 file structure hinges on nested boxes, which contain metadata and media data. Each box starts with a header defining its type and size. Major box types:
ftyp: Defines the file type.moov: Contains media metadata.mdat: Holds raw media data.
In MP4 files, big-endian byte order is used by default.
Detailed Explanation of Core Boxes
Sample Table Box (stbl)
This box forms the backbone of how media samples are mapped, timed, and organized efficiently.
Sub-boxes:
stsd: Describes codec details.stts: Maps decoding time to sample sequence.ctts: Maps composition time adjustments.stss: Identifies keyframes.stsz/stz2: Specifies sample sizes.stsc: Maps samples to chunks.stco/co64: Locates chunk offsets.
File Type Box (ftyp)
Identifies specifications the MP4 file adheres to and its compatible formats.
Movie Box (moov)
Contains headers and tracks, representing overall media data configuration.
Metadata and Header Boxes:
Examples include tkhd (track-specific headers) and mdhd (media-specific metadata).
Practical Technical Insights
Optimizing MP4 File Positioning
By default, MP4 files place moov metadata after media data (mdat). However, using FFmpeg with the -movflags faststart option relocates moov to the beginning, optimizing playback for streaming platforms.
# Moving the moov box to the beginning
ffmpeg -i input.flv -c copy -movflags faststart output.mp4
Seek Implemantation
To locate media from a specific timestamp:
- Convert time to scale using
timescale: e.g.,30 seconds * 90,000 = 2,700,000. - Use
sttsto find the approximate sample number, e.g.,2,700,000 ÷ 3,000 = 900. - Sync keyframes with
stss, locate the chunk viastsc, and find offsets viastco/co64.
Open Source Tools
- GPAC: A complete multimedia toolkit featuring MP4Box.
- mp4v2: API library to creating and editing MP4 files.
References
- ISO/IEC 14496-12:2015
- Wikipedia: MPEG-4
- Wikipedia: ISO Base Media File Format
- MPEG-4 Part 14
- Technical Blog on GPAC Usage
Author: smallest_one Source: Jianshu