Fading Coder

One Final Commit for the Last Sprint

Home > Notes > Content

Understanding MySQL VARCHAR Length Units: Character-Based Storage Explained

Notes May 9 4

Character vs Byte: The Fundamental Difference

A character represents a human-readable symbol—the abstract unit of written language. A Chinese character like '中', an English letter like 'a', and a digit like '1' each count as one character.

A byte is the basic unit of digital storage in computers—an 8-bit binary sequence (such as 0x4F or 0xA3).

The critical distinction: different character encodings consume different amounts of storage space per character.

VARCHAR Length Definition in MySQL

MySQL's VARCHAR(n) declaration specifies the maximum number of characters, not bytes. This means you can store up to n characters regardless of whether each character occupies 1, 2, or 3 bytes under your chosen encoding.

UTF-8 Encoding Behavior

CREATE TABLE user_accounts (
  user_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  username VARCHAR(10) NOT NULL DEFAULT '',
  PRIMARY KEY (user_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Inserting mixed content:

INSERT INTO user_accounts (username) VALUES
  ('Alice'),
  ('李明'),
  ('HelloWorld'),
  ('数据库教程');

Query results using length functions:

SELECT 
  user_id,
  username,
  LENGTH(username) AS byte_count,
  CHAR_LENGTH(username) AS char_count
FROM user_accounts;
user_id username byte_count char_count
1 Alice 5 5
2 李明 6 2
3 HelloWorld 10 10
4 数据库教程 12 4

Key observations for UTF-8:

  • English letters and ASCII characters: 1 byte per character
  • Chinese characters: 3 bytes per character
  • LENGTH() measures bytes; CHAR_LENGTH() measures characters

GBK Encoding Behavior

CREATE TABLE product_catalog (
  product_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  product_name VARCHAR(10) CHARACTER SET gbk COLLATE gbk_bin NOT NULL,
  PRIMARY KEY (product_id)
) ENGINE=InnoDB DEFAULT CHARSET=gbk;
INSERT INTO product_catalog (product_name) VALUES
  ('手机'),
  ('Phone'),
  ('平板电脑');
SELECT 
  product_id,
  product_name,
  LENGTH(product_name) AS byte_count,
  CHAR_LENGTH(product_name) AS char_count
FROM product_catalog;
product_id product_name byte_count char_count
1 手机 4 2
2 Phone 5 5
3 平板电脑 8 4

Key observations for GBK:

  • English letters and ASCII characters: 1 byte per character
  • Chinese characters: 2 bytes per character

Maximum VARCHAR Lengths by Encoding

The row size limit in MySQL is 65,535 bytes (shared across all columns). This constraint directly impacts maximum character capacity:

Encoding Bytes per Character Maximum Characters
latin1 1 65,533
gbk 2 32,766
utf8 3 21,845
utf8mb4 4 16,383

Note: The actual limit is 65,532 bytes (not 65,535) because VARCHAR columns require 2 bytes to store the length prefix.

Version Differences in Storage Rules

MySQL 4.x and earlier:

  • VARCHAR(20) meant 20 bytes
  • UTF-8 text with 3-byte characters: only 6 characters fit in 20 bytes

MySQL 5.x and later:

  • VARCHAR(20) means 20 characters
  • Maximum row size: 65,532 bytes (after accounting for overhead)

CHAR vs VARCHAR: Storage Mechanisms

CHAR is a fixed-length type:

  • Always occupies exactly n bytes per value
  • Trailing spaces shorter than n are padded with spaces
  • Spaces are trimmed upon retrieval
  • Suitable for short, fixed-length data (UUIDs, state codes, phone prefixes)

VARCHAR is a variable-length type:

  • Stores only the actual bytes needed plus 1-2 bytes for length metadata
  • No padding; stores data as-is
  • Ideal for varible-length content (names, descriptions, addresses)
-- CHAR storage example
CHAR(10) storing 'ABC'  →  'ABC      ' (10 bytes, spaces padded)

-- VARCHAR storage example  
VARCHAR(10) storing 'ABC' → 'ABC' (4 bytes: 1 length byte + 3 content bytes)

MySQL Row Storage Rules

  1. If all columns have fixed lengths, every row has the same size
  2. If any column has variable length, rows may have different sizes
  3. MySQL may automatically convert fixed-length columns to variable-length when row size becomes variable—except CHARR columns under 4 characters, which remain fixed-length by design

These storage characteristics affect index performance and page utilization in InnoDB and MyISAM engines.

Related Articles

Designing Alertmanager Templates for Prometheus Notifications

How to craft Alertmanager templates to format alert messages, improving clarity and presentation. Alertmanager uses Go’s text/template engine with additional helper functions. Alerting rules referenc...

Deploying a Maven Web Application to Tomcat 9 Using the Tomcat Manager

Tomcat 9 does not provide a dedicated Maven plugin. The Tomcat Manager interface, however, is backward-compatible, so the Tomcat 7 Maven Plugin can be used to deploy to Tomcat 9. This guide shows two...

Skipping Errors in MySQL Asynchronous Replication

When a replica halts because the SQL thread encounters an error, you can resume replication by skipping the problematic event(s). Two common approaches are available. Methods to Skip Errors 1) Skip a...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.