Home > Tech > Content

Generating Realistic Test Data in Python with Faker

Tech May 15 13

The Faker library enables the creation of realistic synthetic data for testing, prototyping, and anonymization tasks in Python applications.

Installation

Installl via pip:

pip install Faker

Basic Usage

Instantiate a generator using either the Faker class or the legacy Factory:

from faker import Faker
fake = Faker()

print(fake.name())        # e.g., 'Li Wei'
print(fake.address())     # e.g., 'Room 802, No. 15 Beijing Road, Shanghai'
print(fake.text())        # Random paragraph in Chinese

Each method call returns a new random value. The generator supports localization—use 'zh_CN' for Chinese data:

fake_cn = Faker('zh_CN')

Data Categories and Examples

Addresses

fake.city()              # 'Hangzhou'
fake.street_address()    # 'No. 205 Nanjing Road'
fake.postcode()          # '200001'
fake.latitude()          # Decimal('31.2304')

Person Details

fake.name()              # 'Zhang Min'
fake.first_name_male()   # 'Jian'
fake.last_name_female()  # 'Wang'

Barcodes

fake.ean13()             # '6901234567892'
fake.ean(length=8)       # '69012345'

Colors

fake.hex_color()         # '#a3c14b'
fake.color_name()        # 'Crimson'

Companies

fake.company()           # 'FutureLink Digital Technology Ltd.'
fake.company_suffix()    # 'Group Co., Ltd.'

Credit Cards

fake.credit_card_number()      # '4532123456789012'
fake.credit_card_expire()      # '09/28'
fake.credit_card_full()        # Full formatted card info

Dates and Times

fake.date_this_year()          # datetime.date(2024, 5, 12)
fake.iso8601()                 # '2008-07-14T13:45:22'
fake.unix_time()               # 1256789012

Internet Data

fake.ipv4()                    # '192.168.1.105'
fake.email()                   # 'li.xiaoming@example.com'
fake.url()                     # 'http://www.chen.org/'
fake.user_agent()              # Browser user agent string

Text Generation (Lorem Ipsum)

fake.sentence()                # '系统支持用户登录功能。'
fake.paragraph()               # Multi-sentence Chinese paragraph
fake.words(5)                  # ['数据', '分析', '模型', '结果', '验证']

Miscellaneous

fake.password()                # 'Kx!9@mQz#Lp2'
fake.uuid4()                   # 'f47ac10b-58cc-4372-a567-0e02b2c3d479'
fake.boolean()                 # True or False
fake.language_code()           # 'zh'

Phone Numbers

fake.phone_number()            # '13812345678'

Python Objects

fake.pyint()                   # 42
fake.pystr(max_chars=10)       # 'aB3xY9qLmN'
fake.pylist(nb_elements=3)     # Mixed-type list
fake.pydict(nb_elements=2)     # {'key1': 'value', 'key2': 123}

User Profiles

fake.profile()                 # Dict with name, address, job, etc.
fake.simple_profile(sex='F')   # Minimal profile with gender constraint

Chinese ID Numbers (SSN)

fake.ssn()                     # '310101199003072316' (18-digit)

Browser and Platform Strings

fake.chrome()                  # Chrome UA string
fake.windows_platform_token()  # 'Windows 10'
fake.mac_processor()           # 'Intel'

Custom Providers

Extand functionality by creating custom providers:

from faker import Faker
from faker.providers import BaseProvider

class BookProvider(BaseProvider):
    def book_title(self):
        titles = ['Data Engineering', 'Python Tricks', 'Cloud Architecture']
        return self.random_element(titles)

fake = Faker()
fake.add_provider(BookProvider)
print(fake.book_title())  # e.g., 'Cloud Architecture'

Reproducible Results

Set a seed to generate consistant output across runs:

fake = Faker()
fake.seed_instance(12345)
print(fake.name())  # Always 'Chen Yifan' with this seed

Command-Line Interface

Generate data directly in the terminal:

# Generate one Chinese address
faker -l zh_CN address

# Output three names separated by semicolons
faker -r 3 -s ';' name

# Get only ssn and name from profile
faker profile ssn name

Common CLI options:

-l <locale>: Set language (e.g., zh_CN)
-r <count>: Repeat output N times
-s <sep>: Append separator after each result
-o <file>: Write output to file

Tags: Python

Back to List

Prev: Extending jQuery UI Widgets for Enhanced User Interfaces

Next: Python Reflection and Dynamic Parameter Handling for API Integration

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Overview In a recent project, I utilized the SBUS protocol with the Fus remote controller to control a vehicle's basic operations, including movement, lights, and mode switching. This article is aimed...

Fading Coder

Generating Realistic Test Data in Python with Faker

Installation

Basic Usage

Data Categories and Examples

Addresses

Person Details

Barcodes

Colors

Companies

Credit Cards

Dates and Times

Internet Data

Text Generation (Lorem Ipsum)

Miscellaneous

Phone Numbers

Python Objects

User Profiles

Chinese ID Numbers (SSN)

Browser and Platform Strings

Custom Providers

Reproducible Results

Command-Line Interface

Related Articles

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Generating Realistic Test Data in Python with Faker

Installation

Basic Usage

Data Categories and Examples

Addresses

Person Details

Barcodes

Colors

Companies

Credit Cards

Dates and Times

Internet Data

Text Generation (Lorem Ipsum)

Miscellaneous

Phone Numbers

Python Objects

User Profiles

Chinese ID Numbers (SSN)

Browser and Platform Strings

Custom Providers

Reproducible Results

Command-Line Interface

Related Articles

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment