Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Python Fundamentals: Variables, Data Types, and Building Word Clouds with jieba

Tech May 9 3

Variables in Python

Variables serve as containers for storing data values. In Python, they are created the moment you assign a value to them. A variable consists of three key elements: a descriptive name, an assignment operator, and a value.

Naming Conventions

  • Variable names must start with a letter or underscore, never a number
  • Names can only contain alphanumeric characters and underscores
  • Reserved keywords cannot be used as variable names
  • Names should be meaningful and describe the data they hold

Comments

Python supports single-line comments using the # symbol and multi-line comments using triple quotes.

# This is a single-line comment

'''
This is a multi-line comment
spanning multiple lines
'''

"""
Another way to write
multi-line comments
"""

Data Types

Numeric Types

Integers

Integers are whole numbers without decimal points. They support standard arithmetic operations.

user_id = 1001
phone_number = int(13800138000)

# Arithmetic operations
num_a = 15
num_b = 4

print(num_a + num_b)   # Addition: 19
print(num_a - num_b)   # Subtraction: 11
print(num_a * num_b)   # Multiplication: 60
print(num_a / num_b)   # Division: 3.75
print(num_a % num_b)   # Modulus: 3
print(num_a // num_b)  # Floor division: 3
print(num_a ** num_b)  # Exponentiation: 50625

Floats

Floating-point numbers represent decimal values.

monthly_salary = 15000.50
height_meters = float(180)  # Converts to 180.0

# Mathematical operations
import math
result = math.sqrt(16)  # 4.0
log_value = math.log(10)  # Natural logarithm

String Type

Strings are sequences of characters enclosed in single, double, or triple quotes.

first_name = 'John'
last_name = "Doe"
address = '''
123 Main Street,
New York, NY 10001
'''

# String methods
text = "Hello World"
print(text.startswith("Hello"))  # True
print(text.endswith("World"))    # True

# Indexing and slicing
message = "Python Programming"
print(message[0])      # P (first character)
print(message[-1])      # g (last character)
print(message[0:6])     # Python
print(message[::2])     # Pto rgamn (step of 2)

String Operations: join() and split()

# Join - combines list elements into a string
char_list = ['P', 'y', 't', 'h', 'o', 'n']
joined_string = '-'.join(char_list)
print(joined_string)  # P-y-t-h-o-n

# Split - breaks string into list
sentence = "apple,banana,cherry"
fruits = sentence.split(',')
print(fruits)  # ['apple', 'banana', 'cherry']

List Type

Lists are ordered, mutable collections that can hold multiple data types.

# Creating lists
hobbies = ['reading', 'gaming', 'coding']
empty_list = []
converted_list = list('abc')  # ['a', 'b', 'c']

# Accessing elements
print(hobbies[0])    # reading
print(hobbies[-1])   # coding
print(hobbies[1:3])  # ['gaming', 'coding']

# Modifying lists
hobbies.append('music')
hobbies.remove('gaming')

Dictionary Type

Dictionaries store data in key-value pairs, providing fast lookups by key.

# Creating a dictionary
person_info = {
    'name': 'Alice',
    'age': 28,
    'city': 'Beijing',
    'skills': ['Python', 'Java', 'SQL']
}

# Accessing values by key
print(person_info['name'])       # Alice
print(person_info.get('age'))    # 28

# Modifying values
person_info['age'] = 29
person_info['email'] = 'alice@example.com'  # Add new key

# Deleting a key-value pair
del person_info['city']

Configuring Pip Mirror Sources

When installing Python packages, the default pip source may be slow due to geographic distance. Configuring a domestic mirror significantly improves download speeds.

Common Chinese mirror sources include:

To permanently configure a mirror, modify the pip configuration file or use the command:

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

Jieba Library for Chinese Word Segmentation

Jieba is a powerful Chinese text segmentation library. Install it using pip:

pip install jieba

Segmentation Methods

import jieba

text = "Artificial intelligence is changing the world"

# Accurate mode (default) - returns list
seg_result = jieba.lcut(text)
print(seg_result)
# ['Artificial', 'intelligence', 'is', 'changing', 'the', 'world']

# Full mode - scans all possible words
full_seg = jieba.lcut(text, cut_all=True)
print(full_seg)

# Search engine mode - better for search indexing
search_seg = jieba.lcut_for_search(text)
print(search_seg)

# Adding custom words
jieba.add_word("Artificial intelligence")
custom_seg = jieba.lcut("Artificial intelligence is developing rapidly")
print(custom_seg)

Generating Word Clouds with wordcloud

The wordcloud library creates visual representations of text data. Install required packages:

pip install wordcloud
pip install imageio
pip install pillow

Basic Word Cloud Generation

import jieba
import wordcloud

# Sample text
content = "Python is a powerful programming language used for web development, data analysis, machine learning, and automation. Python has a simple syntax and is easy to learn."

# Segment Chinese text (for Chinese content)
words = jieba.lcut(content)
text_processed = ' '.join(words)

# Create word cloud
wc = wordcloud.WordCloud(
    width=800,
    height=400,
    background_color='white'
)

wc.generate(text_processed)
wc.to_file('wordcloud_output.png')

Customized Word Cloud with Mask

Using a mask allows you to shape the word cloud into custom forms.

import jieba
import wordcloud
from imageio import imread

# Load mask image (white background required)
mask_image = imread('star_shape.png')

# Text content
sample_text = "Technology innovation drives progress and creates opportunities for development and growth"

# Add custom vocabulary
jieba.add_word("Technology innovation")

# Process text
word_list = jieba.lcut(sample_text)
processed_text = ' '.join(word_list)

# Configure word cloud with mask
cloud = wordcloud.WordCloud(
    font_path='C:/Windows/Fonts/simhei.ttf',  # Chinese font support
    mask=mask_image,
    background_color='white',
    width=1000,
    height=800,
    max_words=100
)

cloud.generate(processed_text)
cloud.to_file('custom_shaped_wordcloud.png')

Key parameters for WordCloud customization:

  • font_path: Path to font file for Chinese character support
  • mask: Image array defining the shape
  • background_color: Background color (default: black)
  • max_words: Maximum number of words to display
  • width, height: Output image dimensions
  • colormap: Color scheme for words

Related Articles

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Overview In a recent project, I utilized the SBUS protocol with the Fus remote controller to control a vehicle's basic operations, including movement, lights, and mode switching. This article is aimed...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.