Home > Notes > Content

Essential Python Development Patterns and LLM Integration Strategies

Notes Apr 24 10

Legacy String Handling in Python 2

In Python 2, managing text requires careful distinction between byte strings and Unicode objects. To prevent encoding conflicts during concatenation, ensure all strings share the same type.

# Converting to Unicode using the 'u' prefix
text_unicode = u'\u4e2d\u56fd'
print(type(text_unicode))  # <type 'unicode'>

# Exporting for storage (UTF-8 encoding)
serialized_text = text_unicode.encode('utf-8')
print(type(serialized_text))  # <type 'str'>

Modern Python 3 Utilities

Advanced Slicing Behavior

When using indices in string slicing, if the start index is negative and the end index is positive in a way that creates an invalid range, Python returns an empty string rather than an error.

Debugging and Diagnostics

Trigger an interactive debugger at a specific line of code:

import pdb; pdb.set_trace()

Data Partitioning and Randomization

Splitting a list for machine learning tasks (e.g., training vs. validation) can be handled efficiently using random.shuffle and slicing.

import random

def partition_dataset(items, split_ratio=0.1, seed_value=42):
    random.seed(seed_value)
    shuffled_items = list(items)
    random.shuffle(shuffled_items)
    
    split_point = int(len(shuffled_items) * split_ratio)
    if split_point < 1 or len(shuffled_items) == 0:
        return shuffled_items, []
    
    val_set = shuffled_items[:split_point]
    train_set = shuffled_items[split_point:]
    return train_set, val_set

# Random sampling
population = list(range(100))
sampled_unique = random.sample(population, k=5)  # No replacement
sampled_repeats = random.choices(population, k=5) # With replacement

Formatted Logging

Standardizing timestamp output for logs improves readability and traceability.

from datetime import datetime

def log_with_timestamp(message):
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    print(f"[{timestamp}] - {message}")

Logic Flow: Loop-Else Constrcution

The else block in a Python loop executes only if the loop completes naturally without hitting a break statement.

def find_element(target, collection):
    for item in collection:
        if item == target:
            print(f"Found: {item}")
            break
    else:
        print("Target not found in collection.")

Robust Exception Handling

Handling multiple error types within a single block allows for cleaner error recovery strategies.

def safe_division():
    try:
        numerator = float(input("Enter numerator: "))
        denominator = float(input("Enter denominator: "))
        result = numerator / denominator
    except (ZeroDivisionError, ValueError) as err:
        if isinstance(err, ZeroDivisionError):
            print("Error: Division by zero.")
        else:
            print("Error: Invalid numerical input.")
    else:
        print(f"Calculation successful: {result}")

Integrating with Large Language Models

Azure OpenAI GPT-4 API Wrapper

This implementation includes retry logic to handle transient API failures.

import time
import openai
from datetime import datetime

def fetch_gpt_response(messages, config, retries=5, backoff=5):
    attempt = 0
    while attempt < retries:
        try:
            response = openai.ChatCompletion.create(
                engine=config['engine'],
                messages=messages,
                temperature=config.get('temperature', 0.7),
                max_tokens=config.get('max_tokens', 1024)
            )
            return response.choices[0]["message"]["content"]
        except Exception as e:
            attempt += 1
            print(f"[{datetime.now()}] Error: {e}. Retrying in {backoff}s...")
            time.sleep(backoff)
    return "[API_ERROR]"

Baidu Wenxin (Ernie Bot) API Integration

Interacting with Wenxin requires an access token and specific model endpoint mapping.

import requests
import json

def get_baidu_token(api_key, secret_key):
    url = f"https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id={api_key}&client_secret={secret_key}"
    res = requests.post(url)
    return res.json().get("access_token")

def call_ernie_bot(prompt, token, model_url):
    headers = {'Content-Type': 'application/json'}
    payload = json.dumps({
        "messages": [{"role": "user", "content": prompt}],
        "temperature": 0.1
    })
    endpoint = f"{model_url}?access_token={token}"
    response = requests.post(endpoint, headers=headers, data=payload)
    return response.json().get("result")

Infrastructure and Database Operations

MySQL Connection Management

Using pymysql to manage relational data, ensuring connections are refreshed to avoid timeoust.

import pymysql

def execute_query(connection, sql_statement, is_write=False):
    try:
        connection.ping(reconnect=True)
        with connection.cursor() as cursor:
            cursor.execute(sql_statement)
            if is_write:
                connection.commit()
                return True
            return cursor.fetchall()
    except Exception as e:
        connection.rollback()
        print(f"Database error: {e}")
        return None

Redis Cache Interaction

Standardized methods for writing and expiring keys in Redis.

import redis
import json

def cache_data(client, key, value, ttl=3600):
    payload = json.dumps(value) if not isinstance(value, str) else value
    client.set(key, payload)
    client.expire(key, ttl)

LLM Fine-Tuning Frameworks

LLaMA Factory Configuration

For pre-training or fine-tuning, shell scripts automate the environment setup and DeepSpeed integration.

# Example Training Launch Script
deepspeed --num_gpus 8 --master_port=9901 src/train_bash.py \
    --stage pt \
    --model_name_or_path /path/to/base_model \
    --do_train \
    --dataset my_custom_corpus \
    --finetuning_type lora \
    --output_dir ./checkpoints/output_model \
    --overwrite_cache \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --learning_rate 5e-5 \
    --num_train_epochs 2.0 \
    --fp16 \
    --deepspeed ds_config.json

MS-Swift Compatibility Note

When performing Supervised Fine-Tuning (SFT) on Qwen-VL models, specific transformers versions are required. Using transformers=4.47.3 may cause input_ids errors; downgrading to 4.47.1 is a verified resolution for pipeline stability.

Tags: Python llm

Back to List

Prev: Identifying Individuals with No Social Connections

Next: Understanding Process ID Management in Linux Kernel

Fading Coder

Essential Python Development Patterns and LLM Integration Strategies

Legacy String Handling in Python 2

Modern Python 3 Utilities

Advanced Slicing Behavior

Debugging and Diagnostics

Data Partitioning and Randomization

Formatted Logging

Logic Flow: Loop-Else Constrcution

Robust Exception Handling

Integrating with Large Language Models

Azure OpenAI GPT-4 API Wrapper

Baidu Wenxin (Ernie Bot) API Integration

Infrastructure and Database Operations

MySQL Connection Management

Redis Cache Interaction

LLM Fine-Tuning Frameworks

LLaMA Factory Configuration

MS-Swift Compatibility Note

Related Articles

Designing Alertmanager Templates for Prometheus Notifications

Deploying a Maven Web Application to Tomcat 9 Using the Tomcat Manager

Skipping Errors in MySQL Asynchronous Replication

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Essential Python Development Patterns and LLM Integration Strategies

Legacy String Handling in Python 2

Modern Python 3 Utilities

Advanced Slicing Behavior

Debugging and Diagnostics

Data Partitioning and Randomization

Formatted Logging

Logic Flow: Loop-Else Constrcution

Robust Exception Handling

Integrating with Large Language Models

Azure OpenAI GPT-4 API Wrapper

Baidu Wenxin (Ernie Bot) API Integration

Infrastructure and Database Operations

MySQL Connection Management

Redis Cache Interaction

LLM Fine-Tuning Frameworks

LLaMA Factory Configuration

MS-Swift Compatibility Note

Related Articles

Designing Alertmanager Templates for Prometheus Notifications

Deploying a Maven Web Application to Tomcat 9 Using the Tomcat Manager

Skipping Errors in MySQL Asynchronous Replication

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment