Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Automating Financial Reports: Merging Alipay and WeChat Transaction Data with Python

Tech May 13 1

Managing personal finances becomes much easier when you can consolidate transaction data from different sources. Manually processing Alipay and WeChat payment records is tedious due to different CSV formats. This solution automates the process by combining both platforms' data into a unified financial report.

The script performs two main functions:

  1. Merging CSV transaction records from both payment platforms
  2. Automatically generating a categorized Markdown report suitable for note-taking applications

Key technical challenges addressed:

  • Alipay exports 24 header rows that need to be skipped during CSV parsing
  • Column names differ between platforms (e.g., "Transaction ID" vs "Order Number")
  • Currency symbols and thousand separators need normalization
  • Missing fields in one platform's data require proper handling

Implementation details:

The script relies on two Python packages: pandas for data manipulation and chardet for character encoding detection.

pip install pandas chardet

Exporting Transaction Records

  1. Alipay Records:

    • Open Alipay app → My → Bills → Tap the "···" icon in the top right → Generate transaction proof → Select "For personal reconciliation"
  2. WeChat Records:

    • Open WeChat app → Me → Services → Wallet → Bills → FAQs → Download bills → Select "For personnel reconciliation"

Place both CSV files in the same directory as the script.

Modify the script configuration at the bottom:

# Function to read CSV files and generate merged output
process_transaction_files('alipay_records_path.csv', 'wechat_records_path.csv', 'combined_transactions.csv')
# Function to generate Markdown report
create_financial_report('combined_transactions.csv', 'financial_summary.md')

Execute the script to generate the final report:

python financial_analyzer.py

Complete Implementation Code:

import pandas as pd

def process_transaction_files(alipay_file, wechat_file, output_file):
    try:
        # Process Alipay data, skipping header rows
        alipay_data = pd.read_csv(alipay_file, skiprows=24, encoding='GBK')
        
        # Process WeChat data
        wechat_data = pd.read_csv(wechat_file, skiprows=16)

        # Define required columns for consistency
        required_fields = ['OrderID', 'TransactionType', 'Counterparty', 'AccountNumber', 
                          'ProductDescription', 'TransactionDirection', 'Amount', 
                          'PaymentMethod', 'Status', 'Notes', 'TransactionDate']

        # Validate and select Alipay columns
        if all(field in alipay_data.columns for field in required_fields):
            alipay_processed = alipay_data[required_fields]
        else:
            print("Alipay CSV file is missing required columns.")
            raise ValueError("Incomplete columns in Alipay data")

        # Map WeChat columns to match required fields
        column_mapping = {
            'TransactionID': 'OrderID', 'Category': 'TransactionType', 'Item': 'ProductDescription', 
            'Value': 'Amount', 'PaymentChannel': 'PaymentMethod', 'CurrentStatus': 'Status'
        }
        
        # Apply column renaming to WeChat data
        wechat_data.rename(columns=column_mapping, inplace=True)

        # Create standardized DataFrame for WeChat data
        wechat_processed = pd.DataFrame(columns=required_fields)
        
        # Copy existing columns
        for field in wechat_data.columns:
            if field in required_fields:
                wechat_processed[field] = wechat_data[field]
        
        # Fill missing fields with placeholder
        for field in required_fields:
            if field not in wechat_processed.columns:
                wechat_processed[field] = pd.NA

        # Filter out non-transaction records
        alipay_processed = alipay_processed[alipay_processed['TransactionDirection'] != 'Non-transaction']
        wechat_processed = wechat_processed[wechat_processed['TransactionDirection'] != 'Non-transaction']

        # Clean amount values by removing currency symbols and thousand separators
        wechat_processed['Amount'] = wechat_processed['Amount'].str.replace(r'[¥¥,]', '', regex=True).astype(float)

        # Add platform identifier
        alipay_processed['Platform'] = 'Alipay'
        wechat_processed['Platform'] = 'WeChat'

        # Combine all transaction data
        unified_data = pd.concat([alipay_processed, wechat_processed], ignore_index=True)

        # Save merged data to CSV
        unified_data.to_csv(output_file, index=False)
        print(f"Successfully saved merged data to '{output_file}'")
    except FileNotFoundError:
        print("File not found. Please verify file paths.")
    except pd.errors.ParserError:
        print("Error reading CSV file. Please check file format or encoding.")
    except Exception as e:
        print(f"An error occurred: {e}")

def create_financial_report(csv_file, output_file):
    # Detect file encoding automatically
    import chardet
    with open(csv_file, 'rb') as f:
        encoding_result = chardet.detect(f.read())
        file_encoding = encoding_result['encoding']
    
    # Load transaction data
    transaction_data = pd.read_csv(csv_file, encoding=file_encoding)
    
    # Clean amount values
    transaction_data['Amount'] = transaction_data['Amount'].replace({'¥': '', ',': ''}, regex=True).astype(float)

    # Calculate total expenses and income
    total_expenses = transaction_data[transaction_data['TransactionDirection'] == 'Expense']['Amount'].sum()
    total_income = transaction_data[transaction_data['TransactionDirection'] == 'Income']['Amount'].sum()

    # Calculate category-wise amounts
    expenses_by_category = transaction_data[transaction_data['TransactionDirection'] == 'Expense'].groupby('TransactionType')['Amount'].sum().sort_values(ascending=False)
    income_by_category = transaction_data[transaction_data['TransactionDirection'] == 'Income'].groupby('TransactionType')['Amount'].sum().sort_values(ascending=False)

    # Calculate monthly balance
    monthly_balance = total_income - total_expenses
    
    # Debug information
    print(f"Total Expenses: {total_expenses}")
    print(f"Total Income: {total_income}")
    print(f"Monthly Balance: {monthly_balance}")
    
    # Generate markdown content
    markdown_content = f"**Monthly Expenses**: ¥{total_expenses:.2f}  |  **Monthly Income**: ¥{total_income:.2f}  |  **Net Balance**: ¥{monthly_balance:.2f}\n\n"


    # Expense analysis section
    markdown_content += "## Expense Analysis 💸\n\n"
    markdown_content += "Transaction categories and corresponding expenses:\n\n"
    markdown_content += "| Category   | Amount   |\n"
    markdown_content += "| ---------- | ---------- |\n"
    for category, amount in expenses_by_category.items():
        markdown_content += f"| {category} | ¥{amount:.2f} |\n"
    
    markdown_content += "\n### Detailed records by category:\n"
    for category in expenses_by_category.index:
        markdown_content += f"\n#### {category} Expense Details 💳\n"
        category_data = transaction_data[(transaction_data['TransactionDirection'] == 'Expense') & (transaction_data['TransactionType'] == category)]
        markdown_content += "| Counterparty  |  Amount  | Platform | Date |\n"
        markdown_content += "| -------- | ----- | ------ | -------- |\n"
        for _, row in category_data.iterrows():
            markdown_content += f"| {row['Counterparty']} | ¥{row['Amount']:.2f} | {row['Platform']} | {row['TransactionDate']} |\n"

    # Income analysis section
    markdown_content += "\n## Income Analysis 💵\n\n"
    markdown_content += "Transaction categories and corresponding income:\n\n"
    markdown_content += "| Category   | Amount   |\n"
    markdown_content += "| ---------- | ---------- |\n"
    for category, amount in income_by_category.items():
        markdown_content += f"| {category} | ¥{amount:.2f} |\n"
    
    markdown_content += "\n### Detailed records by category:\n"
    for category in income_by_category.index:
        markdown_content += f"\n#### {category} Income Details 💼\n"
        category_data = transaction_data[(transaction_data['TransactionDirection'] == 'Income') & (transaction_data['TransactionType'] == category)]
        markdown_content += "| Counterparty  |  Amount  | Platform | Date |\n"
        markdown_content += "| -------- | ----- | ------ | -------- |\n"
        for _, row in category_data.iterrows():
            markdown_content += f"| {row['Counterparty']} | ¥{row['Amount']:.2f} | {row['Platform']} | {row['TransactionDate']} |\n"

    # Transaction details section
    markdown_content += "\n## Transaction Details\n"
    sorted_data = transaction_data.sort_values(by='TransactionDate')
    markdown_content += "| Type | Platform | Direction | Amount | Counterparty | Description | Account | Method | Status | Notes | Date |\n"
    markdown_content += "| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |\n"
    
    for _, row in sorted_data.iterrows():
        markdown_content += f"| {row['TransactionType']} | {row['Platform']} | {row['TransactionDirection']} | ¥{row['Amount']:.2f} | {row['Counterparty']} | {row['ProductDescription']} | {row['AccountNumber']} | {row['PaymentMethod']} | {row['Status']} | {row['Notes']} | {row['TransactionDate']} |\n"
    
    # Save markdown to file
    with open(output_file, 'w', encoding='utf-8') as f:
        f.write(markdown_content)

    print(f"Markdown report successfully saved as '{output_file}'")

# Example usage
# Process transaction files and create merged CSV
process_transaction_files('./data/alipay_records.csv', './data/wechat_records.csv', './data/combined_transactions.csv')
# Generate financial report in Markdown format
create_financial_report('./data/combined_transactions.csv', './data/financial_report.md')

Tags: Pythonpandas

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.