Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Automated Meme Battles: Scraping, Searching, and Sending Stickers with Python

Tech May 9 3

Meme battles (斗图) demand instant replies with the right reaction image. We can automate the antire pipeline in three stages: collect a large sticker dataset from a public website, enable local fuzzy search by keyword, and integrate with a WeChat messaging interface to send images automatically.

Scraping Stickers from Doutula

The website http://www.doutula.com hosts thousands of stickers across many paginated gallery pages. Each page follows a simple structure, making it easy to parse with requests and a regex. The script below fetches images from a range of pages concurrently using ThreadPoolExecutor, extracts the image URL and caption, cleans the filename, and saves to a local doutula folder.

import requests
import re
import os
from concurrent.futures import ThreadPoolExecutor

def fetch_and_save(page):
    base_url = 'http://www.doutula.com/photo/list/?page='
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)',
        'Accept': 'text/html,application/xhtml+xml,*/*',
        'Accept-Language': 'zh-CN,zh;q=0.9'
    }
    target = f"{base_url}{page}"
    resp = requests.get(target, headers=headers, timeout=10)
    content = resp.text

    # Capture image URL and alt text simultaneously
    pattern = re.compile(r'data-original="(.*?)".*?alt="(.*?)"', re.DOTALL)
    matches = pattern.findall(content)

    os.makedirs('doutula', exist_ok=True)
    for img_url, alt_text in matches:
        # Clean alt text to be a valid filename
        safe_name = re.sub(r'[\\/:*?"<>|《》。?!.!&\#()()]', '', alt_text)
        ext = img_url.split('.')[-1].split('?')[0]  # handle possible query strings
        file_path = os.path.join('doutula', f"{safe_name}.{ext}")
        try:
            # Download and save the image
            img_data = requests.get(img_url, headers=headers, timeout=10).content
            with open(file_path, 'wb') as f:
                f.write(img_data)
            print(f"Saved {file_path}")
        except Exception as e:
            print(f"Failed {img_url}: {e}")

if __name__ == '__main__':
    pages = range(1, 51)    # adjust range as needed
    with ThreadPoolExecutor(max_workers=10) as pool:
        pool.map(fetch_and_save, pages)

Local Fuzzy Search with Glob

After downloading thousands of stickers, we want to quick find images whose filenames contain a given keyword. The standard libray glob can do this using a wildcard pattern:

import glob
import os

keyword = "失望"
sticker_dir = os.path.join(os.getcwd(), "doutula")
pattern = os.path.join(sticker_dir, f"*{keyword}*.*")
for path in glob.glob(pattern):
    print(path)

Alternatively, with pathlib:

from pathlib import Path

folder = Path('doutula')
for img in folder.glob(f'*{keyword}*.*'):
    print(img)

Both approaches return the matching sticker paths, ready to be sent.

WeChat Automation with itchat

To participate in a meme battle, we use itchat to log into Web WeChat, listen for incoming text messages, and reply with up to three relevant stickers. The script matches the message text against local filenames and sends the first matches with a small delay for a natural feel.

import itchat
import glob
import time
import os

def find_stickers(keyword, limit=3):
    directory = os.path.join(os.getcwd(), 'doutula')
    pattern = os.path.join(directory, f'*{keyword}*.*')
    results = []
    for path in glob.glob(pattern):
        results.append(path)
        if len(results) >= limit:
            break
    return results

@itchat.msg_register(['TEXT'])
def reply_with_meme(msg):
    kw = msg.text.strip()
    if not kw:
        return
    stickers = find_stickers(kw)
    if not stickers:
        # fallback: send a random sticker or do nothing
        pass
    else:
        for sticker in stickers:
            msg.user.send_image(sticker)
            time.sleep(0.3)

if __name__ == '__main__':
    itchat.auto_login(hotReload=True)
    itchat.run()

This setup enables fully automatic meme responses. The collection and search steps are independent, so you can rebuild the local library as needed, and the bot will always reply with matching (and sometimes hilarious) images.

Tags: Python

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Overview In a recent project, I utilized the SBUS protocol with the Fus remote controller to control a vehicle's basic operations, including movement, lights, and mode switching. This article is aimed...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.