Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Automating Markdown Table Generation with Python

Tech May 10 3

Creating documentation that synchronizes with version control history can be a tedious manual process. This article demonstrates how to automate the generation of Markdown tables containing Git commit data using Python, eliminating repetitive work through programmatic solutions.

Problem Statement

The goal was to extract commit history from a GitHub repository and automatically generate a formatted Markdown table. For the devchat-ai/gopool project, this meant capturing:

  • Commit messages (first line only)
  • Abbreviated commit hashes
  • Author information
  • Links to full commit details

Additionally, corresponding placeholder files needed creation for each commit entry.

Implementation Approach

A Python script was developed to interface with the GitHub API, retrieve commit data, process it according to specifications, and output a properly formatted Markdown table. The solution also handles creating associated documentation files.

import os
import requests
import pandas as pd
from typing import Dict, List

def fetch_repository_commits(owner: str, repository: str) -> List[Dict[str, str]]:
    api_endpoint = f"https://api.github.com/repos/{owner}/{repository}/commits"
    response = requests.get(api_endpoint)
    response.raise_for_status()
    
    processed_commits = []
    raw_data = response.json()
    
    for item in raw_data:
        title_line = item['commit']['message'].split('\n')[0]
        short_hash = item["sha"][:7]
        
        processed_commits.append({
            "Commit": f"[{title_line}](https://github.com/{owner}/{repository}/commit/{item['sha']})",
            "Hash": item["sha"],
            "Author": f"[{item['commit']['author']['name']}](https://github.com/{item['author']['login']})",
            "English Prompt": f"[Prompt Link](./commits/{item['sha']}.md)",
            "Chinese Prompt": f"[提示链接](./commits/{item['sha']}_zh.md)",
        })
    return processed_commits

def write_table_to_markdown(data: List[Dict[str, str]], output_file: str) -> None:
    dataframe = pd.DataFrame(data)
    dataframe['Hash'] = dataframe['Hash'].apply(lambda x: x[:7])
    
    with open(output_file, "a") as file:
        file.write(dataframe.to_markdown(index=False))

def initialize_commit_documents(record: Dict[str, str], folder: str = "devchat-ai/gopool/commits") -> None:
    os.makedirs(folder, exist_ok=True)
    
    for lang_suffix in ["", "_zh"]:
        path = os.path.join(folder, f"{record['Hash']}{lang_suffix}.md")
        if not os.path.exists(path):
            with open(path, "w") as new_file:
                new_file.write("")

def execute() -> None:
    organization = "devchat-ai"
    project = "gopool"
    target_file = "devchat-ai/gopool/index.md"
    
    commit_list = fetch_repository_commits(organization, project)
    write_table_to_markdown(commit_list, target_file)
    
    for entry in commit_list:
        initialize_commit_documents(entry)

if __name__ == "__main__":
    execute()

Workflow Benefits

This automated approach replaces hours of meticulous manual formatting. Instead of copying individual commit details and constructing hyperlinks by hand, the entire table generates instantly. The integration of pandas for Markdown serialization simplifies output formatting, while idempotent file operations ensure safe re-execution.

Beyond basic table generation, the script creates stub documentation files for each commit, preparing a framework for detaield prompt histories. This dual functionality demonstrates how automation tools can handle both structured data extraction and preparatory filesystem tasks simultaneously.

Tags: Python

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.