Automating Markdown Table Generation with Python
Creating documentation that synchronizes with version control history can be a tedious manual process. This article demonstrates how to automate the generation of Markdown tables containing Git commit data using Python, eliminating repetitive work through programmatic solutions.
Problem Statement
The goal was to extract commit history from a GitHub repository and automatically generate a formatted Markdown table. For the devchat-ai/gopool project, this meant capturing:
- Commit messages (first line only)
- Abbreviated commit hashes
- Author information
- Links to full commit details
Additionally, corresponding placeholder files needed creation for each commit entry.
Implementation Approach
A Python script was developed to interface with the GitHub API, retrieve commit data, process it according to specifications, and output a properly formatted Markdown table. The solution also handles creating associated documentation files.
import os
import requests
import pandas as pd
from typing import Dict, List
def fetch_repository_commits(owner: str, repository: str) -> List[Dict[str, str]]:
api_endpoint = f"https://api.github.com/repos/{owner}/{repository}/commits"
response = requests.get(api_endpoint)
response.raise_for_status()
processed_commits = []
raw_data = response.json()
for item in raw_data:
title_line = item['commit']['message'].split('\n')[0]
short_hash = item["sha"][:7]
processed_commits.append({
"Commit": f"[{title_line}](https://github.com/{owner}/{repository}/commit/{item['sha']})",
"Hash": item["sha"],
"Author": f"[{item['commit']['author']['name']}](https://github.com/{item['author']['login']})",
"English Prompt": f"[Prompt Link](./commits/{item['sha']}.md)",
"Chinese Prompt": f"[提示链接](./commits/{item['sha']}_zh.md)",
})
return processed_commits
def write_table_to_markdown(data: List[Dict[str, str]], output_file: str) -> None:
dataframe = pd.DataFrame(data)
dataframe['Hash'] = dataframe['Hash'].apply(lambda x: x[:7])
with open(output_file, "a") as file:
file.write(dataframe.to_markdown(index=False))
def initialize_commit_documents(record: Dict[str, str], folder: str = "devchat-ai/gopool/commits") -> None:
os.makedirs(folder, exist_ok=True)
for lang_suffix in ["", "_zh"]:
path = os.path.join(folder, f"{record['Hash']}{lang_suffix}.md")
if not os.path.exists(path):
with open(path, "w") as new_file:
new_file.write("")
def execute() -> None:
organization = "devchat-ai"
project = "gopool"
target_file = "devchat-ai/gopool/index.md"
commit_list = fetch_repository_commits(organization, project)
write_table_to_markdown(commit_list, target_file)
for entry in commit_list:
initialize_commit_documents(entry)
if __name__ == "__main__":
execute()
Workflow Benefits
This automated approach replaces hours of meticulous manual formatting. Instead of copying individual commit details and constructing hyperlinks by hand, the entire table generates instantly. The integration of pandas for Markdown serialization simplifies output formatting, while idempotent file operations ensure safe re-execution.
Beyond basic table generation, the script creates stub documentation files for each commit, preparing a framework for detaield prompt histories. This dual functionality demonstrates how automation tools can handle both structured data extraction and preparatory filesystem tasks simultaneously.