Fading Coder

One Final Commit for the Last Sprint

Home > Notes > Content

A Python Script to Parse Email Data from Phonebook JSON API Response

Notes 2

During a security assessment, email adddress collection is a critical task. Manual extraction from API responses, such as those from a banking site's phonebook function, is inefficient when dealing with large volumes of JSON data.

Network inspection (via browser developer tools) reveals the target endpoint returns email information in a structured JSON format. While regex is a possible solution, parsing the JSON directly is more reliable and efficient.

Preparation Save the complete JSON response from the relevant API call to a local file named email.json. Viewing this file in a browser or editor helps identify the data structure. The target email addresses are nested within a array under the key "selectors", each stored in a "selectorvalue" field.

Core Parsing Logic The essential code to extract these values is straightforward.

import json

with open('email.json', 'r', encoding='utf-8') as json_file:
    json_data = json.load(json_file)

for entry in json_data["selectors"]:
    email_address = entry["selectorvalue"]
    print(email_address)

Enhanced Script with File Managemnet To create a reusable utility, the script is enhanced with command-line arguments and robust file handling.

import json
import os
import sys

INTERMEDIATE_FILE = "temp_output.txt"

def extract_emails(json_filename):
    """Reads JSON file and extracts email addresses from the selectors array."""
    try:
        with open(json_filename, 'r', encoding='utf-8') as f:
            dataset = json.load(f)
    except FileNotFoundError:
        print(f"Error: File '{json_filename}' not found.")
        sys.exit(1)
    except json.JSONDecodeError:
        print(f"Error: File '{json_filename}' contains invalid JSON.")
        sys.exit(1)

    email_list = []
    for item in dataset.get("selectors", []):
        email = item.get("selectorvalue")
        if email:
            email_list.append(email)
    return email_list

def write_results(data_list, output_filename):
    """Writes a list of strings to a file, each on a new line."""
    with open(output_filename, 'w', encoding='utf-8') as f:
        for line in data_list:
            f.write(line + '\n')

if __name__ == '__main__':
    # Validate command-line argument for host identifier
    if len(sys.argv) != 2:
        print('Usage: python email_parser.py <host_identifier>')
        sys.exit(1)

    host_tag = sys.argv[1]
    final_filename = f"{host_tag}-email.txt"

    print(f"Target output file: {final_filename}")

    # Clear intermediate file if it exists from a previous run
    if os.path.exists(INTERMEDIATE_FILE):
        open(INTERMEDIATE_FILE, 'w').close()

    # Core process: Extract and write emails
    emails = extract_emails('email.json')
    write_results(emails, INTERMEDIATE_FILE)

    # Manage final file: replace if exists, otherwise rename.
    if os.path.exists(final_filename):
        os.remove(final_filename)
    os.rename(INTERMEDIATE_FILE, final_filename)

    print(f"Extraction complete. Results saved to '{final_filename}'.")

Execution Examples

# Incorrect usage prompts help message
python email_parser.py

# First run creates 'ccb-email.txt'
python email_parser.py ccb

# Subsequent run overwrites the existing 'ccb-email.txt'
python email_parser.py ccb

Related Articles

Designing Alertmanager Templates for Prometheus Notifications

How to craft Alertmanager templates to format alert messages, improving clarity and presentation. Alertmanager uses Go’s text/template engine with additional helper functions. Alerting rules referenc...

Deploying a Maven Web Application to Tomcat 9 Using the Tomcat Manager

Tomcat 9 does not provide a dedicated Maven plugin. The Tomcat Manager interface, however, is backward-compatible, so the Tomcat 7 Maven Plugin can be used to deploy to Tomcat 9. This guide shows two...

Skipping Errors in MySQL Asynchronous Replication

When a replica halts because the SQL thread encounters an error, you can resume replication by skipping the problematic event(s). Two common approaches are available. Methods to Skip Errors 1) Skip a...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.