A Python Script to Parse Email Data from Phonebook JSON API Response
During a security assessment, email adddress collection is a critical task. Manual extraction from API responses, such as those from a banking site's phonebook function, is inefficient when dealing with large volumes of JSON data.
Network inspection (via browser developer tools) reveals the target endpoint returns email information in a structured JSON format. While regex is a possible solution, parsing the JSON directly is more reliable and efficient.
Preparation
Save the complete JSON response from the relevant API call to a local file named email.json. Viewing this file in a browser or editor helps identify the data structure. The target email addresses are nested within a array under the key "selectors", each stored in a "selectorvalue" field.
Core Parsing Logic The essential code to extract these values is straightforward.
import json
with open('email.json', 'r', encoding='utf-8') as json_file:
json_data = json.load(json_file)
for entry in json_data["selectors"]:
email_address = entry["selectorvalue"]
print(email_address)
Enhanced Script with File Managemnet To create a reusable utility, the script is enhanced with command-line arguments and robust file handling.
import json
import os
import sys
INTERMEDIATE_FILE = "temp_output.txt"
def extract_emails(json_filename):
"""Reads JSON file and extracts email addresses from the selectors array."""
try:
with open(json_filename, 'r', encoding='utf-8') as f:
dataset = json.load(f)
except FileNotFoundError:
print(f"Error: File '{json_filename}' not found.")
sys.exit(1)
except json.JSONDecodeError:
print(f"Error: File '{json_filename}' contains invalid JSON.")
sys.exit(1)
email_list = []
for item in dataset.get("selectors", []):
email = item.get("selectorvalue")
if email:
email_list.append(email)
return email_list
def write_results(data_list, output_filename):
"""Writes a list of strings to a file, each on a new line."""
with open(output_filename, 'w', encoding='utf-8') as f:
for line in data_list:
f.write(line + '\n')
if __name__ == '__main__':
# Validate command-line argument for host identifier
if len(sys.argv) != 2:
print('Usage: python email_parser.py <host_identifier>')
sys.exit(1)
host_tag = sys.argv[1]
final_filename = f"{host_tag}-email.txt"
print(f"Target output file: {final_filename}")
# Clear intermediate file if it exists from a previous run
if os.path.exists(INTERMEDIATE_FILE):
open(INTERMEDIATE_FILE, 'w').close()
# Core process: Extract and write emails
emails = extract_emails('email.json')
write_results(emails, INTERMEDIATE_FILE)
# Manage final file: replace if exists, otherwise rename.
if os.path.exists(final_filename):
os.remove(final_filename)
os.rename(INTERMEDIATE_FILE, final_filename)
print(f"Extraction complete. Results saved to '{final_filename}'.")
Execution Examples
# Incorrect usage prompts help message
python email_parser.py
# First run creates 'ccb-email.txt'
python email_parser.py ccb
# Subsequent run overwrites the existing 'ccb-email.txt'
python email_parser.py ccb