Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Exporting Pod Images from Kubernetes Clusters: A Practical Guide

Tech May 9 3

Problem Statement and Solution

Docker Hub registry changes frequently cause image pull failures when restarting Pods, preventing applications from starting properly. This scenario demands a systematic approach to audit container images across Kubernetes clusters. A Python-based utility can scan Pod configurations and extract image references, enabling migration to private registries like Harbor.

The solution presented here queries the Kubernetes API to enumerate all containers running in a cluster, collecting namespace, Pod name, container name, and image URL information. Results can be filtered by namespace or collected across all namespaces, with output formatted for direct CSV export.

Python Implementation

#!/usr/bin/env python3
"""
Kubernetes Pod Image Exporter
Scans cluster resources and extracts container image references.
"""

import argparse
import csv
import sys
from datetime import datetime

from kubernetes import client, config
from kubernetes.client.rest import ApiException


def initialize_logging():
    """Configure application logging with timestamp and severity."""
    logging.basicConfig(
        level=1,
        format='%(asctime)s | %(levelname)-8s | %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S'
    )
    return logging.getLogger(__name__)


def parse_cli_arguments():
    """Process command-line interface parameters."""
    parser = argparse.ArgumentParser(
        description='Extract container image references from Kubernetes Pods.',
        epilog='Example: python3 exporter.py --namespace kube-system --output images.csv'
    )
    parser.add_argument(
        '--namespace', '-n',
        default='all',
        help='Target namespace or "all" for comprehensive scan (default: all)'
    )
    parser.add_argument(
        '--output', '-o',
        default=None,
        help='Output file path for CSV format (prints to stdout if omitted)'
    )
    parser.add_argument(
        '--kubeconfig', '-k',
        default=None,
        help='Path to kubeconfig file (uses default location if omitted)'
    )
    return parser.parse_args()


def establish_api_connection(kubeconfig_path):
    """Authenticate with Kubernetes cluster using specified configuration."""
    try:
        if kubeconfig_path:
            config.load_kube_config(config_file=kubeconfig_path)
        else:
            config.load_kube_config()
        return client.CoreV1Api()
    except config.ConfigException as error:
        logger.error(f'Configuration loading failed: {error}')
        sys.exit(2)


def retrieve_namespace_list(core_api, scope):
    """Fetch namespaces based on scope parameter."""
    try:
        if scope.lower() == 'all':
            response = core_api.list_namespace()
            return [item.metadata.name for item in response.items]
        else:
            response = core_api.read_namespace(name=scope)
            return [response.metadata.name]
    except ApiException as error:
        logger.error(f'Namespace query failed: {error.status} - {error.reason}')
        sys.exit(3)


def collect_pod_images(core_api, namespace):
    """Enumerate containers and their images within a namespace."""
    results = []
    try:
        pod_list = core_api.list_namespaced_pod(namespace=namespace)
        for pod in pod_list.items:
            pod_name = pod.metadata.name
            for container in pod.spec.containers:
                results.append({
                    'namespace': namespace,
                    'pod': pod_name,
                    'container': container.name,
                    'image': container.image,
                    'image_pull_policy': container.image_pull_policy
                })
    except ApiException as error:
        logger.warning(f'Skipped namespace {namespace}: {error.status}')
    return results


def write_csv_output(records, filepath):
    """Persist extracted data to CSV format."""
    fieldnames = ['namespace', 'pod', 'container', 'image', 'image_pull_policy']
    with open(filepath, 'w', newline='') as target:
        writer = csv.DictWriter(target, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(records)
    logger.info(f'Records exported to {filepath}')


def display_stdout(records):
    """Render records in tabular format to standard output."""
    print('namespace,pod,container,image,pull_policy')
    for entry in records:
        print(f"{entry['namespace']},{entry['pod']},{entry['container']},{entry['image']},{entry['image_pull_policy']}")


def main():
    """Orchestrate the image export workflow."""
    global logger
    logger = initialize_logging()
    
    cli_args = parse_cli_arguments()
    logger.info(f'Starting image inventory for namespace: {cli_args.namespace}')
    
    api_connection = establish_api_connection(cli_args.kubeconfig)
    target_namespaces = retrieve_namespace_list(api_connection, cli_args.namespace)
    
    aggregated_data = []
    for ns_name in target_namespaces:
        logger.info(f'Processing namespace: {ns_name}')
        namespace_records = collect_pod_images(api_connection, ns_name)
        aggregated_data.extend(namespace_records)
    
    if cli_args.output:
        write_csv_output(aggregated_data, cli_args.output)
    else:
        display_stdout(aggregated_data)
    
    logger.info(f'Export complete. Total containers discovered: {len(aggregated_data)}')


if __name__ == '__main__':
    main()

Usage Scenarios

Namespace-Specific Export

Target a particular namespace for container image inventory:

python3 exporter.py --namespace kube-system

Or using short flags:

python3 exporter.py -n monitoring

Cluster-Wide Scan

Enumerate images across all namespaces:

python3 exporter.py --namespace all
python3 exporter.py -n all

Persistent CSV Export

Direct results to a spreadsheet-compatible file for further analysis:

python3 exporter.py --namespace all --output /tmp/pod_images.csv
python3 exporter.py -n all -o cluster_inventory.csv

Custom Kubeconfig Path

When managing multiple clusters, specify authentication configuration explicitly:

python3 exporter.py --namespace production --kubeconfig /path/to/cluster.yaml

Output Structure

The exported data includes the following fields per container:

Field Description
namespace Kubernetes namespace containing the Pod
pod Name of the Pod hosting the container
container Container name within the Pod specification
full Container image reference including registry and tag
pull_policy Image pull behavior (Always, Never, IfNotPresent)

This information enables identification of images requiring migration, detection of outdated versions, and verification of registry configurations across the cluster.

Tags: Kubernetes

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Overview In a recent project, I utilized the SBUS protocol with the Fus remote controller to control a vehicle's basic operations, including movement, lights, and mode switching. This article is aimed...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.