Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Kubernetes Service Implementation: A Deep Dive into iptables Mode

Tech May 18 2

Overview

This article continues our exploration of Kubernetes networking by examining how Service objects are implemented using iptables. Building on our previous discussion of CNI plugins and overlay networks, we'll trace how kube-proxy configures iptables rules to enable Service-to-Pod traffic routing.

Preparing Service and Pod Resources

The example setup uses a simple NGINX deployment with two replicas:

nginx-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-frontend
spec:
  replicas: 2
  selector:
    matchLabels:
      component: web
  template:
    metadata:
      labels:
        component: web
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

nginx-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  type: NodePort
  selector:
    component: web
  ports:
  - protocol: TCP
    port: 80
    nodePort: 30007

Kubernetes Service Implementation Principles

Kubernetes supports multiple Service types:

  • ClusterIP: Assigns an internal cluster IP, making the Service accessible only within the cluster
  • NodePort: Exposes a static port on every node, allowing external access via <NodeIP>:<NodePort>
  • LoadBalancer: Integrates with cloud load balancers to distribute external traffic
  • ExternalName: Maps a Service to an external DNS name

kube-proxy Component

Each node in a Kubernetes cluster runs kube-proxy, which maintains network connectivity for Services. The component operates in several proxy modes:

iptables Mode

This mode programs iptables rules to intercept Service traffic and redirect it to backend Pods. The proxy updates rules whenever Services or Pods change. While simple and widely adopted, this approach can suffer performance degradation in large clusters because each packet must traverse a chain of rules.

IPVS (IP Virtual Server) Mode

IPVS leverages the kernel's IPVS subsystem for built-in load balancing. It handles higher traffic volumes with better performance and supports sophisticated scheduling algorithms like least connections. In this mode, kube-proxy creates a virtual server with a VIP for each Service and distributes traffic across backend Pods.

This article focuses on iptables mode to illustrate how traffic flows from a Service to its Pods.

iptables Fundamentals

iptables is a Linux kernel netfilter configuration tool that allows administrators to define packet filtering, NAT, and port forwarding rules. Key capabilities include:

  • Packet filtering: Controlling which packets pass through network interfaces
  • Network Address Translation (NAT): Modifying source or destination addresses
  • Port forwarding: Redirecting traffic to specific ports

iptables Rule Analysis

Service, Pod, and Host Configuration

After deploying the resources above, the environment shows:

Sevrices:

NAME          TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
web-service   NodePort    10.107.33.105  <none>        80:30007/TCP   152m

Pods:

NAME                      READY   STATUS    RESTARTS   AGE   IP           NODE
web-frontend-9d7f5c-lmnop  1/1     Running   0          155m  10.244.2.4   node-03
web-frontend-9d7f5c-qrstu  1/1     Running   0          155m  10.244.1.5   node-02

Cluster Nodes:

192.168.49.2    node-01
192.168.49.3    node-02
192.168.49.4    node-03

Tracing iptables Rules from NodePort

On the primary node, examine the nat table filtered by the NodePort 30007:

sudo iptables -t nat -L | grep 30007

Output:

KUBE-EXT-ABCD1234EFGH5678  tcp  --  anywhere anywhere /* web/web-service */ tcp dpt:30007

This reveals a chain named KUBE-EXT-ABCD1234EFGH5678. Inspect its contents:

sudo iptables -t nat -L KUBE-EXT-ABCD1234EFGH5678 -v

Result:

Chain KUBE-EXT-ABCD1234EFGH5678 (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-SVC-ABCD1234EFGH5678  all  --  any    any     anywhere             anywhere

The chain references KUBE-SVC-ABCD1234EFGH5678. Examining that chain:

sudo iptables -t nat -L KUBE-SVC-ABCD1234EFGH5678 -v

Output:

Chain KUBE-SVC-ABCD1234EFGH5678 (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-SEP-PQRSTU9876543210  all  --  any    any     anywhere             anywhere             /* web/web-service -> 10.244.1.5:80 */ statistic mode random probability 0.50000000000
    0     0 KUBE-SEP-VWXYZA1234567890  all  --  any    any     anywhere             anywhere             /* web/web-service -> 10.244.2.4:80 */

This chain branches into two KUBE-SEP-* endpoints corresponding to the two Pods. Inspect the first endpoint:

sudo iptables -t nat -L KUBE-SEP-PQRSTU9876543210 -v

Result:

Chain KUBE-SEP-PQRSTU9876543210 (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DNAT       tcp  --  any    any     anywhere             anywhere             /* web/web-service */ tcp to:10.244.1.5:80

The chain terminates with a DNAT target, indicating that traffic matching this Service ultimately reaches Pod endpoint 10.244.1.5:80 and 10.244.2.4:80.

Tracing from PREROUTNIG and OUTPUT Chains

Examining PREROUTING:

sudo iptables -t nat -L PREROUTING -v

Output:

Chain PREROUTING (policy ACCEPT 5 packets, 300 bytes)
 pkts bytes target     prot opt in     out     source               destination
   99  5965 KUBE-SERVICES  all  --  any    any     anywhere             anywhere             /* kubernetes service portals */

All inbound traffic passes through KUBE-SERVICES. Checking OUTPUT:

sudo iptables -t nat -L OUTPUT -v

Output:

Chain OUTPUT (policy ACCEPT 3355 packets, 202K bytes)
 pkts bytes target     prot opt in     out     source               destination
29961 1801K KUBE-SERVICES  all  --  any    any     anywhere             anywhere             /* kubernetes service portals */

Similarly, all outbound traffic routes through KUBE-SERVICES. Inspecting that chain:

sudo iptables -t nat -L KUBE-SERVICES -v

Output:

Chain KUBE-SERVICES (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-SVC-ABCD1234EFGH5678  tcp  --  any    any     anywhere             10.107.33.105        /* web/web-service cluster IP */ tcp dpt:http
 3391  203K KUBE-NODEPORTS  all  --  any    any     anywhere             anywhere             /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL

Two branches exist: one for the Service's Cluster IP (KUBE-SVC-ABCD1234EFGH5678) and another for node ports (KUBE-NODEPORTS). Since we analyzed the first branch earlier, let's examine KUBE-NODEPORTS:

sudo iptables -t nat -L KUBE-NODEPORTS -v

Output:

Chain KUBE-NODEPORTS (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-EXT-ABCD1234EFGH5678  tcp  --  any    any     anywhere             anywhere             /* web/web-service */ tcp dpt:30007

When traffic targets TCP port 30007, it matches KUBE-EXT-ABCD1234EFGH5678, which we already traced back to the same KUBE-SVC-* chain. Each newly created NodePort Service adds an entry to KUBE-NODEPORTS linking to a KUBE-EXT-* chain.

Rule Hierarchy Summary

All packets entering or leaving a node traverse KUBE-SERVICES. Routing decisions then depend on destination:

  1. Traffic destined for a Service's Cluster IP matches KUBE-SVC-* directly
  2. Other traffic reaches KUBE-NODEPORTS, which further matches based on TCP destination port, routing to KUBE-EXT-* chains

Both paths converge at KUBE-SVC-* chains. These chains use probabilistic distribution to spread traffic across KUBE-SEP-* endpoint chains, each representing a Pod. KUBE-SEP-* chains perform the final DNAT operation, redirecting original Service requests to actual Pod endpoints.

With two replicas, accessing the Service via NodePort or ClusterIP triggers this traversal sequence, achieving load-balanced distribution across backend Pods.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.