Troubleshooting CoreDNS Readiness Failures Caused by RBAC Restrictions
CoreDNS pods may report a Running status while simultaneously showing 0/1 in the READY column. This discrepancy indicates that the liveness probe passes, but the readiness probe fails, preventing traffic routing to the DNS service.
Executing a namepsace query reveals the state of the system components:
kubectl get po -n kube-system
Example output highlighting the issue:
NAME READY STATUS RESTARTS AGE
calico-node-x7z9a 1/1 Running 0 20d
coredns-74ff55c5b-abc12 0/1 Running 0 5d
coredns-74ff55c5b-def34 0/1 Running 0 5d
kube-apiserver-master 1/1 Running 0 20d
To diagnose the readinesss failure, inspect the container logs. Using a label selector is often more efficient than targeting a specific pod name:
kubectl logs -l k8s-app=kube-dns -n kube-system --tail=100
The output typically contains permission denial errors related to the discovery API group. Look for entries similar to the following:
E1125 06:56:14.489039 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.21.1/tools/cache/reflector.go:167: Failed to watch *v1.EndpointSlice: failed to list *v1.EndpointSlice: endpointslices.discovery.k8s.io is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "endpointslices" in API group "discovery.k8s.io" at the cluster scope
[INFO] plugin/ready: Still waiting on: "kubernetes"
This error confirms that the ServiceAccount associated with CoreDNS lacks the necessary RBAC permissions to acccess EndpointSlice resources. This is a known configuration gap in certain cluster setups.
Resolve the issue by modifying the ClusterRole bound to the CoreDNS service account. Open the role definition for editing:
kubectl edit clusterrole system:coredns
Insert the following rule set into the rules array within the YAML manifest. Ensure the indentation aligns with existing entries:
- apiGroups:
- discovery.k8s.io
resources:
- endpointslices
verbs:
- list
- watch
Save and exit the editor. The CoreDNS pods will automatically restart or reconcile their state. Verify that the READY column updates to 1/1 by querying the namespace again:
kubectl get po -n kube-system