Real Time Kubernetes - Vol-1

 Here are some real-time Kubernetes questions and common issues along with their solutions that you might encounter in a Kubernetes environment. These questions cover various aspects such as troubleshooting, best practices, and specific scenarios you might face in a production setup.

Real-Time Kubernetes Questions

  1. What steps do you take when a pod is stuck in Pending state?

    • Answer:
      • Check for resource availability (CPU/memory) using kubectl describe node.
      • Ensure that there are sufficient resources in the cluster.
      • Check for pod scheduling constraints such as node selectors, taints, and tolerations.
      • Verify if there is a PersistentVolumeClaim (PVC) issue if the pod requires storage.
      • Look at the events section in kubectl describe pod <pod-name> for more details.
  2. How do you handle a pod in CrashLoopBackOff state?

    • Answer:
      • Inspect the pod’s logs using kubectl logs <pod-name>.
      • Check for configuration errors or environment variable issues.
      • Verify if the container image has the correct entry point and is not crashing due to application bugs.
      • Review the readiness and liveness probes configurations.
  3. How would you investigate high CPU usage on a node?

    • Answer:
      • Use kubectl top node <node-name> to check the CPU usage.
      • Use kubectl top pod --all-namespaces to identify which pods are consuming the most CPU.
      • Check the resource requests and limits set on the pods.
      • Investigate logs and application metrics for potential causes of high CPU usage.
  4. What is a service mesh and why would you use one?

    • Answer:
      • A service mesh is an infrastructure layer that manages communication between microservices.
      • It provides features such as traffic management, load balancing, service discovery, retries, circuit breaking, and security (e.g., mTLS).
      • Popular service mesh implementations include Istio, Linkerd, and Consul.
  5. How do you perform zero-downtime deployments?

    • Answer:
      • Use Deployments with rolling updates strategy.
      • Configure readiness probes to ensure new pods are ready before sending traffic to them.
      • Set appropriate values for maxUnavailable and maxSurge in the deployment spec.
      • Use versioned container images and test them in a staging environment before deploying to production.
  6. How would you debug network connectivity issues in a Kubernetes cluster?

    • Answer:
      • Check the pod’s IP and DNS resolution using kubectl exec <pod-name> -- nslookup <service-name>.
      • Verify network policies that might be restricting traffic.
      • Use tools like ping, curl, or telnet inside pods to test connectivity.
      • Inspect the logs of network plugins (like Calico, Flannel) for any errors.
      • Ensure that kube-proxy is running correctly on all nodes.
  7. What are common security practices in Kubernetes?

    • Answer:
      • Use Role-Based Access Control (RBAC) to manage permissions.
      • Apply network policies to control traffic between pods.
      • Use Pod Security Policies (PSP) or Pod Security Admission (PSA) to enforce security standards.
      • Scan container images for vulnerabilities.
      • Use secrets to manage sensitive data.
      • Implement mutual TLS (mTLS) for service-to-service communication using a service mesh.
  8. How do you handle a Kubernetes version upgrade?

    • Answer:
      • Plan and test the upgrade in a staging environment.
      • Review the release notes for deprecations and breaking changes.
      • Upgrade etcd and the control plane components first.
      • Gradually upgrade worker nodes by cordoning and draining them.
      • Verify application functionality and cluster stability after the upgrade.
  9. What is a StatefulSet and when would you use it?

    • Answer:
      • A StatefulSet is a Kubernetes controller that manages the deployment and scaling of stateful applications.
      • It ensures that pods have a unique, stable identity and persistent storage across rescheduling.
      • Use StatefulSets for applications that require stable network identities and persistent storage, such as databases and distributed file systems.
  10. How do you manage configuration changes across multiple environments (dev, staging, prod)?

    • Answer:
      • Use ConfigMaps and Secrets for managing configuration data.
      • Employ Helm charts or Kustomize to template and manage environment-specific configurations.
      • Use GitOps tools like Argo CD or Flux to automate and manage deployments based on Git repositories.

Common Real-Time Issues and Solutions

Issue: Pods Stuck in Terminating State

  • Solution:
    • Identify the pod using kubectl get pod <pod-name>.
    • Force delete the pod using kubectl delete pod <pod-name> --grace-period=0 --force.
    • Check for finalizers that might be preventing the pod from terminating properly.

Issue: Cluster Nodes Not Ready

  • Solution:
    • Check the node status using kubectl describe node <node-name>.
    • Verify if the kubelet is running on the node.
    • Ensure there are no network connectivity issues between the node and the control plane.
    • Investigate any disk pressure, memory pressure, or other resource issues.

Issue: PersistentVolumeClaim (PVC) Pending

  • Solution:
    • Check the PVC status using kubectl describe pvc <pvc-name>.
    • Ensure that there is a matching PersistentVolume (PV) available.
    • Verify storage class configurations and ensure the storage backend is working correctly.
    • Create a PV manually if dynamic provisioning is not configured.

Issue: ImagePullBackOff Error

  • Solution:
    • Check the pod’s events using kubectl describe pod <pod-name>.
    • Verify the image name and tag are correct.
    • Ensure the image is available in the container registry.
    • Check for image pull secrets if the registry requires authentication.

Issue: Network Policy Blocking Traffic

  • Solution:
    • Review the network policies applied in the namespace using kubectl get networkpolicies.
    • Check the policy rules to ensure they allow the desired traffic.
    • Use kubectl describe networkpolicy <policy-name> to understand the applied rules.
    • Adjust the network policies to allow necessary traffic between pods and services.

These questions and issues should help you prepare for real-time scenarios and demonstrate your ability to handle practical challenges in a Kubernetes environment.

Comments