SYLLABUS

CKA Exam Domains & Weightage

Official CKA Domains (2026)
🔧 Troubleshooting 30%
  • Troubleshoot clusters and nodes
  • Troubleshoot cluster components
  • Monitor cluster and application resource usage
  • Manage and evaluate container output streams
  • Troubleshoot services and networking
🏗 Cluster Architecture, Installation & Configuration 25%
  • Manage role based access control (RBAC)
  • Prepare underlying infrastructure for installing a K8s cluster
  • Create and manage Kubernetes clusters using kubeadm
  • Manage the lifecycle of Kubernetes clusters
  • Implement and configure a highly-available control plane
  • Use Helm and Kustomize to install cluster components
  • Understand extension interfaces (CNI, CSI, CRI, etc.)
  • Understand CRDs, install and configure operators
🌐 Services & Networking 20%
  • Understand connectivity between Pods
  • Define and enforce Network Policies
  • Use ClusterIP, NodePort, LoadBalancer service types
  • Use the Gateway API to manage Ingress traffic
  • Know how to use Ingress controllers and Ingress resources
  • Understand and use CoreDNS
📦 Workloads & Scheduling 15%
  • Understand deployments and rolling update/rollbacks
  • Use ConfigMaps and Secrets to configure applications
  • Configure workload autoscaling
  • Understand primitives for robust, self-healing deployments
  • Configure Pod admission and scheduling
💾 Storage 10%
  • Implement storage classes and dynamic volume provisioning
  • Configure volume types, access modes and reclaim policies
  • Manage persistent volumes and persistent volume claims
💡 Exam Strategy: Focus on Troubleshooting (30%) and Cluster Architecture (25%) first — together they make up 55% of the exam. All topics are fully covered in this guide.
OVERVIEW

Kubernetes Deployment Order

In CKA exam — always follow this order
1. Namespace
2. ConfigMap
3. Secret
4. PV
5. PVC
6. Pod / Deploy
7. Service
8. HPA/VPA
Why this order? Pods reference ConfigMaps/Secrets/PVCs — those must exist first. PVC references PV. Services route to pods — pod must exist. HPA targets a Deployment — deploy must exist first.
01

Namespace

Concept

A Namespace is a virtual cluster inside a physical Kubernetes cluster. It provides logical isolation — you can have the same resource names in different namespaces without conflict. Think of it like folders on a computer. By default K8s has: default, kube-system, kube-public, kube-node-lease.

"Namespace provides a mechanism for isolating groups of resources within a single cluster. Resources like Deployments, Services, and Pods are namespace-scoped, while Nodes, PersistentVolumes, and ClusterRoles are cluster-scoped. This lets teams share a cluster without stepping on each other — dev, staging, prod can all live in one cluster but in separate namespaces with their own resource quotas and RBAC policies."

YAML
apiVersion: v1
kind: Namespace
metadata:
  name: my-app
  labels:
    env: production
    team: backend
Imperative Commands ⚡

kubectl commands

# Create namespace kubectl create namespace my-app kubectl create ns my-app # List all namespaces kubectl get ns # Run all commands in a namespace kubectl get pods -n my-app kubectl get all -n my-app # Set default namespace for session kubectl config set-context --current --namespace=my-app
02

ConfigMap

Concept

A ConfigMap stores non-sensitive configuration data as key-value pairs. It decouples environment-specific configuration from container images, so the same image works in dev/staging/prod by just swapping the ConfigMap. Pods consume ConfigMaps as env vars, command-line args, or mounted files.

"ConfigMap is a K8s API object used to store non-confidential data in key-value pairs. The main benefit is separation of concerns — your app image stays the same but the config changes per environment. Pods can consume ConfigMap values as environment variables via envFrom or env.valueFrom, or as files mounted via a volume. One important thing — ConfigMap updates don't automatically restart pods. If you mount it as a volume, K8s will eventually update the file, but if you use it as an env var, you need to manually restart the pod."

YAML — Full example (all 3 usage patterns)
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: my-app
data:
  APP_ENV: "production"
  LOG_LEVEL: "info"
  DB_HOST: "postgres-service"
  config.yaml: |            # file-style key
    server:
      port: 8080
      timeout: 30s

---
# Pod consuming ConfigMap in all 3 ways
apiVersion: v1
kind: Pod
metadata:
  name: app-pod
  namespace: my-app
spec:
  containers:
  - name: app
    image: myapp:1.0
    # Pattern 1: Load ALL keys as env vars
    envFrom:
    - configMapRef:
        name: app-config
    # Pattern 2: Load specific key as env var
    env:
    - name: DATABASE_HOST
      valueFrom:
        configMapKeyRef:
          name: app-config
          key: DB_HOST
    # Pattern 3: Mount as file in container
    volumeMounts:
    - name: config-volume
      mountPath: /etc/config
  volumes:
  - name: config-volume
    configMap:
      name: app-config
Imperative Commands ⚡

kubectl commands

# From literal values kubectl create cm app-config --from-literal=APP_ENV=prod --from-literal=LOG_LEVEL=info # From a file kubectl create cm app-config --from-file=config.yaml # From env file (KEY=VALUE format) kubectl create cm app-config --from-env-file=.env # Generate YAML without creating (exam trick!) kubectl create cm app-config --from-literal=K=V --dry-run=client -o yaml # View configmap data kubectl describe cm app-config -n my-app
03

Secret

Concept

A Secret stores sensitive data like passwords, tokens, SSH keys. Values are base64 encoded (NOT encrypted by default — just encoded). For real security, use encryption at rest (EncryptionConfiguration) + RBAC. Secret types: Opaque (generic), kubernetes.io/tls, kubernetes.io/dockerconfigjson, kubernetes.io/service-account-token.

"Secrets are similar to ConfigMaps but designed for sensitive data. Values are base64 encoded — which is encoding not encryption, so anyone who can access the Secret object can decode it. Best practice is to enable etcd encryption at rest and use strict RBAC. Secrets are mounted as tmpfs (in-memory) volumes so they never hit disk in the container. One key difference from ConfigMap — when you create a secret imperatively, kubectl auto base64-encodes the values. But in YAML, you need to base64-encode yourself unless you use the stringData field."

YAML
apiVersion: v1
kind: Secret
metadata:
  name: app-secret
  namespace: my-app
type: Opaque
# Option A: base64 encoded values
data:
  DB_PASSWORD: cGFzc3dvcmQxMjM=   # echo -n "password123" | base64
  API_KEY: c2VjcmV0a2V5

# Option B: plain text (K8s encodes automatically)
stringData:
  DB_PASSWORD: "password123"
  API_KEY: "secretkey"

---
# Consuming in Pod
apiVersion: v1
kind: Pod
metadata:
  name: app-pod
spec:
  containers:
  - name: app
    image: myapp:1.0
    envFrom:
    - secretRef:
        name: app-secret
    env:
    - name: DB_PASS
      valueFrom:
        secretKeyRef:
          name: app-secret
          key: DB_PASSWORD
    volumeMounts:
    - name: secret-vol
      mountPath: /etc/secrets
      readOnly: true
  volumes:
  - name: secret-vol
    secret:
      secretName: app-secret
Imperative Commands ⚡

kubectl commands

kubectl create secret generic app-secret --from-literal=DB_PASSWORD=pass123 kubectl create secret generic app-secret --from-file=ssh-key=id_rsa # TLS secret kubectl create secret tls my-tls --cert=cert.pem --key=key.pem # Docker registry secret kubectl create secret docker-registry regcred \ --docker-server=gcr.io --docker-username=user --docker-password=pass # Decode a secret value kubectl get secret app-secret -o jsonpath='{.data.DB_PASSWORD}' | base64 -d
04

PersistentVolume & PersistentVolumeClaim

Concept

PV = actual storage provisioned by admin (NFS, EBS, HostPath). It's cluster-scoped.

PVC = user's request for storage. Pod uses PVC, not PV directly. K8s matches PVC to PV via accessModes + capacity.

Access Modes: ReadWriteOnce (RWO — 1 node), ReadOnlyMany (ROX — many nodes read), ReadWriteMany (RWX — many nodes write).
Reclaim Policy: Retain (keep data), Delete (delete on PVC delete), Recycle (deprecated).

"PV and PVC implement a two-tier abstraction for storage in K8s. Admin creates PVs that represent actual storage infrastructure. Developers create PVCs to claim that storage without needing to know the underlying infrastructure. K8s binds a PVC to a PV when capacity and accessModes match. The Pod then references the PVC. This separation means infrastructure and application code are decoupled. StorageClass adds dynamic provisioning — the PV gets created automatically when PVC is submitted, no manual PV creation needed."

YAML — PV + PVC + Pod using it
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce          # RWO
  persistentVolumeReclaimPolicy: Retain
  storageClassName: manual   # must match PVC
  hostPath:                  # for local/dev (use NFS/EBS in prod)
    path: /mnt/data

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
  namespace: my-app
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi           # request <= PV capacity
  storageClassName: manual   # must match PV

---
apiVersion: v1
kind: Pod
metadata:
  name: app-with-storage
  namespace: my-app
spec:
  containers:
  - name: app
    image: nginx
    volumeMounts:
    - name: storage
      mountPath: /data
  volumes:
  - name: storage
    persistentVolumeClaim:
      claimName: my-pvc      # reference PVC, not PV
Imperative Commands ⚡

kubectl commands

# No direct imperative command for PV/PVC creation — use YAML # But useful commands: kubectl get pv # list all PVs kubectl get pvc -n my-app # list PVCs in ns kubectl describe pvc my-pvc -n my-app # check binding # Check PVC status — should be "Bound" kubectl get pvc my-pvc -n my-app -o wide
PVC stays in Pending state if no PV matches. Check: storage class name, access mode, and capacity must all match!
05

Pod

Concept

A Pod is the smallest deployable unit in K8s. It wraps one or more containers that share the same network namespace and storage. Containers in a Pod communicate via localhost. Pods are ephemeral — when they die, they're gone. That's why you use Deployments/StatefulSets to manage them. Key concepts: init containers (run before app), sidecar containers (run alongside app), resource requests/limits.

"A Pod is the atomic unit of scheduling in Kubernetes. It hosts one or more tightly-coupled containers that share an IP address, hostname, and storage volumes. In practice, most pods have one container — multi-container pods are used for sidecar patterns like log shipping or service mesh proxies. Pods are ephemeral by design; you never manage them directly in production. Instead, controllers like Deployment or StatefulSet manage pods and ensure the desired count is always running. Resource requests are used for scheduling decisions, while limits enforce runtime constraints."

YAML — Full-blown Pod
apiVersion: v1
kind: Pod
metadata:
  name: full-app-pod
  namespace: my-app
  labels:
    app: myapp
    version: "1.0"
spec:
  # Init container runs first, completes, then app starts
  initContainers:
  - name: init-db-check
    image: busybox
    command: ['sh', '-c', 'until nc -z postgres-service 5432; do sleep 2; done']

  containers:
  - name: app
    image: myapp:1.0
    ports:
    - containerPort: 8080
    # Resources
    resources:
      requests:
        memory: "128Mi"
        cpu: "250m"
      limits:
        memory: "256Mi"
        cpu: "500m"
    # Env from ConfigMap + Secret
    envFrom:
    - configMapRef:
        name: app-config
    - secretRef:
        name: app-secret
    # Liveness + Readiness probes
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 10
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5
    volumeMounts:
    - name: storage
      mountPath: /data
    - name: config-volume
      mountPath: /etc/config

  # Sidecar container
  - name: log-shipper
    image: fluentd:latest
    volumeMounts:
    - name: shared-logs
      mountPath: /var/log

  volumes:
  - name: storage
    persistentVolumeClaim:
      claimName: my-pvc
  - name: config-volume
    configMap:
      name: app-config
  - name: shared-logs
    emptyDir: {}

  restartPolicy: Always   # Always | OnFailure | Never
Imperative Commands ⚡

kubectl commands

kubectl run nginx-pod --image=nginx --port=80 kubectl run nginx-pod --image=nginx --dry-run=client -o yaml > pod.yaml # With env vars kubectl run app --image=myapp --env="ENV=prod" --env="PORT=8080" # Execute command in pod kubectl exec -it nginx-pod -- /bin/bash kubectl exec nginx-pod -- env | grep APP # Logs kubectl logs nginx-pod -f # follow kubectl logs nginx-pod -c log-shipper # specific container kubectl logs nginx-pod --previous # crashed pod logs
06

ReplicaSet

Concept

A ReplicaSet ensures a specified number of Pod replicas are running at all times. If a pod dies, RS creates a new one. Uses a label selector to track pods.

⚠️ In practice — don't use RS directly! Use Deployment instead which manages RS and adds rolling updates + rollback capabilities. RS is the underlying mechanism Deployment uses.

"ReplicaSet is a controller that maintains a stable set of replica pods running at any given time. It uses label selectors to identify which pods it manages. If you manually delete a pod, the ReplicaSet notices the actual count doesn't match desired count and creates a new one. However, ReplicaSet alone has no rollout strategy — you can't do rolling updates with it. That's why in production we always use Deployment, which owns a ReplicaSet and adds update strategies, rollback, and pause/resume capabilities on top."

YAML
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: app-rs
  namespace: my-app
spec:
  replicas: 3
  selector:            # RS uses this to find/track pods
    matchLabels:
      app: myapp       # MUST match pod template labels
  template:
    metadata:
      labels:
        app: myapp     # MUST match selector
    spec:
      containers:
      - name: app
        image: myapp:1.0
        ports:
        - containerPort: 8080
Imperative Commands ⚡

kubectl commands

# No direct imperative for RS — generate YAML: kubectl get rs -n my-app kubectl describe rs app-rs -n my-app # Scale replicaset kubectl scale rs app-rs --replicas=5 -n my-app
07

Deployment

Concept

A Deployment manages ReplicaSets and adds declarative updates. It's the most common workload for stateless apps. Key features: rolling updates (zero downtime), rollback, scaling, pause/resume.

Strategy types:
RollingUpdate: gradual replacement (default) — configurable via maxSurge / maxUnavailable
Recreate: kill all pods then create new (causes downtime)

"Deployment is the standard way to run stateless applications in Kubernetes. It manages one or more ReplicaSets — when you update the Deployment, it creates a new RS and gradually shifts traffic from old to new RS, that's your rolling update. You can control the speed with maxSurge (extra pods during update) and maxUnavailable (pods that can go down). Every update creates a new RS, and Kubernetes keeps old RSes for rollback. You can roll back with kubectl rollout undo, which just swaps which RS is active. It's an immutable history of your rollouts."

YAML — Full Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-deployment
  namespace: my-app
  labels:
    app: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1          # max extra pods during update
      maxUnavailable: 0    # no downtime (0 = zero-downtime)
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: app
        image: myapp:1.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: "100m"
            memory: "128Mi"
          limits:
            cpu: "500m"
            memory: "256Mi"
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
Imperative Commands ⚡

kubectl commands

kubectl create deployment app-deploy --image=myapp:1.0 --replicas=3 kubectl create deployment app-deploy --image=myapp:1.0 --dry-run=client -o yaml # Scale kubectl scale deployment app-deploy --replicas=5 -n my-app # Update image (triggers rolling update) kubectl set image deployment/app-deploy app=myapp:2.0 -n my-app # Check rollout status kubectl rollout status deployment/app-deploy -n my-app # Rollback to previous version kubectl rollout undo deployment/app-deploy -n my-app kubectl rollout undo deployment/app-deploy --to-revision=2 # View history kubectl rollout history deployment/app-deploy # Pause/Resume rolling update kubectl rollout pause deployment/app-deploy kubectl rollout resume deployment/app-deploy
08

StatefulSet

Concept

A StatefulSet manages stateful applications that need stable identity and persistent storage. Unlike Deployment, each pod gets:
Sticky identity: pod-0, pod-1, pod-2 (not random names)
Stable DNS: pod-0.service.ns.svc.cluster.local
Per-pod PVC: via volumeClaimTemplates

Use for: databases (MySQL, PostgreSQL, MongoDB), Kafka, Zookeeper, Elasticsearch. Requires a Headless Service.

"StatefulSet is for stateful workloads where pods need a stable network identity and persistent storage that survives pod restarts. The key difference from Deployment is that pods are created and deleted in order — pod-0 must be Running before pod-1 starts. Each pod has a predictable DNS name through a headless service. volumeClaimTemplates creates a separate PVC for each pod, so pod-0 always gets pvc-0 even after rescheduling. This is essential for databases where each replica node needs its own dedicated storage and stable hostname for replication configuration."

YAML
apiVersion: v1
kind: Service
metadata:
  name: mysql-headless         # Headless service for DNS
  namespace: my-app
spec:
  clusterIP: None              # Makes it headless!
  selector:
    app: mysql
  ports:
  - port: 3306

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
  namespace: my-app
spec:
  serviceName: mysql-headless  # MUST reference headless service
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: password
        ports:
        - containerPort: 3306
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  # Each pod gets its own PVC!
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi
Imperative Commands ⚡

kubectl commands

# No direct imperative — use YAML. Useful debugging: kubectl get statefulset -n my-app kubectl get pods -l app=mysql -n my-app # see pod-0, pod-1, pod-2 kubectl scale statefulset mysql --replicas=5 -n my-app # Access specific pod by DNS # mysql-0.mysql-headless.my-app.svc.cluster.local
09

DaemonSet

Concept

A DaemonSet ensures one pod runs on every node (or a subset via node selectors). When nodes are added to cluster, pods are added automatically. When nodes are removed, pods are garbage collected.

Use cases: log collectors (Fluentd, Filebeat), monitoring agents (Prometheus node-exporter, Datadog), network plugins (CNI like Calico, Weave), storage daemons.

"DaemonSet guarantees that a copy of a pod runs on every node — or a subset if you use nodeSelector or affinity rules. It's cluster infrastructure stuff — log shippers, monitoring agents, network proxies, anything that needs to run at the node level. Unlike Deployment where you specify replica count, DaemonSet is implicitly replicas=number-of-nodes. When you add a node, K8s automatically schedules the DaemonSet pod on it. It also tolerates the control-plane taint by default in newer K8s versions so it can run on master nodes too for system-level agents."

YAML
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: log-collector
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      tolerations:             # Run on master/control-plane too
      - key: node-role.kubernetes.io/control-plane
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluentd:latest
        resources:
          limits:
            memory: "200Mi"
            cpu: "100m"
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
Imperative Commands ⚡

kubectl commands

kubectl get daemonset -n kube-system kubectl describe ds log-collector -n kube-system # Trick: generate YAML from deployment then change kind kubectl create deployment log-ds --image=fluentd --dry-run=client -o yaml \ | sed 's/kind: Deployment/kind: DaemonSet/' \ | sed '/replicas/d' | sed '/strategy/d' > ds.yaml
10

Job

Concept

A Job runs pods to completion (not indefinitely like Deployment). It guarantees that a specified number of pods successfully terminate. Key params:
completions: how many successful pod completions needed
parallelism: how many pods run simultaneously
backoffLimit: retry limit before marking job failed
activeDeadlineSeconds: max job duration
Use restartPolicy: OnFailure or Never (not Always).

"Job creates one or more pods and tracks successful completions. When completions are reached, the job is done. For batch processing, you set parallelism to run multiple pods simultaneously and completions to total tasks. The key difference from a Deployment is that a Job terminates — pods aren't restarted after success. You use restartPolicy: OnFailure so failed pods retry, or Never to get a new pod each attempt. BackoffLimit controls how many times a failed pod retries before the whole job fails."

YAML
apiVersion: batch/v1
kind: Job
metadata:
  name: data-processor
  namespace: my-app
spec:
  completions: 10          # total successful completions needed
  parallelism: 3           # 3 pods at a time
  backoffLimit: 4          # retry 4 times before fail
  activeDeadlineSeconds: 300  # fail if not done in 5min
  template:
    spec:
      restartPolicy: OnFailure  # OnFailure or Never (NOT Always!)
      containers:
      - name: processor
        image: python:3.9
        command: ["python", "-c", "print('Processing batch job')"]
        resources:
          requests:
            cpu: "100m"
            memory: "64Mi"
Imperative Commands ⚡

kubectl commands

kubectl create job my-job --image=busybox -- echo "hello" kubectl create job my-job --image=busybox --dry-run=client -o yaml -- echo "hello" kubectl get jobs -n my-app kubectl describe job data-processor kubectl logs job/data-processor
11

CronJob

Concept

A CronJob creates Jobs on a schedule (like Linux cron). Uses standard cron syntax: * * * * * (minute hour day month weekday). Key params:
successfulJobsHistoryLimit: keep N successful jobs (default 3)
failedJobsHistoryLimit: keep N failed jobs (default 1)
concurrencyPolicy: Allow | Forbid | Replace
startingDeadlineSeconds: deadline to start if missed window

"CronJob is a layer on top of Job that runs it on a schedule using cron syntax. CronJob creates a Job, which creates Pods. So it's CronJob → Job → Pod. The concurrencyPolicy is important — Allow means multiple scheduled jobs can run simultaneously, Forbid skips new job if previous is still running, Replace kills the old and starts new. You need to manage job history to avoid accumulating too many completed jobs — use successfulJobsHistoryLimit and failedJobsHistoryLimit to control that."

YAML
apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-backup
  namespace: my-app
spec:
  schedule: "0 2 * * *"        # Every day at 2 AM
  # schedule: "*/5 * * * *"   # Every 5 minutes
  concurrencyPolicy: Forbid    # Don't run if previous still running
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  startingDeadlineSeconds: 60  # Start within 60s of schedule or skip
  jobTemplate:                 # Job spec goes here
    spec:
      backoffLimit: 2
      template:
        spec:
          restartPolicy: OnFailure
          containers:
          - name: backup
            image: backup-tool:latest
            command: ["/bin/sh", "-c", "pg_dump -h postgres > /backup/dump.sql"]
            volumeMounts:
            - name: backup-storage
              mountPath: /backup
          volumes:
          - name: backup-storage
            persistentVolumeClaim:
              claimName: backup-pvc
Imperative Commands ⚡

kubectl commands

kubectl create cronjob daily-backup --image=busybox --schedule="0 2 * * *" -- echo backup kubectl create cronjob my-cj --image=busybox --schedule="*/5 * * * *" --dry-run=client -o yaml kubectl get cronjob -n my-app kubectl get jobs -n my-app # see jobs created by cronjob # Manually trigger a cronjob kubectl create job manual-run --from=cronjob/daily-backup
12

Service Types

Concept

A Service exposes pods via a stable network endpoint. Pods come and go but Service IP stays. Uses label selectors to find target pods and kube-proxy to route traffic.

Type Access Use case
ClusterIP Internal only Default. Pod-to-pod communication
NodePort Node IP:Port Dev/test external access (30000-32767)
LoadBalancer Cloud LB IP Production cloud external access
ExternalName DNS alias Alias external service (no selector)
Headless Direct pod IPs StatefulSets, service discovery

"Service provides stable DNS and IP for a dynamic set of pods. ClusterIP is default — only reachable within the cluster. NodePort extends it and opens a port on every node's IP. LoadBalancer extends NodePort and provisions a cloud load balancer in front. The key insight is they're additive — LoadBalancer creates NodePort creates ClusterIP. For StatefulSets you use a headless service — clusterIP: None — which returns the actual pod IPs from DNS instead of a virtual IP, so clients can connect directly to individual pods."

YAML — All Service Types
# 1. ClusterIP (default - internal only)
apiVersion: v1
kind: Service
metadata:
  name: app-svc
  namespace: my-app
spec:
  type: ClusterIP
  selector:
    app: myapp
  ports:
  - port: 80         # service port
    targetPort: 8080 # container port
    protocol: TCP

---
# 2. NodePort (external via node IP)
apiVersion: v1
kind: Service
metadata:
  name: app-nodeport
spec:
  type: NodePort
  selector:
    app: myapp
  ports:
  - port: 80
    targetPort: 8080
    nodePort: 30080   # optional, auto-assigned if omitted (30000-32767)

---
# 3. LoadBalancer (cloud provider LB)
apiVersion: v1
kind: Service
metadata:
  name: app-lb
spec:
  type: LoadBalancer
  selector:
    app: myapp
  ports:
  - port: 80
    targetPort: 8080

---
# 4. Headless (StatefulSet / direct pod access)
apiVersion: v1
kind: Service
metadata:
  name: mysql-headless
spec:
  clusterIP: None   # This makes it headless!
  selector:
    app: mysql
  ports:
  - port: 3306

---
# 5. ExternalName (alias to external DNS)
apiVersion: v1
kind: Service
metadata:
  name: external-db
spec:
  type: ExternalName
  externalName: my-database.rds.amazonaws.com  # no selector!
Imperative Commands ⚡

kubectl commands

# Expose a deployment as ClusterIP kubectl expose deployment app-deploy --port=80 --target-port=8080 # Expose as NodePort kubectl expose deployment app-deploy --type=NodePort --port=80 --target-port=8080 # Create service directly kubectl create service clusterip my-svc --tcp=80:8080 # create type=NodePort then manually edit yaml for NodePort/span> kubectl create service nodeport my-svc --tcp=80:8080 --type=NodePort kubectl create service loadbalancer my-svc --tcp=80:8080 # Get service endpoints kubectl get endpoints app-svc -n my-app # Test connectivity from inside cluster kubectl run test --image=busybox --rm -it -- wget -qO- http://app-svc
13

Ingress & Gateway API

Concept

Ingress manages external HTTP/HTTPS access to services in the cluster. It provides URL-based routing, SSL/TLS termination, and virtual hosting. Unlike NodePort/LoadBalancer (L4), Ingress operates at Layer 7.

Key components:
Ingress Controller: The actual proxy (NGINX, Traefik, HAProxy) — must be installed separately
Ingress Resource: Rules that define routing — which host/path maps to which service
IngressClass: Selects which controller handles the resource

Gateway API is the newer replacement for Ingress — more expressive, supports TCP/UDP, and separates concerns between infra and app teams via GatewayClass → Gateway → HTTPRoute.

"Ingress is K8s Layer 7 load balancing. Instead of creating a LoadBalancer per service, you have one Ingress controller handling all external traffic and routing based on host headers or URL paths. The Ingress resource is just the config — the actual work is done by the Ingress controller which is a pod running NGINX or Traefik. Gateway API is the evolution — it splits responsibilities into GatewayClass (infra provider), Gateway (cluster operator), and HTTPRoute (developer), giving better separation of concerns. In the exam, you need to know both Ingress resources and Gateway API basics."

YAML — Ingress (path + host + TLS) & Gateway API
# Path-based routing with TLS
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
  namespace: my-app
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx         # which controller to use
  tls:
  - hosts:
    - myapp.example.com
    secretName: tls-secret        # kubectl create secret tls ...
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix          # Prefix | Exact | ImplementationSpecific
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-service
            port:
              number: 80

---
# Gateway API — HTTPRoute
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: app-route
  namespace: my-app
spec:
  parentRefs:
  - name: my-gateway              # references a Gateway object
  hostnames:
  - "myapp.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /api
    backendRefs:
    - name: api-service
      port: 80
Imperative Commands ⚡

kubectl commands

# Create ingress imperatively kubectl create ingress app-ingress --rule="myapp.example.com/api*=api-svc:80" kubectl create ingress app-ingress \ --rule="myapp.example.com/api*=api-svc:80,tls=tls-secret" # Generate YAML kubectl create ingress my-ing --rule="host/path=svc:port" --dry-run=client -o yaml # List and inspect kubectl get ingress -n my-app kubectl describe ingress app-ingress -n my-app # Check IngressClass kubectl get ingressclass
14

Network Policies

Concept

NetworkPolicy controls traffic flow at the pod level — like a firewall for pods. By default, all pods can talk to all pods (allow-all). Once you apply a NetworkPolicy selecting a pod, only explicitly allowed traffic is permitted (default-deny for selected pods).

Key concepts:
Ingress rules: Who can send traffic TO the pod
Egress rules: Where the pod can send traffic TO
• Selectors: podSelector, namespaceSelector, ipBlock
• Requires a CNI that supports NetworkPolicy (Calico, Cilium, Weave — NOT Flannel)

"NetworkPolicy is K8s native firewall at the pod level. Without any policy, it's a flat network — every pod can reach every other pod. The moment you create a NetworkPolicy targeting a pod, that pod switches to default-deny for the policy types specified. You then whitelist specific traffic. The critical thing is your CNI must support it — Calico and Cilium do, Flannel doesn't. In the exam, you'll likely need to create a policy that allows traffic only from specific pods or namespaces. Remember: NetworkPolicy is namespaced and additive — multiple policies combine with OR logic."

YAML — Default Deny + Allow Specific
# 1. Default deny ALL ingress in namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: my-app
spec:
  podSelector: {}            # empty = selects ALL pods in namespace
  policyTypes:
  - Ingress                  # no ingress rules = deny all incoming

---
# 2. Allow traffic from specific pods + namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend
  namespace: my-app
spec:
  podSelector:
    matchLabels:
      app: api-server         # apply to api-server pods
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:            # from pods with role=frontend
        matchLabels:
          role: frontend
    - namespaceSelector:      # OR from monitoring namespace
        matchLabels:
          name: monitoring
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 5432
  - to:                       # allow DNS
    - namespaceSelector: {}
    ports:
    - protocol: UDP
      port: 53
Imperative Commands ⚡

kubectl commands

# No direct imperative — use YAML. Useful commands: kubectl get networkpolicy -n my-app kubectl describe netpol allow-frontend -n my-app # Test connectivity kubectl run test --image=busybox --rm -it -- wget -qO- --timeout=2 http://api-svc:8080 # Debug: check if policy is applied kubectl get pods -n my-app --show-labels kubectl get netpol -n my-app -o yaml
CKA Tip: Always allow DNS egress (UDP 53) in your egress policies, otherwise pods can't resolve service names!
15

CoreDNS

Concept

CoreDNS is the cluster DNS server in Kubernetes. It provides service discovery — every Service gets a DNS name automatically.

DNS record formats:
• Service: svc-name.namespace.svc.cluster.local
• Pod: pod-ip-dashed.namespace.pod.cluster.local
• Headless: pod-name.svc-name.namespace.svc.cluster.local

CoreDNS runs as a Deployment in kube-system namespace. Config stored in a ConfigMap called coredns.

"CoreDNS replaced kube-dns as the default DNS in K8s 1.13. It runs as a Deployment with 2 replicas in kube-system, backed by a Service called kube-dns on ClusterIP. When you create a Service, K8s automatically creates a DNS record. Pods use the kube-dns Service IP (set in /etc/resolv.conf) to resolve names. For troubleshooting DNS, you run a debug pod with nslookup or dig. The Corefile in the coredns ConfigMap controls behavior — you can add custom DNS entries, forward to upstream DNS, etc."

YAML — CoreDNS ConfigMap & DNS Debug
# CoreDNS Corefile (in ConfigMap coredns)
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health { laxy_start_seconds 5 }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf    # upstream DNS
        cache 30
        loop
        reload
        loadbalance
    }

---
# DNS Debug Pod
apiVersion: v1
kind: Pod
metadata:
  name: dnsutils
spec:
  containers:
  - name: dnsutils
    image: registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3
    command: ["sleep", "infinity"]
  restartPolicy: Always
Imperative Commands ⚡

kubectl commands

# Check CoreDNS pods kubectl get pods -n kube-system -l k8s-app=kube-dns kubectl logs -n kube-system -l k8s-app=kube-dns # Check CoreDNS ConfigMap kubectl get cm coredns -n kube-system -o yaml # DNS lookup from debug pod kubectl exec -it dnsutils -- nslookup kubernetes.default kubectl exec -it dnsutils -- nslookup api-svc.my-app.svc.cluster.local # Quick DNS test without debug pod kubectl run dns-test --image=busybox --rm -it -- nslookup kubernetes.default # Check /etc/resolv.conf inside a pod kubectl exec my-pod -- cat /etc/resolv.conf
13

HPA — Horizontal Pod Autoscaler

Concept

HPA automatically scales pod count based on observed metrics. Runs as a control loop (every 15s by default).

CPU-based HPA: Built-in. Uses Metrics Server. Scales when average CPU across pods crosses target %.

KEDA (Kubernetes Event Driven Autoscaler): External operator. Scales based on event sources — queue length (RabbitMQ, SQS, Kafka), cron, Prometheus metrics, etc. Can scale to 0 (saves cost). CPU HPA cannot scale to 0.

"HPA watches resource metrics and adjusts replica count. The standard HPA uses CPU and memory from the metrics server — if average CPU exceeds your target, it scales up. KEDA extends this with 50+ scalers for event-driven sources. The critical advantage of KEDA over CPU-HPA is scale-to-zero — if there are no messages in your queue, KEDA scales to 0 replicas. CPU-HPA can only scale to minReplicas (minimum 1). For microservices processing async jobs from a queue, KEDA is the right choice. For typical HTTP services, CPU-based HPA is simpler and sufficient."

YAML — CPU HPA & KEDA
# CPU-based HPA (requires metrics-server installed)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: app-hpa
  namespace: my-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: app-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70    # scale up if CPU > 70%
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

---
# KEDA ScaledObject (scale based on queue length)
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: app-keda-scaler
  namespace: my-app
spec:
  scaleTargetRef:
    name: app-deployment
  minReplicaCount: 0            # KEDA can scale to ZERO!
  maxReplicaCount: 20
  triggers:
  - type: rabbitmq
    metadata:
      host: amqp://rabbitmq:5672
      queueName: tasks
      queueLength: "5"          # 1 pod per 5 messages in queue
Imperative Commands ⚡

kubectl commands

# Create HPA imperatively (CPU) kubectl autoscale deployment app-deployment --cpu=70 --min=2 --max=10 kubectl get hpa -n my-app kubectl describe hpa app-hpa -n my-app # see current metrics # Generate YAML kubectl autoscale deployment app-deployment --cpu-percent=70 --min=2 --max=10 \ --dry-run=client -o yaml
HPA vs KEDA Comparison
Feature HPA (CPU) KEDA
Scale to zero ❌ min 1 ✅ yes
Event sources CPU, Memory 50+ (queue, cron, DB, Prometheus...)
Installation Built-in Separate operator
Best for HTTP services Async/event-driven workloads
Lag before scale ~15-30s Near real-time
14

VPA — Vertical Pod Autoscaler

Concept

VPA automatically adjusts CPU and memory requests/limits for containers (vertical scaling = more resources, not more pods). It has 3 modes:
Off: Only recommend, no auto-apply
Initial: Apply only at pod creation
Auto: Apply and restart pods when needed

⚠️ VPA + HPA conflict! Don't use both on same target for CPU/Memory. Use VPA for vertical, HPA for horizontal, or use HPA with custom metrics + VPA on non-conflicting resources.

"VPA solves the problem of right-sizing — developers often over-provision CPU/memory to be safe, wasting resources. VPA observes actual usage over time and recommends or automatically adjusts resource requests and limits. In Auto mode, it will evict and restart pods with updated resources. The main drawback is that restarting pods causes brief disruption. So for production, many teams use Recommendation mode to get suggestions and apply them during maintenance windows. VPA and HPA shouldn't both be targeting the same CPU metric — they'll fight each other. Safe combo is HPA on CPU + VPA in recommendation-only mode."

YAML
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: app-vpa
  namespace: my-app
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: app-deployment
  updatePolicy:
    updateMode: "Auto"    # "Off" | "Initial" | "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: "*"   # apply to all containers
      minAllowed:
        cpu: 100m
        memory: 50Mi
      maxAllowed:
        cpu: "2"
        memory: 2Gi
      controlledResources: ["cpu", "memory"]
15

Taints & Tolerations

Concept

Taints are applied to Nodes — they repel pods that can't tolerate them.
Tolerations are applied to Pods — they allow pods to schedule on tainted nodes.

Taint effects:
NoSchedule: New pods won't schedule. Existing pods stay.
PreferNoSchedule: Try not to schedule (soft).
NoExecute: Evict existing pods too (hard). Pod needs toleration or gets kicked.

⚠️ Taints/Tolerations only say "this pod CAN go to tainted node" — it doesn't FORCE it there. Use NodeAffinity for forcing.

"Taints and tolerations are a push mechanism — taints push pods away from nodes. A taint on a node means 'no pod is allowed here unless they tolerate this taint'. Tolerations on pods say 'I can handle that taint, don't reject me.' NoSchedule prevents future pods, NoExecute additionally evicts existing pods that don't tolerate it. A classic use case is GPU nodes — you taint them with gpu=true:NoSchedule so general workloads don't land on expensive GPU nodes, and only your ML pods that have the matching toleration get scheduled there. But important — tolerations alone don't guarantee the pod goes to that node. You still need node affinity or nodeSelector for that positive selection."

YAML + Commands
# Taints are applied with kubectl (node level)
# kubectl taint nodes node1 gpu=true:NoSchedule
# kubectl taint nodes node1 gpu=true:NoSchedule-  (remove taint with -)

# Pod with Toleration
apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  tolerations:
  - key: "gpu"
    operator: "Equal"     # Equal | Exists
    value: "true"
    effect: "NoSchedule"  # NoSchedule | NoExecute | PreferNoSchedule
  # - key: "gpu"
  #   operator: "Exists"  # tolerates any value with key "gpu"
  containers:
  - name: ml-app
    image: tensorflow/tensorflow:latest-gpu
Imperative Commands ⚡

kubectl commands

# Add taint to node kubectl taint nodes node1 gpu=true:NoSchedule kubectl taint nodes node1 env=prod:NoExecute kubectl taint nodes node1 dedicated=backend:PreferNoSchedule # Remove taint (append -) kubectl taint nodes node1 gpu=true:NoSchedule- # View taints on nodes kubectl describe node node1 | grep -i taint kubectl get nodes -o json | jq '.items[].spec.taints'
16

Node Affinity & Pod Affinity

Concept

Node Affinity: PULL pods TOWARD specific nodes (based on node labels). Advanced version of nodeSelector.
requiredDuringSchedulingIgnoredDuringExecution: Hard rule (MUST match)
preferredDuringSchedulingIgnoredDuringExecution: Soft rule (try to match)

Pod Affinity / Anti-Affinity: Schedule pods RELATIVE to other pods.
Affinity: "Schedule near pods with label X" (co-location)
Anti-Affinity: "Don't schedule near pods with label X" (spread out, HA)

"Node affinity is the positive complement to taints — taints repel, affinity attracts. You label your nodes (disk=ssd, region=us-east) and use requiredDuringScheduling for hard constraints or preferredDuringScheduling for best-effort. Pod affinity is more nuanced — it says 'schedule me on the same node or same zone as pods matching this selector'. Anti-affinity is the opposite — 'spread my replicas across zones'. For high availability, you always use pod anti-affinity with topologyKey=topology.kubernetes.io/zone to ensure replicas land in different availability zones, so one zone failure doesn't take down your whole app."

YAML — Node Affinity + Pod Anti-Affinity (HA setup)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ha-deployment
  namespace: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      affinity:
        # NODE AFFINITY — schedule only on SSD nodes
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: disk-type
                operator: In           # In | NotIn | Exists | DoesNotExist | Gt | Lt
                values:
                - ssd
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            preference:
              matchExpressions:
              - key: region
                operator: In
                values: [us-east-1]

        # POD ANTI-AFFINITY — spread replicas across zones (HA!)
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: myapp             # don't schedule with other myapp pods
            topologyKey: topology.kubernetes.io/zone  # one per zone

        # POD AFFINITY — schedule near cache pods (co-location)
        podAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 80
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: redis-cache
              topologyKey: kubernetes.io/hostname  # same node as redis

      containers:
      - name: app
        image: myapp:1.0
Cheatsheet — Operator Types
Operator Meaning Example
In Label value is in list disk-type In [ssd, nvme]
NotIn Label value NOT in list env NotIn [dev, staging]
Exists Key exists (any value) gpu Exists
DoesNotExist Key doesn't exist spot DoesNotExist
Gt Greater than cores Gt 4
Lt Less than memory Lt 8
Taints vs Affinity — Quick Reference
Concept Applied To Direction Type
Taint Node REPEL pods Push (node says "go away")
Toleration Pod Accept taint "I can handle that taint"
Node Affinity Pod ATTRACT to node Pull (pod says "go here")
Pod Affinity Pod Near other pods Co-locate
Pod Anti-Affinity Pod Away from pods Spread for HA
nodeSelector Pod Exact label match Simpler node affinity
⚡ CKA Exam Power Tips

Must-know shortcuts for the exam

# 1. ALWAYS use --dry-run to generate YAML fast kubectl run pod --image=nginx --dry-run=client -o yaml > pod.yaml kubectl create deploy app --image=nginx --dry-run=client -o yaml > deploy.yaml # 2. Edit running resource kubectl edit deployment app-deployment # 3. Apply changes imperatively kubectl apply -f pod.yaml kubectl replace --force -f pod.yaml # delete + recreate # 4. jsonpath — extract specific field kubectl get pod my-pod -o jsonpath='{.status.podIP}' kubectl get nodes -o jsonpath='{.items[*].metadata.name}' # 5. Watch resources in real-time kubectl get pods -w # 6. Check what went wrong kubectl describe pod my-pod | tail -20 # events section kubectl events --for pod/my-pod # 7. Copy file to/from pod kubectl cp my-pod:/etc/config/app.conf ./app.conf kubectl cp ./app.conf my-pod:/etc/config/app.conf # 8. Port forward for quick testing kubectl port-forward pod/my-pod 8080:80 kubectl port-forward svc/my-svc 8080:80
20

RBAC — Role-Based Access Control

Concept

RBAC controls who can do what in the cluster. It uses 4 objects:

Role: Permissions within a namespace (verbs + resources)
ClusterRole: Permissions cluster-wide (nodes, PVs, non-namespaced resources)
RoleBinding: Binds a Role/ClusterRole to a user/group/SA in a namespace
ClusterRoleBinding: Binds a ClusterRole to a user/group/SA cluster-wide

ServiceAccount: Identity for pods. Every pod runs as a ServiceAccount (default: default SA). Used to grant pods API access.

"RBAC is the authorization model in K8s. You define WHAT actions are allowed (Role/ClusterRole with verbs like get, list, create, delete on resources like pods, deployments) and then bind those permissions to WHO needs them (users, groups, or ServiceAccounts via RoleBinding/ClusterRoleBinding). A key exam pattern is: create a ServiceAccount, create a Role with specific permissions, then bind them. ClusterRole + RoleBinding is a common combo — define permissions once at cluster level, then grant them namespace-by-namespace. Always follow least privilege — don't give cluster-admin unless absolutely needed."

YAML — Role + RoleBinding + ClusterRole
# ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: app-sa
  namespace: my-app

---
# Role (namespace-scoped)
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: my-app
rules:
- apiGroups: [""]             # core API group
  resources: ["pods", "pods/log"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "create", "update", "patch"]

---
# RoleBinding — binds Role to ServiceAccount
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods-binding
  namespace: my-app
subjects:
- kind: ServiceAccount
  name: app-sa
  namespace: my-app
# - kind: User               # for user binding
#   name: jane
#   apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role                  # Role or ClusterRole
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

---
# ClusterRole (cluster-wide)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: node-viewer
rules:
- apiGroups: [""]
  resources: ["nodes", "persistentvolumes"]
  verbs: ["get", "list", "watch"]

---
# ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: view-nodes-binding
subjects:
- kind: User
  name: jane
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: node-viewer
  apiGroup: rbac.authorization.k8s.io

---
# Pod using ServiceAccount
apiVersion: v1
kind: Pod
metadata:
  name: app-pod
  namespace: my-app
spec:
  serviceAccountName: app-sa  # use custom SA
  containers:
  - name: app
    image: myapp:1.0
Imperative Commands ⚡

kubectl commands

# Create ServiceAccount kubectl create sa app-sa -n my-app # Create Role kubectl create role pod-reader --verb=get,list,watch --resource=pods -n my-app # Create ClusterRole kubectl create clusterrole node-viewer --verb=get,list --resource=nodes # Create RoleBinding kubectl create rolebinding read-pods --role=pod-reader --serviceaccount=my-app:app-sa -n my-app # Create ClusterRoleBinding kubectl create clusterrolebinding view-nodes --clusterrole=node-viewer --user=jane # Check permissions (can-i) kubectl auth can-i get pods -n my-app --as=system:serviceaccount:my-app:app-sa kubectl auth can-i create deployments --as=jane kubectl auth can-i '*' '*' # am I cluster-admin? # Generate YAML kubectl create role pod-reader --verb=get,list --resource=pods --dry-run=client -o yaml
21

kubeadm — Cluster Bootstrap

Concept

kubeadm is the official tool to bootstrap Kubernetes clusters. It handles: certificate generation, etcd setup, control plane components, kubelet configuration, and node joining.

Workflow:
1. kubeadm init on master → initializes control plane
2. Copy /etc/kubernetes/admin.conf to ~/.kube/config
3. Install CNI network plugin (Calico, Flannel, etc.)
4. kubeadm join on worker nodes with token

"kubeadm is the standard tool for creating production-grade K8s clusters. Init generates all certificates, creates static pod manifests for kube-apiserver, kube-scheduler, kube-controller-manager, and etcd in /etc/kubernetes/manifests. It bootstraps the cluster CA and creates the kubeconfig files. After init, you install a CNI plugin — without it, pods can't communicate across nodes. Then you join workers using the token from kubeadm init. For upgrades, you use kubeadm upgrade plan to see available versions, then kubeadm upgrade apply on master, then upgrade kubelet on each node."

Commands — Init, Join & Upgrade

kubeadm commands

# Initialize control plane kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.1.10 # Setup kubeconfig after init mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config # Generate join token kubeadm token create --print-join-command # Join worker node kubeadm join 192.168.1.10:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash> # Upgrade cluster kubeadm upgrade plan # check available versions kubeadm upgrade apply v1.31.0 # upgrade control plane kubeadm upgrade node # on worker nodes
22

Cluster Lifecycle — Upgrade, Backup, Drain

Concept

Cluster lifecycle management covers upgrades, etcd backup/restore, and node maintenance:

Upgrade strategy: Upgrade control plane first, then workers one at a time. Max skew: kubelet can be 1 minor version behind API server
etcd backup: etcd stores ALL cluster state. Take regular snapshots with etcdctl snapshot save. Critical for disaster recovery
Node drain: Safely evict pods before maintenance. cordon = mark unschedulable, drain = cordon + evict pods

"For upgrades, you always go control plane first — upgrade kubeadm, then run kubeadm upgrade apply, then upgrade kubelet and kubectl. For workers, drain each node, upgrade, uncordon. etcd backup is critical — I'd take snapshots before any upgrade. Use etcdctl snapshot save with the correct cert paths. For restore, stop API server, restore snapshot, restart. In the exam, you'll likely get a drain/cordon question or an etcd backup/restore question — those are high-value topics."

Commands — Drain, Backup, Upgrade

Essential commands

# Node maintenance — drain/cordon/uncordon kubectl cordon node01 # mark unschedulable kubectl drain node01 --ignore-daemonsets --delete-emptydir-data kubectl uncordon node01 # mark schedulable again # etcd backup ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db \ --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.key # etcd restore ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcd-backup.db \ --data-dir=/var/lib/etcd-restored # Verify snapshot ETCDCTL_API=3 etcdctl snapshot status /tmp/etcd-backup.db --write-table # Upgrade steps (on control plane node) apt-get update && apt-get install -y kubeadm=1.31.0-00 kubeadm upgrade plan kubeadm upgrade apply v1.31.0 apt-get install -y kubelet=1.31.0-00 kubectl=1.31.0-00 systemctl daemon-reload && systemctl restart kubelet
23

Highly-Available Control Plane

Concept

HA control plane means multiple master nodes for fault tolerance. Two topologies:

Stacked etcd: etcd runs on each control plane node. Simpler setup, fewer servers. But if a node fails, both a control plane member and an etcd member are lost
External etcd: etcd runs on separate dedicated hosts. Better resilience — etcd and control plane failures are independent. Needs more infrastructure

Both require a load balancer in front of API servers (HAProxy, nginx, cloud LB). Minimum 3 control plane nodes for etcd quorum (odd number required).

"For production, you need at least 3 control plane nodes for etcd quorum — etcd uses Raft consensus which requires a majority to function, so 3 nodes can survive 1 failure. The load balancer distributes API requests across all API servers. In stacked topology, you init the first node with --control-plane-endpoint pointing to the LB, then join additional control planes with --control-plane flag. kube-scheduler and kube-controller-manager use leader election — only one is active, others are standby. API servers are all active behind the load balancer."

Commands — HA Setup

kubeadm HA commands

# Init first control plane with LB endpoint kubeadm init --control-plane-endpoint "lb.example.com:6443" \ --upload-certs --pod-network-cidr=10.244.0.0/16 # Join additional control plane nodes kubeadm join lb.example.com:6443 --token <token> \ --discovery-token-ca-cert-hash sha256:<hash> \ --control-plane --certificate-key <cert-key> # Check control plane components kubectl get nodes kubectl get pods -n kube-system | grep -E 'apiserver|scheduler|controller|etcd' # Check etcd cluster health ETCDCTL_API=3 etcdctl member list \ --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.key
24

Helm & Kustomize

Concept

Helm is the package manager for Kubernetes. A Chart is a package of K8s manifests with templating. A Release is a deployed instance of a chart.

Kustomize is a template-free way to customize K8s manifests using overlays. Built into kubectl (kubectl apply -k). It patches base manifests without modifying them — great for per-environment config (dev/staging/prod).

"Helm uses Go templates and values.yaml for parameterization — you install charts from repos like ArtifactHub. Key commands: helm install, upgrade, rollback, uninstall. Kustomize takes a different approach — no templating. You have a base directory with your standard manifests and overlay directories that patch those manifests. Use namePrefix, labels, patches to customize per environment. In the CKA exam, you'll likely need to install something via Helm (e.g., a CNI or ingress controller) or use Kustomize to modify existing manifests."

Commands — Helm & Kustomize

Helm commands

# Add repo and install chart helm repo add bitnami https://charts.bitnami.com/bitnami helm repo update helm install my-nginx bitnami/nginx -n my-app # Install with custom values helm install my-app ./my-chart --values=prod-values.yaml helm install my-app ./my-chart --set replicaCount=3 # Upgrade and rollback helm upgrade my-app bitnami/nginx --set image.tag=1.25 helm rollback my-app 1 # rollback to revision 1 helm history my-app # List and uninstall helm list -n my-app helm uninstall my-app -n my-app

Kustomize commands

# Apply with Kustomize (built into kubectl) kubectl apply -k ./overlays/production/ # Preview generated YAML kubectl kustomize ./overlays/production/ kustomize build ./overlays/production/
25

CNI, CSI, CRI — Extension Interfaces

Concept

Kubernetes uses plugin interfaces to keep the core modular:

CNI (Container Network Interface): Network plugin that assigns IPs to pods and enables pod-to-pod communication across nodes. Examples: Calico, Flannel, Cilium, Weave
CSI (Container Storage Interface): Storage plugin for dynamic volume provisioning. Examples: AWS EBS CSI, GCE PD CSI, NFS CSI
CRI (Container Runtime Interface): Runtime plugin. K8s talks to CRI, which talks to the container runtime. Examples: containerd, CRI-O (Docker removed since K8s 1.24)

CNI plugins live in /opt/cni/bin/. CNI config in /etc/cni/net.d/.

"These three interfaces are what make K8s extensible. CNI handles networking — when a pod is created, kubelet calls the CNI plugin to set up the network namespace, assign an IP, and configure routes. Without a CNI, pods can't communicate. CSI standardizes how storage vendors integrate — StorageClass references a CSI driver, and the provisioner creates volumes on demand. CRI abstracts the container runtime — kubelet uses CRI to pull images, create containers. After K8s 1.24, Docker is no longer supported directly; containerd and CRI-O are the standard runtimes."

Imperative Commands ⚡

kubectl commands

# Check which CNI is installed ls /etc/cni/net.d/ ls /opt/cni/bin/ kubectl get pods -n kube-system | grep -E 'calico|flannel|cilium|weave' # Install Calico CNI kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml # Check container runtime kubectl get nodes -o wide # CONTAINER-RUNTIME column crictl ps # list running containers via CRI crictl images # list images via CRI # Check CSI drivers kubectl get csidrivers kubectl get csinodes
26

CRDs & Operators

Concept

CRD (Custom Resource Definition) lets you extend the K8s API with your own resource types. After creating a CRD, you can create instances (Custom Resources) just like pods or services.

Operator = CRD + custom controller. The controller watches for changes to Custom Resources and takes action (create pods, configure databases, etc.). Think of it as encoding operational knowledge into software. Examples: Prometheus Operator, cert-manager, MySQL Operator.

"CRDs extend the K8s API — you define a new resource type, and K8s API server handles the rest: validation, RBAC, storage in etcd, kubectl support. An Operator packages a CRD with a controller that knows how to manage the lifecycle of a complex application. For example, a PostgreSQL Operator knows how to handle replication, failover, backup — things that would normally require a DBA. In the exam, you should know how to view CRDs, create custom resources, and understand the operator pattern."

YAML — CRD Example
# Custom Resource Definition
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: backups.stable.example.com
spec:
  group: stable.example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              schedule:
                type: string
              database:
                type: string
  scope: Namespaced
  names:
    plural: backups
    singular: backup
    kind: Backup
    shortNames:
    - bk

---
# Custom Resource (instance of the CRD)
apiVersion: stable.example.com/v1
kind: Backup
metadata:
  name: daily-db-backup
  namespace: my-app
spec:
  schedule: "0 2 * * *"
  database: "production-db"
Imperative Commands ⚡

kubectl commands

# List all CRDs in cluster kubectl get crd kubectl describe crd backups.stable.example.com # Work with custom resources kubectl get backups -n my-app kubectl get bk -n my-app # using shortName kubectl describe backup daily-db-backup -n my-app # List all API resources (shows CRDs too) kubectl api-resources | grep stable.example.com
27

StorageClass & Dynamic Provisioning

Concept

StorageClass enables dynamic volume provisioning — PVs are created automatically when a PVC is submitted, no manual PV creation needed.

provisioner: The CSI driver that creates the volume (e.g., kubernetes.io/aws-ebs, pd.csi.storage.gke.io)
reclaimPolicy: Delete (default) or Retain
volumeBindingMode: Immediate (bind PV right away) or WaitForFirstConsumer (wait until a pod uses the PVC — better for topology-aware scheduling)
allowVolumeExpansion: Allow PVC resize

"StorageClass abstracts storage infrastructure. Instead of pre-creating PVs, you define a StorageClass with a provisioner, and when a PVC references that StorageClass, the provisioner automatically creates the volume. WaitForFirstConsumer is important for multi-zone setups — it delays PV creation until a pod is scheduled, so the volume is created in the same zone as the pod. Without it, you might get a PV in zone-a but a pod scheduled in zone-b, which won't work."

YAML — StorageClass + Dynamic PVC
# StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: pd.csi.storage.gke.io    # cloud-specific provisioner
parameters:
  type: pd-ssd
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

---
# PVC using dynamic provisioning (no manual PV needed!)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-data
  namespace: my-app
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd       # references StorageClass
  resources:
    requests:
      storage: 20Gi
Imperative Commands ⚡

kubectl commands

kubectl get storageclass kubectl get sc # shorthand kubectl describe sc fast-ssd # Check default StorageClass kubectl get sc -o wide # Set default StorageClass kubectl patch sc fast-ssd -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
28

Troubleshooting — Clusters & Nodes

Concept

Cluster troubleshooting covers node issues, control plane components, and resource monitoring:

Node NotReady: Check kubelet status, container runtime, network
Control plane issues: Static pods in /etc/kubernetes/manifests/ — API server, scheduler, controller-manager, etcd
Monitoring: kubectl top (requires metrics-server), events, component logs
Container logs: kubectl logs, --previous for crashed containers, crictl logs for runtime-level logs

"When troubleshooting, start broad then narrow down. For a NotReady node: SSH in, check kubelet (systemctl status kubelet, journalctl -u kubelet), check container runtime, check /var/log/. For control plane issues: check if static pod manifests are correct in /etc/kubernetes/manifests/, check kube-apiserver logs. For monitoring: install metrics-server, then use kubectl top nodes/pods. Always check events first — kubectl get events --sort-by=.lastTimestamp gives you the timeline of what happened."

Commands — Cluster Troubleshooting

Essential debugging commands

# Node status and issues kubectl get nodes kubectl describe node node01 # check Conditions, Events # Control plane components kubectl get pods -n kube-system kubectl logs kube-apiserver-master -n kube-system kubectl logs kube-scheduler-master -n kube-system # kubelet (on the node via SSH) systemctl status kubelet journalctl -u kubelet -f # follow logs systemctl restart kubelet # Static pod manifests ls /etc/kubernetes/manifests/ cat /var/lib/kubelet/config.yaml | grep staticPodPath # Resource monitoring (requires metrics-server) kubectl top nodes kubectl top pods -n my-app --sort-by=cpu kubectl top pods -A --sort-by=memory # Events — timeline of what happened kubectl get events -n my-app --sort-by=.lastTimestamp kubectl get events -A --field-selector reason=Failed
29

Troubleshooting — Workloads

Concept

Common pod failure states:

CrashLoopBackOff: Container starts, crashes, restarts repeatedly. Check logs: kubectl logs pod --previous
ImagePullBackOff: Can't pull image. Wrong image name, tag, or registry auth issue
Pending: No node can schedule the pod. Insufficient resources, no matching nodeSelector/affinity, taints blocking
CreateContainerConfigError: Missing ConfigMap, Secret, or volume reference
OOMKilled: Container exceeded memory limit. Increase limits or fix memory leak

"For pod debugging, follow this flow: 1) kubectl get pods — see status. 2) kubectl describe pod — check Events section at the bottom for scheduling or volume errors. 3) kubectl logs — check app logs for crashes. 4) kubectl exec — get a shell and debug from inside. For CrashLoopBackOff, 90% of the time the answer is in kubectl logs --previous. For Pending, check events for scheduling failures. For ImagePullBackOff, verify the image exists and check imagePullSecrets if it's a private registry."

Commands — Pod Debugging Flow

Debugging workflow

# Step 1: Check pod status kubectl get pods -n my-app -o wide # Step 2: Describe for events and details kubectl describe pod my-pod -n my-app # Step 3: Check logs kubectl logs my-pod -n my-app kubectl logs my-pod -n my-app --previous # crashed container kubectl logs my-pod -n my-app -c sidecar # specific container kubectl logs my-pod -n my-app --all-containers=true # Step 4: Get shell inside pod kubectl exec -it my-pod -n my-app -- /bin/sh kubectl exec my-pod -n my-app -- env # check env vars kubectl exec my-pod -n my-app -- cat /etc/config/app.yaml # Deployment rollout issues kubectl rollout status deployment/my-deploy -n my-app kubectl rollout history deployment/my-deploy kubectl rollout undo deployment/my-deploy # fix bad deploy # Debug with ephemeral container kubectl debug my-pod -it --image=busybox --target=app
30

Troubleshooting — Networking

Concept

Network troubleshooting covers service connectivity, DNS, and pod networking:

Service not reachable: Check selector labels match pod labels. Check endpoints exist (kubectl get endpoints). Verify target port matches container port
DNS issues: Check CoreDNS pods are running. Test with nslookup from a debug pod. Check /etc/resolv.conf
Pod-to-pod issues: Check CNI plugin is installed and running. Check NetworkPolicies aren't blocking traffic
Ingress issues: Check Ingress controller pods, IngressClass, annotations, backend service

"For Service issues, the most common problem is label mismatch — the service selector doesn't match any pod labels, so endpoints are empty. Always check kubectl get endpoints. For DNS, verify CoreDNS is running, then test from inside a pod. For pod connectivity, check if CNI is healthy and if NetworkPolicies are blocking traffic. A good debug pattern: run a test pod with network tools (busybox or nicolaka/netshoot) and test connectivity step by step — can you reach the pod IP? The Service IP? The DNS name?"

Commands — Network Debugging

Network troubleshooting commands

# Check Service and Endpoints kubectl get svc -n my-app kubectl get endpoints -n my-app # empty = label mismatch! kubectl describe svc my-svc -n my-app # Test connectivity from debug pod kubectl run netdebug --image=busybox --rm -it -- sh # Inside pod: # wget -qO- http://my-svc.my-app.svc:80 # nslookup my-svc.my-app.svc.cluster.local # ping 10.244.1.5 # DNS troubleshooting kubectl get pods -n kube-system -l k8s-app=kube-dns kubectl logs -n kube-system -l k8s-app=kube-dns kubectl run dns-test --image=busybox --rm -it -- nslookup kubernetes.default # Check NetworkPolicies blocking traffic kubectl get netpol -n my-app kubectl describe netpol -n my-app # Ingress troubleshooting kubectl get ingress -n my-app kubectl describe ingress my-ingress -n my-app kubectl get pods -n ingress-nginx # check controller kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx
CKA Debug Pattern: get pods → describe pod (Events) → logs → exec into pod. For services: get endpoints → test from inside cluster. 90% of problems are found in Events or logs!