Kubernetes初探（八）

Katacoda在线课：Liveness and Readiness Healthchecks

本系列教程希望能通过交互式学习网站与传统方式结合，更高效、容易的学习知识。本系列教程将使用 Katacoda在线学习平台完成学习。

在此场景中，您将了解 Kubernetes 如何使用 Readiness and Liveness Probes 检查容器运行状况。

Readiness Probe 检查应用是否准备好开始处理流量。此探针解决了容器已启动的问题，但该进程仍在预热和配置自身，这意味着它尚未准备好接收流量。

Liveness Probe 确保应用程序健康并能够处理请求。

启动集群

首先，我们需要启动一个 Kubernetes 集群。

执行以下命令启动集群组件并下载 Kubectl CLI

controlplane $ launch.sh
Waiting for Kubernetes to start...
Kubernetes started

集群启动后，使用 kubectl apply -f deploy.yaml 部署演示应用程序。

deploy.yaml

kind: List
apiVersion: v1
items:
- kind: ReplicationController
  apiVersion: v1
  metadata:
    name: frontend
    labels:
      name: frontend
  spec:
    replicas: 1
    selector:
      name: frontend
    template:
      metadata:
        labels:
          name: frontend
      spec:
        containers:
        - name: frontend
          image: katacoda/docker-http-server:health
          readinessProbe:
            httpGet:
              path: /
              port: 80
            initialDelaySeconds: 1
            timeoutSeconds: 1
          livenessProbe:
            httpGet:
              path: /
              port: 80
            initialDelaySeconds: 1
            timeoutSeconds: 1
- kind: ReplicationController
  apiVersion: v1
  metadata:
    name: bad-frontend
    labels:
      name: bad-frontend
  spec:
    replicas: 1
    selector:
      name: bad-frontend
    template:
      metadata:
        labels:
          name: bad-frontend
      spec:
        containers:
        - name: bad-frontend
          image: katacoda/docker-http-server:unhealthy
          readinessProbe:
            httpGet:
              path: /
              port: 80
            initialDelaySeconds: 1
            timeoutSeconds: 1
          livenessProbe:
            httpGet:
              path: /
              port: 80
            initialDelaySeconds: 1
            timeoutSeconds: 1
- kind: Service
  apiVersion: v1
  metadata:
    labels:
      app: frontend
      kubernetes.io/cluster-service: "true"
    name: frontend
  spec:
    type: NodePort
    ports:
    - port: 80
      nodePort: 30080
    selector:
      app: frontend

controlplane $ kubectl apply -f deploy.yaml
replicationcontroller/frontend created
replicationcontroller/bad-frontend created
service/frontend created

Readiness Probe

在部署集群时，还部署了两个 Pod 来演示健康检查。您可以使用 cat deploy.yaml 查看部署。

livenessProbe:
  httpGet:
    path: /
    port: 80
  initialDelaySeconds: 1
  timeoutSeconds: 1

可以根据您的应用程序更改设置来调用不同的端点，例如 /ping。

获取状态

第一个 Pod bad-frontend 是一个 HTTP 服务，它总是返回 500 错误，表明它没有正确启动。您可以使用 kubectl get pods --selector="name=bad-frontend" 命令查看 Pod 的状态。

controlplane $ kubectl get pods --selector="name=bad-frontend"
NAME                 READY   STATUS    RESTARTS   AGE
bad-frontend-jk5z2   0/1     Running   2          78s

Kubectl 将返回使用我们的特定标签部署的 Pod。因为健康检查失败，它会显示没有容器为准备就绪的状态，同时它还将显示容器的重启尝试次数。

要了解有关失败原因的更多详细信息，请对该 Pod 使用描述命令。

controlplane $ pod=$(kubectl get pods --selector="name=bad-frontend" --output=jsonpath={.items..metadata.name})
controlplane $ kubectl describe pod $pod
Name:               bad-frontend-jk5z2
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               controlplane/172.17.0.67
Start Time:         Fri, 23 Jul 2021 05:54:59 +0000
Labels:             name=bad-frontend
Annotations:        <none>
Status:             Running
IP:                 10.32.0.6
Controlled By:      ReplicationController/bad-frontend
Containers:
  bad-frontend:
    Container ID:   docker://811c3fa6d76c13f4b5bc7a6b2d6f514292e9673be46572a79b8f2d3d5e36bc62
    Image:          katacoda/docker-http-server:unhealthy
    Image ID:       docker-pullable://katacoda/docker-http-server@sha256:bea95c69c299c690103c39ebb3159c39c5061fee1dad13aa1b0625e0c6b52f22
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Fri, 23 Jul 2021 05:57:07 +0000
      Finished:     Fri, 23 Jul 2021 05:57:37 +0000
    Ready:          False
    Restart Count:  4
    Liveness:       http-get http://:80/ delay=1s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:80/ delay=1s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-5n24z (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-5n24z:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-5n24z
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                   From                   Message
  ----     ------     ----                  ----                   -------
  Normal   Scheduled  2m53s                 default-scheduler      Successfully assigned default/bad-frontend-jk5z2 to controlplane
  Normal   Pulling    2m52s                 kubelet, controlplane  Pulling image "katacoda/docker-http-server:unhealthy"
  Normal   Pulled     2m45s                 kubelet, controlplane  Successfully pulled image "katacoda/docker-http-server:unhealthy"
  Normal   Created    105s (x3 over 2m45s)  kubelet, controlplane  Created container bad-frontend
  Normal   Started    105s (x3 over 2m45s)  kubelet, controlplane  Started container bad-frontend
  Warning  Unhealthy  105s (x6 over 2m35s)  kubelet, controlplane  Liveness probe failed: HTTP probe failed with statuscode: 500
  Normal   Killing    105s (x2 over 2m15s)  kubelet, controlplane  Container bad-frontend failed liveness probe, will be restarted
  Normal   Pulled     105s (x2 over 2m15s)  kubelet, controlplane  Container image "katacoda/docker-http-server:unhealthy" already present on machine
  Warning  Unhealthy  103s (x7 over 2m43s)  kubelet, controlplane  Readiness probe failed: HTTP probe failed with statuscode: 500

Readiness

我们的第二个 Pod，frontend，在启动时返回 OK 状态。

controlplane $ kubectl get pods --selector="name=frontend"
NAME             READY   STATUS    RESTARTS   AGE
frontend-54czd   1/1     Running   0          4m9s

Liveness Probe

由于我们的第二个 Pod 当前处于健康状态，我们可以模拟发生的故障。

目前，该 Pod 应该没有发生崩溃。

controlplane $ kubectl get pods --selector="name=frontend"
NAME             READY   STATUS    RESTARTS   AGE
frontend-54czd   1/1     Running   0          7m17s

让服务崩溃

HTTP 服务器有一个额外的端点，这将导致它返回 500 错误。使用 kubectl exec 可以调用端点。

controlplane $ pod=$(kubectl get pods --selector="name=frontend" --output=jsonpath={.items..metadata.name})
controlplane $ kubectl exec $pod -- /usr/bin/curl -s localhost/unhealthy

Liveness

Kubernetes 将根据配置执行 Liveness Probe。如果探测器失败，Kubernetes 将销毁并重新创建失败的容器。执行上面的命令使服务崩溃并观察 Kubernetes 自动恢复它。

controlplane $ kubectl get pods --selector="name=frontend"
NAME             READY   STATUS    RESTARTS   AGE
frontend-54czd   1/1     Running   1          9m42s

检查可能需要一些时间才能检测到。