由来 开发反应他们环境上有个 pod 不正常。他们先操作然后卡住后来找我的。
1 2 3 4 5 $ kubectl -n xxx get pod -o wide | grep 0/1 xstorage-6d959cd8c8-rdscv 0/1 Init:0/1 0 15m xxxx $ kubectl -n xxx delete po xstorage-6d959cd8c8-rdscv pod "xstorage-6d959cd8c8-rdscv" deleted ^[[A^[[A^[[A^[[A
处理过程 查看日志:
1 2 3 4 $ journalctl -xe -u kubelet ... kubelet_pods.go:1090] Killing unwanted pod "xstorage-6d959cd8c8-rdscv" kuberuntime_container.go:581] Killing container "docker://7e33d0cdfe694050d3a7ef2a2553792eaff78286c92c52674320b2be23eb793e" with 30 second grace period
查看下状态显示运行,kubelet 显示走 api 删不掉,所以也不打算尝试 docker rm -f
之类的了,从 pid 层面去试试
1 2 3 4 5 6 7 8 $ docker ps -a | grep 7e33 7e33d0cdfe69 ed29c9ed519e "bash -ceux '\\cp -rf…" 23 minutes ago Up 23 minutes k8s_copy-configmap_xstorage-6d959cd8c8-rdscv_xxx_ee4423f2-de76-424c-b685-8d888446b2af_0 $ docker inspect 7e33 | grep -i pid "Pid": 15370, "PidMode": "", "PidsLimit": 0, $ kill -9 15370 -bash: kill: (15370) - 没有那个进程
绝了,尝试下 exec 看看:
1 2 3 4 $ docker exec 7e33 ps aux cannot exec in a stopped state: unknown $ docker ps -a | grep 7e33 7e33d0cdfe69 ed29c9ed519e "bash -ceux '\\cp -rf…" 23 minutes ago Up 23 minutes k8s_copy-configmap_xstorage-6d959cd8c8-rdscv_xxx_ee4423f2-de76-424c-b685-8d888446b2af_0
好吧,从上层的 containerd-shim 处理吧,按照容器 id 的前面,例如我这里取 7e33
:
1 2 3 4 5 $ ps aux | grep 7e33 root 15251 0.0 0.0 10724 3984 ? Sl 10:43 0:00 containerd-shim -namespace moby -workdir /data/kube/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/7e33d0cdfe694050d3a7ef2a2553792eaff78286c92c52674320b2be23eb793e -address /var/run/docker/containerd/containerd.sock -containerd-binary /data/kube/bin/containerd -runtime-root /var/run/docker/runtime-runc root 28330 0.0 0.0 112728 984 pts/5 S+ 11:07 0:00 grep --color=auto 7e33 $ kill 15251 $ docker ps -a | grep 7e33
然后 pod 也没有了。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 $ docker info Containers: 388 Running: 301 Paused: 0 Stopped: 87 Images: 378 Server Version: 18.09.3 Storage Driver: overlay2 Backing Filesystem: xfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: e6b3f5632f50dbc4e9cb6288d911bf4f5e95b18e runc version: 6635b4f0c6af3810594d2770f662f34ddc15b40d init version: fec3683 Security Options: seccomp Profile: default Kernel Version: 3.10.0-693.el7.x86_64 Operating System: Red Hat Enterprise Linux Server 7.4 (Maipo) OSType: linux Architecture: x86_64 CPUs: 8 Total Memory: 31.26GiB Name: redhat7.4 ID: JJTI:G4BO:2VFT:PGNY:VYFD:OYHD:MNK3:5L6U:RY4Q:4WUC:NQDA:HP2D Docker Root Dir: /data/kube/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: reg.xxx.lan:5000 treg.yun.xxx.cn 127.0.0.0/8 Registry Mirrors: https://registry.docker-cn.com/ https://docker.mirrors.ustc.edu.cn/ Live Restore Enabled: false Product License: Community Engine