zhangguanzhang's Blog

k8s 使用 nfs 下 pod 无法创建的解决思路

字数统计: 2k阅读时长: 10 min
2023/08/18

群友付费找我解决 pod 无法创建的过程,写出来给别人参考

由来

k8s 群里有群友问 pod 创建调度到某节点后,长期处于 containercreating ,让他看日志他看不出啥来。后面加我还有付费让我看看

解决过程

最开始是没挂载的部署一个 nginx 的 pod 出问题,describe 确实看不到啥信息,后面是 nfs pvc 的 pod 无法调度。

环境信息:

1
2
3
4
5
6
$ kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready master 2y129d v1.18.3 xxx.xx.xx.9 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.5
work01 Ready <none> 2y129d v1.18.3 xxx.xx.xx.1 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.5
work02 Ready <none> 2y129d v1.18.3 xxx.xx.xx.3 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.5
work03 Ready <none> 2y129d v1.18.3 xxx.xx.xx.4 <none> CentOS Linux 7 (Core) 3.10.0-1160.24.1.el7.x86_64 docker://19.3.5

orphaned pod xxx found, but

kubelet 日志刷下面的

1
kubelet_volumes.go:154] orphaned pod "xxx" found, but volume paths are still present on disk : There were a total of 84 errors similar to this. Turn up verbosity to see them.

这个是 1.20 还是哪个版本之前,pod 到其他节点或者删掉后,相关的一些目录还遗留在节点上的 --root-dir 下,默认是 /var/lib/kubelet/pods 下的 uuid 字样的目录,可以 find 下它确认里面的内容,以及看 etc-hosts 文件,看 hostname 后利用 kubectl get pod 查看是否存在这个 pod 名,不存在就是遗留目录,可以手动清理下。 这个问题我记得后续有人提交了 pr kubelet 会定期清理这种目录的。

Faild to get system container stats

依次处理掉上面的日志里错误后,看到下面的

1
summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": faild to get container info for "system.slice/docker.service": unknow container "/system.slice/docker.service"

这个是 kubelet 无法从 docker 获取一些信息,一般是 docker 出问题了,看了下 docker 日志也还正常。询问了下能不能重启 docker ,说上午就重启过了,那就是其他问题了。

他说 df -h 会卡住,应该要解决掉卡住的问题后再重启 docker,因为 docker 和 kubelet 收集信息会调用一些 fs 操作。安装了个 strace 看了下:

1
2
3
4
5
6
stat("/var/lib/kubelet/pods/e3b61daa-86a7-4e2d-8c82-bdd96e7a6da2/volumes/kubernetes.io~secret/jenkins-admin-token-6479r", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=140, ...}) = 0
stat("/var/lib/kubelet/pods/c527ccff-e623-48e1-90be-11930facc11b/volumes/kubernetes.io~secret/default-token-r9mv9", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=140, ...}) = 0
stat("/sys/fs/bpf", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=0, ...}) = 0
stat("/var/lib/kubelet/pods/e3b61daa-86a7-4e2d-8c82-bdd96e7a6da2/volumes/kubernetes.io~nfs/jenkins", {st_mode=S_IFDIR|0777, st_size=3688, ...}) = 0
stat("/var/lib/kubelet/pods/c527ccff-e623-48e1-90be-11930facc11b/volumes/kubernetes.io~nfs/pv-es-master", ^C^Z
[1]+ 已停止 strace df -h

可以看到卡住的路径,然后 umount 掉:

1
umount -lf /var/lib/kubelet/pods/c527ccff-e623-48e1-90be-11930facc11b/volumes/kubernetes.io~nfs/pv-es-master

多次处理,直到 df -h 不卡,然后重启 docker 后,nginx 能调度到这个节点上了,有个容器删不掉,最后删掉 /var/lib/docker/containers/xxxxx 后重启才清理掉它的 docker ps -a 显示。

nfs

然后 nginx 能调度后,发现带 pvc 的 pod 无法调度到该节点上,等待后 describe 显示:

1
2
3
4
5
6
7
8
9
10
$ kubectl   -n content-dev  get pod content-754c9964bc-8dbxw -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
content-754c9964bc-8dbxw 0/1 ContainerCreating 0 63s <none> work02 <none> <none>

$ kubectl -n content-dev describe pod content-754c9964bc-8dbxw
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 8s kubelet Unable to attach or mount volumes: unmounted volumes=[nfs], unattached volumes=[nfs default-token-vdjpz]: timed out waiting for the condition

查看 pod 使用的 pvc 信息:

1
2
3
4
5
6
7
8
9
$ kubectl get deploy content -n content-dev -o yaml
...
volumes:
- name: nfs
persistentVolumeClaim:
claimName: datanfs-pvc
$ kubectl -n content-dev get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
datanfs-pvc Bound pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf 10Gi RWO nfs-client 39d

查看这个 pvc 的 pv 属性:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
$ kubectl   -n content-dev  get pv pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf 10Gi RWO Delete Bound content-dev/datanfs-pvc nfs-client 39d
$ kubectl -n content-dev get pv pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/provisioned-by: cluster.local/nfs-client-nfs-client-provisioner
creationTimestamp: "2023-07-12T05:03:19Z"
finalizers:
- kubernetes.io/pv-protection
name: pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf
resourceVersion: "269296403"
selfLink: /api/v1/persistentvolumes/pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf
uid: add520cb-6506-47b0-b21a-24b1c92e9d56
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: datanfs-pvc
namespace: content-dev
resourceVersion: "253312376"
uid: 9865b525-2cb5-4a2a-b7a7-036ca9f524cf
nfs:
path: /volume3/cloudxxx-pre/content-dev-datanfs-pvc-pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf
server: xxx.xx.xx.50
persistentVolumeReclaimPolicy: Delete
storageClassName: nfs-client
volumeMode: Filesystem
status:
phase: Bound

使用的 nfs , 在 work02 上使用 showmount 查看下本机是否有挂载权限:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ showmount -e xxx.xx.xx.50
Export list for xxx.xx.xx.50:
/volume1/secondary *
/volume1/primary *
/volume3/devnfs *
/volume4/xxxxxxxx-demo *
/volume4/xxx2 *
/volume4/xxx *
/volume4/xxxxxxxx *
/volume4/xxxxxxxx-xxx *
/volume4/xxxxxxxx-xxxsoft *
/volume4/k8s-nfs *
/volume3/cicd *
/volume3/xxxxxxxx-pre *
/volume2/web xxx.xxx.xxx.235,xxx.xxx.xxx.211,xxx.xxx.xxx.51,xxx.xxx.xxx.50,xxx.xxx.xxx.55,xxx.xxx.xxx.238
/volume3/VSPHERE-NFS-LUN1 xxx.xx.xx.167,xxx.xx.xx.166,xxx.xx.xx.165,xxx.xx.xx.164,xxx.xx.xx.163,xxx.xx.xx.162,xxx.xx.xx.161,xxx.xx.xx.160
/volume1/文件共享目录 xxx.xx.xx.11

有权限,说明 nfs server 的 /etc/exports 配置没问题,然后看下 kubelet 的挂载进程:

1
2
$ ps aux | grep pvc-9865b525 
root 163806 0.0 0.0 123632 1056 ? S 8月18 0:00 /usr/bin/mount -t nfs xxx.xx.xx.50:/volume3/cloudxxx-pre/content-dev-datanfs-pvc-pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf /var/lib/kubelet/pods/cd743cb6-839c-4a15-9203-5321a0ed0666/volumes/kubernetes.io~nfs/pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf

手动尝试挂载,发现卡住

1
2
3
$ mkdir test1111
$ mount.nfs xxx.xx.xx.50:/volume3/cloudxxx-pre/ test1111
^C

查看下是否有 nfs 内核模块:

1
2
3
4
5
6
7
8
9
10
11
12
$ lsmod |grep nfs
nfsv3 43720 0
nfsd 351321 13
nfs_acl 12837 2 nfsd,nfsv3
auth_rpcgss 59415 2 nfsd,rpcsec_gss_krb5
nfsv4 584056 3
dns_resolver 13140 1 nfsv4
nfs 262045 4 nfsv3,nfsv4
lockd 98048 3 nfs,nfsd,nfsv3
grace 13515 2 nfsd,lockd
fscache 64980 2 nfs,nfsv4
sunrpc 358543 33 nfs,nfsd,rpcsec_gss_krb5,auth_rpcgss,lockd,nfsv3,nfsv4,nfs_acl

确实有,之前接触过 nas ,发现不同内核对于支持的版本不一样,尝试看看:

1
2
3
4
5
6
7
8
$ mount.nfs -o vers=3 xxx.xx.xx.50:/volume3/cloudxxx-pre/ test1111
$ umount test1111
$ mount.nfs -o vers=4 xxx.xx.xx.50:/volume3/cloudxxx-pre/ test1111
^C
$ mount.nfs -o vers=4.0 xxx.xx.xx.50:/volume3/cloudxxx-pre/ test1111
$ umount test1111
$ mount.nfs -o vers=4.1 xxx.xx.xx.50:/volume3/cloudxxx-pre/ test1111
^C

从上面看,默认挂载 nfs 时用的是最新 version,目前本机上只有 34.0 可以用,需要把挂载版本加到 pv 上。

1
2
3
$ kubectl   -n content-dev  edit pv pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf
mountOptions:
- nfsvers=4.0

然后 pod 就能创建了:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ kubectl   -n content-dev  describe  pod content-754c9964bc-8dbxw
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 4m18s (x4 over 11m) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs], unattached volumes=[nfs default-token-vdjpz]: timed out waiting for the condition
Warning FailedMount 2m kubelet Unable to attach or mount volumes: unmounted volumes=[nfs], unattached volumes=[default-token-vdjpz nfs]: timed out waiting for the condition
Warning FailedMount 11s kubelet MountVolume.SetUp failed for volume "pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf" : mount failed: signal: terminated
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/c61fd1a0-cb8f-4cc9-84c9-122a7d24cde6/volumes/kubernetes.io~nfs/pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf --scope -- mount -t nfs xxx.xx.xx.50:/volume3/cloudxxx-pre/content-dev-datanfs-pvc-pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf /var/lib/kubelet/pods/c61fd1a0-cb8f-4cc9-84c9-122a7d24cde6/volumes/kubernetes.io~nfs/pvc-9865b525-2cb5-4a2a-b7a7-036ca9f524cf
Output: Running scope as unit run-164051.scope.
Normal Pulled 10s kubelet Container image "xxx.xx.xx.215/content/content:dev-247" already present on machine
Normal Created 10s kubelet Created container spring-boot
Normal Started 10s kubelet Started container spring-boot

然后处理掉其他的已有的 pv pvc 都加上 mountOptions 。还需要清理掉每个节点上卡住的 mount 进程

1
ps aux | grep -P 'mount.+nf[s]'

查找到 pid 后 kill -9 清理下

处理后续的 pv

发现使用了 nfs-provisioner,所有的 pvc 都是从 sc 创建出来的,对于已经创建的前面手动处理了,避免后续创建出来的没带 mountOptions,我们需要修改 sc 也加上:

1
2
3
4
5
6
7
$ kubectl get sc nfs-client
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
nfs-client cluster.local/nfs-client-nfs-client-provisioner Delete Immediate true 499d

$ kubectl edit sc nfs-client
mountOptions:
- nfsvers=4.0
CATALOG
  1. 1. 由来
  2. 2. 解决过程
    1. 2.1. orphaned pod xxx found, but
    2. 2.2. Faild to get system container stats
    3. 2.3. nfs
    4. 2.4. 处理后续的 pv