zhangguanzhang's Blog

干掉烦人的 open /run/xtables.lock: is a directory

字数统计: 3.4k阅读时长: 19 min
2021/08/27

环境信息

suse 这辣鸡系统,使用官方文档 docker-static 的二进制安装的话会无法起来,所以我们在 suse 上用的是很久之前的 rpm 安装的 docker。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
$ cat /etc/os-release 
NAME="SLES"
VERSION="12-SP5"
VERSION_ID="12.5"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP5"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12:sp5"

$ rpm -qa | grep docker
docker-17.09.1_ce-98.18.1.x86_64
docker-libnetwork-0.7.0.1+gitr2066_7b2b1feb1de4-10.1.x86_64
docker-runc-1.0.0rc4+gitr3338_3f2f8b84a77f-1.3.1.x86_64

$ docker info
Containers: 62
Running: 35
Paused: 0
Stopped: 27
Images: 89
Server Version: 17.09.1-ce
Storage Driver: btrfs
Build Version: Btrfs v3.18.2+20150430
Library Version: 101
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: oci runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 06b9cb35161009dcb7123345749fef02f7cea8e0
runc version: 3f2f8b84a77f73d38244dd690525642a72156c64
init version: v0.1.3_catatonit (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
apparmor
Kernel Version: 4.12.14-120-default
Operating System: SUSE Linux Enterprise Server 12 SP5
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 62.67GiB
Name: SUSESP5
ID: GZWM:UWPN:INDR:SICH:UD6H:RKPG:WRYK:5YDY:6723:I2HR:UPWI:KI6W
Docker Root Dir: /data/kube/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
reg.xxx.lan:5000
treg.yun.xxx.cn
127.0.0.0/8
Registry Mirrors:
https://registry.docker-cn.com/
https://docker.mirrors.ustc.edu.cn/
Live Restore Enabled: false

WARNING: No swap limit support

问题是部署后业务无法解析域名,排查了下是 svc 的 iptables 规则没生成,最后在 kube-proxy (二进制部署的)的日志发现如下

1
2
3
4
5
6
Aug 27 10:41:58 SUSESP5 kube-proxy[1666]: E0827 10:41:58.480578    1666 proxier.go:1595] Failed to execute iptables-restore: failed to open iptables lock /run/xtables.lock: open /run/xtables.lock: is a directory
Aug 27 10:41:58 SUSESP5 kube-proxy[1666]: I0827 10:41:58.480614 1666 proxier.go:876] Sync failed; retrying in 18s
Aug 27 10:42:16 SUSESP5 kube-proxy[1666]: E0827 10:42:16.502207 1666 proxier.go:1595] Failed to execute iptables-restore: failed to open iptables lock /run/xtables.lock: open /run/xtables.lock: is a directory
Aug 27 10:42:16 SUSESP5 kube-proxy[1666]: I0827 10:42:16.502266 1666 proxier.go:876] Sync failed; retrying in 18s
Aug 27 10:42:24 SUSESP5 kube-proxy[1666]: E0827 10:42:24.806734 1666 proxier.go:1595] Failed to execute iptables-restore: failed to open iptables lock /run/xtables.lock: open /run/xtables.lock: is a directory
Aug 27 10:42:24 SUSESP5 kube-proxy[1666]: I0827 10:42:24.806768 1666 proxier.go:876] Sync failed; retrying in 18s

我们的 ipset 容器 -v 挂载了这个 iptables 的锁文件,众所周知 -v 的 src 如果不存在就会被 mkdir -p 。找了个干净的环境,确认了下 iptables 命令(即使啥选项都不带都)会生成该锁文件:

1
2
3
4
5
6
7
8
9
$ rm -f /run/xtables.lock
$ ll /run/xtables.lock
ls: cannot access '/run/xtables.lock': No such file or directory
$ iptables
iptables v1.8.2 (legacy): no command specified
Try `iptables -h' or 'iptables --help' for more information.
$ ll /run/xtables.lock
-rw------- 1 root root 0 Aug 27 10:27 /run/xtables.lock

说个题外话,我们其他 os 上没有这个问题

解决

其实昨天就在 kube-proxy 的 systemd 里加了下面,但是今天重启测试了下还是发生了。

1
ExecStartPre=/bin/bash -c 'test -d /run/xtables.lock && rmdir /run/xtables.lock || true'

手动删除目录的话,因为 kube-proxy 时刻调用 iptables ,删除后该锁文件就会被创建。但是总不能每次开机弄下。

1
2
3
4
5
$ rmdir /run/xtables.lock
$ ll /run/xtables.lock
ls: cannot access '/run/xtables.lock': No such file or directory
$ ll /run/xtables.lock
-rw------- 1 root root 0 Aug 27 10:43 /run/xtables.lock

我们在 suse 的安装下给 systemd 加了个子配置文件,这个 ExecStartPre 是解决另一个问题的,为啥这样自己去猜。所以一开始是我的想法是再加个 ExecStartPre 解决

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ cat /etc/systemd/system/docker.service.d/10.docker.conf
[Service]
ExecStartPre=/bin/bash -c 'test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true'
Restart=always
RestartSec=10
$ vi /etc/systemd/system/docker.service.d/10.docker.conf
$ cat /etc/systemd/system/docker.service.d/10.docker.conf
[Service]
ExecStartPre=/bin/bash -c 'test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true'
ExecStartPre=/bin/bash -c 'test -d /run/xtables.lock && rmdir /run/xtables.lock || true'
ExecStartPre=/bin/bash -c 'iptables -V &>/dev/null || true'
Restart=always
RestartSec=10

$ reboot

重启后开机还是这样。。。

1
2
3
4
5
6
7
8
9
10
11
12
Aug 27 10:47:31 SUSESP5 kube-proxy[1680]: I0827 10:47:31.265099    1680 shared_informer.go:247] Caches are synced for node config
Aug 27 10:47:31 SUSESP5 kube-proxy[1680]: I0827 10:47:31.865070 1680 shared_informer.go:247] Caches are synced for service config
Aug 27 10:47:31 SUSESP5 kube-proxy[1680]: E0827 10:47:31.893976 1680 proxier.go:1595] Failed to execute iptables-restore: failed to open iptables lock /run/xtables.lock: open /run/xtables.lock: is a directory
Aug 27 10:47:31 SUSESP5 kube-proxy[1680]: I0827 10:47:31.894015 1680 proxier.go:876] Sync failed; retrying in 18s
Aug 27 10:47:32 SUSESP5 kube-proxy[1680]: E0827 10:47:32.632956 1680 proxier.go:1595] Failed to execute iptables-restore: failed to open iptables lock /run/xtables.lock: open /run/xtables.lock: is a directory
Aug 27 10:47:32 SUSESP5 kube-proxy[1680]: I0827 10:47:32.632986 1680 proxier.go:876] Sync failed; retrying in 18s
Aug 27 10:47:39 SUSESP5 kube-proxy[1680]: E0827 10:47:39.109972 1680 proxier.go:1595] Failed to execute iptables-restore: failed to open iptables lock /run/xtables.lock: open /run/xtables.lock: is a directory
Aug 27 10:47:39 SUSESP5 kube-proxy[1680]: I0827 10:47:39.110010 1680 proxier.go:876] Sync failed; retrying in 18s
Aug 27 10:47:39 SUSESP5 kube-proxy[1680]: E0827 10:47:39.128309 1680 proxier.go:1595] Failed to execute iptables-restore: failed to open iptables lock /run/xtables.lock: open /run/xtables.lock: is a directory
Aug 27 10:47:39 SUSESP5 kube-proxy[1680]: I0827 10:47:39.128340 1680 proxier.go:876] Sync failed; retrying in 18s
Aug 27 10:47:40 SUSESP5 kube-proxy[1680]: E0827 10:47:40.150827 1680 proxier.go:1595] Failed to execute iptables-restore: failed to open iptables lock /run/xtables.lock: open /run/xtables.lock: is a directory
Aug 27 10:47:40 SUSESP5 kube-proxy[1680]: I0827 10:47:40.150873 1680 proxier.go:876] Sync failed; retrying in 18s

确认了下子配置文件是生效了的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
$ systemctl cat docker
# /etc/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network.target containerd.socket containerd.service lvm2-monitor.service SuSEfirewall2.service
Requires=containerd.socket containerd.service

[Service]
EnvironmentFile=/etc/sysconfig/docker

# While Docker has support for socket activation (-H fd://), this is not
# enabled by default because enabling socket activation means that on boot your
# containers won't start until someone tries to administer the Docker daemon.
Type=notify
ExecStart=/usr/bin/dockerd --containerd /run/containerd/containerd.sock --add-runtime oci=/usr/sbin/docker-runc $DOCKER_NETWORK_OPTIONS $DOCKER_OPTS
ExecReload=/bin/kill -s HUP $MAINPID

# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this property.
TasksMax=infinity

# Set delegate yes so that systemd does not reset the cgroups of docker containers
# Only systemd 218 and above support this property.
Delegate=yes

# This is not necessary because of how we set up containerd.
#KillMode=process

[Install]
WantedBy=multi-user.target


# /etc/systemd/system/docker.service.d/10.docker.conf
[Service]
ExecStartPre=/bin/bash -c 'test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true'
ExecStartPre=/bin/bash -c 'test -d /run/xtables.lock && rmdir /run/xtables.lock || true'
ExecStartPre=/bin/bash -c 'iptables -V &>/dev/null || true'
Restart=always
RestartSec=10

看下状态,确定是执行了的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/etc/systemd/system/docker.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/docker.service.d
└─10.docker.conf
Active: active (running) since Fri 2021-08-27 10:47:31 CST; 50s ago
Docs: http://docs.docker.com
Process: 1685 ExecStartPre=/bin/bash -c iptables -V &>/dev/null || true (code=exited, status=0/SUCCESS)
Process: 1681 ExecStartPre=/bin/bash -c test -d /run/xtables.lock && rmdir /run/xtables.lock || true (code=exited, status=0/SUCCESS)
Process: 1675 ExecStartPre=/bin/bash -c test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true (code=exited, status=0/SUCCESS)
Main PID: 1740 (dockerd)
Tasks: 81
Memory: 318.9M
CPU: 11.734s
CGroup: /system.slice/docker.service
└─1740 /usr/bin/dockerd --containerd /run/containerd/containerd.sock --add-runtime oci=/usr/sbin/docker-runc

一开始怀疑是不是顺序问题,测试了下是对的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
$ vi /etc/systemd/system/docker.service.d/10.docker.conf
$ cat /etc/systemd/system/docker.service.d/10.docker.conf
[Service]
ExecStartPre=/bin/bash -c 'test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true'
ExecStartPre=/bin/bash -c 'test -d /run/xtables.lock && rmdir /run/xtables.lock || true'
ExecStartPre=/bin/bash -c 'iptables -V &>/dev/null || true'
ExecStartPre=/bin/bash -c 'echo 1 >> /root/test.file'
ExecStartPre=/bin/bash -c 'echo 2 >> /root/test.file'
ExecStartPre=/bin/bash -c 'echo 3 >> /root/test.file'
ExecStartPre=/bin/bash -c 'echo 4 >> /root/test.file'
ExecStartPre=/bin/bash -c 'echo 5 >> /root/test.file'
Restart=always
RestartSec=10

$ reboot

$ cat /root/test.file
1
2
3
4
5
$ systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/etc/systemd/system/docker.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/docker.service.d
└─10.docker.conf
Active: active (running) since Fri 2021-08-27 11:03:18 CST; 28s ago
Docs: http://docs.docker.com
Process: 1777 ExecStartPre=/bin/bash -c echo 5 >> /root/test.file (code=exited, status=0/SUCCESS)
Process: 1769 ExecStartPre=/bin/bash -c echo 4 >> /root/test.file (code=exited, status=0/SUCCESS)
Process: 1766 ExecStartPre=/bin/bash -c echo 3 >> /root/test.file (code=exited, status=0/SUCCESS)
Process: 1758 ExecStartPre=/bin/bash -c echo 2 >> /root/test.file (code=exited, status=0/SUCCESS)
Process: 1750 ExecStartPre=/bin/bash -c echo 1 >> /root/test.file (code=exited, status=0/SUCCESS)
Process: 1684 ExecStartPre=/bin/bash -c iptables -V &>/dev/null || true (code=exited, status=0/SUCCESS)
Process: 1681 ExecStartPre=/bin/bash -c test -d /run/xtables.lock && rmdir /run/xtables.lock || true (code=exited, status=0/SUCCESS)
Process: 1676 ExecStartPre=/bin/bash -c test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true (code=exited, status=0/SUCCESS)
Main PID: 1783 (dockerd)
Tasks: 77
Memory: 304.5M
CPU: 10.183s
CGroup: /system.slice/docker.service
└─1783 /usr/bin/dockerd --containerd /run/containerd/containerd.sock --add-runtime oci=/usr/sbin/docker-runc


$ ll /run/xtables.lock
total 0

去掉 iptables 的输出重定向看看执行了没

1
2
3
4
5
6
7
8
$ cat /etc/systemd/system/docker.service.d/10.docker.conf
[Service]
ExecStartPre=/bin/bash -c 'test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true'
ExecStartPre=/bin/bash -c 'rmdir /run/xtables.lock &>/dev/null && iptables -V || true'
Restart=always
RestartSec=10

$ reboot

确实执行了

1
2
3

$ journalctl -xe -u docker |& grep iptables
Aug 27 11:08:20 SUSESP5 bash[1705]: iptables v1.4.21

有点绝了,然后试了下 audit 审计,但是查不出啥信息,只能继续查了。

1
echo '-w /run/xtables.lock -prw ' > /etc/audit/rules.d/zgz.rules

加个 stat 看看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ cat /etc/systemd/system/docker.service.d/10.docker.conf
[Service]
ExecStartPre=/bin/bash -c 'test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true'
ExecStartPre=/bin/bash -c 'stat /run/xtables.lock'
ExecStartPre=/bin/bash -c 'rmdir /run/xtables.lock &>/dev/null && iptables -V || true'
Restart=always
RestartSec=10

$ reboot

$ journalctl -xe -u docker |& grep /run/xta
Aug 27 11:08:25 SUSESP5 dockerd[1765]: time="2021-08-27T11:08:25.564282010+08:00" level=error msg="Failed to start container e74f20c98617fa60b13981513b434cdb3139f2e3b406ec521b48ec81958cebe0: oci runtime error: container_linux.go:265: starting container process caused \"process_linux.go:368: container init caused \\"rootfs_linux.go:57: mounting \\\\"/run/xtables.lock\\\\" to rootfs \\\\"/data/kube/docker/btrfs/subvolumes/7035bbf1baba2ef7cee552cd2fdda6841a22d4bdb62bd66b81f4b62c78b65735\\\\" at \\\\"/data/kube/docker/btrfs/subvolumes/7035bbf1baba2ef7cee552cd2fdda6841a22d4bdb62bd66b81f4b62c78b65735/run/xtables.lock\\\\" caused \\\\"not a directory\\\\"\\"\"
Aug 27 11:11:32 SUSESP5 dockerd[1679]: time="2021-08-27T11:11:32.556235556+08:00" level=error msg="Failed to start container e74f20c98617fa60b13981513b434cdb3139f2e3b406ec521b48ec81958cebe0: oci runtime error: container_linux.go:265: starting container process caused \"process_linux.go:368: container init caused \\"rootfs_linux.go:57: mounting \\\\"/run/xtables.lock\\\\" to rootfs \\\\"/data/kube/docker/btrfs/subvolumes/7035bbf1baba2ef7cee552cd2fdda6841a22d4bdb62bd66b81f4b62c78b65735\\\\" at \\\\"/data/kube/docker/btrfs/subvolumes/7035bbf1baba2ef7cee552cd2fdda6841a22d4bdb62bd66b81f4b62c78b65735/run/xtables.lock\\\\" caused \\\\"not a directory\\\\"\\"\"
Aug 27 11:33:03 SUSESP5 dockerd[1705]: time="2021-08-27T11:33:03.048807247+08:00" level=error msg="Failed to start container e74f20c98617fa60b13981513b434cdb3139f2e3b406ec521b48ec81958cebe0: oci runtime error: container_linux.go:265: starting container process caused \"process_linux.go:368: container init caused \\"rootfs_linux.go:57: mounting \\\\"/run/xtables.lock\\\\" to rootfs \\\\"/data/kube/docker/btrfs/subvolumes/7035bbf1baba2ef7cee552cd2fdda6841a22d4bdb62bd66b81f4b62c78b65735\\\\" at \\\\"/data/kube/docker/btrfs/subvolumes/7035bbf1baba2ef7cee552cd2fdda6841a22d4bdb62bd66b81f4b62c78b65735/run/xtables.lock\\\\" caused \\\\"not a directory\\\\"\\"\"
Aug 27 11:36:28 SUSESP5 bash[1684]: stat: cannot stat '/run/xtables.lock': No such file or directory
Aug 27 11:36:38 SUSESP5 bash[2265]: stat: cannot stat '/run/xtables.lock': No such file or directory
Aug 27 11:36:48 SUSESP5 bash[2272]: stat: cannot stat '/run/xtables.lock': No such file or directory
Aug 27 11:36:59 SUSESP5 bash[2295]: stat: cannot stat '/run/xtables.lock': No such file or directory
Aug 27 11:37:09 SUSESP5 bash[2327]: stat: cannot stat '/run/xtables.lock': No such file or directory
Aug 27 11:37:19 SUSESP5 bash[2356]: stat: cannot stat '/run/xtables.lock': No such file or directory
Aug 27 11:37:29 SUSESP5 bash[2385]: stat: cannot stat '/run/xtables.lock': No such file or directory
Aug 27 11:37:40 SUSESP5 bash[2475]: stat: cannot stat '/run/xtables.lock': No such file or directory

从日志看出 rmdir 之前路径 /run/xtables.lock 确实不存在,那就不用逻辑或了,直接下面这样始终执行 iptables

1
2
3
4
5
6
7
$ cat /etc/systemd/system/docker.service.d/10.docker.conf
[Service]
ExecStartPre=/bin/bash -c 'test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true'
ExecStartPre=/bin/bash -c 'rmdir /run/xtables.lock &>/dev/null || true; iptables -V 1>/dev/null'
Restart=always
RestartSec=10
$ reboot

结果还是变成目录,猜测了下应该试 suse 上的 iptables 某些情况不会生成锁文件。所以后面加个 touch 得了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ stat /run/xtables.lock
File: '/run/xtables.lock'
Size: 40 Blocks: 0 IO Block: 4096 directory
Device: 16h/22d Inode: 14292 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2021-08-27 11:41:08.309257362 +0800
Modify: 2021-08-27 11:41:08.309257362 +0800
Change: 2021-08-27 11:41:08.309257362 +0800
Birth: -

$ vi /etc/systemd/system/docker.service.d/10.docker.conf
$ cat /etc/systemd/system/docker.service.d/10.docker.conf
[Service]
ExecStartPre=/bin/bash -c 'test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true'
ExecStartPre=/bin/bash -c 'rmdir /run/xtables.lock &>/dev/null || true; iptables -V 1>/dev/null; touch -m 0600 /run/xtables.lock'
Restart=always
RestartSec=10

$ reboot

结果可以了,但是 touch 的权限不生效

1
2
3
4
5
6
7
8
9
$ stat /run/xtables.lock 
File: '/run/xtables.lock'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 16h/22d Inode: 12019 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2021-08-27 11:50:30.056416389 +0800
Modify: 2021-08-27 11:50:30.056416389 +0800
Change: 2021-08-27 11:50:30.056416389 +0800
Birth: -

最终的解决办法,加个 chmod :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ cat /etc/systemd/system/docker.service.d/10.docker.conf
[Service]
ExecStartPre=/bin/bash -c 'test -d /var/run/docker.sock && rmdir /var/run/docker.sock || true'
ExecStartPre=/bin/bash -c 'rmdir /run/xtables.lock &>/dev/null || true; iptables -V 1>/dev/null; touch /run/xtables.lock'
ExecStartPre=/bin/bash -c 'chmod 0600 /run/xtables.lock'
Restart=always
RestartSec=10

$ reboot

$ stat /run/xtables.lock
File: '/run/xtables.lock'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 16h/22d Inode: 26741 Links: 1
Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2021-08-27 12:18:53.097143334 +0800
Modify: 2021-08-27 12:18:53.097143334 +0800
Change: 2021-08-27 12:18:53.101143334 +0800
Birth: -

理论上可以用ConditionPathIsDirectory=!/run/xtables.lock,但是我就不去试了。主要是排错思路分享下。

2021、08、30

suse 这个 iptables 锁文件没生成可能是因为 iptablesv1.40 版本开始加入了锁文件,suse sp12Centos7.8 这个上面的 iptables 的版本是 v1.4.21 。测试了下 iptables -V 不会生成但是 iptables -S 才会生成锁文件,而 v1.8 版本的 iptables 带不带任何选项下都会没锁则创建锁文件。

CATALOG
  1. 1. 环境信息
  2. 2. 解决
  3. 3. 2021、08、30