zhangguanzhang's Blog

docker info无warning,iptables规则正常,宿主机就是不转发

字数统计: 1.2k阅读时长: 5 min
2022/06/03

由来

06、02 凌晨被喊醒帮忙看问题,客户侧重启部分 k8s 节点机器后,业务的部分接口出现问题,环境无法向日葵之类的远程,只能发命令后,现场人员执行。

具体现象

业务 pod 日志看是无法连到非 k8s 机器上的 mysql 的 3306, docker info 命令无 warning 也就是代表下面的几个内核参数正常:

1
2
3
4
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-arptables = 1
net.ipv4.ip_forward = 1

查看 iptables -S 规则没有优先级高的 drop 之类的规则,默认 FORWARD Chain 的最后行为规则也不是 DROP。因为用了 cni plugins ,pod 的容器都是挂在 cni0 下的,可以 -i 插入优先放行的规则:

1
iptables --wait -I INPUT -i cni0 -j ACCEPT

让现场插入了还是不行,然后让看看安全软件,ps aux | grep xxx 关键字后找到了个 titanagent,搜了下是青藤云安全,之前遇到过安全软件拦截 pod 的网络请求,凌晨无法找到相关青藤云人员,暂时把几个节点 drain

排查

06/03 号客户找人卸载了青藤云,发现还是不行,提供了远程,我上去看。

故障现象

查了下,发现重启的节点上的容器,无法访问非宿主机的ip和端口,ping 另一个 k8s 主机都不行,用 tcpdump 抓包发现宿主机没转发,然后用 docker 起个默认桥接网络的试下也是不行:

1
$ docker run --rm -ti --entrypoint bash xxx telnet xxx 3306

然后看了下 iptables 状态:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ iptables -vnL -w | grep -A10 FORWARD
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 KUBE-FORWARD all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding rules */
0 0 KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes service portals */
0 0 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * * 172.27.0.0/16 0.0.0.0/0
0 0 ACCEPT all -- * * 0.0.0.0/0 172.27.0.0/16

果然没转发,仔细看是所有的都没转发,思考了下是不是客户 /etc/sysctl.conf 里改了啥 docker info 没检测到的转发相关的内核参数,然后重启后就没转发的。

1
2
cp /etc/sysctl.conf /etc/sysctl.conf.bak
vi /etc/sysctl.conf

看了下参数很多,就直接粗暴的处理了,备份文件,然后二分排除。第一行到中间注释,重启后上面的 docker 命令测试,最后找到是下面这个参数:

1
net.ipv4.conf.default.forwarding=0

net.ipv4.conf.default.forwarding 的测试

其实问题还没完,我试了下如果设置为0重启,然后起容器了,然后设置为1还是不行,然后找个干净的环境来下面步骤复现:

1
2
3
4
5
6
7
8
9
10
sed -ri '/net.ipv4.conf.default.forwarding/s#1#0#' /etc/sysctl.conf
reboot



docker run --rm -tid --name test --entrypoint bash nicolaka/netshoot
timeout 3 docker exec test ping -c 1 114.114.114.114

sysctl -w net.ipv4.conf.default.forwarding=1
timeout 3 docker exec test ping -c 1 114.114.114.114

然后看下参数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$ sysctl -a |& grep forwarding 
net.ipv4.conf.all.forwarding = 1
net.ipv4.conf.all.mc_forwarding = 0
net.ipv4.conf.default.forwarding = 1
net.ipv4.conf.default.mc_forwarding = 0
net.ipv4.conf.docker0.forwarding = 0
net.ipv4.conf.docker0.mc_forwarding = 0
net.ipv4.conf.eth0.forwarding = 1
net.ipv4.conf.eth0.mc_forwarding = 0
net.ipv4.conf.lo.forwarding = 1
net.ipv4.conf.lo.mc_forwarding = 0
net.ipv4.conf.veth10d8150.forwarding = 0
net.ipv4.conf.veth10d8150.mc_forwarding = 0
net.ipv6.conf.all.forwarding = 0
net.ipv6.conf.all.mc_forwarding = 0
net.ipv6.conf.default.forwarding = 0
net.ipv6.conf.default.mc_forwarding = 0
net.ipv6.conf.docker0.forwarding = 0
net.ipv6.conf.docker0.mc_forwarding = 0
net.ipv6.conf.eth0.forwarding = 0
net.ipv6.conf.eth0.mc_forwarding = 0
net.ipv6.conf.lo.forwarding = 0
net.ipv6.conf.lo.mc_forwarding = 0
net.ipv6.conf.veth10d8150.forwarding = 0
net.ipv6.conf.veth10d8150.mc_forwarding = 0

桥接工具看下,因为只有 docker 并且上面一个容器:

1
2
3
$ brctl show
bridge name bridge id STP enabled interfaces
docker0 8000.0242b6e42b7f no veth10d8150

把容器的网卡所在的网桥 docker0 转发开启下再试试:

1
2
3
4
5
6
7
8
$ sysctl -w net.ipv4.conf.docker0.forwarding=1
$ timeout 3 docker exec test ping -c 1 114.114.114.114
PING 114.114.114.114 (114.114.114.114) 56(84) bytes of data.
64 bytes from 114.114.114.114: icmp_seq=1 ttl=67 time=17.5 ms

--- 114.114.114.114 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 17.500/17.500/17.500/0.000 ms

所以如果是关闭后开机,需要设置总开关和相关的网桥:

1
2
3
net.ipv4.conf.default.forwarding=1
net.ipv4.conf.docker0.forwarding=1
net.ipv4.conf.cni0.forwarding=1

总结

这个参数不知道为啥 docker info 不检查它

CATALOG
  1. 1. 由来
    1. 1.1. 具体现象
  2. 2. 排查
    1. 2.1. 故障现象
    2. 2.2. net.ipv4.conf.default.forwarding 的测试
  3. 3. 总结