由来
06、02 凌晨被喊醒帮忙看问题,客户侧重启部分 k8s 节点机器后,业务的部分接口出现问题,环境无法向日葵之类的远程,只能发命令后,现场人员执行。
具体现象
业务 pod 日志看是无法连到非 k8s 机器上的 mysql 的 3306, docker info
命令无 warning 也就是代表下面的几个内核参数正常:
1 2 3 4
| net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-arptables = 1 net.ipv4.ip_forward = 1
|
查看 iptables -S
规则没有优先级高的 drop 之类的规则,默认 FORWARD Chain 的最后行为规则也不是 DROP。因为用了 cni plugins ,pod 的容器都是挂在 cni0 下的,可以 -i 插入优先放行的规则:
1
| iptables --wait -I INPUT -i cni0 -j ACCEPT
|
让现场插入了还是不行,然后让看看安全软件,ps aux | grep xxx
关键字后找到了个 titanagent
,搜了下是青藤云安全,之前遇到过安全软件拦截 pod 的网络请求,凌晨无法找到相关青藤云人员,暂时把几个节点 drain
了
排查
06/03 号客户找人卸载了青藤云,发现还是不行,提供了远程,我上去看。
故障现象
查了下,发现重启的节点上的容器,无法访问非宿主机的ip和端口,ping 另一个 k8s 主机都不行,用 tcpdump 抓包发现宿主机没转发,然后用 docker 起个默认桥接网络的试下也是不行:
1
| $ docker run --rm -ti --entrypoint bash xxx telnet xxx 3306
|
然后看了下 iptables 状态:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| $ iptables -vnL -w | grep -A10 FORWARD Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 KUBE-FORWARD all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding rules */ 0 0 KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes service portals */ 0 0 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED 0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- * * 172.27.0.0/16 0.0.0.0/0 0 0 ACCEPT all -- * * 0.0.0.0/0 172.27.0.0/16
|
果然没转发,仔细看是所有的都没转发,思考了下是不是客户 /etc/sysctl.conf
里改了啥 docker info 没检测到的转发相关的内核参数,然后重启后就没转发的。
1 2
| cp /etc/sysctl.conf /etc/sysctl.conf.bak vi /etc/sysctl.conf
|
看了下参数很多,就直接粗暴的处理了,备份文件,然后二分排除。第一行到中间注释,重启后上面的 docker 命令测试,最后找到是下面这个参数:
1
| net.ipv4.conf.default.forwarding=0
|
net.ipv4.conf.default.forwarding 的测试
其实问题还没完,我试了下如果设置为0重启,然后起容器了,然后设置为1还是不行,然后找个干净的环境来下面步骤复现:
1 2 3 4 5 6 7 8 9 10
| sed -ri '/net.ipv4.conf.default.forwarding/s#1#0#' /etc/sysctl.conf reboot
docker run --rm -tid --name test --entrypoint bash nicolaka/netshoot timeout 3 docker exec test ping -c 1 114.114.114.114
sysctl -w net.ipv4.conf.default.forwarding=1 timeout 3 docker exec test ping -c 1 114.114.114.114
|
然后看下参数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| $ sysctl -a |& grep forwarding net.ipv4.conf.all.forwarding = 1 net.ipv4.conf.all.mc_forwarding = 0 net.ipv4.conf.default.forwarding = 1 net.ipv4.conf.default.mc_forwarding = 0 net.ipv4.conf.docker0.forwarding = 0 net.ipv4.conf.docker0.mc_forwarding = 0 net.ipv4.conf.eth0.forwarding = 1 net.ipv4.conf.eth0.mc_forwarding = 0 net.ipv4.conf.lo.forwarding = 1 net.ipv4.conf.lo.mc_forwarding = 0 net.ipv4.conf.veth10d8150.forwarding = 0 net.ipv4.conf.veth10d8150.mc_forwarding = 0 net.ipv6.conf.all.forwarding = 0 net.ipv6.conf.all.mc_forwarding = 0 net.ipv6.conf.default.forwarding = 0 net.ipv6.conf.default.mc_forwarding = 0 net.ipv6.conf.docker0.forwarding = 0 net.ipv6.conf.docker0.mc_forwarding = 0 net.ipv6.conf.eth0.forwarding = 0 net.ipv6.conf.eth0.mc_forwarding = 0 net.ipv6.conf.lo.forwarding = 0 net.ipv6.conf.lo.mc_forwarding = 0 net.ipv6.conf.veth10d8150.forwarding = 0 net.ipv6.conf.veth10d8150.mc_forwarding = 0
|
桥接工具看下,因为只有 docker 并且上面一个容器:
1 2 3
| $ brctl show bridge name bridge id STP enabled interfaces docker0 8000.0242b6e42b7f no veth10d8150
|
把容器的网卡所在的网桥 docker0 转发开启下再试试:
1 2 3 4 5 6 7 8
| $ sysctl -w net.ipv4.conf.docker0.forwarding=1 $ timeout 3 docker exec test ping -c 1 114.114.114.114 PING 114.114.114.114 (114.114.114.114) 56(84) bytes of data. 64 bytes from 114.114.114.114: icmp_seq=1 ttl=67 time=17.5 ms
--- 114.114.114.114 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 17.500/17.500/17.500/0.000 ms
|
所以如果是关闭后开机,需要设置总开关和相关的网桥:
1 2 3
| net.ipv4.conf.default.forwarding=1 net.ipv4.conf.docker0.forwarding=1 net.ipv4.conf.cni0.forwarding=1
|
总结
这个参数不知道为啥 docker info 不检查它