<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Zhangguanzhang</title>
  
  <subtitle>站在巨人的肩膀上</subtitle>
  <link href="http://zhangguanzhang.github.io/atom.xml" rel="self"/>
  
  <link href="http://zhangguanzhang.github.io/"/>
  <updated>2026-04-09T09:10:30.000Z</updated>
  <id>http://zhangguanzhang.github.io/</id>
  
  <author>
    <name>Zhangguanzhang</name>
    
  </author>
  
  <generator uri="https://hexo.io/">Hexo</generator>
  
  <entry>
    <title>欧拉22.04容器内sudo 被 killed</title>
    <link href="http://zhangguanzhang.github.io/2026/04/09/openeuler-sudo-killed/"/>
    <id>http://zhangguanzhang.github.io/2026/04/09/openeuler-sudo-killed/</id>
    <published>2026-04-09T09:10:30.000Z</published>
    <updated>2026-04-09T09:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>记录下一次欧拉容器内sudo无法使用的排查过程。</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>实施反馈客户环境上我们部署容器内无法使用 sudo：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">sudo</span> <span class="built_in">ls</span> -l</span></span><br><span class="line">已杀死</span><br></pre></td></tr></table></figure><h2 id="排查"><a href="#排查" class="headerlink" title="排查"></a>排查</h2><p>部署容器是基于 <code>openeuler/openeuler:22.03</code> 制作的，之前是实施和另一个同事发现基于 ubuntu 的基础镜像制作的部署镜像内 sudo 可以使用，后面升级后又变成欧拉的，让我排查下。</p><h3 id="权限？"><a href="#权限？" class="headerlink" title="权限？"></a>权限？</h3><p>远程上去后：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker run --<span class="built_in">rm</span> -ti --entrypoint bash xxx</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">sudo</span> <span class="built_in">ls</span> -l</span></span><br><span class="line">已杀死</span><br></pre></td></tr></table></figure><p>想着是不是 seccomp 之类的导致的，用特权启动下试试：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker run --<span class="built_in">rm</span> -ti --privileged --entrypoint bash xxx</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">sudo</span> <span class="built_in">ls</span> -l</span></span><br><span class="line">已杀死</span><br></pre></td></tr></table></figure><p>依旧这样，换基于 ubuntu 制作的发现没问题，但是这个欧拉镜像内无 strace 命令。</p><h3 id="制作验证"><a href="#制作验证" class="headerlink" title="制作验证"></a>制作验证</h3><p>本地制作了下基于欧拉和其他一些镜像增加 sudo 和 strace 的镜像：</p><figure class="highlight dockerfile"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">FROM</span> openeuler/openeuler:<span class="number">22.03</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">RUN</span><span class="language-bash"> <span class="built_in">set</span> -eux; \</span></span><br><span class="line"><span class="language-bash">    sed -ri <span class="string">&#x27;s|://repo.openeuler.org/|://repo.huaweicloud.com/openeuler/|g&#x27;</span> /etc/yum.repos.d/openEuler.repo; \</span></span><br><span class="line"><span class="language-bash">    sed -ri <span class="string">&#x27;s|https://mirrors.openeuler.org|https://repo.huaweicloud.com/openeuler/|g&#x27;</span> /etc/yum.repos.d/openEuler.repo; \</span></span><br><span class="line"><span class="language-bash">    yum makecache; \</span></span><br><span class="line"><span class="language-bash">    dnf install -y --<span class="built_in">setopt</span>=install_weak_deps=False    \</span></span><br><span class="line"><span class="language-bash">        <span class="built_in">sudo</span>   \</span></span><br><span class="line"><span class="language-bash">        strace   \</span></span><br><span class="line"><span class="language-bash">    ;</span></span><br></pre></td></tr></table></figure><p>然后几个镜像构建完成后：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker save sudo:oe sudo:xxx | gzip -&gt; sudo-docker.tar.gz</span><br></pre></td></tr></table></figure><p>客户环境上导入测试：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker load -i sudo-docker.tar.gz</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker run --<span class="built_in">rm</span> -ti --entrypoint bash <span class="built_in">sudo</span>:oe</span></span><br><span class="line">[root@71ec45b79f63 /]# sudo ls -l</span><br><span class="line">Killed</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker run --<span class="built_in">rm</span> -ti --privileged --entrypoint bash <span class="built_in">sudo</span>:oe</span></span><br><span class="line">[root@fea66e40292f /]# sudo -V</span><br><span class="line">Sudo version 1.9.8p2</span><br><span class="line">Configure options: --build=x86_64-openEuler-linux-gnu --host=x86_64-openEuler-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --prefix=/usr --sbindir=/usr/sbin --libdir=/usr/lib64 --docdir=/usr/share/doc/sudo --disable-root-mailer --disable-intercept --disable-log-server --disable-log-client --with-logging=syslog --with-logfac=authpriv --with-pam --with-pam-login --with-editor=/bin/vi --with-env-editor --with-ignore-dot --with-tty-tickets --with-ldap --with-selinux --with-passprompt=[sudo] password for %p:  --with-linux-audit --with-sssd</span><br><span class="line">Killed</span><br></pre></td></tr></table></figure><p>发现 <code>-V</code> 都报错，没办法了，strace 看看调用过程：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">[root@fea66e40292f /]# strace sudo -V</span><br><span class="line">....</span><br><span class="line">mprotect(0x7f605d240000, 4096, PROT_READ) = 0</span><br><span class="line">mprotect(0x7f605d2bf000, 4096, PROT_READ) = 0</span><br><span class="line">openat(AT_FDCWD, &quot;/proc/sys/crypto/fips_enabled&quot;, O_RDONLY) = 3</span><br><span class="line">read(3, &quot;0\n&quot;, 2)                       = 2</span><br><span class="line">close(3)                                = 0</span><br><span class="line">access(&quot;/etc/system-fips&quot;, F_OK)        = -1 ENOENT (No such file or directory)</span><br><span class="line">munmap(0x7f605d5eb000, 10235)           = 0</span><br><span class="line">newfstatat(AT_FDCWD, &quot;/usr/libexec/sudo/sudoers.so&quot;, &#123;st_mode=S_IFREG|0644, st_size=525888, ...&#125;, 0) = 0</span><br><span class="line">newfstatat(AT_FDCWD, &quot;/usr/libexec/sudo/sudoers.so&quot;, &#123;st_mode=S_IFREG|0644, st_size=525888, ...&#125;, 0) = 0</span><br><span class="line">pipe2([3, 4], O_NONBLOCK|O_CLOEXEC)     = 0</span><br><span class="line">getpid()                                = 80</span><br><span class="line">getrandom(0x7fff8e404980, 40, 0)        = -1 ENOSYS (Function not implemented)</span><br><span class="line">gettid()                                = 80</span><br><span class="line">getpid()                                = 80</span><br><span class="line">tgkill(80, 80, SIGKILL)                 = ?</span><br><span class="line">+++ killed by SIGKILL +++</span><br><span class="line">Killed</span><br></pre></td></tr></table></figure><p>看着是 <code>getrandom</code> 的 syscall 报错没实现，欧拉镜像是类似 rhel 的，自带 python，用 python 调用它试试：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">[root@fea66e40292f /]# python3 -c &#x27;import os;os.getrandom(3)&#x27;</span><br><span class="line">Traceback (most recent call last):</span><br><span class="line">  File &quot;&lt;string&gt;&quot;, line 1, in &lt;module&gt;</span><br><span class="line">OSError: [Errno 38] Function not implemented</span><br></pre></td></tr></table></figure><p>明确是 <code>getrandom</code> syscall 问题，再试下 Linux shuf 随机数命令：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">[root@fea66e40292f /]# shuf -i 1-10 -n 1</span><br><span class="line">shuf: getrandom: Function not implemented</span><br><span class="line">[root@fea66e40292f /]# uname -a</span><br><span class="line">Linux fea66e40292f 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux</span><br><span class="line">[root@fea66e40292f /]# rpm -qa | grep glibc</span><br><span class="line">glibc-common-2.34-170.oe2203sp4.x86_64</span><br><span class="line">glibc-2.34-170.oe2203sp4.x86_64</span><br></pre></td></tr></table></figure><h3 id="结论和验证"><a href="#结论和验证" class="headerlink" title="结论和验证"></a>结论和验证</h3><p>客户机器是 CentOS 7.3，内核 <code>3.10.0-514.el7.x86_64</code> ，推测欧拉的 glibc 对 getrandom syscall 强依赖没有 fallback 回退到 <code>openat(/dev/urandom)</code> 机制，客户这套环境还有其他 CentOS 7.9的，上面去测试了下没问题，其他 7.3 的机器上也复现了：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">其他 7.9 机器上</span></span><br><span class="line">[root@fea66e40292f /]# shuf -i 1-10 -n 1</span><br><span class="line">4</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">其他 7.3 机器上</span></span><br><span class="line">[root@fea66e40292f /]# shuf -i 1-10 -n 1</span><br><span class="line">shuf: getrandom: Function not implemented</span><br></pre></td></tr></table></figure><p>因为还有其他业务服务使用欧拉基础镜像，如果从 glibc 层面修改解决这个问题成本太大了:</p><ul><li>客户现场要针对每个容器镜像单独更新下 glibc </li><li>后续我们也要维护一份 glibc 代码编译成欧拉的 rpm 包</li></ul><p>让客户使用 CentOS 7.9 的最稳妥，同时该问题已经反馈给欧拉仓库 <a href="https://atomgit.com/openeuler/openeuler-docker-images/issues/60">https://atomgit.com/openeuler/openeuler-docker-images/issues/60</a></p>]]></content>
    
    
    <summary type="html">&lt;p&gt;记录下一次欧拉容器内sudo无法使用的排查过程。&lt;/p&gt;</summary>
    
    
    
    
    <category term="openeuler" scheme="http://zhangguanzhang.github.io/tags/openeuler/"/>
    
    <category term="glibc" scheme="http://zhangguanzhang.github.io/tags/glibc/"/>
    
  </entry>
  
  <entry>
    <title>[持续更新] - 安卓 HTTPS 抓包那些事</title>
    <link href="http://zhangguanzhang.github.io/2026/02/23/android-https-capture/"/>
    <id>http://zhangguanzhang.github.io/2026/02/23/android-https-capture/</id>
    <published>2026-02-23T20:10:30.000Z</published>
    <updated>2026-02-23T20:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>长期记录和更新安卓 HTTPS 抓包相关的内容与笔记。</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>网上好多文章比较水，只知道照抄步骤。这里以科普、重点和笔记的形式写下安卓 HTTPS 抓包那些事，方便有各方面基础的人快速上手安卓抓 HTTPS 包。</p><h2 id="原理相关"><a href="#原理相关" class="headerlink" title="原理相关"></a>原理相关</h2><p>基于逆向、协议逆向以及实现某些 app 自动化等需求，往往需要抓取指定 app 的 HTTPS 包才能完成。本文只讨论 HTTPS 抓包本身，涉及 apk 对设备 root 与反调试检测的就不展开了。</p><h3 id="中间人"><a href="#中间人" class="headerlink" title="中间人"></a>中间人</h3><p>抓 HTTPS 的方式大致分为两类：hack 与非 hack。</p><ul><li><strong>hack 方式</strong>：例如基于 eBPF 的 <a href="https://github.com/gojue/ecapture">gojue&#x2F;ecapture</a>，在 client 端 hook 相关 SSL 即可解密 HTTPS 报文，另外例如 frida hook app。</li><li><strong>非 hack 方式</strong>：例如中间人攻击。</li></ul><p>eBPF 需要较高内核版本且有对应内核模块支持。网上很多人用 root、KSU 等，却很少去编译内核增加 ebpf 的支持，所以这种方式很难普及， frida 的话要写注入逻辑，一般要电脑配合且要懂点编程和逆向基础。而中间人攻击的原理是：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">客户端                    攻击者            服务器(https://example.com)</span><br><span class="line">  │                        │                         │</span><br><span class="line">  │──── ClientHello ─────▶│                         │</span><br><span class="line">  │                        │──── ClientHello ─────▶ │</span><br><span class="line">  │                        │◀── Server Certificate ─│</span><br><span class="line">  │◀ Fake Certificate ────│                         │</span><br><span class="line">  │                        │                         │</span><br><span class="line">  │==== 加密通信 ====      │  ==== 加密通信 ====      │</span><br><span class="line">  │   (被攻击者解密)        │   (正常HTTPS)           │</span><br></pre></td></tr></table></figure><ol><li>客户端向服务器发起 HTTPS 请求，中间人伪装成服务器完成 TLS 握手，拦截并解析数据；中间人再以客户端身份与真实服务器通信，在服务器看来就是普通的一个客户端在访问自己。</li><li>攻击者自签 example.com 的 TLS 证书并回复给客户端。</li><li>客户端后续发送的 <code>https://example.com/api</code> 等请求，攻击者解密后再向服务器请求，拿到数据后以 HTTPS 形式回给客户端。</li><li>从开发视角看，中间人相当于以明文查看 HTTPS 请求，可拦截请求与响应，也可针对某些 URL path 做策略。</li></ol><p>HTTPS 刚普及时，很多客户端不校验中间人证书是否由权威 CA 签发（用浏览器访问会看到红色警告）。后来普遍校验证书后，做中间人时需要在 Linux 上把自建 CA 加入 <code>/etc/ssl/certs/ca-certificates.crt</code> 等系统信任链。</p><h3 id="安卓的凭据"><a href="#安卓的凭据" class="headerlink" title="安卓的凭据"></a>安卓的凭据</h3><p>例如在 Windows 电脑上用 Fiddler，在电脑里加入 Fiddler 的 CA 后，Fiddler 可以抓浏览器或进程的 HTTPS 报文。Linux、Windows、安卓本质都是同一套中间人原理，只是细节不同。在安卓上，CA 以「凭据」形式存在，路径是：<code>设置</code> → <code>系统安全</code> → <code>凭据存储</code> → <code>信任的证书</code>，分为「系统」和「用户」两类。</p><p>安卓 7.0 起不再信任用户安装的 CA，需要把 CA 放进「系统」凭据。需要手机 root，可用 <a href="https://mt2.cn/download/">MT 管理器</a> 或 adb shell（root）把 CA 放到 <code>/system/etc/security/cacerts/</code>。</p><h3 id="SSL-Pinning"><a href="#SSL-Pinning" class="headerlink" title="SSL Pinning"></a>SSL Pinning</h3><p>部分 app 不信任系统凭据或对服务器证书做校验，可能还需要配合 LSPosed 和 Magisk 模块（如 <code>JustTrustMe</code>、<code>TrustMeAlready</code>）hook 应用内的 <code>X509TrustManager</code>、<code>okhttp3</code> 等跳过证书校验。会 Frida 的话也可以直接 Frida hook。</p><h3 id="高版本安卓的凭据"><a href="#高版本安卓的凭据" class="headerlink" title="高版本安卓的凭据"></a>高版本安卓的凭据</h3><p>在安卓 15 + KSU 上试过 <a href="https://reqable.com/">Reqable</a>：左上角菜单 → 证书管理 → 安装证书，我选的是手动复制证书文件，证书会保存到 <code>Download/Reqable/xxxxx.0</code>。KSU 授权 <code>com.android.shell</code> 取得 root 后，在 adb root shell 里仍无法把证书推到 <code>/system/etc/security/cacerts/</code>，用 MT 管理器移动也会报「挂载读写失败」。</p><p>查资料得知有人分析 <code>framework.jar</code> 发现从安卓 14 起证书目录是 <code>/apex/com.android.conscrypt/cacerts</code>，但试了下依然无法直接写入。继续查发现 <code>/apex</code> 有更严格的系统分区保护，即便 root 后 remount 也改不了。后来看到做法是：先装 <a href="https://github.com/KernelSU-Modules-Repo/meta-overlayfs">OverlayFS MetaModule</a>，再装 <a href="https://github.com/ys1231/MoveCertificate">MoveCertificate</a>，把证书放到 <code>/data/local/tmp/cert/</code> 下重启，就能在系统凭据里看到新 CA。</p><p>看了下 <a href="https://github.com/ys1231/MoveCertificate/blob/iyue/post-fs-data.sh">MoveCertificate 的 post-fs-data.sh</a>，是通过 tmpfs 覆盖挂载 <code>/apex/com.android.conscrypt/cacerts</code> 并配合 SELinux 权限完成凭据列表的注入，前提是已安装 <code>OverlayFS MetaModule</code>。另外我试过只 ksu 里关掉 OverlayFS 模块不关别的，手机重启会起不来。</p><p>然后把 Reqable 记录模式选成 <code>V-P-N</code>，应用指定微信，开启增强并启动抓包，打开微信和小程序，发现抓到的全是 <code>CONNECT</code>，点进请求显示 SSL 握手失败，说明 CA 没被正确信任。后来试了 Reqable 生成的 Magisk 模块刷入，并<strong>取消</strong>使用 <code>MoveCertificate</code> 模块，重启后就能正常抓到了。</p><p>安卓 16 好些还要额外安装到用户凭据里。</p><h2 id="安卓原理和工具选项"><a href="#安卓原理和工具选项" class="headerlink" title="安卓原理和工具选项"></a>安卓原理和工具选项</h2><p>模拟器和安卓上开光速虚机抓包啥的不讨论，如果你看懂了原理又嫌弃 root 麻烦，可以自行搜索查看相关文章。</p><h3 id="指定-app-抓包原理"><a href="#指定-app-抓包原理" class="headerlink" title="指定 app 抓包原理"></a>指定 app 抓包原理</h3><p>安卓上「指定 app 走代理」的实现原理可以看我之前的文章 <a href="https://zhangguanzhang.github.io/2025/04/27/android-tun2sock/">开发一个让指定应用走代理的安卓 app 过程</a>。安卓的 VPNService 本身支持让指定 app 走类似 Linux <code>/dev/tun</code> 的通道，应用开发者只需实现相应逻辑即可。</p><h3 id="工具选型"><a href="#工具选型" class="headerlink" title="工具选型"></a>工具选型</h3><p>有人会把 Fiddler 的 CA 导入手机，电脑跑 Fiddler，手机 WiFi 代理指向电脑的 Fiddler 端口。但不少 apk 会检测 WiFi 是否设置了代理，一旦发现就不走代理，导致抓不到该 app 的 HTTPS，所以更稳妥的方式还是：在安卓上装抓包&#x2F;代理软件，用 VPNService 指定 app，再做中间人拦截。</p><p>微信小程序的鉴权里，<code>code</code> 不能重复使用（QQ 的似乎可以），所以常需要抓包拦截住请求不让它发出去，然用请求里的 code 在代码里去使用。我调研了一下，<strong>以下是搜到的对比</strong>：</p><table><thead><tr><th>软件</th><th>开源</th><th>UI 和功能</th><th>CA 相关</th></tr></thead><tbody><tr><td><a href="https://reqable.com/">Reqable</a></td><td>未开源，前身是 HttpCanary</td><td>UI 现代，请求查看体验较好。需 VIP 解锁全部功能（只有价格说明没具体有哪些功能）</td><td>安装后 CA 随机生成这点不错，并且支持生成 magisk 模块；PC 端 USB 装证书到手机疑似通过 adb root 完成，无手动步骤，比较担心被注入不好的东西</td></tr><tr><td><a href="https://github.com/wanghongenpin/proxypin">proxypin</a></td><td>开源</td><td>UI 现代，请求查看不如 Reqable 细致，但支持断点、重写请求</td><td>apk 完全可以像 Reqable 那样生成一个 magisk 模块，而不需要让人去下载注入CA 的 <a href="https://github.com/wanghongenpin/Magisk-ProxyPinCA">Magisk-ProxyPinCA</a> 模块。 apk 生成的 CA 与 Magisk-ProxyPinCA 内的 CA 是固定而不是随机的，可能是一个安全隐患点</td></tr></tbody></table><p>安卓 16 好像限制挺多的，可以去尝试下虚拟机，我是安卓 15 + <code>proxypin</code> 和它的 ca 模块可以抓包。理清原理后，甚至可以自己搭一套分布式抓包方案，例如：</p><ol><li>在 Linux 上写&#x2F;部署一个带中间人能力的 SOCKS5 服务端；</li><li>复用或开发开源安卓代理 app（如 appproxy）；</li><li>手机安装上述 CA，启动 appproxy，让待抓包的 app 走代理，上游指向该 SOCKS5；</li><li>SOCKS5 收到流量后做中间人解密，并在 Web 端展示。既利用了安卓的指定 app 走代理，又可以专心开发服务端，不需要过多关注安卓方面。</li></ol><p>这里举例是想表明工具只存在功能细节上的差异，不存在某个工具可以做到某个工具不可以的说法。</p><h3 id="一些题外话"><a href="#一些题外话" class="headerlink" title="一些题外话"></a>一些题外话</h3><p>这些模块可能会在 <code>/data/local/tmp</code> 下创建目录用来注入 CA。若 apk 检测这些路径或系统凭据里是否出现这类 CA 名称，可能被用来做 root&#x2F;环境检测。所以不抓包时记得把相关模块关掉。</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;长期记录和更新安卓 HTTPS 抓包相关的内容与笔记。&lt;/p&gt;</summary>
    
    
    
    
    <category term="android" scheme="http://zhangguanzhang.github.io/tags/android/"/>
    
    <category term="https" scheme="http://zhangguanzhang.github.io/tags/https/"/>
    
  </entry>
  
  <entry>
    <title>[持续更新] - 为什么不应该为了实现需求而调用命令</title>
    <link href="http://zhangguanzhang.github.io/2025/12/18/replacing-shell-commands-with-library-calls/"/>
    <id>http://zhangguanzhang.github.io/2025/12/18/replacing-shell-commands-with-library-calls/</id>
    <published>2025-12-18T14:37:30.000Z</published>
    <updated>2025-12-18T14:37:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>总结下一些案例场景，图快和方便调用命令在私有化的问题。</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>在私有化的开发中，到现场客户反馈问题，聊一聊调用命令的后果和如何浪费了别人的时间。</p><h2 id="案例"><a href="#案例" class="headerlink" title="案例"></a>案例</h2><h3 id="命令注入"><a href="#命令注入" class="headerlink" title="命令注入"></a>命令注入</h3><p>有 dashboard 提供 url 检测，后端代码逻辑为：</p><ol><li>取 post 的 ip</li><li>拼接 curl 命令</li></ol><p>然后被安全部门注入了 <code>ip; whoami; rm -rf /</code> 被当作安全案例播放。以及另一个 ip ping 检测的调用 ping 命令拼接。url 检测完全可以调用 <code>request</code> 包，以及 icmp 的可以使用 <code>socket</code> 包。</p><p>拿 ping 来举例，某个版本有 ipv6 需求，你在自己的开发系统 ubuntu 上测试 ipv6 没问题：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ping ::1</span></span><br><span class="line">PING ::1(::1) 56 data bytes</span><br><span class="line">64 bytes from ::1: icmp_seq=1 ttl=64 time=0.505 ms</span><br><span class="line">^C</span><br></pre></td></tr></table></figure><p>然后提交后，测试人员出包测试后，来找你反馈说某些系统上有问题，你排查了一遍发现是在 centos7 上低版本 ping 测出问题了：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ping ::1</span></span><br><span class="line">ping: ::1: Address family for hostname not supported</span><br></pre></td></tr></table></figure><p>而如果你一开始使用 <code>socket</code> 编程，初期的 icmp 实现是：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">sock = socket.socket(socket.AF_INET, socket.SOCK_RAW, </span><br><span class="line">                                     socket.getprotobyname(<span class="string">&quot;icmp&quot;</span>))</span><br></pre></td></tr></table></figure><p>在接收到 ipv6 需求后，你根据传入的 ip 使用 <code>IPy</code> 库判断 socket 的第一个选项是使用 <code>socket.AF_INET</code> 还是 <code>socket.AF_INET6</code>，测试后续压根不会因为系统来找你，避免了一来一回和挤占了别人时间。golang 的 icmp 实现可以参考 <a href="https://github.com/go-ping/ping">https://github.com/go-ping/ping</a></p><h3 id="Popen-卡住"><a href="#Popen-卡住" class="headerlink" title="Popen 卡住"></a>Popen 卡住</h3><ul><li><a href="https://www.markjour.com/article/20230927-python-popen-hang.html">https://www.markjour.com/article/20230927-python-popen-hang.html</a></li><li><a href="https://www.google.com/search?q=python+subprocess.popen+page_size&oq=python+subprocess.popen+page_size">https://www.google.com/search?q=python+subprocess.popen+page_size&amp;oq=python+subprocess.popen+page_size</a></li><li><a href="https://stackoverflow.com/questions/46494789/python-subprocess-hangs-as-popen-when-piping-output">https://stackoverflow.com/questions/46494789/python-subprocess-hangs-as-popen-when-piping-output</a></li></ul><h3 id="僵尸进程"><a href="#僵尸进程" class="headerlink" title="僵尸进程"></a>僵尸进程</h3><p>在 Python 中，常见的几种调用命令的方式有：<code>os.system()</code>, <code>os.popen()</code>, <code>subprocess.Popen()</code> 等。其中，如果使用 <code>subprocess.Popen()</code> 并且没有调用 <code>wait()</code> 或者没有使用 <code>communicate()</code> 方法，那么子进程在结束后可能会变成僵尸进程。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ps aux | grep Z</span></span><br><span class="line">USER       PID  %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND</span><br><span class="line">root       2214  0.0  0.0      0     0 ?        Z    03:35   0:00 [docker] &lt;defunct&gt;</span><br><span class="line">root       3727  0.0  0.0      0     0 ?        Z    14:21   0:00 [docker] &lt;defunct&gt;</span><br><span class="line">root       4908  0.0  0.0      0     0 ?        Z    11:52   0:00 [docker] &lt;defunct&gt;</span><br><span class="line">root       7979  0.0  0.0      0     0 ?        Z    14:31   0:00 [docker] &lt;defunct&gt;</span><br><span class="line">root      13301  0.0  0.0      0     0 ?        Z    13:07   0:00 [docker] &lt;defunct&gt;</span><br><span class="line">root      18508  0.0  0.0      0     0 ?        Z    10:57   0:00 [docker] &lt;defunct&gt;</span><br><span class="line">root      20993  0.0  0.0      0     0 ?        Z    14:41   0:00 [docker] &lt;defunct&gt;</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>僵尸进程虽然不会占用 cpu mem，但是会占用 pid 资源：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">默认的 pid max</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /proc/sys/kernel/pid_max</span></span><br><span class="line">32768</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">可以看到 pid 数字没释放</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /proc/21190/cmdline</span></span><br><span class="line"></span><br></pre></td></tr></table></figure><p>而私有化很多客户机器上有监控 agent，僵尸进程数量太多会告警，最后客户会反馈要求处理，而现场 pm 会先要求熟悉 linux 基础的同事 X 排查，最后发现是 daemon 类服务开发 Y 写的，给同事 X 增加定位时间。</p><p>命令实际上就是根据 cmdline、env 和命令的配置文件，最后去调用通过网络或者系统的 syscall 处理。例如之前看到一个文件处理服务，调用了 tar 命令最后产生很多僵尸进程的，压缩文件格式实际就是下面的流格式：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line">┌────────────────────────────────────────────────────────────────────┐</span><br><span class="line">│                              TAR 归档文件                           │</span><br><span class="line">├────────────────────────────────────────────────────────────────────┤</span><br><span class="line">│                                                                    │</span><br><span class="line">│  ┌─────────────────────────────────────────────────────────────┐   │</span><br><span class="line">│  │                    文件1 - 头部 (512字节)                    │   │</span><br><span class="line">│  │  name(100)  mode(8)  uid(8)  gid(8)  size(12)  mtime(12) .. │   │</span><br><span class="line">│  └─────────────────────────────────────────────────────────────┘   │</span><br><span class="line">│                                                                    │</span><br><span class="line">│  ┌─────────────────────────────────────────────────────────────┐   │</span><br><span class="line">│  │                    文件1 - 内容                              │   │</span><br><span class="line">│  │    实际数据 (例如: &quot;Hello World!&quot;)                           │   │</span><br><span class="line">│  │                                                             │   │</span><br><span class="line">│  │   填充到 512 字节倍数                                        │   │</span><br><span class="line">│  └─────────────────────────────────────────────────────────────┘   │</span><br><span class="line">│                                                                    │</span><br><span class="line">│  ┌─────────────────────────────────────────────────────────────┐   │</span><br><span class="line">│  │                    文件2 - 头部 (512字节)                    │   │</span><br><span class="line">│  │  name(100)  mode(8)  uid(8)  gid(8)  size(12) ...           │   │</span><br><span class="line">│  └─────────────────────────────────────────────────────────────┘   │</span><br><span class="line">│                                                                    │</span><br><span class="line">│  ┌─────────────────────────────────────────────────────────────┐   │</span><br><span class="line">│  │                    文件2 - 内容                              │   │</span><br><span class="line">│  │  实际数据 (例如: 二进制文件内容)                              │   │</span><br><span class="line">│  │                                                             │   │</span><br><span class="line">│  │  填充到 512 字节倍数                                         │   │</span><br><span class="line">│  └─────────────────────────────────────────────────────────────┘   │</span><br><span class="line">| .....</span><br></pre></td></tr></table></figure><p>而每个编程语言都有对应的库的，完全可以使用库去处理，这里就不列举不通 rootfs 下 tar 行为和选项不一致的案例了，使用库在更换基础镜像就不会出现被动情况了。</p><h3 id="客户监控"><a href="#客户监控" class="headerlink" title="客户监控"></a>客户监控</h3><p>dashboard 有探测端口部分，然后调用的 <code>nmap</code> 命令实现，而很多政府单位机器上有 agent，针对 nmap 这种会认为是机器被黑成为肉鸡扫描其他机器，在护网期间这是非常严重的问题。然后又拉群和一堆人要求处理，最后动员一大堆人，给客户解释原因，客户要求整改。这个完全可以使用 socket 库实现：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">check_tcp_port</span>(<span class="params">ip, port, timeout=<span class="number">1</span></span>):</span><br><span class="line">    <span class="keyword">if</span> is_ipv6(ip) <span class="keyword">is</span> <span class="literal">True</span>:</span><br><span class="line">        sk = socket.socket(socket.AF_INET6, socket.SOCK_STREAM)</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        sk = socket.socket(socket.AF_INET, socket.SOCK_STREAM)</span><br><span class="line">    sk.settimeout(timeout)</span><br><span class="line">    <span class="keyword">try</span>:</span><br><span class="line">        sk.connect((ip, <span class="built_in">int</span>(port)))</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">True</span></span><br><span class="line">    <span class="keyword">except</span> Exception <span class="keyword">as</span> e:</span><br><span class="line">        <span class="built_in">print</span>(e)</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">False</span></span><br><span class="line">    <span class="keyword">finally</span>:</span><br><span class="line">        sk.close()</span><br></pre></td></tr></table></figure><h3 id="命令限制和禁止"><a href="#命令限制和禁止" class="headerlink" title="命令限制和禁止"></a>命令限制和禁止</h3><p>某个银行客户，排查问题的时候发现删不掉 pod：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">kubectl delete pod xxx</span><br></pre></td></tr></table></figure><p>最后感觉客户机器有啥软件是不是阻拦了关键字的执行，让实施人员执行：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">echo</span> 111</span></span><br><span class="line">111</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">echo</span> 111 delete</span></span><br><span class="line"><span class="meta prompt_">$</span></span><br></pre></td></tr></table></figure><p>发现确实，然后去 dashboard 上点击删除的（背后走 k8s api），除了命令限制以外，还有客户不允许执行某些命令，例如 <code>ssh</code>，如果一开始就使用 python 的 <code>paramiko</code> 库就没这种问题了。</p><h3 id="行为控制和错误处理"><a href="#行为控制和错误处理" class="headerlink" title="行为控制和错误处理"></a>行为控制和错误处理</h3><p>这里以 ssh 命令举例，Linux 的 ssh 和 sshd 的配置都存放在 <code>/etc/ssh/</code> 下的 <code>ssh_config</code> 和 <code>sshd_config</code> ，而经常客户现场由于等保或者安全需求更改后，出问题了就在 ssh 调用上加 option：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">find . -<span class="built_in">type</span> f -name <span class="string">&#x27;*.py&#x27;</span> -<span class="built_in">exec</span> grep -P <span class="string">&#x27;ssh .+-o&#x27;</span> &#123;&#125; \;</span></span><br><span class="line">        cmd = &quot;timeout 10 ssh -q -F /dev/null -o StrictHostKeyChecking=no &#123;&#125;@&#123;&#125; -p &#123;&#125; sudo LC_ALL=C &#123;&#125;&quot;.format(</span><br><span class="line">        cmd = &quot;timeout 5 ssh &#123;&#125;@&#123;&#125; -o StrictHostKeyChecking=no -p &#123;&#125; ls&quot;.format(host_obj.username, host_obj.ip, host_obj.port)</span><br><span class="line">        cmd1 = &quot;timeout 5 ssh &#123;&#125;@&#123;&#125; -o StrictHostKeyChecking=no -p &#123;&#125; &#123;&#125;&quot;.format(</span><br><span class="line">                &quot;ssh -p %s -o PubkeyAuthentication=yes -o stricthostkeychecking=no %s@%s cat /etc/hosts 2&gt;/dev/null | grep -m 1 &#x27; xxx-init-job &#x27; | awk &#x27;&#123;print $1&#125;&#x27;&quot;</span><br><span class="line">                create_cmd = &quot;ssh -p %s -o PubkeyAuthentication=yes -o stricthostkeychecking=no %s@%s &#x27;docker exec -i %s %s&#x27;&quot; % (</span><br><span class="line">                cmd = &quot;ssh -p &#123;&#125; -o PubkeyAuthentication=yes -o stricthostkeychecking=no &#123;&#125;@&#123;&#125; &#x27;docker exec -i &#123;&#125; ls /xxx/xxx/xxx-dc-main&#x27;&quot;.format(</span><br><span class="line">    ssh_cmd = &#x27;ssh -q -F /dev/null -o StrictHostKeyChecking=no &#123;&#125;@&#123;&#125; -p &#123;&#125; &quot;&#123;&#125;&quot;&#x27;.format(</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">而且不按照 -o 选项 find 出来的行数居然更多</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">find . -<span class="built_in">type</span> f -name <span class="string">&#x27;*.py&#x27;</span> -<span class="built_in">exec</span> grep -P <span class="string">&#x27;ssh .+@&#x27;</span> &#123;&#125; \; | <span class="built_in">wc</span> -l</span></span><br><span class="line">22</span><br></pre></td></tr></table></figure><p>而每个部分选项又不一样，后续又可能有问题，以及上面的 timeout，这些实际 <a href="https://docs.paramiko.org/en/stable/api/client.html">paramiko 库</a> 都有选项可以设置。以及更精准的捕获，远端机器上执行一个命令，到底是 ssh 连接失败还是远端命令失败，库完全可以捕获到：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">try</span>:</span><br><span class="line">    ssh.connect(...)</span><br><span class="line"><span class="keyword">except</span> paramiko.SSHException:</span><br><span class="line">    <span class="comment"># 连接层失败</span></span><br></pre></td></tr></table></figure><p>甚至能完全区分 stdour 和 stderr：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">ssh = paramiko.SSHClient()</span><br><span class="line">ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())</span><br><span class="line">ssh.connect(ip, username=<span class="string">&quot;user&quot;</span>, port=<span class="number">2222</span>)</span><br><span class="line">stdin, stdout, stderr = ssh.exec_command(<span class="string">&quot;x&quot;</span>)</span><br><span class="line"></span><br><span class="line">exit_code = stdout.channel.recv_exit_status()</span><br><span class="line">out = stdout.read().decode()</span><br><span class="line">err = stderr.read().decode()</span><br></pre></td></tr></table></figure><h3 id="文件权限"><a href="#文件权限" class="headerlink" title="文件权限"></a>文件权限</h3><p>很多 cli 命令的执行需要授权信息，而 cmdline 调用会被客户扫到，非 cmdline 例如配置会在家目录下 <code>~/.xxx/xx.conf</code> ，而在很多时候为了测试或者运行一个命令，需要产生文件让这个 cli 读取，然后因为容器 pid1 启动最后 gosu 启动切到非 root 运行，docker exec 进去是 root 用户，此刻调用产生的文件的 owner 是 root，后续 daemon 触发又报错权限问题。</p><p>例如之前的 mc 测试 minio 上传文件，实际可以使用 minio 的 python 库。以及 kubeconfig 文件权限，零星的有客户反馈要求权限问题，如果使用 kuberketes-api，授权信息存数据库，就能避免这种额外的整改需求。</p><h3 id="后续扩展性"><a href="#后续扩展性" class="headerlink" title="后续扩展性"></a>后续扩展性</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">kubexxx_deploy_user = config_dict[<span class="string">&quot;HOSTS&quot;</span>][<span class="number">0</span>][<span class="string">&quot;username&quot;</span>]</span><br><span class="line">kubexxx_deploy_ip = config_dict[<span class="string">&quot;HOSTS&quot;</span>][<span class="number">0</span>][<span class="string">&quot;ip&quot;</span>]</span><br><span class="line">kubexxx_deploy_port = config_dict[<span class="string">&quot;HOSTS&quot;</span>][<span class="number">0</span>][<span class="string">&quot;ssh_port&quot;</span>]</span><br><span class="line"></span><br><span class="line">ssh = paramiko.SSHClient()</span><br><span class="line">ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())</span><br><span class="line">ssh.connect(kubexxx_deploy_ip, kubexxx_deploy_port, kubexxx_deploy_user)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 修改configmap和config.yaml</span></span><br><span class="line">data_dir = config_dict[<span class="string">&quot;DATA_DIR&quot;</span>]</span><br><span class="line">apisix_config_path = os.path.join(data_dir, <span class="string">&quot;kube/apisix/config.yaml&quot;</span>)</span><br><span class="line">remote_cmd = <span class="string">f&quot;sudo docker cp <span class="subst">&#123;apisix_config_path&#125;</span> kubexxx:/root/kubexxx;sudo docker exec kubexxx chown xxx:xxx /root/kubexxx/config.yaml&quot;</span></span><br><span class="line">_, stdout, stderr = ssh.exec_command(remote_cmd)</span><br><span class="line"><span class="comment"># 获取输出内容</span></span><br><span class="line">out = stdout.read().decode().strip()</span><br><span class="line">err = stderr.read().decode().strip()</span><br><span class="line">logging.info(<span class="string">&quot;STDOUT:\n%s&quot;</span>, out)</span><br><span class="line"><span class="keyword">if</span> err:</span><br><span class="line">    logging.error(<span class="string">&quot;STDERR:\n%s&quot;</span>, err)</span><br><span class="line">    </span><br><span class="line"><span class="comment"># 重新创建configmap，重启pod</span></span><br><span class="line">configmap_cmd = <span class="string">&quot;kubectl delete cm -n default apisix; kubectl create cm -n default apisix --from-file=/root/kubexxx/config.yaml; kubectl get pods -n default | grep apisix | awk &#x27;&#123;print $1&#125;&#x27; | xargs -I &#123;&#125; kubectl delete pod -n default &#123;&#125;&quot;</span></span><br><span class="line">retcode, res = exec_cmd(configmap_cmd)</span><br><span class="line"><span class="keyword">if</span> retcode != <span class="number">0</span>:</span><br><span class="line">    logging.error(res)</span><br><span class="line">pod_cmd = <span class="string">&quot;kubectl get pods -n default | grep apisix | awk &#x27;&#123;print $1&#125;&#x27; | xargs -I &#123;&#125; kubectl delete pod -n default &#123;&#125;&quot;</span></span><br><span class="line">retcode, res = exec_cmd(pod_cmd)</span><br><span class="line"><span class="keyword">if</span> retcode != <span class="number">0</span>:</span><br><span class="line">    logging.error(res)</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>同样修改 configmap，完全库实现：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">change_backend_type</span>(<span class="params">mode</span>):</span><br><span class="line"></span><br><span class="line">    config.load_kube_config()</span><br><span class="line">    api_instance = client.CoreV1Api()</span><br><span class="line"></span><br><span class="line">    cm_name = <span class="string">&quot;kube-flannel-cfg&quot;</span></span><br><span class="line"></span><br><span class="line">    cm = api_instance.read_namespaced_config_map(name=<span class="string">&quot;kube-flannel-cfg&quot;</span>, namespace=<span class="string">&quot;kube-system&quot;</span>)</span><br><span class="line">    cni_json_str = cm.data.get(<span class="string">&#x27;net-conf.json&#x27;</span>, <span class="string">&#x27;&#123;&#125;&#x27;</span>)</span><br><span class="line">    net_conf = json.loads(cni_json_str)</span><br><span class="line">    net_conf[<span class="string">&#x27;Backend&#x27;</span>][<span class="string">&#x27;Type&#x27;</span>] = mode</span><br><span class="line">    cm.data[<span class="string">&#x27;net-conf.json&#x27;</span>] = json.dumps(net_conf)</span><br><span class="line">    result_cm = api_instance.patch_namespaced_config_map(cm_name, <span class="string">&quot;kube-system&quot;</span>, cm, pretty=<span class="literal">True</span>)</span><br><span class="line">    <span class="keyword">if</span> get_backend_from_cm(result_cm) != mode:</span><br><span class="line">        logging.error(<span class="string">&quot;flannel backend Type修改为&#123;0&#125;失败: &quot;</span>.<span class="built_in">format</span>(mode))</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">False</span></span><br><span class="line">        </span><br><span class="line">    logging.info(<span class="string">&quot;开始删除 flannel pod&quot;</span>)</span><br><span class="line">    pods = api_instance.list_namespaced_pod(<span class="string">&quot;kube-system&quot;</span>, label_selector=<span class="string">&#x27;app=flannel&#x27;</span>).items</span><br><span class="line">    <span class="keyword">for</span> pod <span class="keyword">in</span> pods:</span><br><span class="line">        logging.info(<span class="string">f&quot;删除 flannel Pod: <span class="subst">&#123;pod.metadata.name&#125;</span>&quot;</span>)</span><br><span class="line">        api_instance.delete_namespaced_pod(name=pod.metadata.name, namespace=<span class="string">&quot;kube-system&quot;</span>)</span><br></pre></td></tr></table></figure><p>如果后续 dashboard 需要纳管多个 k8s 之类的，用库只需要修改 client 加载部分来操作具体集群，而且修改 configmap 不会产生临时文件，避免临时文件的权限问题，而命令形式的话 kubectl 拼一堆选项非常麻烦。</p><h3 id="默认参数"><a href="#默认参数" class="headerlink" title="默认参数"></a>默认参数</h3><p>很多命令有默认参数，再拿 ssh 举例，上面的 ssh，很多时候老是漏掉端口，很多客户的 ssh 端口不是默认的 22，如果基于 <code>paramiko</code> 库封装成方法，要求必须传入端口，能避免后续测试反馈以及客户现场暴漏。</p><h3 id="tty"><a href="#tty" class="headerlink" title="tty"></a>tty</h3><p>很多命令会通过 <code>isatty(1)，isatty(2)</code> 改变行为，程序调用时默认是 非 tty，与人手敲命令完全不是一个世界，以及在管道里是非 tty 行为：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">ls</span></span></span><br><span class="line">Dockerfile  Makefile  README.md</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">ls</span> | <span class="built_in">cat</span></span></span><br><span class="line">Dockerfile</span><br><span class="line">Makefile</span><br><span class="line">README.md</span><br></pre></td></tr></table></figure><p>如果不具备这些 Linux 知识，会浪费时间在这种问题上排查。</p><h3 id="输出变更"><a href="#输出变更" class="headerlink" title="输出变更"></a>输出变更</h3><p>很多时候调用命令是为了获取信息，而很多系统或者切换容器 os 后，命令的版本变化：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl version</span></span><br><span class="line">OpenSSL 1.0.2k-fips  26 Jan 2017</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl x509 -noout -text -<span class="keyword">in</span> ca.pem</span> </span><br><span class="line">Certificate:</span><br><span class="line">    Data:</span><br><span class="line">        Version: 3 (0x2)</span><br><span class="line">        Serial Number:</span><br><span class="line">            77:b0:0a:e0:8c:5a:88:ee:89:9d:18:fa:48:94:c1:cf:28:f6:6d:d1</span><br><span class="line">    Signature Algorithm: ecdsa-with-SHA256</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl version</span></span><br><span class="line">OpenSSL 3.0.2 15 Mar 2022 (Library: OpenSSL 3.0.2 15 Mar 2022)</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl x509 -<span class="keyword">in</span> ca.pem -noout -text</span></span><br><span class="line">Certificate:</span><br><span class="line">    Data:</span><br><span class="line">        Version: 3 (0x2)</span><br><span class="line">        Serial Number:</span><br><span class="line">            77:b0:0a:e0:8c:5a:88:ee:89:9d:18:fa:48:94:c1:cf:28:f6:6d:d1</span><br><span class="line">        Signature Algorithm: ecdsa-with-SHA256</span><br></pre></td></tr></table></figure><p>例如上面的 openssl 3 版本的 <code>Signature</code> 位置变化了，再例如 docker 新版本镜像输出变化：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">docker images | <span class="built_in">head</span></span></span><br><span class="line">WARNING: This output is designed for human readability. For machine-readable output, please use --format.</span><br><span class="line">IMAGE                                                                   ID             DISK USAGE   CONTENT SIZE   EXTRA</span><br><span class="line">alpine:latest                                                           706db57fb206       8.32MB             0B        </span><br><span class="line">busybox:glibc                                                           08ef35a1c3f0       4.43MB             0B        </span><br><span class="line">busybox:latest                                                          08ef35a1c3f0       4.43MB             0B        </span><br><span class="line">cr.loongnix.cn/kubernetes/etcd:3.5.14                                   dcb2aaf9fcc7       85.3MB             0B        </span><br><span class="line">debian:11                                                               a20b5a7387bf        124MB             0B        </span><br><span class="line">debian:trixie-slim                                                      58c1f2a9fa85       78.6MB             0B        </span><br><span class="line">envoyproxy/envoy:v1.21.2                                                2c32b8d45d47        115MB             0B        </span><br><span class="line">gcr.io/k8s-staging-dns/k8s-dns-dnsmasq-amd64:1.26.5-1-gcf293f8e         60f63d70918c       21.2MB             0B        </span><br><span class="line">gcr.io/k8s-staging-dns/k8s-dns-dnsmasq-amd64:1.26.5-1-gcf293f8e-dirty   60f63d70918c       21.2MB             0B</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker images | grep none</span></span><br><span class="line">WARNING: This output is designed for human readability. For machine-readable output, please use --format.</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker images --format table -a | <span class="built_in">head</span></span></span><br><span class="line">REPOSITORY                                              TAG                                              IMAGE ID       CREATED         SIZE</span><br><span class="line">hub-mirror.xxx.xx/xxxx-run/python                       3.12-amd64-oe-v1                                 8634b85409a9   2 hours ago     477MB</span><br><span class="line">hub-mirror.xxx.xx/xxxx-run/python                       3.12.11-amd64-oe-v1                              8634b85409a9   2 hours ago     477MB</span><br><span class="line">&lt;none&gt;                                                  &lt;none&gt;                                           af0f5e291624   2 hours ago     477MB</span><br><span class="line">&lt;none&gt;                                                  &lt;none&gt;                                           1b7d81e61c72   2 hours ago     467MB</span><br><span class="line">&lt;none&gt;                                                  &lt;none&gt;                                           1affa42bade1   2 hours ago     467MB</span><br><span class="line">&lt;none&gt;                                                  &lt;none&gt;                                           3353b43ef79a   2 hours ago     467MB</span><br><span class="line">&lt;none&gt;                                                  &lt;none&gt;                                           789cff37497a   2 hours ago     467MB</span><br><span class="line">&lt;none&gt;                                                  &lt;none&gt;                                           410eb1000652   2 hours ago     467MB</span><br><span class="line">&lt;none&gt;                                                  &lt;none&gt;                                           c38df4d4aa02   3 hours ago     426MB</span><br></pre></td></tr></table></figure><p>如果使用库完全不限于被动情况。</p><p>以及调用 <code>free -h</code> 判断机器内存有多少 g 情况，然后做限制判断，然后客户机器内存 1t了，<code>free -h</code> 显示的数字部分就是 1 了，认为客户机器内存只有 1G。实际上 free 命令就是读取的 <code>/proc</code> 目录：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">strace free -h |&amp; grep /proc/</span></span><br><span class="line">openat(AT_FDCWD, &quot;/proc/self/auxv&quot;, O_RDONLY) = 3</span><br><span class="line">openat(AT_FDCWD, &quot;/proc/sys/kernel/osrelease&quot;, O_RDONLY) = 3</span><br><span class="line">openat(AT_FDCWD, &quot;/proc/self/auxv&quot;, O_RDONLY) = 3</span><br><span class="line">openat(AT_FDCWD, &quot;/proc/sys/kernel/osrelease&quot;, O_RDONLY) = 3</span><br><span class="line">openat(AT_FDCWD, &quot;/proc/meminfo&quot;, O_RDONLY) = 3</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">free -h</span></span><br><span class="line">              total        used        free      shared  buff/cache   available</span><br><span class="line">Mem:            62G        9.8G         11G        3.0M         40G         52G</span><br><span class="line">Swap:            0B          0B          0B</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">head</span> /proc/meminfo</span> </span><br><span class="line">MemTotal:       65806560 kB</span><br><span class="line">MemFree:        12365980 kB</span><br><span class="line">MemAvailable:   54861520 kB</span><br><span class="line">Buffers:            2104 kB</span><br><span class="line">Cached:         40519256 kB</span><br></pre></td></tr></table></figure><p>以及其他的 <code>/proc/cpuinfo</code> 、<code>/proc/mounts</code> 和 <code>/sys/</code> 目录。</p><h3 id="引发事故"><a href="#引发事故" class="headerlink" title="引发事故"></a>引发事故</h3><p>早期的环境检查单独的脚本里，为了检查某些 sysctl 参数调用了 <code>sysctl -p</code> ，而某天同事 A 去重要客户生产环境执行环境检查，执行完后环境崩了，被客户骂赶紧解决。最后排查到是客户修改了 <code>/etc/sysctl.conf</code> 内关闭 <code>net.ipv4.ip_forward = 0</code> 转发，本质这个环境检查就是看这些需要检查的内核参数，完全可以从 <code>/proc</code> 目录下获取和写入就行：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /proc/sys/net/ipv4/ip_forward</span></span><br><span class="line">1</span><br></pre></td></tr></table></figure><h3 id="没考虑到的场景"><a href="#没考虑到的场景" class="headerlink" title="没考虑到的场景"></a>没考虑到的场景</h3><p>因为网上很多 ssl证书生成都是用的 <code>openssl rsa</code> 生成证书，证书合法性使用 <code>openssl rsa</code> 判断证书 crt 的公钥部分是不是 key 的公钥：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">check_crt = subprocess.Popen(</span><br><span class="line">    <span class="string">&quot;openssl x509 -pubkey -noout -in &#123;0&#125;&quot;</span>.<span class="built_in">format</span>(crt_path),</span><br><span class="line">    shell=<span class="literal">True</span>,</span><br><span class="line">    stdout=subprocess.PIPE,</span><br><span class="line">)</span><br><span class="line">check_crt.wait()</span><br><span class="line">check_key = subprocess.Popen(</span><br><span class="line">    <span class="string">&quot;openssl rsa -pubout  -in &#123;0&#125;&quot;</span>.<span class="built_in">format</span>(key_path),</span><br><span class="line">    shell=<span class="literal">True</span>,</span><br><span class="line">    stdout=subprocess.PIPE,</span><br><span class="line">)</span><br><span class="line">check_key.wait()</span><br><span class="line"><span class="keyword">if</span> check_crt.returncode == <span class="number">0</span>:</span><br><span class="line">    check_crt_result = check_crt.stdout.read()</span><br><span class="line"><span class="keyword">else</span>:</span><br><span class="line">    check_crt_result = <span class="literal">False</span></span><br><span class="line"><span class="keyword">if</span> check_key.returncode == <span class="number">0</span>:</span><br><span class="line">    check_key_result = check_key.stdout.read()</span><br><span class="line"><span class="keyword">else</span>:</span><br><span class="line">    check_key_result = <span class="literal">False</span></span><br></pre></td></tr></table></figure><p>然后客户的证书是使用 ecdsa 算法生成(生成效率快，体积更小)，实施使用 nginx 起容器测试没问题，最后临时给证书检验的改为了 <code>openssl pkey -pubout -in</code>，如果调用命令，难道每个 openssl 子命令循环一边吗。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> cryptography <span class="keyword">import</span> x509</span><br><span class="line"><span class="keyword">from</span> cryptography.hazmat.primitives.asymmetric <span class="keyword">import</span> rsa, ec</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">cert_key_type</span>(<span class="params">cert_path: <span class="built_in">str</span></span>) -&gt; <span class="built_in">str</span>:</span><br><span class="line">    <span class="keyword">with</span> <span class="built_in">open</span>(cert_path, <span class="string">&quot;rb&quot;</span>) <span class="keyword">as</span> f:</span><br><span class="line">        data = f.read()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">try</span>:</span><br><span class="line">        cert = x509.load_pem_x509_certificate(data)</span><br><span class="line">    <span class="keyword">except</span> ValueError:</span><br><span class="line">        cert = x509.load_der_x509_certificate(data)</span><br><span class="line"></span><br><span class="line">    pubkey = cert.public_key()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">isinstance</span>(pubkey, rsa.RSAPublicKey):</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;RSA&quot;</span></span><br><span class="line">    <span class="keyword">elif</span> <span class="built_in">isinstance</span>(pubkey, ec.EllipticCurvePublicKey):</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;ECDSA&quot;</span></span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">type</span>(pubkey).__name__</span><br></pre></td></tr></table></figure><p>以及 <code>cryptography</code> 库还可以生成 ssl 证书，不需要像 openssl 那样产生临时文件调用多次命令。</p><h3 id="参数废弃"><a href="#参数废弃" class="headerlink" title="参数废弃"></a>参数废弃</h3><p>命令选项的废弃只有运行时候才知道，而如果使用库的话，在代码 lint 层面或者 import 的时候就会报错，避免问题发生时间的滞后。</p><h2 id="一些-python-替代"><a href="#一些-python-替代" class="headerlink" title="一些 python 替代"></a>一些 python 替代</h2><h3 id="一些文件操作"><a href="#一些文件操作" class="headerlink" title="一些文件操作"></a>一些文件操作</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">shutil.copy</span><br><span class="line">shutil.copyfile</span><br><span class="line">shutil.rmtree</span><br></pre></td></tr></table></figure><p>通配符文件：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">glob.glob(<span class="string">f&quot;<span class="subst">&#123;IPVS_DIRS&#125;</span>/*.ipvs.conf&quot;</span>)</span><br></pre></td></tr></table></figure><p>find 类似的路径匹配获取：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">tag_dirs = [<span class="built_in">str</span>(p) <span class="keyword">for</span> p <span class="keyword">in</span> Path(registry_dir).rglob(<span class="string">&quot;*/_manifests/tags/*&quot;</span>)]</span><br></pre></td></tr></table></figure><h3 id="一些-sdk-经验"><a href="#一些-sdk-经验" class="headerlink" title="一些 sdk 经验"></a>一些 sdk 经验</h3><p>开源 sdk 单独的 logger，在 import 的里会有些输出，关闭的话可以在 import 之前关闭。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">logging.getLogger(<span class="string">&#x27;docker&#x27;</span>).setLevel(logging.CRITICAL)</span><br><span class="line">logging.getLogger(<span class="string">&#x27;salt&#x27;</span>).setLevel(logging.CRITICAL)</span><br></pre></td></tr></table></figure><p>使用 ansible runner 默认会保留 tmp 目录，可以：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="keyword">import</span> os</span><br><span class="line"><span class="keyword">import</span> ansible_runner</span><br><span class="line"><span class="keyword">import</span> logging</span><br><span class="line"><span class="keyword">import</span> shutil</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment"># 默认的 artifacts 处理器</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">default_artifacts_handler</span>(<span class="params">artifacts_dir</span>):</span><br><span class="line">    shutil.rmtree(artifacts_dir.split(<span class="string">&quot;artifacts/&quot;</span>)[<span class="number">0</span>], ignore_errors=<span class="literal">True</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 默认的 event 处理器</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">default_event_handler</span>(<span class="params">event</span>):</span><br><span class="line">    line = event.get(<span class="string">&quot;stdout&quot;</span>, <span class="string">&quot;&quot;</span>)</span><br><span class="line">    <span class="keyword">if</span> line.strip():</span><br><span class="line">        logging.info(line.strip())</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">run_ansible_playbook</span>(<span class="params"></span></span><br><span class="line"><span class="params">    playbook,</span></span><br><span class="line"><span class="params">    inventory=<span class="literal">None</span>,</span></span><br><span class="line"><span class="params">    user=<span class="literal">None</span>,</span></span><br><span class="line"><span class="params">    private_key=<span class="literal">None</span>,</span></span><br><span class="line"><span class="params">    limit_hosts=<span class="literal">None</span>,</span></span><br><span class="line"><span class="params">    extra_cmdline=<span class="string">&quot;&quot;</span>,</span></span><br><span class="line"><span class="params">    base_path=<span class="literal">None</span>,</span></span><br><span class="line"><span class="params">    become=<span class="literal">True</span>,</span></span><br><span class="line"><span class="params">    become_method=<span class="string">&quot;sudo&quot;</span>,</span></span><br><span class="line"><span class="params">    suppress_output=<span class="literal">True</span>,</span></span><br><span class="line"><span class="params">    artifacts_handler=<span class="literal">None</span>,</span></span><br><span class="line"><span class="params">    event_handler=<span class="literal">None</span>,</span></span><br><span class="line"><span class="params"></span>):</span><br><span class="line">    <span class="string">&quot;&quot;&quot;</span></span><br><span class="line"><span class="string">    执行 Ansible playbook 的封装方法</span></span><br><span class="line"><span class="string">    </span></span><br><span class="line"><span class="string">    参数:</span></span><br><span class="line"><span class="string">        playbook (str): playbook 文件路径</span></span><br><span class="line"><span class="string">        inventory (str|list): inventory 文件路径，可以是单个路径或路径列表</span></span><br><span class="line"><span class="string">        user (str): SSH 用户名，默认从环境变量 USER 获取</span></span><br><span class="line"><span class="string">        private_key (str): SSH 私钥路径</span></span><br><span class="line"><span class="string">        limit_hosts (list): 限制执行的主机列表</span></span><br><span class="line"><span class="string">        extra_cmdline (str): 额外的命令行参数</span></span><br><span class="line"><span class="string">        base_path (str): 基础路径</span></span><br><span class="line"><span class="string">        become (bool): 是否使用 become</span></span><br><span class="line"><span class="string">        become_method (str): become 方法</span></span><br><span class="line"><span class="string">        suppress_output (bool): 是否抑制 ansible 输出</span></span><br><span class="line"><span class="string">        artifacts_handler (callable): 自定义 artifacts 处理器</span></span><br><span class="line"><span class="string">        event_handler (callable): 自定义 event 处理器</span></span><br><span class="line"><span class="string">    </span></span><br><span class="line"><span class="string">    返回:</span></span><br><span class="line"><span class="string">        ansible_runner 的运行结果</span></span><br><span class="line"><span class="string">    &quot;&quot;&quot;</span></span><br><span class="line">    </span><br><span class="line">    artifacts_handler = artifacts_handler <span class="keyword">or</span> default_artifacts_handler</span><br><span class="line">    event_handler = event_handler <span class="keyword">or</span> default_event_handler</span><br><span class="line">    private_key = private_key <span class="keyword">or</span> <span class="string">&quot;/root/.ssh/id_rsa&quot;</span></span><br><span class="line"></span><br><span class="line">    <span class="comment"># 构建 ansible 命令行参数</span></span><br><span class="line">    cmdline_parts = []</span><br><span class="line">    </span><br><span class="line">    <span class="comment"># 用户名</span></span><br><span class="line">    user = user <span class="keyword">or</span> os.getenv(<span class="string">&quot;USER&quot;</span>)</span><br><span class="line">    <span class="keyword">if</span> user:</span><br><span class="line">        cmdline_parts.append(<span class="string">f&quot;-u <span class="subst">&#123;user&#125;</span>&quot;</span>)</span><br><span class="line">    </span><br><span class="line">    <span class="comment"># 私钥</span></span><br><span class="line">    <span class="keyword">if</span> private_key:</span><br><span class="line">        cmdline_parts.append(<span class="string">f&quot;--private-key=<span class="subst">&#123;private_key&#125;</span>&quot;</span>)</span><br><span class="line">    </span><br><span class="line">    <span class="comment"># become</span></span><br><span class="line">    <span class="keyword">if</span> become:</span><br><span class="line">        cmdline_parts.append(<span class="string">&quot;-b&quot;</span>)</span><br><span class="line">        <span class="keyword">if</span> become_method:</span><br><span class="line">            cmdline_parts.append(<span class="string">f&quot;--become-method=<span class="subst">&#123;become_method&#125;</span>&quot;</span>)</span><br><span class="line">    </span><br><span class="line">    <span class="comment"># 限制主机</span></span><br><span class="line">    <span class="keyword">if</span> limit_hosts:</span><br><span class="line">        <span class="keyword">if</span> <span class="built_in">isinstance</span>(limit_hosts, <span class="built_in">list</span>):</span><br><span class="line">            cmdline_parts.append(<span class="string">f&quot;--limit=<span class="subst">&#123;<span class="string">&#x27;,&#x27;</span>.join(limit_hosts)&#125;</span>&quot;</span>)</span><br><span class="line">        <span class="keyword">else</span>:</span><br><span class="line">            cmdline_parts.append(<span class="string">f&quot;--limit=<span class="subst">&#123;limit_hosts&#125;</span>&quot;</span>)</span><br><span class="line">    </span><br><span class="line">    <span class="keyword">if</span> extra_cmdline:</span><br><span class="line">        cmdline_parts.append(extra_cmdline)</span><br><span class="line">    </span><br><span class="line">    ansible_cmdline = <span class="string">&quot; &quot;</span>.join(cmdline_parts)</span><br><span class="line">    </span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">isinstance</span>(inventory, <span class="built_in">str</span>):</span><br><span class="line">        inventory_list = [inventory]</span><br><span class="line">    <span class="keyword">elif</span> <span class="built_in">isinstance</span>(inventory, <span class="built_in">list</span>):</span><br><span class="line">        inventory_list = inventory</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        inventory_list = []</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 执行 ansible</span></span><br><span class="line">    r = ansible_runner.run(</span><br><span class="line">        inventory=inventory_list,</span><br><span class="line">        playbook=playbook,</span><br><span class="line">        cmdline=ansible_cmdline,</span><br><span class="line">        artifacts_handler=artifacts_handler,</span><br><span class="line">        event_handler=event_handler,</span><br><span class="line">        settings=&#123;<span class="string">&quot;suppress_ansible_output&quot;</span>: suppress_output&#125;,</span><br><span class="line">    )</span><br><span class="line">    </span><br><span class="line">    <span class="keyword">return</span> r</span><br></pre></td></tr></table></figure><p><code>salt -N xxx module.name args</code> 实际上可以看 <code>cat $(which salt)</code> 找下源码，可以使用 <code>salt.client</code>：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> salt.client</span><br><span class="line">salt_client = salt.client.LocalClient()</span><br><span class="line"><span class="comment"># result = &#123;&#x27;10.xx.xx.xxx&#x27;: True, &#x27;10.xxx.xx.xxx&#x27;: &#x27;Minion did not return. xxxx&#x27;&#125;</span></span><br><span class="line">result = salt_client.cmd(tgt=<span class="string">&#x27;*&#x27;</span>, fun=<span class="string">&#x27;pillar.items&#x27;</span>, arg=[<span class="string">&quot;data_dir&quot;</span>], tgt_type=<span class="string">&quot;glob&quot;</span>)</span><br><span class="line"><span class="built_in">print</span>(result)</span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;p&gt;总结下一些案例场景，图快和方便调用命令在私有化的问题。&lt;/p&gt;</summary>
    
    
    
    
    <category term="linux" scheme="http://zhangguanzhang.github.io/tags/linux/"/>
    
    <category term="python" scheme="http://zhangguanzhang.github.io/tags/python/"/>
    
    <category term="shell" scheme="http://zhangguanzhang.github.io/tags/shell/"/>
    
  </entry>
  
  <entry>
    <title>docker 重启非 host 网络容器造成 dns 异常的梳理和pr修复</title>
    <link href="http://zhangguanzhang.github.io/2025/11/12/docker-sandbox-dns/"/>
    <id>http://zhangguanzhang.github.io/2025/11/12/docker-sandbox-dns/</id>
    <published>2025-11-12T10:30:30.000Z</published>
    <updated>2025-11-12T10:30:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>docker 重启非 host 网络容器造成 dns 异常的梳理和pr修复</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>内部产品是 toB 和 toG，针对 toG 的完全内网，测试会在搭建的 K8S 集群上所有节点配置个假的 DNS，这样黑盒下测功能，避免业务访问公网而造成功能问题。但是有些后续新业务是依赖公网的，测试测完没公网的部分后就会配置真实 dns 后测这部分功能，之前就遇到过好几次配置节点 DNS 后，个别 Pod 内 DNS 内容不是 k8s 的，而是下面这样类似变成宿主机：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"># Generated by Docker Engine.</span><br><span class="line"># This file can be edited; Docker Engine will not make further changes once it</span><br><span class="line"># has been modified.</span><br><span class="line"></span><br><span class="line">nameserver 10.xx.xx.xxx</span><br><span class="line"></span><br><span class="line"># Based on host file: &#x27;/run/systemd/resolve/resolv.conf&#x27; (legacy)</span><br><span class="line"># Overrides: []</span><br></pre></td></tr></table></figure><p>我们是使用的 cri-dockerd + k8s 组合。</p><h2 id="过程"><a href="#过程" class="headerlink" title="过程"></a>过程</h2><p>之前反馈了几次，但是一直没稳定复现手段，就先去看了下大概这块 docker 源码，然后给测试说，下次开发环境遇到了别删除 Pod 和对应容器，直接喊我，2025&#x2F;11&#x2F;11 下午反馈找到稳定复现步骤了，我们有个 dashboard，测试环境上开发在上面重启了他的 Pod 下的容器，该 Pod 的 <code>spec.containers</code> 只有一个，勾选的是类似 <code>docker ps -a</code> 的那样，<code>/pause</code> 容器和他的容器都勾选点重启的。上面点了下确实发生了。</p><h3 id="日志"><a href="#日志" class="headerlink" title="日志"></a>日志</h3><p>找到容器所在节点上去看：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl get pod -o wide -A | grep ai-aixxxxapi</span><br><span class="line">default       ai-aixxxxapi-6698f696d8-5w8kw                                1/1     Running            1 (27m ago)      35m     10.187.x.34    10.1x.5x.251   &lt;none&gt;           &lt;none&gt;</span><br><span class="line">$ ip r g 1</span><br><span class="line">1.0.0.0 via 10.1x.5x.1 dev eth0 src 10.1x.5x.251 uid 0 </span><br><span class="line">    cache </span><br><span class="line">$ docker ps -a | grep  ai-aixxxxapi-6698f696d8-5w8kw </span><br><span class="line">8b4ae64cecc0   reg.xxx.lan:5000/xxx/ai-aixxxxapi                                                <span class="string">&quot;/usr/local/bin/star…&quot;</span>   27 minutes ago      Up 27 minutes                         k8s_ai-aixxxxapi_ai-aixxxxapi-6698f696d8-5w8kw_default_79fc032c-e1d3-4603-b21f-e5400ac6e3b6_1</span><br><span class="line">e8eb18aaca94   reg.xxx.lan:5000/xxx/pause:3.9                                                   <span class="string">&quot;/pause&quot;</span>                 27 minutes ago      Up 27 minutes                         k8s_POD_ai-aixxxxapi-6698f696d8-5w8kw_default_79fc032c-e1d3-4603-b21f-e5400ac6e3b6_1</span><br><span class="line">96ea991f9bb3   reg.xxx.lan:5000/xxx/ai-aixxxxapi                                                <span class="string">&quot;/usr/local/bin/star…&quot;</span>   34 minutes ago      Exited (2) 27 minutes ago             k8s_ai-aixxxxapi_ai-aixxxxapi-6698f696d8-5w8kw_default_79fc032c-e1d3-4603-b21f-e5400ac6e3b6_0</span><br><span class="line">e4c64897be98   reg.xxx.lan:5000/xxx/pause:3.9                                                   <span class="string">&quot;/pause&quot;</span>                 35 minutes ago      Exited (0) 27 minutes ago             k8s_POD_ai-aixxxxapi-6698f696d8-5w8kw_default_79fc032c-e1d3-4603-b21f-e5400ac6e3b6_0</span><br><span class="line">$ docker inspect 96ea991f9bb3 | grep Resolv</span><br><span class="line">        <span class="string">&quot;ResolvConfPath&quot;</span>: <span class="string">&quot;/data/kube/docker/containers/e4c64897be9891d88b999e81bfd55bb0cc1c21d626708749691d43158062f2bb/resolv.conf&quot;</span>,</span><br><span class="line">$ <span class="built_in">cat</span> /data/kube/docker/containers/e4c64897be9891d88b999e81bfd55bb0cc1c21d626708749691d43158062f2bb/resolv.conf</span><br><span class="line"><span class="comment"># Generated by Docker Engine.</span></span><br><span class="line"><span class="comment"># This file can be edited; Docker Engine will not make further changes once it</span></span><br><span class="line"><span class="comment"># has been modified.</span></span><br><span class="line"></span><br><span class="line">nameserver 10.xx.41.103</span><br><span class="line"></span><br><span class="line"><span class="comment"># Based on host file: &#x27;/run/systemd/resolve/resolv.conf&#x27; (legacy)</span></span><br><span class="line"><span class="comment"># Overrides: []</span></span><br><span class="line">$ <span class="built_in">stat</span> /data/kube/docker/containers/e4c64897be9891d88b999e81bfd55bb0cc1c21d626708749691d43158062f2bb/resolv.conf</span><br><span class="line">  File: /data/kube/docker/containers/e4c64897be9891d88b999e81bfd55bb0cc1c21d626708749691d43158062f2bb/resolv.conf</span><br><span class="line">  Size: 238       Blocks: 8          IO Block: 4096   regular file</span><br><span class="line">Device: 811h/2065dInode: 32449846    Links: 1</span><br><span class="line">Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)</span><br><span class="line">Access: 2025-11-11 15:52:45.692564762 +0800</span><br><span class="line">Modify: 2025-11-11 15:52:45.648562824 +0800</span><br><span class="line">Change: 2025-11-11 15:52:45.655896480 +0800</span><br><span class="line"> Birth: -</span><br></pre></td></tr></table></figure><p>容器 id e4c64897be98 找下日志看看：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">journalctl -xe --no-pager -u docker | grep -P <span class="string">&#x27;96ea991f9bb3|e4c64897be98&#x27;</span></span></span><br><span class="line">Nov 11 15:52:45 ubuntu2004chenxxxx7YX6V70 dockerd[3974]: time=&quot;2025-11-11T15:52:45.083407489+08:00&quot; level=warning msg=&quot;cleaning up after shim disconnected&quot; id=96ea991f9bb3454bec28712cb91c97f684e8f113e1b45697244190347a8c8305 namespace=moby</span><br><span class="line">Nov 11 15:52:45 ubuntu2004chenxxxx7YX6V70 dockerd[3974]: time=&quot;2025-11-11T15:52:45.587632737+08:00&quot; level=warning msg=&quot;cleaning up after shim disconnected&quot; id=e4c64897be9891d88b999e81bfd55bb0cc1c21d626708749691d43158062f2bb namespace=moby</span><br><span class="line">Nov 11 15:53:10 ubuntu2004chenxxxx7YX6V70 dockerd[3974]: time=&quot;2025-11-11T15:53:10.794469549+08:00&quot; level=warning msg=&quot;cleaning up after shim disconnected&quot; id=96ea991f9bb3454bec28712cb91c97f684e8f113e1b45697244190347a8c8305 namespace=moby</span><br><span class="line">Nov 11 15:53:11 ubuntu2004chenxxxx7YX6V70 dockerd[3974]: time=&quot;2025-11-11T15:53:11.417476169+08:00&quot; level=warning msg=&quot;cleaning up after shim disconnected&quot; id=e4c64897be9891d88b999e81bfd55bb0cc1c21d626708749691d43158062f2bb namespace=moby</span><br></pre></td></tr></table></figure><p>也看下容器时间相关：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker inspect e4c</span></span><br><span class="line">[</span><br><span class="line">    &#123;</span><br><span class="line">        &quot;Id&quot;: &quot;e4c64897be9891d88b999e81bfd55bb0cc1c21d626708749691d43158062f2bb&quot;,</span><br><span class="line">        &quot;Created&quot;: &quot;2025-11-11T07:44:57.093893019Z&quot;, 👈 创建时间</span><br><span class="line">        &quot;Path&quot;: &quot;/pause&quot;,</span><br><span class="line">        &quot;Args&quot;: [],</span><br><span class="line">        &quot;State&quot;: &#123;</span><br><span class="line">            &quot;Status&quot;: &quot;exited&quot;,</span><br><span class="line">            &quot;Running&quot;: false,</span><br><span class="line">            &quot;Paused&quot;: false,</span><br><span class="line">            &quot;Restarting&quot;: false,</span><br><span class="line">            &quot;OOMKilled&quot;: false,</span><br><span class="line">            &quot;Dead&quot;: false,</span><br><span class="line">            &quot;Pid&quot;: 0,</span><br><span class="line">            &quot;ExitCode&quot;: 0,</span><br><span class="line">            &quot;Error&quot;: &quot;&quot;,</span><br><span class="line">            &quot;StartedAt&quot;: &quot;2025-11-11T07:52:45.856187062Z&quot;,</span><br><span class="line">            &quot;FinishedAt&quot;: &quot;2025-11-11T07:53:11.408195153Z&quot;</span><br><span class="line">        &#125;,</span><br></pre></td></tr></table></figure><p>也看下 cri-dockerd 的日志：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">journalctl -xe --no-pager -u cri-dockerd | grep e4c64897be9891d88b999e81bfd55bb0cc1c21d626708749691d43158062f2bb</span></span><br><span class="line">Nov 11 15:45:03 ubuntu2004chenxxxx7YX6V70 cri-dockerd[51041]: time=&quot;2025-11-11T15:45:03+08:00&quot; level=info msg=&quot;Will attempt to re-write config file /data/kube/docker/containers/e4c64897be9891d88b999e81bfd55bb0cc1c21d626708749691d43158062f2bb/resolv.conf as [nameserver 10.186.0.2 search default.svc.cluster1.local. svc.cluster1.local. cluster1.local. options ndots:5]&quot;</span><br></pre></td></tr></table></figure><p>根据上面日志总结时间线：</p><ol><li><code>15.44.47</code> 创建 <code>/pause</code> 容器</li><li><code>15:45:03</code> cri-dockerd 拉完业务镜像后创建 sandbox 容器，re-write 了容器的 <code>resolv.conf</code>，这块源码逻辑可以搜 <code>Will attempt to re-write</code></li><li><code>15.52.45</code> 重启了容器</li><li>容器的 <code>resolv.conf</code> 根据 mtime 看发生改变</li></ol><h3 id="最小复现"><a href="#最小复现" class="headerlink" title="最小复现"></a>最小复现</h3><p>后端重启容器逻辑是同事写的，我记得大概逻辑是 python docker client 调用 docker 重启的，直接二分，如果 docker restart 复现了就不是 dashboard 后端逻辑造成的。然后环境上复现了，然后自己搭建个干净 K8S 也复现了。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> testpod.yml</span></span><br><span class="line">apiVersion: v1</span><br><span class="line">kind: Pod</span><br><span class="line">metadata:</span><br><span class="line">  name: testpod</span><br><span class="line">spec:</span><br><span class="line">  containers:</span><br><span class="line">  - name: testpod</span><br><span class="line">    image: m.daocloud.io/docker.io/library/nginx:latest</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash"> nodeName: xxx</span></span><br></pre></td></tr></table></figure><p>单节点，如果固定节点的话设置下 <code>nodeName</code> 即可，复现：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker ps -a | grep testpod</span></span><br><span class="line">73b346e11ef4   m.daocloud.io/docker.io/library/nginx                                            &quot;/docker-entrypoint.…&quot;   3 minutes ago       Up 3 minutes                          k8s_vulnerable-container_testpod_default_f8215913-32b2-4e18-8536-69e5ecce7c84_0</span><br><span class="line">ad255b51f1b3   reg.xxx.lan:5000/xxx/pause:3.9                                                   &quot;/pause&quot;                 3 minutes ago       Up 3 minutes                          k8s_POD_testpod_default_f8215913-32b2-4e18-8536-69e5ecce7c84_0</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker inspect 73b346e11ef4 | grep ResolvConfPath</span></span><br><span class="line">        &quot;ResolvConfPath&quot;: &quot;/data/kube/docker/containers/ad255b51f1b396fdea0d2579b373ac5c497fbd707031fa53936e805ac1b30cc9/resolv.conf&quot;,</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /data/kube/docker/containers/ad255b51f1b396fdea0d2579b373ac5c497fbd707031fa53936e805ac1b30cc9/resolv.conf</span></span><br><span class="line">nameserver 10.186.0.2</span><br><span class="line">search default.svc.cluster1.local. svc.cluster1.local. cluster1.local.</span><br><span class="line">options ndots:5</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker restart 73b346e11ef4 ad255b51f1b3</span></span><br><span class="line">73b346e11ef4</span><br><span class="line">ad255b51f1b3</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /data/kube/docker/containers/ad255b51f1b396fdea0d2579b373ac5c497fbd707031fa53936e805ac1b30cc9/resolv.conf</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Generated by Docker Engine.</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">This file can be edited; Docker Engine will not make further changes once it</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">has been modified.</span></span><br><span class="line"></span><br><span class="line">nameserver 10.x3.41.103</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Based on host file: <span class="string">&#x27;/run/systemd/resolve/resolv.conf&#x27;</span> (legacy)</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Overrides: []</span></span><br></pre></td></tr></table></figure><p>Pod 的创建流程是容器运行时先创建一个 &#x2F;pause 容器，然后 <code>pod.spec.containers</code> 的容器会 join 到 <code>/pause</code> 上，而 docker 下容器的 hosts、resolv.conf 和 hostname 这些是单独一层 init 层处理的，会创建文件，Pod 的所有容器都使用同一份：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker ps -a | grep testpod</span></span><br><span class="line">f6ef003b4864   xxx</span><br><span class="line">340e173d875b   xxxx</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker inspect 340e173d875b | grep ResolvConfPath</span></span><br><span class="line">        &quot;ResolvConfPath&quot;: &quot;/data/kube/docker/containers/340e173d875b00b6aca32f8770493b9f1d86159340bcea6bc01b93992763bab7/resolv.conf&quot;,</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker inspect f6ef003b4864 | grep ResolvConfPath</span></span><br><span class="line">        &quot;ResolvConfPath&quot;: &quot;/data/kube/docker/containers/340e173d875b00b6aca32f8770493b9f1d86159340bcea6bc01b93992763bab7/resolv.conf&quot;,</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /data/kube/docker/containers/340e173d875b00b6aca32f8770493b9f1d86159340bcea6bc01b93992763bab7/resolv.conf</span></span><br><span class="line">nameserver 10.186.0.2</span><br><span class="line">search default.svc.cluster2.local. svc.cluster2.local. cluster2.local.</span><br><span class="line">options ndots:5</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">ls</span> -1 /data/kube/docker/containers/340e173d875b00b6aca32f8770493b9f1d86159340bcea6bc01b93992763bab7/</span></span><br><span class="line">340e173d875b00b6aca32f8770493b9f1d86159340bcea6bc01b93992763bab7-json.log</span><br><span class="line">checkpoints</span><br><span class="line">config.v2.json</span><br><span class="line">hostconfig.json</span><br><span class="line">hostname</span><br><span class="line">hosts</span><br><span class="line">mounts</span><br><span class="line">resolv.conf</span><br><span class="line">resolv.conf.hash</span><br></pre></td></tr></table></figure><p>然后发现最小化复现是只重启 <code>/pause</code> 容器发生：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker ps -a | grep testpod</span></span><br><span class="line">ef9bb3c69dfa   reg.xxx.lan:5000/xxx/nginx                                                       &quot;/docker-entrypoint.…&quot;   2 minutes ago    Up 2 minutes                              k8s_vulnerable-container_testpod_default_ebb8ace6-84ab-4a28-814c-109c41827908_1</span><br><span class="line">e6868ab3d8ef   reg.xxx.lan:5000/xxx/pause:3.9                                                   &quot;/pause&quot;                  2 minutes ago    Up 2 minutes                              k8s_POD_testpod_default_ebb8ace6-84ab-4a28-814c-109c41827908_1</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker inspect e6868ab3d8ef | grep ResolvConfPath</span></span><br><span class="line">        &quot;ResolvConfPath&quot;: &quot;/data/kube/docker/containers/e6868ab3d8ef8fa1238a82a15faa88b1d13967a71a1e16c99618663610d21286/resolv.conf&quot;,</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /data/kube/docker/containers/e6868ab3d8ef8fa1238a82a15faa88b1d13967a71a1e16c99618663610d21286/resolv.conf</span></span><br><span class="line">nameserver 10.186.0.2</span><br><span class="line">search default.svc.cluster2.local. svc.cluster2.local. cluster2.local.</span><br><span class="line">options ndots:5</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker restart e6868ab3d8ef</span></span><br><span class="line">e6868ab3d8ef</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /data/kube/docker/containers/e6868ab3d8ef8fa1238a82a15faa88b1d13967a71a1e16c99618663610d21286/resolv.conf</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Generated by Docker Engine.</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">This file can be edited; Docker Engine will not make further changes once it</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">has been modified.</span></span><br><span class="line"></span><br><span class="line">nameserver 223.5.5.5</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Based on host file: <span class="string">&#x27;/etc/resolv.conf&#x27;</span> (legacy)</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Overrides: []</span></span><br></pre></td></tr></table></figure><h3 id="相关源码"><a href="#相关源码" class="headerlink" title="相关源码"></a>相关源码</h3><p>既然确定是 docker 逻辑造成，那就需要看下这块源码逻辑了，可以看到有生成注释的，代码里搜关键字 <code>Generated by Docker Engine.</code> 搜到：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v26.1.4/libnetwork/sandbox_dns_unix.go#L248-L278</span></span><br><span class="line"><span class="comment">// loadResolvConf reads the resolv.conf file at path, and merges in overrides for</span></span><br><span class="line"><span class="comment">// nameservers, options, and search domains.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(sb *Sandbox)</span></span> loadResolvConf(path <span class="type">string</span>) (*resolvconf.ResolvConf, <span class="type">error</span>) &#123;</span><br><span class="line">rc, err := resolvconf.Load(path)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &amp;&amp; !errors.Is(err, fs.ErrNotExist) &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// Proceed with rc, which might be zero-valued if path does not exist.</span></span><br><span class="line"></span><br><span class="line">rc.SetHeader(<span class="string">`# Generated by Docker Engine.</span></span><br><span class="line"><span class="string"># This file can be edited; Docker Engine will not make further changes once it</span></span><br><span class="line"><span class="string"># has been modified.`</span>)</span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(sb.config.dnsList) &gt; <span class="number">0</span> &#123;</span><br><span class="line"><span class="keyword">var</span> dnsAddrs []netip.Addr</span><br><span class="line"><span class="keyword">for</span> _, ns := <span class="keyword">range</span> sb.config.dnsList &#123;</span><br><span class="line">addr, err := netip.ParseAddr(ns)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, errors.Wrapf(err, <span class="string">&quot;bad nameserver address %s&quot;</span>, ns)</span><br><span class="line">&#125;</span><br><span class="line">dnsAddrs = <span class="built_in">append</span>(dnsAddrs, addr)</span><br><span class="line">&#125;</span><br><span class="line">rc.OverrideNameServers(dnsAddrs)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(sb.config.dnsSearchList) &gt; <span class="number">0</span> &#123;</span><br><span class="line">rc.OverrideSearch(sb.config.dnsSearchList)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(sb.config.dnsOptionsList) &gt; <span class="number">0</span> &#123;</span><br><span class="line">rc.OverrideOptions(sb.config.dnsOptionsList)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> &amp;rc, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>逆向思维，搜了下 <code>loadResolvConf(</code> 发现有在</p><ul><li><code>func (sb *Sandbox) setupDNS() error {</code></li><li><code>func (sb *Sandbox) updateDNS(ipv6Enabled bool) error {</code></li><li><code>func (sb *Sandbox) rebuildDNS() error {</code></li></ul><h4 id="断点调试"><a href="#断点调试" class="headerlink" title="断点调试"></a>断点调试</h4><p>golang 二进制调试构建的时候需要设置 <code>-gcflags=all=-N -l</code> 才能 dlv 调试，编译 dockerd 开启调试：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">DOCKER_DEBUG=1 ./hack/make.sh binary-daemon</span></span><br><span class="line"></span><br><span class="line">Removing bundles/</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">---&gt; </span><span class="language-bash">Making bundle: binary-daemon (<span class="keyword">in</span> bundles/binary-daemon)</span></span><br><span class="line">Building static bundles/binary-daemon/dockerd (linux/amd64)...</span><br><span class="line">+++ ./hack/with-go-mod.sh go build -mod=vendor -modfile=vendor.mod -o bundles/binary-daemon/dockerd -tags &#x27;netgo osusergo static_build &#x27; -ldflags &#x27; -X &quot;github.com/docker/docker/dockerversion.Version=dev&quot; -X &quot;github.com/docker/docker/dockerversion.GitCommit=de5c9cf0b96e4e172b96db54abababa4a328462f&quot; -X &quot;github.com/docker/docker/dockerversion.BuildTime=2025-11-11T10:42:16.000000000+00:00&quot; -X &quot;github.com/docker/docker/dockerversion.PlatformName=&quot; -X &quot;github.com/docker/docker/dockerversion.ProductName=&quot; -X &quot;github.com/docker/docker/dockerversion.DefaultProductLicense=&quot;  -extldflags -static &#x27; &#x27;-gcflags=all=-N -l&#x27; github.com/docker/docker/cmd/dockerd</span><br><span class="line">+ tee /root/github/moby/go.mod</span><br><span class="line">module github.com/docker/docker</span><br><span class="line"></span><br><span class="line">go 1.21</span><br><span class="line">+ trap &#x27;rm -f &quot;$&#123;ROOTDIR&#125;/go.mod&quot;&#x27; EXIT</span><br><span class="line">+ GO111MODULE=on</span><br><span class="line">+ GOTOOLCHAIN=local</span><br><span class="line">+ go build -mod=vendor -modfile=vendor.mod -o bundles/binary-daemon/dockerd -tags &#x27;netgo osusergo static_build &#x27; -ldflags &#x27; -X &quot;github.com/docker/docker/dockerversion.Version=dev&quot; -X &quot;github.com/docker/docker/dockerversion.GitCommit=de5c9cf0b96e4e172b96db54abababa4a328462f&quot; -X &quot;github.com/docker/docker/dockerversion.BuildTime=2025-11-11T10:42:16.000000000+00:00&quot; -X &quot;github.com/docker/docker/dockerversion.PlatformName=&quot; -X &quot;github.com/docker/docker/dockerversion.ProductName=&quot; -X &quot;github.com/docker/docker/dockerversion.DefaultProductLicense=&quot;  -extldflags -static &#x27; &#x27;-gcflags=all=-N -l&#x27; github.com/docker/docker/cmd/dockerd</span><br><span class="line">+ rm -f /root/github/moby/go.mod</span><br><span class="line">Created binary: bundles/binary-daemon/dockerd</span><br></pre></td></tr></table></figure><p>替换启动，找到 pid：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">systemctl stop docker</span><br><span class="line">d_dir=$(dirname $(which docker))</span><br><span class="line">cp $(which dockerd) $(which dockerd).bak</span><br><span class="line">cp bundles/binary-daemon/dockerd $&#123;d_dir&#125;</span><br><span class="line">systemctl start docker</span><br><span class="line">ps -ef | grep /docker[d]</span><br></pre></td></tr></table></figure><p>然后附加 pid 上调试，dlv 打了几个断点后发现在 <code>setupDNS()</code> 里：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">go install github.com/go-delve/delve/cmd/dlv@master</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">dlv attach 19034</span></span><br><span class="line">Type &#x27;help&#x27; for list of commands.</span><br><span class="line">(dlv) b libnetwork/sandbox_dns_unix.go:285</span><br><span class="line">Breakpoint 1 set at 0x24c6c8b for github.com/docker/docker/libnetwork.(*Sandbox).setupDNS() ./libnetwork/sandbox_dns_unix.go:285</span><br><span class="line">(dlv) b libnetwork/sandbox_dns_unix.go:300</span><br><span class="line">Breakpoint 2 set at 0x24c6f52 for github.com/docker/docker/libnetwork.(*Sandbox).updateDNS() ./libnetwork/sandbox_dns_unix.go:300</span><br><span class="line">(dlv) c</span><br></pre></td></tr></table></figure><p>然后另一个终端上重启下 <code>/pause</code> 容器，这边终端的 dlv 就走到断点了：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">(dlv) c</span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">[Breakpoint 1] github.com/docker/docker/libnetwork.(*Sandbox).setupDNS() ./libnetwork/sandbox_dns_unix.go:285 (hits goroutine(2045):1 total:1) (PC: 0x24c6c8b)</span></span><br><span class="line">   280:// For a new sandbox, write an initial version of the container&#x27;s resolv.conf. It&#x27;ll</span><br><span class="line">   281:// be a copy of the host&#x27;s file, with overrides for nameservers, options and search</span><br><span class="line">   282:// domains applied.</span><br><span class="line">   283:func (sb *Sandbox) setupDNS() error &#123;</span><br><span class="line">   284:// Make sure the directory exists.</span><br><span class="line">=&gt; 285:sb.restoreResolvConfPath()</span><br><span class="line">   286:dir, _ := filepath.Split(sb.config.resolvConfPath)</span><br><span class="line">   287:if err := createBasePath(dir); err != nil &#123;</span><br><span class="line">   288:return err</span><br><span class="line">   289:&#125;</span><br><span class="line">   290:</span><br><span class="line">(dlv) p sb</span><br><span class="line">Sending output to pager...</span><br><span class="line">(&quot;*github.com/docker/docker/libnetwork.Sandbox&quot;)(0xc0028d2400)</span><br><span class="line">*github.com/docker/docker/libnetwork.Sandbox &#123;</span><br><span class="line">id: &quot;1695c2a715872e25bcaa9c0268eeea65e528fa3ec6c275dfbf567344cd2cb30c&quot;,</span><br><span class="line">containerID: &quot;176c492dc77f3ed8020c1d4e59896311fea359135be39c26c04d28460546cf38&quot;,</span><br><span class="line">config: github.com/docker/docker/libnetwork.containerConfig &#123;</span><br></pre></td></tr></table></figure><h4 id="分析"><a href="#分析" class="headerlink" title="分析"></a>分析</h4><p>大致看了下 docker 这块逻辑：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v26.1.4/libnetwork/sandbox_dns_unix.go#L280C1-L296C2</span></span><br><span class="line"><span class="comment">// For a new sandbox, write an initial version of the container&#x27;s resolv.conf. It&#x27;ll</span></span><br><span class="line"><span class="comment">// be a copy of the host&#x27;s file, with overrides for nameservers, options and search</span></span><br><span class="line"><span class="comment">// domains applied.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(sb *Sandbox)</span></span> setupDNS() <span class="type">error</span> &#123;</span><br><span class="line"><span class="comment">// Make sure the directory exists.</span></span><br><span class="line">sb.restoreResolvConfPath()</span><br><span class="line">dir, _ := filepath.Split(sb.config.resolvConfPath)</span><br><span class="line"><span class="keyword">if</span> err := createBasePath(dir); err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">rc, err := sb.loadResolvConf(sb.config.getOriginResolvConfPath())</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> rc.WriteFile(sb.config.resolvConfPath, sb.config.resolvConfHashFile, filePerm)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>sb.restoreResolvConfPath()</code> 填充变量，也就是实际的 <code>ResolvConfPath</code> 和他的 <code>.hash</code> 文件:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(sb *Sandbox)</span></span> restoreResolvConfPath() &#123;</span><br><span class="line"><span class="keyword">if</span> sb.config.resolvConfPath == <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">sb.config.resolvConfPath = defaultPrefix + <span class="string">&quot;/&quot;</span> + sb.id + <span class="string">&quot;/resolv.conf&quot;</span></span><br><span class="line">&#125;</span><br><span class="line">sb.config.resolvConfHashFile = sb.config.resolvConfPath + <span class="string">&quot;.hash&quot;</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>sb.loadResolvConf</code> 方法就是把 Linux 的 DNS 内容解析成结构体，传入的 <code>sb.config.getOriginResolvConfPath()</code> 是获取宿主机的 dns 文件路径：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">(dlv) n</span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">github.com/docker/docker/libnetwork.(*containerConfig).getOriginResolvConfPath() ./libnetwork/sandbox_dns_unix.go:242 (PC: 0x24c64ab)</span></span><br><span class="line">   237:&#125;</span><br><span class="line">   238:&#125;</span><br><span class="line">   239:</span><br><span class="line">   240:func (c *containerConfig) getOriginResolvConfPath() string &#123;</span><br><span class="line">   241:if c.originResolvConfPath != &quot;&quot; &#123;</span><br><span class="line">=&gt; 242:return c.originResolvConfPath</span><br><span class="line">   243:&#125;</span><br><span class="line">   244:// Fallback if not specified.</span><br><span class="line">   245:return resolvconf.Path()</span><br><span class="line">   246:&#125;</span><br><span class="line">   247:</span><br><span class="line">(dlv) p c.originResolvConfPath</span><br><span class="line">&quot;/etc/resolv.conf&quot;</span><br></pre></td></tr></table></figure><p>机器如果使用了 systemd-resolv 下 <code>originResolvConfPath</code> 则会是 <code>/run/systemd/resolve/resolv.conf</code>,  然后继续调试：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">(dlv) list</span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">[Breakpoint 1] github.com/docker/docker/libnetwork.(*Sandbox).setupDNS() ./libnetwork/sandbox_dns_unix.go:291 (hits goroutine(52319):1 total:3) (PC: 0x24c6d7a)</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">github.com/docker/docker/libnetwork/internal/resolvconf.(*ResolvConf).WriteFile() ./libnetwork/internal/resolvconf/resolvconf.go:377 (PC: 0xfad513)</span></span><br><span class="line">   372:</span><br><span class="line">   373:// WriteFile generates content and writes it to path. If hashPath is non-zero, it</span><br><span class="line">   374:// also writes a file containing a hash of the content, to enable UserModified()</span><br><span class="line">   375:// to determine whether the file has been modified.</span><br><span class="line">   376:func (rc *ResolvConf) WriteFile(path, hashPath string, perm os.FileMode) error &#123;</span><br><span class="line">=&gt; 377:content, err := rc.Generate(true)</span><br><span class="line">   378:if err != nil &#123;</span><br><span class="line">   379:return err</span><br><span class="line">   380:&#125;</span><br><span class="line">   381:</span><br><span class="line">   382:// Write the resolv.conf file - it&#x27;s bind-mounted into the container, so can&#x27;t</span><br><span class="line">(dlv) p path</span><br><span class="line">&quot;/data/kube/docker/containers/c213ae91753573a17948caf3cfa6421045d11e7e269d57cbc129f784f188d99b/resolv.conf&quot;</span><br><span class="line">(dlv) p hashPath</span><br><span class="line">&quot;/data/kube/docker/containers/c213ae91753573a17948caf3cfa6421045d11e7e269d57cbc129f784f188d99b/resolv.conf.hash&quot;</span><br></pre></td></tr></table></figure><p>有个 hash 文件，那看起来就最下面的 <code>rc.WriteFile</code> 改写的内容了，看下它的逻辑：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v26.1.4/libnetwork/internal/resolvconf/resolvconf.go#L373-L402</span></span><br><span class="line"><span class="comment">// WriteFile generates content and writes it to path. If hashPath is non-zero, it</span></span><br><span class="line"><span class="comment">// also writes a file containing a hash of the content, to enable UserModified()</span></span><br><span class="line"><span class="comment">// to determine whether the file has been modified.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(rc *ResolvConf)</span></span> WriteFile(path, hashPath <span class="type">string</span>, perm os.FileMode) <span class="type">error</span> &#123;</span><br><span class="line">content, err := rc.Generate(<span class="literal">true</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// Write the resolv.conf file - it&#x27;s bind-mounted into the container, so can&#x27;t</span></span><br><span class="line"><span class="comment">// move a temp file into place, just have to truncate and write it.</span></span><br><span class="line"><span class="keyword">if</span> err := os.WriteFile(path, content, perm); err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> errdefs.System(err)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// Write the hash file.</span></span><br><span class="line"><span class="keyword">if</span> hashPath != <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">hashFile, err := ioutils.NewAtomicFileWriter(hashPath, perm)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> errdefs.System(err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">defer</span> hashFile.Close()</span><br><span class="line"></span><br><span class="line">digest := digest.FromBytes(content)</span><br><span class="line"><span class="keyword">if</span> _, err = hashFile.Write([]<span class="type">byte</span>(digest)); err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>content 就是生成最后写入的内容写入，以及计算当前 content 的 hash 写入到 <code>.hash</code> 文件。这个逻辑是 docker daemon 检查到容器的 <code>resolv.conf</code> 文件的计算 hash 对不上 <code>.hash</code> 保存的的，就说明客户修改过容器的 DNS，docker daemon 不再修改它。</p><p>那还是得看 <code>setupDNS()</code> 的更上层调用，搜到 <code>setupResolutionFiles()</code>，而调用它的有两个：</p><ul><li><code>func (c *Controller) NewSandbox(</code></li><li><code>func (sb *Sandbox) Refresh(</code></li></ul><p>俩都打断点看看：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">dlv attach 19035</span></span><br><span class="line">Type &#x27;help&#x27; for list of commands.</span><br><span class="line">(dlv) b libnetwork/sandbox.go:263</span><br><span class="line">Breakpoint 1 set at 0x24be2d8 for github.com/docker/docker/libnetwork.(*Sandbox).Refresh() ./libnetwork/sandbox.go:263</span><br><span class="line">(dlv) b libnetwork/controller.go:944</span><br><span class="line">Breakpoint 2 set at 0x2470d2f for github.com/docker/docker/libnetwork.(*Controller).NewSandbox() ./libnetwork/controller.go:944</span><br><span class="line">(dlv) c</span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">[Breakpoint 2] github.com/docker/docker/libnetwork.(*Controller).NewSandbox() ./libnetwork/controller.go:944 (hits goroutine(58233):1 total:1) (PC: 0x2470d2f)</span></span><br><span class="line">   939:&#125;</span><br><span class="line">   940:c.mu.Unlock()</span><br><span class="line">   941:&#125;</span><br><span class="line">   942:&#125;()</span><br><span class="line">   943:</span><br><span class="line">=&gt; 944:if err := sb.setupResolutionFiles(); err != nil &#123;</span><br><span class="line">   945:return nil, err</span><br><span class="line">   946:&#125;</span><br><span class="line">   947:if err := c.setupOSLSandbox(sb); err != nil &#123;</span><br><span class="line">   948:return nil, err</span><br><span class="line">   949:&#125;</span><br></pre></td></tr></table></figure><p>走的 <code>NewSandbox</code> ，打印下 backtrace：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">(dlv) bt</span><br><span class="line"> 0  0x0000000002470d2f in github.com/docker/docker/libnetwork.(*Controller).NewSandbox</span><br><span class="line">    at ./libnetwork/controller.go:944</span><br><span class="line"> 1  0x0000000002f855e5 in github.com/docker/docker/daemon.(*Daemon).connectToNetwork</span><br><span class="line">    at ./daemon/container_operations.go:762</span><br><span class="line"> 2  0x0000000002f82825 in github.com/docker/docker/daemon.(*Daemon).allocateNetwork</span><br><span class="line">    at ./daemon/container_operations.go:525</span><br><span class="line"> 3  0x0000000002f87d05 in github.com/docker/docker/daemon.(*Daemon).initializeNetworking</span><br><span class="line">    at ./daemon/container_operations.go:950</span><br><span class="line"> 4  0x000000000302d26b in github.com/docker/docker/daemon.(*Daemon).containerStart</span><br><span class="line">    at ./daemon/start.go:117</span><br><span class="line"> 5  0x0000000003028736 in github.com/docker/docker/daemon.(*Daemon).containerRestart</span><br><span class="line">    at ./daemon/restart.go:69</span><br><span class="line"> 6  0x00000000030281e5 in github.com/docker/docker/daemon.(*Daemon).ContainerRestart</span><br><span class="line">    at ./daemon/restart.go:24</span><br><span class="line"> 7  0x00000000028a9c15 in github.com/docker/docker/api/server/router/container.(*containerRouter).postContainersRestart</span><br><span class="line">    at ./api/server/router/container/container_routes.go:267</span><br><span class="line"> 8  0x00000000028b594c in github.com/docker/docker/api/server/router/container.(*containerRouter).postContainersRestart-fm</span><br><span class="line">    at &lt;autogenerated&gt;:1</span><br></pre></td></tr></table></figure><p>以及另一块大概的相关逻辑：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">0  0x000000000247da42 in github.com/docker/docker/libnetwork.(*Endpoint).sbJoin</span><br><span class="line">   at ./libnetwork/endpoint.go:529</span><br><span class="line">1  0x000000000247cc3a in github.com/docker/docker/libnetwork.(*Endpoint).Join</span><br><span class="line">   at ./libnetwork/endpoint.go:467</span><br><span class="line">2  0x0000000002f85ad1 in github.com/docker/docker/daemon.(*Daemon).connectToNetwork</span><br><span class="line">   at ./daemon/container_operations.go:780</span><br><span class="line">3  0x0000000002f82ac5 in github.com/docker/docker/daemon.(*Daemon).allocateNetwork</span><br><span class="line">   at ./daemon/container_operations.go:530</span><br><span class="line">4  0x0000000002f87fa5 in github.com/docker/docker/daemon.(*Daemon).initializeNetworking</span><br><span class="line">   at ./daemon/container_operations.go:955</span><br><span class="line">5  0x000000000302d50b in github.com/docker/docker/daemon.(*Daemon).containerStart</span><br><span class="line">   at ./daemon/start.go:117</span><br><span class="line">6  0x00000000030289d6 in github.com/docker/docker/daemon.(*Daemon).containerRestart</span><br><span class="line">   at ./daemon/restart.go:69</span><br><span class="line">7  0x0000000003028485 in github.com/docker/docker/daemon.(*Daemon).ContainerRestart</span><br><span class="line">   at ./daemon/restart.go:24</span><br><span class="line">8  0x00000000028a9c95 in github.com/docker/docker/api/server/router/container.(*containerRouter).postContainersRestart</span><br><span class="line">   at ./api/server/router/container/container_routes.go:267</span><br></pre></td></tr></table></figure><h4 id="sandbox-change"><a href="#sandbox-change" class="headerlink" title="sandbox change"></a>sandbox change</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">$ docker ps -a | grep testpod</span><br><span class="line">256cdbac0fc8   reg.xxx.lan:5000/xxx/nginx                                                       &quot;/docker-entrypoint.…&quot;   12 minutes ago      Up 12 minutes                             k8s_vulnerable-container_testpod_default_ebb8ace6-84ab-4a28-814c-109c41827908_20</span><br><span class="line">2f79aa18d3f6   reg.xxx.lan:5000/xxx/pause:3.9                                                   &quot;/pause&quot;                  12 minutes ago      Up 12 minutes                             k8s_POD_testpod_default_ebb8ace6-84ab-4a28-814c-109c41827908_20</span><br><span class="line">$ docker inspect 2f79aa18d3f6 | grep SandboxI</span><br><span class="line">            &quot;SandboxID&quot;: &quot;08b8e08676b30c527f76ef551de3dff3fec048af4486b69f8eb0071b2af0a9cf&quot;,</span><br><span class="line">            &quot;SandboxKey&quot;: &quot;/var/run/docker/netns/08b8e08676b3&quot;,</span><br><span class="line">$ docker restart 2f79aa18d3f6</span><br><span class="line">2f79aa18d3f6</span><br><span class="line">$ docker inspect 2f79aa18d3f6 | grep SandboxI</span><br><span class="line">            &quot;SandboxID&quot;: &quot;c87e60bef5f4d95feb723a1ec8818bf2021c900fa5974e0e2107189ab90661ba&quot;,</span><br><span class="line">            &quot;SandboxKey&quot;: &quot;/var/run/docker/netns/c87e60bef5f4&quot;,</span><br></pre></td></tr></table></figure><p>可以看到 sandbox 变了导致重建了 DNS。</p><h4 id="containerd-没复现"><a href="#containerd-没复现" class="headerlink" title="containerd 没复现"></a>containerd 没复现</h4><p>顺便找人 containerd + nerdctl 试了下没问题：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">nerdctl -n k8s.io ps -a | grep zgz</span></span><br><span class="line">d79ac72ea7a6    dockerhub.xxxxxxwan.cn/pub/nginx:latest                                                        &quot;/docker-entrypoint.…&quot;    57 seconds ago        Up                  k8s://default/test-zgz/nginx</span><br><span class="line">a2f547c59e8f    dockerhub.xxxxxxwan.cn/pub/pause:3.9                                                           &quot;/pause&quot;                  About a minute ago    Up                  k8s://default/test-zgz</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">nerdctl -n k8s.io inspect a2f547c59e8f | grep Resolv</span></span><br><span class="line">        &quot;ResolvConfPath&quot;: &quot;/data/lv/lib/io.containerd.grpc.v1.cri/sandboxes/a2f547c59e8fc1d0dd1b6e5a9e35232211e9d4ce811a9101c45ed4d6bc9cf343/resolv.conf&quot;,</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /data/lv/lib/io.containerd.grpc.v1.cri/sandboxes/a2f547c59e8fc1d0dd1b6e5a9e35232211e9d4ce811a9101c45ed4d6bc9cf343/resolv.conf</span></span><br><span class="line">search default.svc.cluster.local svc.cluster.local cluster.local</span><br><span class="line">nameserver 169.254.25.10</span><br><span class="line">options ndots:5</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">nerdctl -n k8s.io restart a2f547c59e8f</span></span><br><span class="line">a2f547c59e8f</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">nerdctl -n k8s.io ps -a | grep zgz</span></span><br><span class="line">13be2f142023    dockerhub.xxxxxxwan.cn/pub/nginx:latest                                                        &quot;/docker-entrypoint.…&quot;    Less than a second ago    Up                  k8s://default/test-zgz/nginx</span><br><span class="line">cabe19fd2fd7    dockerhub.xxxxxxwan.cn/pub/pause:3.9                                                           &quot;/pause&quot;                  1 second ago              Up                  k8s://default/test-zgz</span><br><span class="line">d79ac72ea7a6    dockerhub.xxxxxxwan.cn/pub/nginx:latest                                                        &quot;/docker-entrypoint.…&quot;    2 minutes ago             Created             k8s://default/test-zgz/nginx</span><br><span class="line">a2f547c59e8f    dockerhub.xxxxxxwan.cn/pub/pause:3.9                                                           &quot;/pause&quot;                  2 minutes ago             Up                  k8s://default/test-zgz</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">nerdctl -n k8s.io ps -a | grep zgz</span></span><br><span class="line">13be2f142023    dockerhub.xxxxxxwan.cn/pub/nginx:latest                                                        &quot;/docker-entrypoint.…&quot;    9 seconds ago     Up                  k8s://default/test-zgz/nginx</span><br><span class="line">cabe19fd2fd7    dockerhub.xxxxxxwan.cn/pub/pause:3.9                                                           &quot;/pause&quot;                  10 seconds ago    Up                  k8s://default/test-zgz</span><br><span class="line">d79ac72ea7a6    dockerhub.xxxxxxwan.cn/pub/nginx:latest                                                        &quot;/docker-entrypoint.…&quot;    2 minutes ago     Created             k8s://default/test-zgz/nginx</span><br><span class="line">a2f547c59e8f    dockerhub.xxxxxxwan.cn/pub/pause:3.9                                                           &quot;/pause&quot;                  2 minutes ago     Up                  k8s://default/test-zgz</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">nerdctl -n k8s.io inspect cabe19fd2fd7 | grep Resolv</span></span><br><span class="line">        &quot;ResolvConfPath&quot;: &quot;/data/lv/lib/io.containerd.grpc.v1.cri/sandboxes/cabe19fd2fd75677f4ce77883697a76f66c425977c7cb358a3beb7da36d9d847/resolv.conf&quot;,</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /data/lv/lib/io.containerd.grpc.v1.cri/sandboxes/cabe19fd2fd75677f4ce77883697a76f66c425977c7cb358a3beb7da36d9d847/resolv.conf</span></span><br><span class="line">search default.svc.cluster.local svc.cluster.local cluster.local</span><br><span class="line">nameserver 169.254.25.10</span><br><span class="line">options ndots:5</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">nerdctl -n k8s.io inspect 13be2f142023 | grep Resolv</span></span><br><span class="line">        &quot;ResolvConfPath&quot;: &quot;/data/lv/lib/io.containerd.grpc.v1.cri/sandboxes/cabe19fd2fd75677f4ce77883697a76f66c425977c7cb358a3beb7da36d9d847/resolv.conf&quot;,</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /data/lv/lib/io.containerd.grpc.v1.cri/sandboxes/cabe19fd2fd75677f4ce77883697a76f66c425977c7cb358a3beb7da36d9d847/resolv.conf</span></span><br><span class="line">search default.svc.cluster.local svc.cluster.local cluster.local</span><br><span class="line">nameserver 169.254.25.10</span><br><span class="line">options ndots:5</span><br></pre></td></tr></table></figure><p>containerd 没复现</p><h4 id="重启偶尔重建Pod"><a href="#重启偶尔重建Pod" class="headerlink" title="重启偶尔重建Pod"></a>重启偶尔重建Pod</h4><p>偶尔发现重启 <code>/pause</code> 容器被 k8s 重建了</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker restart c213ae917535</span></span><br><span class="line">c213ae917535</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker ps -a | grep testpod</span></span><br><span class="line">427dbeb17e72   reg.xxx.lan:5000/xxx/nginx                                                       &quot;/docker-entrypoint.…&quot;   2 seconds ago    Up 1 second                               k8s_vulnerable-container_testpod_default_ebb8ace6-84ab-4a28-814c-109c41827908_8</span><br><span class="line">612bf2b23193   reg.xxx.lan:5000/xxx/pause:3.9                                                   &quot;/pause&quot;                  3 seconds ago    Up 1 second                               k8s_POD_testpod_default_ebb8ace6-84ab-4a28-814c-109c41827908_8</span><br><span class="line">cbd04bc66cac   reg.xxx.lan:5000/xxx/nginx                                                       &quot;/docker-entrypoint.…&quot;   10 minutes ago   Exited (0) 2 seconds ago                  k8s_vulnerable-container_testpod_default_ebb8ace6-84ab-4a28-814c-109c41827908_7</span><br><span class="line">51a41d9884e4   reg.xxx.lan:5000/xxx/pause:3.9                                                   &quot;/pause&quot;                  10 minutes ago   Exited (0) 2 seconds ago                  k8s_POD_testpod_default_ebb8ace6-84ab-4a28-814c-109c41827908_7</span><br><span class="line">c213ae917535   reg.xxx.lan:5000/xxx/pause:3.9                                                   &quot;/pause&quot;                  2 hours ago      Exited (0) 2 seconds ago                  k8s_POD_testpod_default_ebb8ace6-84ab-4a28-814c-109c41827908_6</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl get event | grep testpod</span></span><br><span class="line">3s         Normal    Pulling             pod/testpod          Pulling image &quot;reg.xxx.lan:5000/xxx/nginx&quot;</span><br><span class="line">3s         Normal    Created             pod/testpod          Created container vulnerable-container</span><br><span class="line">3s         Normal    Started             pod/testpod          Started container vulnerable-container</span><br><span class="line">4s         Normal    SandboxChanged      pod/testpod          Pod sandbox changed, it will be killed and re-created.</span><br></pre></td></tr></table></figure><p>K8S 源码里搜 <code>Pod sandbox changed</code> 大致看了下这块 kubelet 代码，就是刚好检测 Pod 的 sandbox 状态和重启行为重合就会启动新的容器避免了这个问题，但是这种情况发生的概率不是百分之百重合。</p><h3 id="修复"><a href="#修复" class="headerlink" title="修复"></a>修复</h3><p>从以上结论来看，kubelet 没问题，cri-dockerd 也没问题（它 re-write 行为和人为修改一样），只可能在 docker 方面修复了，根据上面调用链，restart 容器实际走的：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">(dlv) bt</span><br><span class="line"> 0  0x0000000002470d2f in github.com/docker/docker/libnetwork.(*Controller).NewSandbox</span><br><span class="line">    at ./libnetwork/controller.go:944</span><br><span class="line"> 1  0x0000000002f855e5 in github.com/docker/docker/daemon.(*Daemon).connectToNetwork</span><br><span class="line">    at ./daemon/container_operations.go:762</span><br><span class="line"> 2  0x0000000002f82825 in github.com/docker/docker/daemon.(*Daemon).allocateNetwork</span><br><span class="line">    at ./daemon/container_operations.go:525</span><br><span class="line"> 3  0x0000000002f87d05 in github.com/docker/docker/daemon.(*Daemon).initializeNetworking</span><br><span class="line">    at ./daemon/container_operations.go:950</span><br><span class="line"> 4  0x000000000302d26b in github.com/docker/docker/daemon.(*Daemon).containerStart</span><br><span class="line">    at ./daemon/start.go:117</span><br><span class="line"> 5  0x0000000003028736 in github.com/docker/docker/daemon.(*Daemon).containerRestart</span><br></pre></td></tr></table></figure><p>看了下就是根据 restart 的 id 获取 containerConfig，<code>containerRestart</code> 调用逻辑是先 stop 在 start，由于是 sandbox，所以重建了一个 sandbox，重建的关键地方在 <code>container_operations.go</code> 内，使用 stop 之前的 <code>containerConfig</code>， 创建 <code>SandboxOption</code> 的 slice 。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">sbOptions, err := buildSandboxOptions(cfg, ctr)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">&#125;</span><br><span class="line">sb, err := daemon.netController.NewSandbox(ctx, ctr.ID, sbOptions...)</span><br></pre></td></tr></table></figure><p>想法是增加一个 <code>SandboxOption</code> 设置 sandbox 的结构体增加一个 flag ，来判断 SandBox 是否由重启行为创建，然后写相关逻辑处理就行，pr 改动如下：</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br><span class="line">177</span><br><span class="line">178</span><br><span class="line">179</span><br><span class="line">180</span><br><span class="line">181</span><br><span class="line">182</span><br><span class="line">183</span><br><span class="line">184</span><br><span class="line">185</span><br><span class="line">186</span><br><span class="line">187</span><br><span class="line">188</span><br><span class="line">189</span><br><span class="line">190</span><br><span class="line">191</span><br><span class="line">192</span><br><span class="line">193</span><br><span class="line">194</span><br><span class="line">195</span><br><span class="line">196</span><br><span class="line">197</span><br><span class="line">198</span><br><span class="line">199</span><br><span class="line">200</span><br><span class="line">201</span><br><span class="line">202</span><br><span class="line">203</span><br><span class="line">204</span><br><span class="line">205</span><br><span class="line">206</span><br><span class="line">207</span><br><span class="line">208</span><br><span class="line">209</span><br><span class="line">210</span><br><span class="line">211</span><br><span class="line">212</span><br><span class="line">213</span><br><span class="line">214</span><br><span class="line">215</span><br><span class="line">216</span><br><span class="line">217</span><br><span class="line">218</span><br><span class="line">219</span><br><span class="line">220</span><br><span class="line">221</span><br><span class="line">222</span><br><span class="line">223</span><br><span class="line">224</span><br><span class="line">225</span><br><span class="line">226</span><br><span class="line">227</span><br><span class="line">228</span><br><span class="line">229</span><br><span class="line">230</span><br><span class="line">231</span><br><span class="line">232</span><br><span class="line">233</span><br><span class="line">234</span><br><span class="line">235</span><br><span class="line">236</span><br><span class="line">237</span><br><span class="line">238</span><br><span class="line">239</span><br><span class="line">240</span><br><span class="line">241</span><br><span class="line">242</span><br><span class="line">243</span><br><span class="line">244</span><br><span class="line">245</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">diff --git a/daemon/container_operations.go b/daemon/container_operations.go</span></span><br><span class="line"><span class="comment">index 1ed84bb04e4a0..c00b074881d31 100644</span></span><br><span class="line"><span class="comment">--- a/daemon/container_operations.go</span></span><br><span class="line"><span class="comment">+++ b/daemon/container_operations.go</span></span><br><span class="line"><span class="meta">@@ -149,6 +149,10 @@</span> func buildSandboxOptions(cfg *config.Config, ctr *container.Container) ([]libnet</span><br><span class="line"> </span><br><span class="line"> sboxOptions = append(sboxOptions, libnetwork.OptionPortMapping(publishedPorts), libnetwork.OptionExposedPorts(exposedPorts))</span><br><span class="line"> </span><br><span class="line"><span class="addition">+if !ctr.State.StartedAt.IsZero() &amp;&amp; !ctr.State.FinishedAt.IsZero() &#123;</span></span><br><span class="line"><span class="addition">+sboxOptions = append(sboxOptions, libnetwork.OptionCreateByRestart())</span></span><br><span class="line"><span class="addition">+&#125;</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"> return sboxOptions, nil</span><br><span class="line"> &#125;</span><br><span class="line"> </span><br><span class="line"><span class="comment">diff --git a/daemon/libnetwork/sandbox.go b/daemon/libnetwork/sandbox.go</span></span><br><span class="line"><span class="comment">index 4d0236f2e4628..606403e5d08f8 100644</span></span><br><span class="line"><span class="comment">--- a/daemon/libnetwork/sandbox.go</span></span><br><span class="line"><span class="comment">+++ b/daemon/libnetwork/sandbox.go</span></span><br><span class="line"><span class="meta">@@ -37,23 +37,24 @@</span> func (sb *Sandbox) processOptions(options ...SandboxOption) &#123;</span><br><span class="line"> // Sandbox provides the control over the network container entity.</span><br><span class="line"> // It is a one to one mapping with the container.</span><br><span class="line"> type Sandbox struct &#123;</span><br><span class="line"><span class="deletion">-id              string</span></span><br><span class="line"><span class="deletion">-containerID     string</span></span><br><span class="line"><span class="deletion">-config          containerConfig</span></span><br><span class="line"><span class="deletion">-extDNS          []extDNSEntry</span></span><br><span class="line"><span class="deletion">-osSbox          *osl.Namespace</span></span><br><span class="line"><span class="deletion">-controller      *Controller</span></span><br><span class="line"><span class="deletion">-resolver        *Resolver</span></span><br><span class="line"><span class="deletion">-resolverOnce    sync.Once</span></span><br><span class="line"><span class="deletion">-dbIndex         uint64</span></span><br><span class="line"><span class="deletion">-dbExists        bool</span></span><br><span class="line"><span class="deletion">-isStub          bool</span></span><br><span class="line"><span class="deletion">-inDelete        bool</span></span><br><span class="line"><span class="deletion">-ingress         bool</span></span><br><span class="line"><span class="deletion">-ndotsSet        bool</span></span><br><span class="line"><span class="deletion">-oslTypes        []osl.SandboxType // slice of properties of this sandbox</span></span><br><span class="line"><span class="deletion">-loadBalancerNID string            // NID that this SB is a load balancer for</span></span><br><span class="line"><span class="deletion">-mu              sync.Mutex</span></span><br><span class="line"><span class="addition">+id               string</span></span><br><span class="line"><span class="addition">+containerID      string</span></span><br><span class="line"><span class="addition">+config           containerConfig</span></span><br><span class="line"><span class="addition">+extDNS           []extDNSEntry</span></span><br><span class="line"><span class="addition">+osSbox           *osl.Namespace</span></span><br><span class="line"><span class="addition">+controller       *Controller</span></span><br><span class="line"><span class="addition">+resolver         *Resolver</span></span><br><span class="line"><span class="addition">+resolverOnce     sync.Once</span></span><br><span class="line"><span class="addition">+dbIndex          uint64</span></span><br><span class="line"><span class="addition">+dbExists         bool</span></span><br><span class="line"><span class="addition">+isStub           bool</span></span><br><span class="line"><span class="addition">+inDelete         bool</span></span><br><span class="line"><span class="addition">+ingress          bool</span></span><br><span class="line"><span class="addition">+createdByRestart bool</span></span><br><span class="line"><span class="addition">+ndotsSet         bool</span></span><br><span class="line"><span class="addition">+oslTypes         []osl.SandboxType // slice of properties of this sandbox</span></span><br><span class="line"><span class="addition">+loadBalancerNID  string            // NID that this SB is a load balancer for</span></span><br><span class="line"><span class="addition">+mu               sync.Mutex</span></span><br><span class="line"> </span><br><span class="line"> // joinLeaveMu is required as well as mu to modify the following fields,</span><br><span class="line"> // acquire joinLeaveMu first, and keep it at-least until gateway changes</span><br><span class="line"><span class="comment">diff --git a/daemon/libnetwork/sandbox_dns_unix.go b/daemon/libnetwork/sandbox_dns_unix.go</span></span><br><span class="line"><span class="comment">index a5aac066e925b..c99b382b70894 100644</span></span><br><span class="line"><span class="comment">--- a/daemon/libnetwork/sandbox_dns_unix.go</span></span><br><span class="line"><span class="comment">+++ b/daemon/libnetwork/sandbox_dns_unix.go</span></span><br><span class="line"><span class="meta">@@ -264,8 +264,17 @@</span> func (sb *Sandbox) loadResolvConf(path string) (*resolvconf.ResolvConf, error) &#123;</span><br><span class="line"> // be a copy of the host&#x27;s file, with overrides for nameservers, options and search</span><br><span class="line"> // domains applied.</span><br><span class="line"> func (sb *Sandbox) setupDNS() error &#123;</span><br><span class="line"><span class="deletion">-// Make sure the directory exists.</span></span><br><span class="line"> sb.restoreResolvConfPath()</span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// Fixes https://github.com/moby/moby/issues/51490</span></span><br><span class="line"><span class="addition">+// non-host network sandbox should check resolvconf.UserModified</span></span><br><span class="line"><span class="addition">+if sb.createdByRestart &amp;&amp; !sb.config.useDefaultSandBox &#123;</span></span><br><span class="line"><span class="addition">+if mod, err := resolvconf.UserModified(sb.config.resolvConfPath, sb.config.resolvConfHashFile); err != nil || mod &#123;</span></span><br><span class="line"><span class="addition">+return err</span></span><br><span class="line"><span class="addition">+&#125;</span></span><br><span class="line"><span class="addition">+&#125;</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// Make sure the directory exists.</span></span><br><span class="line"> dir, _ := filepath.Split(sb.config.resolvConfPath)</span><br><span class="line"> if err := createBasePath(dir); err != nil &#123;</span><br><span class="line"> return err</span><br><span class="line"><span class="comment">diff --git a/daemon/libnetwork/sandbox_dns_unix_test.go b/daemon/libnetwork/sandbox_dns_unix_test.go</span></span><br><span class="line"><span class="comment">index 3bb64cf5ce5b5..93700c20ced08 100644</span></span><br><span class="line"><span class="comment">--- a/daemon/libnetwork/sandbox_dns_unix_test.go</span></span><br><span class="line"><span class="comment">+++ b/daemon/libnetwork/sandbox_dns_unix_test.go</span></span><br><span class="line"><span class="meta">@@ -14,12 +14,17 @@</span> import (</span><br><span class="line"> is &quot;gotest.tools/v3/assert/cmp&quot;</span><br><span class="line"> )</span><br><span class="line"> </span><br><span class="line"><span class="deletion">-func getResolvConfOptions(t *testing.T, rcPath string) []string &#123;</span></span><br><span class="line"><span class="addition">+func getResolvConf(t *testing.T, rcPath string) resolvconf.ResolvConf &#123;</span></span><br><span class="line"> t.Helper()</span><br><span class="line"> resolv, err := os.ReadFile(rcPath)</span><br><span class="line"> assert.NilError(t, err)</span><br><span class="line"> rc, err := resolvconf.Parse(bytes.NewBuffer(resolv), &quot;&quot;)</span><br><span class="line"> assert.NilError(t, err)</span><br><span class="line"><span class="addition">+return rc</span></span><br><span class="line"><span class="addition">+&#125;</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+func getResolvConfOptions(t *testing.T, rcPath string) []string &#123;</span></span><br><span class="line"><span class="addition">+rc := getResolvConf(t, rcPath)</span></span><br><span class="line"> return rc.Options()</span><br><span class="line"> &#125;</span><br><span class="line"> </span><br><span class="line"><span class="meta">@@ -90,3 +95,69 @@</span> func TestDNSOptions(t *testing.T) &#123;</span><br><span class="line"> dnsOptionsList = getResolvConfOptions(t, sb2.config.resolvConfPath)</span><br><span class="line"> assert.Check(t, is.DeepEqual([]string&#123;&quot;ndots:0&quot;&#125;, dnsOptionsList))</span><br><span class="line"> &#125;</span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+func TestNonHostNetDNSRestart(t *testing.T) &#123;</span></span><br><span class="line"><span class="addition">+c, err := New(context.Background(), config.OptionDataDir(t.TempDir()))</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// Step 1: Create initial sandbox (simulating first container start)</span></span><br><span class="line"><span class="addition">+sb, err := c.NewSandbox(context.Background(), &quot;cnt1&quot;)</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+sb.startResolver(false)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+err = sb.setupDNS()</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+err = sb.rebuildDNS()</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// Step 2: Simulate cri-dockerd modifying the resolv.conf for a Kubernetes pause container.</span></span><br><span class="line"><span class="addition">+// This mimics the behavior where external tools (like cri-dockerd) customize DNS</span></span><br><span class="line"><span class="addition">+// settings for K8s pods, which should be preserved during container restart/unpause.</span></span><br><span class="line"><span class="addition">+resolvConfPath := sb.config.resolvConfPath</span></span><br><span class="line"><span class="addition">+modifiedContent := []byte(`nameserver 10.96.0.10</span></span><br><span class="line"><span class="addition">+search default.svc.cluster.local. svc.cluster.local. cluster.local.</span></span><br><span class="line"><span class="addition">+options ndots:5</span></span><br><span class="line"><span class="addition">+`)</span></span><br><span class="line"><span class="addition">+err = os.WriteFile(resolvConfPath, modifiedContent, 0644)</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// Step 3: Delete the sandbox (simulating container stop)</span></span><br><span class="line"><span class="addition">+err = sb.Delete(context.Background())</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// Step 4: Create a new sandbox with OptionRestartOperate (simulating container restart)</span></span><br><span class="line"><span class="addition">+sbRestart, err := c.NewSandbox(context.Background(), &quot;cnt1&quot;,</span></span><br><span class="line"><span class="addition">+OptionCreateByRestart(),</span></span><br><span class="line"><span class="addition">+OptionResolvConfPath(resolvConfPath),</span></span><br><span class="line"><span class="addition">+)</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+defer func() &#123;</span></span><br><span class="line"><span class="addition">+if err := sbRestart.Delete(context.Background()); err != nil &#123;</span></span><br><span class="line"><span class="addition">+t.Error(err)</span></span><br><span class="line"><span class="addition">+&#125;</span></span><br><span class="line"><span class="addition">+&#125;()</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+sbRestart.startResolver(false)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// Step 5: Call setupDNS on restart - should preserve external modifications</span></span><br><span class="line"><span class="addition">+err = sbRestart.setupDNS()</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// Verify that the DNS settings modified by cri-dockerd are preserved</span></span><br><span class="line"><span class="addition">+rc := getResolvConf(t, sbRestart.config.resolvConfPath)</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.Len(rc.Options(), 1))</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.Equal(&quot;10.96.0.10&quot;, rc.NameServers()[0].String()))</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.DeepEqual([]string&#123;&quot;default.svc.cluster.local.&quot;, &quot;svc.cluster.local.&quot;, &quot;cluster.local.&quot;&#125;, rc.Search()))</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.Equal(&quot;ndots:5&quot;, rc.Options()[0]))</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+err = sbRestart.rebuildDNS()</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+rc = getResolvConf(t, sbRestart.config.resolvConfPath)</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.Len(rc.Options(), 1))</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.Equal(&quot;10.96.0.10&quot;, rc.NameServers()[0].String()))</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.DeepEqual([]string&#123;&quot;default.svc.cluster.local.&quot;, &quot;svc.cluster.local.&quot;, &quot;cluster.local.&quot;&#125;, rc.Search()))</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.Equal(&quot;ndots:5&quot;, rc.Options()[0]))</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+&#125;</span></span><br><span class="line"><span class="comment">diff --git a/daemon/libnetwork/sandbox_options.go b/daemon/libnetwork/sandbox_options.go</span></span><br><span class="line"><span class="comment">index ba05582dec270..8210d3bd6845b 100644</span></span><br><span class="line"><span class="comment">--- a/daemon/libnetwork/sandbox_options.go</span></span><br><span class="line"><span class="comment">+++ b/daemon/libnetwork/sandbox_options.go</span></span><br><span class="line"><span class="meta">@@ -151,3 +151,11 @@</span> func OptionLoadBalancer(nid string) SandboxOption &#123;</span><br><span class="line"> sb.oslTypes = append(sb.oslTypes, osl.SandboxTypeLoadBalancer)</span><br><span class="line"> &#125;</span><br><span class="line"> &#125;</span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// OptionCreateByRestart function returns an option setter for marking a</span></span><br><span class="line"><span class="addition">+// sandbox was created by restart.</span></span><br><span class="line"><span class="addition">+func OptionCreateByRestart() SandboxOption &#123;</span></span><br><span class="line"><span class="addition">+return func(sb *Sandbox) &#123;</span></span><br><span class="line"><span class="addition">+sb.createdByRestart = true</span></span><br><span class="line"><span class="addition">+&#125;</span></span><br><span class="line"><span class="addition">+&#125;</span></span><br><span class="line"><span class="comment">diff --git a/integration/networking/resolvconf_test.go b/integration/networking/resolvconf_test.go</span></span><br><span class="line"><span class="comment">index c0119bd650be0..4a84077adbb6e 100644</span></span><br><span class="line"><span class="comment">--- a/integration/networking/resolvconf_test.go</span></span><br><span class="line"><span class="comment">+++ b/integration/networking/resolvconf_test.go</span></span><br><span class="line"><span class="meta">@@ -212,3 +212,47 @@</span> func TestNslookupWindows(t *testing.T) &#123;</span><br><span class="line"> // can only be changed in daemon.json using feature flag &quot;windows-dns-proxy&quot;.</span><br><span class="line"> assert.Check(t, is.Contains(res.Stdout.String(), &quot;Addresses:&quot;))</span><br><span class="line"> &#125;</span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// TestResolvConfPreservedOnRestart verifies that external modifications to</span></span><br><span class="line"><span class="addition">+// /etc/resolv.conf are preserved when a non-host network container is restarted.</span></span><br><span class="line"><span class="addition">+// Regression test for https://github.com/moby/moby/issues/51490</span></span><br><span class="line"><span class="addition">+func TestResolvConfPreservedOnRestart(t *testing.T) &#123;</span></span><br><span class="line"><span class="addition">+skip.If(t, testEnv.DaemonInfo.OSType == &quot;windows&quot;, &quot;No /etc/resolv.conf on Windows&quot;)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+ctx := setupTest(t)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+d := daemon.New(t, daemon.WithResolvConf(network.GenResolvConf(&quot;8.8.8.8&quot;)))</span></span><br><span class="line"><span class="addition">+d.StartWithBusybox(ctx, t)</span></span><br><span class="line"><span class="addition">+defer d.Stop(t)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+c := d.NewClientT(t)</span></span><br><span class="line"><span class="addition">+defer c.Close()</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+const ctrName = &quot;test-resolvconf-preserved-on-restart&quot;</span></span><br><span class="line"><span class="addition">+id := container.Run(ctx, t, c,</span></span><br><span class="line"><span class="addition">+container.WithName(ctrName),</span></span><br><span class="line"><span class="addition">+container.WithImage(&quot;busybox:latest&quot;),</span></span><br><span class="line"><span class="addition">+container.WithCmd(&quot;top&quot;),</span></span><br><span class="line"><span class="addition">+)</span></span><br><span class="line"><span class="addition">+defer c.ContainerRemove(ctx, id, client.ContainerRemoveOptions&#123;</span></span><br><span class="line"><span class="addition">+Force: true,</span></span><br><span class="line"><span class="addition">+&#125;)</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+appendContent := `# hello`</span></span><br><span class="line"><span class="addition">+res, err := container.Exec(ctx, c, ctrName, []string&#123;</span></span><br><span class="line"><span class="addition">+&quot;sh&quot;, &quot;-c&quot;,</span></span><br><span class="line"><span class="addition">+&quot;echo &#x27;&quot; + appendContent + &quot;&#x27; &gt;&gt; /etc/resolv.conf&quot;,</span></span><br><span class="line"><span class="addition">+&#125;)</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.Equal(res.ExitCode, 0))</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// Restart the container.</span></span><br><span class="line"><span class="addition">+_, err = c.ContainerRestart(ctx, ctrName, client.ContainerRestartOptions&#123;&#125;)</span></span><br><span class="line"><span class="addition">+assert.Assert(t, is.Nil(err))</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line"><span class="addition">+// Verify the modification was preserved</span></span><br><span class="line"><span class="addition">+res, err = container.Exec(ctx, c, ctrName, []string&#123;&quot;tail&quot;, &quot;-n&quot;, &quot;1&quot;, &quot;/etc/resolv.conf&quot;&#125;)</span></span><br><span class="line"><span class="addition">+assert.NilError(t, err)</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.Equal(res.ExitCode, 0))</span></span><br><span class="line"><span class="addition">+assert.Check(t, is.Contains(res.Stdout(), appendContent))</span></span><br><span class="line"><span class="addition">+&#125;</span></span><br></pre></td></tr></table></figure><p>2025&#x2F;11&#x2F;12 提交 pr <a href="https://github.com/moby/moby/pull/51507">https://github.com/moby/moby/pull/51507</a> 后，单元测试啥的都在 github action 里跑过了，然后 docker member 发现一个类似问题：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker run -d --name hello nginx:alpine</span></span><br><span class="line">2daa4fc6f6c6c708c394c9b490f80a83269108907b86059d7d472f1f735d8b34</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker <span class="built_in">exec</span> hello sh -c <span class="string">&#x27;echo &quot;nameserver 1.1.1.1&quot; &gt; /etc/resolv.conf&#x27;</span></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker <span class="built_in">exec</span> hello sh -c <span class="string">&#x27;tail -n 1 /etc/resolv.conf&#x27;</span></span></span><br><span class="line">nameserver 1.1.1.1</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker restart hello</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker <span class="built_in">exec</span> hello sh -c <span class="string">&#x27;tail -n 1 /etc/resolv.conf&#x27;</span></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Overrides: []</span></span><br></pre></td></tr></table></figure><p>按照我修复后的代码测试了下，也可以解决这个问题，docker libnetwork 负责人 akerouanton 说我这种重启 &#x2F;pause 是非标行为，但是上面这种默认桥接网络也会发生，他给我 code review了下。<br>reivew 评论说 <code>createdByRestart</code> 标志会无效，当 docker 宕机以及类似掉电情况下就无效了，让我去掉这个选项。</p><p>仔细想了下确实，意外情况下不会走正常的 restart 流程去应用上这个选项，后面就是讨论和 2025&#x2F;11&#x2F;26 合入了，预计 docker <code>v29.0.5</code> 版本带出。</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;docker 重启非 host 网络容器造成 dns 异常的梳理和pr修复&lt;/p&gt;</summary>
    
    
    
    
    <category term="docker" scheme="http://zhangguanzhang.github.io/tags/docker/"/>
    
    <category term="sandbox" scheme="http://zhangguanzhang.github.io/tags/sandbox/"/>
    
  </entry>
  
  <entry>
    <title>docker open /var/lib/docker/tmp/GetImageBlobXXX: no such file 的正确处理方式</title>
    <link href="http://zhangguanzhang.github.io/2025/11/06/docker-GetImageBlob-no-such/"/>
    <id>http://zhangguanzhang.github.io/2025/11/06/docker-GetImageBlob-no-such/</id>
    <published>2025-11-06T10:30:30.000Z</published>
    <updated>2025-11-06T10:30:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>docker open &#x2F;var&#x2F;lib&#x2F;docker&#x2F;tmp&#x2F;GetImageBlobXXX: no such file or directory. 解决 </p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>测试反馈 <code>04:33</code> 出包失败，相关步骤报错：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker run -d -p 48835:5000 --name daily-master-K8S_XC-2298 -v /xxx/images:/var/lib/registry harbor.xxx.cn/xxx-base/registry:2.6.1</span></span><br><span class="line">Unable to find image &#x27;harbor.xxx.cn/xxx-base/registry&#x27; locally</span><br><span class="line">2.6.1: Pulling from xxx-base/registry</span><br><span class="line">53478ce18e19: Pulling fs layer</span><br><span class="line">907370c150a1: Pulling fs layer</span><br><span class="line">ecd89ee27260: Pulling fs layer</span><br><span class="line">e4d3e6950197: Pulling fs layer</span><br><span class="line">a0c226b30c4f: Pulling fs layer</span><br><span class="line">d4bda1830450: Pulling fs layer</span><br><span class="line">f441bc34ec75: Pulling fs layer</span><br><span class="line">877c19e43805: Pulling fs layer</span><br><span class="line">docker: open /work/docker/tmp/GetImageBlob838250894: no such file or directory.</span><br><span class="line">See &#x27;docker run --help&#x27;.</span><br></pre></td></tr></table></figure><p>我们的 docker 设置了 data-root，如果默认路径会是 <code>open /var/lib/docker/tmp/GetImageBlob</code></p><h2 id="处理过程"><a href="#处理过程" class="headerlink" title="处理过程"></a>处理过程</h2><h3 id="复现"><a href="#复现" class="headerlink" title="复现"></a>复现</h3><p>根据构建日志，登录到构建机器上，手动拉镜像也复现：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker pull harbor.xxx.cn/xxxx-run/gosu:v1</span></span><br><span class="line">v1: Pulling from xxxx-run/gosu</span><br><span class="line">e9abf7e9593f: Pulling fs layer </span><br><span class="line">open /work/docker/tmp/GetImageBlob159490514: no such file or directory</span><br></pre></td></tr></table></figure><p>去 jenkins 上看这台机器没构建任务，给标记下线状态避免影响其他构建。</p><h3 id="排查"><a href="#排查" class="headerlink" title="排查"></a>排查</h3><p>这种报错很直观了，不要造轮子和要学会善用搜索引擎，结果搜到都是说重启 docker 解决，如果一个开源项目存在一个必现问题，那就不是问题，golang 项目里妥善 <code>return err</code> 的话，直接源码能搜到相关逻辑，源码里搜 <code>GetImageBlob</code> 搜到：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v26.1.4/distribution/pull_v2.go#L1070C1-L1072C2</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">createDownloadFile</span><span class="params">()</span></span> (*os.File, <span class="type">error</span>) &#123;</span><br><span class="line"><span class="keyword">return</span> os.CreateTemp(<span class="string">&quot;&quot;</span>, <span class="string">&quot;GetImageBlob&quot;</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>os.CreateTemp</code> 默认是在 <code>/tmp/</code> 下创建临时文件的，但是实际目录是拼接了，应该有地方设置了 tmp 相关 env，先从 proc 看下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ps -ef | grep docker[d]</span></span><br><span class="line">root     23938     1  3 03:00 ?        00:19:00 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">xargs -0 -n1 &lt; /proc/23938/environ</span> </span><br><span class="line">LANG=en_US.UTF-8</span><br><span class="line">PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin</span><br><span class="line">NOTIFY_SOCKET=/run/systemd/notify</span><br><span class="line">LISTEN_PID=23938</span><br><span class="line">LISTEN_FDS=1</span><br></pre></td></tr></table></figure><p>从 proc 进程能确认启动的没有 <code>TMPDIR</code> 相关 env，那么只有进程自己 <code>os.Setenv</code> 了，搜 <code>TMPDIR</code> 搜到相关逻辑：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v26.1.4/daemon/daemon.go#L841-L856</span></span><br><span class="line"><span class="comment">// set up the tmpDir to use a canonical path</span></span><br><span class="line">tmp, err := prepareTempDir(config.Root)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;Unable to get the TempDir under %s: %s&quot;</span>, config.Root, err)</span><br><span class="line">&#125;</span><br><span class="line">realTmp, err := fileutils.ReadSymlinkedDirectory(tmp)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;Unable to get the full path to the TempDir (%s): %s&quot;</span>, tmp, err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> isWindows &#123;</span><br><span class="line">        ...</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">os.Setenv(<span class="string">&quot;TMPDIR&quot;</span>, realTmp)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>路径拼接逻辑确认了，要确认 err 哪里抛出的，搜上面的 <code>createDownloadFile</code> 找到：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v26.1.4/distribution/pull_v2.go#L169-L199</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(ld *layerDescriptor)</span></span> Download(ctx context.Context, progressOutput progress.Output) (io.ReadCloser, <span class="type">int64</span>, <span class="type">error</span>) &#123;</span><br><span class="line">log.G(ctx).Debugf(<span class="string">&quot;pulling blob %q&quot;</span>, ld.digest)</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> (</span><br><span class="line">err    <span class="type">error</span></span><br><span class="line">offset <span class="type">int64</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> ld.tmpFile == <span class="literal">nil</span> &#123;</span><br><span class="line">ld.tmpFile, err = createDownloadFile()</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, <span class="number">0</span>, xfer.DoNotRetry&#123;Err: err&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">offset, err = ld.tmpFile.Seek(<span class="number">0</span>, io.SeekEnd)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">log.G(ctx).Debugf(<span class="string">&quot;error seeking to end of download file: %v&quot;</span>, err)</span><br><span class="line">offset = <span class="number">0</span></span><br><span class="line"></span><br><span class="line">ld.tmpFile.Close()</span><br><span class="line"><span class="keyword">if</span> err := os.Remove(ld.tmpFile.Name()); err != <span class="literal">nil</span> &#123;</span><br><span class="line">log.G(ctx).Errorf(<span class="string">&quot;Failed to remove temp file: %s&quot;</span>, ld.tmpFile.Name())</span><br><span class="line">&#125;</span><br><span class="line">ld.tmpFile, err = createDownloadFile()</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, <span class="number">0</span>, xfer.DoNotRetry&#123;Err: err&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125; <span class="keyword">else</span> <span class="keyword">if</span> offset != <span class="number">0</span> &#123;</span><br><span class="line">log.G(ctx).Debugf(<span class="string">&quot;attempting to resume download of %q from %d bytes&quot;</span>, ld.digest, offset)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>err 是这附近抛出的，具体位置不知道，但是看有日志打印，需要查看下 docker daemon 日志确认下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">Nov 06 10:30:22 centos-xx dockerd[23938]: time=&quot;2025-11-06T10:30:22.290434484+08:00&quot; level=error msg=&quot;Download failed after 1 attempts: open /work/docker/tmp/GetImageBlob1655640392: no such file or directory&quot;</span><br><span class="line">Nov 06 10:48:51 centos-xx dockerd[23938]: time=&quot;2025-11-06T10:48:51.619866730+08:00&quot; level=error msg=&quot;Download failed after 1 attempts: open /work/docker/tmp/GetImageBlob1626967315: no such file or directory&quot;</span><br></pre></td></tr></table></figure><p>搜 <code>Download failed after</code> 搜到：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v26.1.4/distribution/xfer/download.go#L274-L293</span></span><br><span class="line"><span class="keyword">for</span> &#123;</span><br><span class="line">downloadReader, size, err = descriptor.Download(d.transfer.context(), progressOutput)</span><br><span class="line"><span class="keyword">if</span> err == <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">break</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// If an error was returned because the context</span></span><br><span class="line"><span class="comment">// was cancelled, we shouldn&#x27;t retry.</span></span><br><span class="line"><span class="keyword">select</span> &#123;</span><br><span class="line"><span class="keyword">case</span> &lt;-d.transfer.context().Done():</span><br><span class="line">d.err = err</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line"><span class="keyword">default</span>:</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> _, isDNR := err.(DoNotRetry); isDNR || attempt &gt;= ldm.maxDownloadAttempts &#123;</span><br><span class="line">log.G(context.TODO()).Errorf(<span class="string">&quot;Download failed after %d attempts: %v&quot;</span>, attempt, err)</span><br><span class="line">d.err = err</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>那确认 err 是上面的 download 里 return 的，从 daemon 的日志来看，该 err 没有经过 <code>errors.Wrap</code> 类似添加额外信息，应该就是 <code>createDownloadFile()</code> 抛出的 syscall 层面 error。</p><p>想着 golang 写一个 demo 复现，但是想着 mktmp 这个不也是很常见的行为吗，shell 就自带 <code>mktemp</code> 相关命令，复现下看看：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">mktemp</span> /work/docker/tmp/test1111</span></span><br><span class="line">mktemp: too few X&#x27;s in template ‘/work/docker/tmp/test1111’</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">mktemp</span> /work/docker/tmp/testXXXX</span></span><br><span class="line">mktemp: failed to create file via template ‘/work/docker/tmp/testXXXX’: No such file or directory</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">mktemp</span> /tmp/testXXX</span></span><br><span class="line">/tmp/testx1T</span><br></pre></td></tr></table></figure><p>work 目录挂载的，测试了下读写也没问题，然后发现了 docker data-root 没有 tmp 目录：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">touch</span> /work/docker/test1111</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">rm</span> -f /work/docker/test1111</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">ls</span> -l /work/docker/</span></span><br><span class="line">total 4</span><br><span class="line">drwx--x--x 4 root root 170 Nov  6 03:00 buildkit</span><br><span class="line">drwx--x--- 2 root root  10 Nov  6 03:00 containers</span><br><span class="line">-rw------- 1 root root  36 Nov  6 03:00 engine-id</span><br><span class="line">drwx------ 3 root root  30 Nov  6 03:00 image</span><br><span class="line">drwxr-x--- 3 root root  27 Nov  6 03:00 network</span><br><span class="line">drwx--x--- 3 root root  52 Nov  6 03:00 overlay2</span><br><span class="line">drwx------ 4 root root  44 Nov  6 03:00 plugins</span><br><span class="line">drwx------ 2 root root  10 Nov  6 03:00 swarm</span><br><span class="line">drwx-----x 2 root root  62 Nov  6 03:00 volumes</span><br></pre></td></tr></table></figure><p>然后创建该目录后就好了：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker pull harbor.xxx.cn/xxxx-run/gosu:v1</span></span><br><span class="line">v1: Pulling from xxxx-run/gosu</span><br><span class="line">e9abf7e9593f: Pulling fs layer </span><br><span class="line">open /work/docker/tmp/GetImageBlob1426925393: no such file or directory</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">mkdir</span> -p /work/docker/tmp</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker pull harbor.xxx.cn/xxxx-run/gosu:v1</span></span><br><span class="line">v1: Pulling from xxxx-run/gosu</span><br><span class="line">e9abf7e9593f: Pull complete </span><br><span class="line">Digest: sha256:06ff9bb691ce53498f7dda976e0028639fb320f71513f6a41b4dd6761e989e78</span><br><span class="line">Status: Downloaded newer image for harbor.xxx.cn/xxxx-run/gosu:v1</span><br><span class="line">harbor.xxx.cn/xxxx-run/gosu:v1</span><br></pre></td></tr></table></figure><h3 id="根因"><a href="#根因" class="headerlink" title="根因"></a>根因</h3><p>原本想启动 docker 后创建下目录：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">.d 目录的话，即使后续 docker 更新也不会删掉下面文件，能持久化住逻辑</span></span><br><span class="line">mkdir -p /etc/systemd/system/docker.service.d/</span><br><span class="line">cat &gt; /etc/systemd/system/docker.service.d/tmp.conf &lt;&lt; EOF</span><br><span class="line">[Service]</span><br><span class="line">ExecStartPost=/usr/bin/mkdir /work/docker/tmp</span><br><span class="line">EOF</span><br><span class="line"></span><br><span class="line">systemctl daemon-reload</span><br></pre></td></tr></table></figure><p>但是想了下，别人重启 docker 能解决说明 docker daemon 启动是会创建这个目录的，搜了下确认有如此逻辑：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v26.1.4/daemon/daemon.go#L1420-L1442</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// prepareTempDir prepares and returns the default directory to use</span></span><br><span class="line"><span class="comment">// for temporary files.</span></span><br><span class="line"><span class="comment">// If it doesn&#x27;t exist, it is created. If it exists, its content is removed.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">prepareTempDir</span><span class="params">(rootDir <span class="type">string</span>)</span></span> (<span class="type">string</span>, <span class="type">error</span>) &#123;</span><br><span class="line"><span class="keyword">var</span> tmpDir <span class="type">string</span></span><br><span class="line"><span class="keyword">if</span> tmpDir = os.Getenv(<span class="string">&quot;DOCKER_TMPDIR&quot;</span>); tmpDir == <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">tmpDir = filepath.Join(rootDir, <span class="string">&quot;tmp&quot;</span>)</span><br><span class="line">newName := tmpDir + <span class="string">&quot;-old&quot;</span></span><br><span class="line"><span class="keyword">if</span> err := os.Rename(tmpDir, newName); err == <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> &#123;</span><br><span class="line"><span class="keyword">if</span> err := os.RemoveAll(newName); err != <span class="literal">nil</span> &#123;</span><br><span class="line">log.G(context.TODO()).Warnf(<span class="string">&quot;failed to delete old tmp directory: %s&quot;</span>, newName)</span><br><span class="line">&#125;</span><br><span class="line">&#125;()</span><br><span class="line">&#125; <span class="keyword">else</span> <span class="keyword">if</span> !os.IsNotExist(err) &#123;</span><br><span class="line">log.G(context.TODO()).Warnf(<span class="string">&quot;failed to rename %s for background deletion: %s. Deleting synchronously&quot;</span>, tmpDir, err)</span><br><span class="line"><span class="keyword">if</span> err := os.RemoveAll(tmpDir); err != <span class="literal">nil</span> &#123;</span><br><span class="line">log.G(context.TODO()).Warnf(<span class="string">&quot;failed to delete old tmp directory: %s&quot;</span>, tmpDir)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> tmpDir, idtools.MkdirAllAndChown(tmpDir, <span class="number">0o700</span>, idtools.CurrentIdentity())</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>那就是说外部行为删除了 tmp 目录导致，查看定时任务：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">crontab -l</span></span><br><span class="line">0 3 * * *  systemctl stop docker &amp;&amp; mv /work/docker /work/docker-$(date +\%Y\%m\%d) &amp;&amp; systemctl start docker</span><br><span class="line">15 3 * * * rm -rf /work/docker-*</span><br><span class="line">20 3 * * * find /work -maxdepth 1 -name &#x27;*dev*&#x27; -type d -ctime +1 | xargs rm -rf</span><br><span class="line">23 3 * * * find /work -maxdepth 1 -name &#x27;*test*&#x27; -type d -ctime +1 | xargs rm -rf</span><br><span class="line">25 3 * * * find /work -maxdepth 1 -name &#x27;*release*&#x27; -type d -ctime +1 | xargs rm -rf</span><br><span class="line">27 3 * * * find /work -maxdepth 1 -name &#x27;*openxxx*&#x27; -type d -ctime +1 | xargs rm -rf</span><br><span class="line">0 9-23/2 * * * python /data/epy/clean_dir.py  /work</span><br><span class="line"><span class="meta prompt_">#</span><span class="language-bash">0 9-23/2  * * *  docker system prune -f</span></span><br><span class="line">0 */2 * * * echo 3 &gt; /proc/sys/vm/drop_caches</span><br><span class="line"></span><br><span class="line">*/30 * * * * python /data/prepullimage/pull_image.py </span><br></pre></td></tr></table></figure><p>根据 <code>04:33</code> 附近时间看没有，询问了下同事发现 jenkins 有定时 <code>04:30</code> 清理机器上的文件，调整后解决</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;docker open &amp;#x2F;var&amp;#x2F;lib&amp;#x2F;docker&amp;#x2F;tmp&amp;#x2F;GetImageBlobXXX: no such file or directory. 解决 &lt;/p&gt;</summary>
    
    
    
    
    <category term="docker" scheme="http://zhangguanzhang.github.io/tags/docker/"/>
    
    <category term="GetImageBlob" scheme="http://zhangguanzhang.github.io/tags/GetImageBlob/"/>
    
  </entry>
  
  <entry>
    <title>多线程执行skopeo copy panic的一次解决过程</title>
    <link href="http://zhangguanzhang.github.io/2025/10/31/skopeo-copy-panic/"/>
    <id>http://zhangguanzhang.github.io/2025/10/31/skopeo-copy-panic/</id>
    <published>2025-10-31T17:30:30.000Z</published>
    <updated>2025-10-31T17:30:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>多线程执行skopeo copy panic的一次解决过程</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>内部构建出包存在大概以下逻辑：</p><ol><li>起一个 registry 容器，假设随机端口为 45678</li><li>然后把相关镜像 skopeo copy 从缓存的 harbor 同步到 registry容器</li><li>打包registry的目录成为 <code>iamge-xxx.tgz</code></li></ol><p>然后发现这几天打包有问题，多线程 skopeo copy 报错没处理，最后 <code>iamge-xxx.tgz</code> 大小不对。</p><h2 id="排查"><a href="#排查" class="headerlink" title="排查"></a>排查</h2><h3 id="调用链分析"><a href="#调用链分析" class="headerlink" title="调用链分析"></a>调用链分析</h3><p>查看出包日志一堆panic，由于多线程调用的，golang 的 panic 堆栈的顺序都错乱了，构建机器都是 centos7，都不维护了，上面的 skopeo 通过包管理安装的，版本比较低：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">skopeo --version</span></span><br><span class="line">skopeo version 0.1.40</span><br></pre></td></tr></table></figure><p>根据版本去查看源码找调用链，skopeo 使用了 cobra 库，从 <code>cmd/skopeo/copy.go</code> 找到堆栈：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">/builddir/build/BUILD/skopeo-be6146b0a8471b02e776134119a2c37dfb70d414/cmd/skopeo/copy.go:159 +0x94b fp=0xc0005f3920 sp=0xc0005f3678 pc=0x559635b91b0b</span><br></pre></td></tr></table></figure><p>代码：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> (</span><br><span class="line">    <span class="string">&quot;github.com/containers/image/v5/copy&quot;</span></span><br><span class="line">    ....</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment">// https://github.com/containers/skopeo/blob/v0.1.40/cmd/skopeo/copy.go#L159-L167</span></span><br><span class="line">    _, err = <span class="built_in">copy</span>.Image(ctx, policyContext, destRef, srcRef, &amp;<span class="built_in">copy</span>.Options&#123;</span><br><span class="line">        RemoveSignatures:      opts.removeSignatures,</span><br><span class="line">        SignBy:                opts.signByFingerprint,</span><br><span class="line">        ReportWriter:          stdout,</span><br><span class="line">        SourceCtx:             sourceCtx,</span><br><span class="line">        DestinationCtx:        destinationCtx,</span><br><span class="line">        ForceManifestMIMEType: manifestType,</span><br><span class="line">        ImageListSelection:    imageListSelection,</span><br><span class="line">    &#125;)</span><br></pre></td></tr></table></figure><p>根据导包找到：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containers/skopeo/blob/v0.1.40/vendor/github.com/containers/image/v5/copy/copy.go#L173</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Image</span><span class="params">(ctx context.Context, policyContext *signature.PolicyContext, destRef, srcRef types.ImageReference, options *Options)</span></span> (copiedManifest []<span class="type">byte</span>, retErr <span class="type">error</span>) &#123;</span><br></pre></td></tr></table></figure><p>便于查找，给所有堆栈字符串保存到文件里，从输出混乱的堆栈字符串里搜索 <code>copy/copy.go:</code> 找到:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">$ grep -Po <span class="string">&#x27;vendor.*?/copy/copy.go:\d+&#x27;</span> txt | sort -u</span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">1337</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">258</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">578</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">740</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">755</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">765</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">766</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">770</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">771</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">860</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">948</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">949</span></span><br><span class="line">vendor/src/github.com/containers/image/v5/pkg/blobinfocache/boltdb/boltdb.go0x105/builddir/build/BUILD/skopeo-be6146b0a8471b02e776134119a2c37dfb70d414/vendor/src/github.com/containers/image/v5/<span class="built_in">copy</span>/<span class="built_in">copy</span>.<span class="keyword">go</span>:<span class="number">174</span></span><br></pre></td></tr></table></figure><p>根据上面信息和源码，调用链为：</p><ul><li><code>copy/copy.go#L173</code> 的 <code>func Image(ctx context.Context,</code></li><li><code>copy/copy.go#L258</code> 的 <code>if copiedManifest, _, _, err = c.copyOneImage(</code></li><li><code>copy/copy.go#L473</code> 的 <code>func (c *copier) copyOneImage</code></li><li><code>copy/copy.go#L578</code> 的 <code>if err := ic.copyLayers(ctx);</code></li><li><code>copy/copy.go#L704</code> 的 <code>func (ic *imageCopier) copyLayers(ctx context.Context)</code></li><li><code>copy/copy.go:766</code> 的 <code>go copyLayerHelper(i, srcLayer, progressPool)，而该方法是下面闭包声明的</code></li><li><code>copy/copy.go:755</code> 的 <code>cld.destInfo, cld.diffID, cld.err = ic.copyLayer(ctx, srcLayer, pool) </code></li><li><code>copy/copy.go:948</code> 的 <code>func (ic *imageCopier) copyLayer(ctx，然后走到内部第一行</code></li><li><code>copy/copy.go:949</code> 的 <code>cachedDiffID := ic.c.blobInfoCache.UncompressedDigest(srcInfo.Digest)</code></li></ul><p>而方法 <code>UncompressedDigest</code> 是接口：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containers/skopeo/blob/v0.1.40/vendor/github.com/containers/image/v5/types/types.go#L177-L198</span></span><br><span class="line"><span class="keyword">type</span> BlobInfoCache <span class="keyword">interface</span> &#123;</span><br><span class="line"></span><br><span class="line">    UncompressedDigest(anyDigest digest.Digest) digest.Digest</span><br><span class="line"></span><br><span class="line">    RecordDigestUncompressedPair(anyDigest digest.Digest, uncompressed digest.Digest)</span><br><span class="line">    RecordKnownLocation(transport ImageTransport, scope BICTransportScope, digest digest.Digest, location BICLocationReference)</span><br><span class="line">    CandidateLocations(transport ImageTransport, scope BICTransportScope, digest digest.Digest, canSubstitute <span class="type">bool</span>) []BICReplacementCandidate</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>搜 <code>ic.c.blobInfoCache</code> 里的 <code>blobInfoCache</code> 赋值，搜到：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containers/skopeo/blob/v0.1.40/vendor/github.com/containers/image/v5/copy/copy.go#L173-L233</span></span><br><span class="line">blobInfoCache: blobinfocache.DefaultCache(options.DestinationCtx)</span><br></pre></td></tr></table></figure><p>然后发现是 boltdb 存储 cache：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containers/skopeo/blob/v0.1.40/vendor/github.com/containers/image/v5/pkg/blobinfocache/default.go</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">DefaultCache</span><span class="params">(sys *types.SystemContext)</span></span> types.BlobInfoCache &#123;</span><br><span class="line">    dir, err := blobInfoCacheDir(sys, getRootlessUID())</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        logrus.Debugf(<span class="string">&quot;Error determining a location for %s, using a memory-only cache&quot;</span>, blobInfoCacheFilename)</span><br><span class="line">        <span class="keyword">return</span> memory.New()</span><br><span class="line">    &#125;</span><br><span class="line">    path := filepath.Join(dir, blobInfoCacheFilename)</span><br><span class="line">    <span class="keyword">if</span> err := os.MkdirAll(dir, <span class="number">0700</span>); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        logrus.Debugf(<span class="string">&quot;Error creating parent directories for %s, using a memory-only cache: %v&quot;</span>, blobInfoCacheFilename, err)</span><br><span class="line">        <span class="keyword">return</span> memory.New()</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    logrus.Debugf(<span class="string">&quot;Using blob info cache at %s&quot;</span>, path)</span><br><span class="line">    <span class="keyword">return</span> boltdb.New(path)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>查看失败构建的构建机器上：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cd</span> /var/lib/containers/cache</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">ls</span> -l</span></span><br><span class="line">total 25740</span><br><span class="line">-rw------- 1 root root 43134976 Oct 28 12:31 blob-info-cache-v1.boltdb</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">ls</span> -al</span></span><br><span class="line">total 25740</span><br><span class="line">drwx------ 2 root root       39 Aug 27 14:18 .</span><br><span class="line">drwxr-xr-x 4 root root       35 Aug 27 14:18 ..</span><br><span class="line">-rw------- 1 root root 43134976 Oct 28 12:31 blob-info-cache-v1.boltdb</span><br></pre></td></tr></table></figure><p>和堆栈里的 boltdb 相关堆栈也对的上：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">grep -Po <span class="string">&#x27;blobinfocache/boltdb/boltdb.go:\d+&#x27;</span> txt  | <span class="built_in">sort</span> -u</span></span><br><span class="line">blobinfocache/boltdb/boltdb.go:108</span><br><span class="line">blobinfocache/boltdb/boltdb.go:112</span><br><span class="line">blobinfocache/boltdb/boltdb.go:114</span><br><span class="line">blobinfocache/boltdb/boltdb.go:119</span><br><span class="line">blobinfocache/boltdb/boltdb.go:124</span><br><span class="line">blobinfocache/boltdb/boltdb.go:146</span><br><span class="line">blobinfocache/boltdb/boltdb.go:172</span><br><span class="line">blobinfocache/boltdb/boltdb.go:174</span><br><span class="line">blobinfocache/boltdb/boltdb.go:175</span><br><span class="line">blobinfocache/boltdb/boltdb.go:3010</span><br><span class="line">blobinfocache/boltdb/boltdb.go:54</span><br><span class="line">blobinfocache/boltdb/boltdb.go:56</span><br><span class="line">blobinfocache/boltdb/boltdb.go:58</span><br><span class="line">blobinfocache/boltdb/boltdb.go:65</span><br><span class="line">blobinfocache/boltdb/boltdb.go:66</span><br><span class="line">blobinfocache/boltdb/boltdb.go:67</span><br><span class="line">blobinfocache/boltdb/boltdb.go:84</span><br></pre></td></tr></table></figure><p>这里调用链就不细致分析了，确定是 <code>uncompressedDigest</code> 方法里 boltdb 问题：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containers/skopeo/blob/v0.1.40/vendor/github.com/containers/image/v5/pkg/blobinfocache/boltdb/boltdb.go#L145C1-L146C57</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(bdc *cache)</span></span> uncompressedDigest(tx *bolt.Tx, anyDigest digest.Digest) digest.Digest &#123;</span><br><span class="line">    <span class="keyword">if</span> b := tx.Bucket(uncompressedDigestBucket); b != <span class="literal">nil</span> &#123;</span><br></pre></td></tr></table></figure><p>这里有问题的话就说明 boltdb 文件损坏，boltdb 文件损坏的话，docker 和 etcd 都能遇到，搜关键字就知道了：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">grep invalid txt</span></span><br><span class="line">panic: invalid page type: 6432: 10</span><br></pre></td></tr></table></figure><h3 id="原因"><a href="#原因" class="headerlink" title="原因"></a>原因</h3><p>写个 boltdb 查看 Bucket 的 cli 复现下：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line">    <span class="string">&quot;fmt&quot;</span></span><br><span class="line">    <span class="string">&quot;log&quot;</span></span><br><span class="line">    <span class="string">&quot;os&quot;</span></span><br><span class="line"></span><br><span class="line">    bolt <span class="string">&quot;go.etcd.io/bbolt&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(os.Args) &lt; <span class="number">2</span> &#123;</span><br><span class="line">        fmt.Fprintf(os.Stderr, <span class="string">&quot;Usage: %s &lt;bolt-db-file&gt;\n&quot;</span>, os.Args[<span class="number">0</span>])</span><br><span class="line">        os.Exit(<span class="number">1</span>)</span><br><span class="line">    &#125;</span><br><span class="line">    filename := os.Args[<span class="number">1</span>]</span><br><span class="line"></span><br><span class="line">    db, err := bolt.Open(filename, <span class="number">0600</span>, &amp;bolt.Options&#123;ReadOnly: <span class="literal">true</span>&#125;)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        log.Fatalf(<span class="string">&quot;failed to open %s: %v&quot;</span>, filename, err)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">defer</span> db.Close()</span><br><span class="line"></span><br><span class="line">    err = db.View(<span class="function"><span class="keyword">func</span><span class="params">(tx *bolt.Tx)</span></span> <span class="type">error</span> &#123;</span><br><span class="line">        fmt.Printf(<span class="string">&quot;Top-level buckets in %s:\n&quot;</span>, filename)</span><br><span class="line">        <span class="keyword">return</span> tx.ForEach(<span class="function"><span class="keyword">func</span><span class="params">(name []<span class="type">byte</span>, _ *bolt.Bucket)</span></span> <span class="type">error</span> &#123;</span><br><span class="line">            fmt.Printf(<span class="string">&quot;- %s\n&quot;</span>, name)</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">        &#125;)</span><br><span class="line">    &#125;)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        log.Fatal(err)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>拷贝到构建机器上执行：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">./bbolt-tool blob-info-cache-v1.boltdb</span></span><br><span class="line">Top-level buckets in blob-info-cache-v1.boltdb:</span><br><span class="line">panic: invalid page type: 6432: 10</span><br><span class="line"></span><br><span class="line">goroutine 1 [running]:</span><br><span class="line">go.etcd.io/bbolt.(*Cursor).search(0xc0000a5cd8, &#123;0x7ff6b8ff8130, 0x47, 0x47&#125;, 0x0?)</span><br><span class="line">        /root/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/cursor.go:286 +0x279</span><br><span class="line">go.etcd.io/bbolt.(*Cursor).seek(0xc0000a5cd8, &#123;0x7ff6b8ff8130?, 0xc000080140?, 0xc0000ac040?&#125;)</span><br><span class="line">        /root/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/cursor.go:162 +0x2e</span><br><span class="line">go.etcd.io/bbolt.(*Bucket).Bucket(0xc0000aa018, &#123;0x7ff6b8ff8130, 0x47, 0x4e8c40?&#125;)</span><br><span class="line">        /root/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/bucket.go:97 +0xb6</span><br><span class="line">main.main.func1.(*Tx).ForEach.2(&#123;0x7ff6b8ff8130, 0x47, 0x47&#125;, &#123;0xc0000a5dd8?, 0x1?, 0x1?&#125;)</span><br><span class="line">        /root/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/tx.go:158 +0x45</span><br><span class="line">go.etcd.io/bbolt.(*Bucket).ForEach(0x51c268?, 0xc0000a5de8)</span><br><span class="line">        /root/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/bucket.go:591 +0x89</span><br><span class="line">go.etcd.io/bbolt.(*Tx).ForEach(...)</span><br><span class="line">        /root/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/tx.go:157</span><br><span class="line">main.main.func1(0xc0000aa000)</span><br><span class="line">        /root/code/golang/bbolt/main.go:26 +0x9d</span><br><span class="line">go.etcd.io/bbolt.(*DB).View(0x7ffca6c567a8?, 0xc0000a5f00)</span><br><span class="line">        /root/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/db.go:939 +0x6c</span><br><span class="line">main.main()</span><br><span class="line">        /root/code/golang/bbolt/main.go:24 +0x1f0</span><br><span class="line">        </span><br><span class="line">        </span><br></pre></td></tr></table></figure><p>正常构建机器上：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">./bbolt-tool blob-info-cache-v1.boltdb</span> </span><br><span class="line">Top-level buckets in blob-info-cache-v1.boltdb:</span><br><span class="line">- knownLocations</span><br></pre></td></tr></table></figure><h2 id="解决"><a href="#解决" class="headerlink" title="解决"></a>解决</h2><p>根据上面方法 <code>DefaultCache</code> 可以把 path 创建成文件，让走内存缓存，但是查看了下新版本 skopeo 已经默认使用 <code>sqlite</code> 缓存了：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// If the format changes in an incompatible way, increase the version number.</span></span><br><span class="line">blobInfoCacheFilename = <span class="string">&quot;blob-info-cache-v1.sqlite&quot;</span></span><br><span class="line"><span class="comment">// systemBlobInfoCacheDir is the directory containing the blob info cache (in blobInfocacheFilename) for root-running processes.</span></span><br><span class="line">systemBlobInfoCacheDir = <span class="string">&quot;/var/lib/containers/cache&quot;</span></span><br></pre></td></tr></table></figure><p>改为使用新版本 skopeo ，以及相关多线程的输出也加了前缀，避免下次类似问题堆栈混乱。然后发现老版本分支还是会走老逻辑，构建机器上把这个 boltdb 文件 mv 下避免影响其他分支，</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;多线程执行skopeo copy panic的一次解决过程&lt;/p&gt;</summary>
    
    
    
    
    <category term="skopeo" scheme="http://zhangguanzhang.github.io/tags/skopeo/"/>
    
    <category term="boltdb" scheme="http://zhangguanzhang.github.io/tags/boltdb/"/>
    
  </entry>
  
  <entry>
    <title>离线安装docker和包管理安装docker下containerd的启动相关</title>
    <link href="http://zhangguanzhang.github.io/2025/10/23/docker-bin-containerd/"/>
    <id>http://zhangguanzhang.github.io/2025/10/23/docker-bin-containerd/</id>
    <published>2025-10-23T17:10:30.000Z</published>
    <updated>2025-10-23T17:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>简单科普下 docker 启动时候和 contaienrd 相关</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>昨天处理了一个现场 docker 起不来的问题，借着处理过程科普下。docker 无法启动日志：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">journarlctl -xe --no-pager -u docker</span></span><br><span class="line">Oct 23 17:28:33 XXX251023S00P systemd[1]: docker.service: Unit entered failed state.</span><br><span class="line">Oct 23 17:28:33 XXX251023S00P systemd[1]: docker.service: Failed with result &#x27;exit-code&#x27;.</span><br><span class="line">Oct 23 17:28:43 XXX251023S00P systemd[1]: docker.service: Service RestartSec=10s expired, scheduling restart.</span><br><span class="line">Oct 23 17:28:43 XXX251023S00P systemd[1]: Stopped Docker Application Container Engine.</span><br><span class="line">-- Subject: Unit docker.service has finished shutting down</span><br><span class="line">-- Defined-By: systemd</span><br><span class="line">-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel</span><br><span class="line">-- </span><br><span class="line">-- Unit docker.service has finished shutting down.</span><br><span class="line">Oct 23 17:28:43 XXX251023S00P systemd[1]: Starting Docker Application Container Engine...</span><br><span class="line">-- Subject: Unit docker.service has begun start-up</span><br><span class="line">-- Defined-By: systemd</span><br><span class="line">-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel</span><br><span class="line">-- </span><br><span class="line">-- Unit docker.service has begun starting up.</span><br><span class="line">Oct 23 17:28:43 XXX251023S00P dockerd[7355]: time=&quot;2025-10-23T17:28:43+08:00&quot; level=info msg=&quot;SUSE:secrets :: enabled&quot;</span><br><span class="line">Oct 23 17:28:44 XXX251023S00P dockerd[7355]: time=&quot;2025-10-23T17:28:44.000689797+08:00&quot; level=warning msg=&quot;The \&quot;graph\&quot; config file option is deprecated. Please use \&quot;data-root\&quot; instead.&quot;</span><br><span class="line">Oct 23 17:28:44 XXX251023S00P dockerd[7355]: time=&quot;2025-10-23T17:28:44.064839553+08:00&quot; level=warning msg=&quot;grpc: addrConn.createTransport failed to connect to &#123;unix:///run/containerd/containerd.sock 0  &lt;nil&gt;&#125;. Err :connection error: desc = \&quot;transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\&quot;. Reconnecting...&quot; module=grpc</span><br><span class="line">Oct 23 17:28:45 XXX251023S00P dockerd[7355]: time=&quot;2025-10-23T17:28:45.065163178+08:00&quot; level=warning msg=&quot;grpc: addrConn.createTransport failed to connect to &#123;unix:///run/containerd/containerd.sock 0  &lt;nil&gt;&#125;. Err :connection error: desc = \&quot;transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\&quot;. Reconnecting...&quot; module=grpc</span><br></pre></td></tr></table></figure><h2 id="排查"><a href="#排查" class="headerlink" title="排查"></a>排查</h2><p>上面日志右边滑动查看，核心报错是 <code>Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused</code> ，解决这个问题要先了解下 docker 和 containerd 启动相关。</p><h3 id="包管理下的-docker-和-containerd"><a href="#包管理下的-docker-和-containerd" class="headerlink" title="包管理下的 docker 和 containerd"></a>包管理下的 docker 和 containerd</h3><p>docker damon 和 containerd 是存在交互而工作的，如果是包管理安装的 docker，会有两个 systemd service 文件：</p><ul><li>containerd 包提供 <code>containerd.service</code> 文件</li><li>docker-ce 的 <code>docker.service</code></li></ul><p>这里以 rpm 包举例：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">rpm -qa | grep -P <span class="string">&#x27;containerd&#x27;</span></span></span><br><span class="line">containerd.io-1.6.33-3.1.el7.x86_64</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">rpm -ql containerd.io | grep -Ev <span class="string">&#x27;/(doc|licen|man)&#x27;</span></span></span><br><span class="line">/etc/containerd</span><br><span class="line">/etc/containerd/config.toml</span><br><span class="line">/usr/bin/containerd</span><br><span class="line">/usr/bin/containerd-shim</span><br><span class="line">/usr/bin/containerd-shim-runc-v1</span><br><span class="line">/usr/bin/containerd-shim-runc-v2</span><br><span class="line">/usr/bin/ctr</span><br><span class="line">/usr/bin/runc</span><br><span class="line">/usr/lib/systemd/system/containerd.service</span><br></pre></td></tr></table></figure><p>而包管理 docker.service 有依赖：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">systemctl <span class="built_in">cat</span> --no-pager  docker | grep containerd.service</span></span><br><span class="line">After=network-online.target docker.socket firewalld.service containerd.service time-set.target</span><br><span class="line">Wants=network-online.target containerd.service</span><br></pre></td></tr></table></figure><h3 id="二进制安装-docker"><a href="#二进制安装-docker" class="headerlink" title="二进制安装 docker"></a>二进制安装 docker</h3><p>我们私有化就是 docker 离线安装的，根据官方文档 <a href="https://docs.docker.com/engine/install/binaries/">https://docs.docker.com/engine/install/binaries/</a> 下载二进制安装，但是官方文档没有说 systemd service 文件获取。以及我接手后发现也没有创建 containerd.service 纳管 containerd，但是 docker 也能运行，查看 docker 子进程能看到：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">systemctl status docker</span></span><br><span class="line">● docker.service - Docker Application Container Engine</span><br><span class="line">   Loaded: loaded (/etc/systemd/system/docker.service; enabled; vendor preset: disabled)</span><br><span class="line">   Active: active (running) since 三 2025-10-22 16:44:38 CST; 1 day 1h ago</span><br><span class="line">     Docs: http://docs.docker.io</span><br><span class="line"> Main PID: 16487 (dockerd)</span><br><span class="line">    Tasks: 167</span><br><span class="line">   Memory: 6.1G</span><br><span class="line">   CGroup: /system.slice/docker.service</span><br><span class="line">...</span><br><span class="line">           ├─16506 containerd --config /var/run/docker/containerd/containerd.toml --log-level warn</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>说明 docker 肯定内部协程起了 containerd 进程，低版本 containerd 名字可能是 <code>docker-containerd</code> 。<br>逆向思维查下源码，因为协程起 containerd 进程，肯定会拼接 cmdline，源码搜索 <code>--config</code> 找到：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v19.03.15/libcontainerd/supervisor/remote_daemon.go#L165C1-L212C5</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(r *remote)</span></span> startContainerd() <span class="type">error</span> &#123;</span><br><span class="line">    pid, err := r.getContainerdPid()</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> err</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> pid != <span class="number">-1</span> &#123;</span><br><span class="line">        r.daemonPid = pid</span><br><span class="line">        logrus.WithField(<span class="string">&quot;pid&quot;</span>, pid).</span><br><span class="line">            Infof(<span class="string">&quot;libcontainerd: %s is still running&quot;</span>, binaryName)</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    configFile, err := r.getContainerdConfig()</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> err</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    args := []<span class="type">string</span>&#123;<span class="string">&quot;--config&quot;</span>, configFile&#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> r.Debug.Level != <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">        args = <span class="built_in">append</span>(args, <span class="string">&quot;--log-level&quot;</span>, r.Debug.Level)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    cmd := exec.Command(binaryName, args...)</span><br><span class="line">    <span class="comment">// redirect containerd logs to docker logs</span></span><br><span class="line">    cmd.Stdout = os.Stdout</span><br><span class="line">    cmd.Stderr = os.Stderr</span><br><span class="line">    cmd.SysProcAttr = containerdSysProcAttr()</span><br><span class="line">    <span class="comment">// clear the NOTIFY_SOCKET from the env when starting containerd</span></span><br><span class="line">    cmd.Env = <span class="literal">nil</span></span><br><span class="line">    <span class="keyword">for</span> _, e := <span class="keyword">range</span> os.Environ() &#123;</span><br><span class="line">        <span class="keyword">if</span> !strings.HasPrefix(e, <span class="string">&quot;NOTIFY_SOCKET&quot;</span>) &#123;</span><br><span class="line">            cmd.Env = <span class="built_in">append</span>(cmd.Env, e)</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> err := cmd.Start(); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> err</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    r.daemonWaitCh = <span class="built_in">make</span>(<span class="keyword">chan</span> <span class="keyword">struct</span>&#123;&#125;)</span><br><span class="line">    <span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> &#123;</span><br><span class="line">        <span class="comment">// Reap our child when needed</span></span><br><span class="line">        <span class="keyword">if</span> err := cmd.Wait(); err != <span class="literal">nil</span> &#123;</span><br><span class="line">            r.logger.WithError(err).Errorf(<span class="string">&quot;containerd did not exit successfully&quot;</span>)</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="built_in">close</span>(r.daemonWaitCh)</span><br><span class="line">    &#125;()</span><br></pre></td></tr></table></figure><p>然后反向找 <code>startContainerd()</code> 的调用链：</p><ul><li>同文件的 <code>func (r *remote) monitorDaemon(ctx context.Context) { </code></li><li>同文件的 <code>func Start(</code></li><li>因为 <code>Start</code> 方法大写，肯定在其他地方包导入，搜 <code>supervisor.Start</code> 找到</li></ul><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v19.03.15/cmd/dockerd/daemon_unix.go#L152C1-L171</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(cli *DaemonCli)</span></span> initContainerD(ctx context.Context) (<span class="function"><span class="keyword">func</span><span class="params">(time.Duration)</span></span> <span class="type">error</span>, <span class="type">error</span>) &#123;</span><br><span class="line">    <span class="keyword">var</span> waitForShutdown <span class="function"><span class="keyword">func</span><span class="params">(time.Duration)</span></span> <span class="type">error</span></span><br><span class="line">    <span class="keyword">if</span> cli.Config.ContainerdAddr == <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">        systemContainerdAddr, ok, err := systemContainerdRunning(honorXDG)</span><br><span class="line">        <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">nil</span>, errors.Wrap(err, <span class="string">&quot;could not determine whether the system containerd is running&quot;</span>)</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">if</span> !ok &#123;</span><br><span class="line">            logrus.Debug(<span class="string">&quot;Containerd not running, starting daemon managed containerd&quot;</span>)</span><br><span class="line">        opts, err := cli.getContainerdDaemonOpts()</span><br><span class="line">        <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">          <span class="keyword">return</span> <span class="literal">nil</span>, errors.Wrap(err, <span class="string">&quot;failed to generate containerd options&quot;</span>)</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        r, err := supervisor.Start(ctx, filepath.Join(cli.Config.Root, <span class="string">&quot;containerd&quot;</span>), filepath.Join(cli.Config.ExecRoot, <span class="string">&quot;containerd&quot;</span>), opts...)</span><br><span class="line">        <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">          <span class="keyword">return</span> <span class="literal">nil</span>, errors.Wrap(err, <span class="string">&quot;failed to start containerd&quot;</span>)</span><br><span class="line">        &#125;</span><br><span class="line">        logrus.Debug(<span class="string">&quot;Started daemon managed containerd&quot;</span>)</span><br><span class="line">        cli.Config.ContainerdAddr = r.Address()</span><br></pre></td></tr></table></figure><p>上面代码逻辑就是 <code>systemContainerdRunning</code> 方法判断 <code>containerd</code> 是否运行，没有运行就调用 <code>supervisor.Start</code> 启动 containerd，查看 <code>systemContainerdRunning</code> 内部实现：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/moby/moby/blob/v19.03.15/cmd/dockerd/daemon.go#L691C1-L702C2</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">systemContainerdRunning</span><span class="params">(honorXDG <span class="type">bool</span>)</span></span> (<span class="type">string</span>, <span class="type">bool</span>, <span class="type">error</span>) &#123;</span><br><span class="line">    addr := containerddefaults.DefaultAddress</span><br><span class="line">    <span class="keyword">if</span> honorXDG &#123;</span><br><span class="line">        runtimeDir, err := homedir.GetRuntimeDir()</span><br><span class="line">        <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> <span class="string">&quot;&quot;</span>, <span class="literal">false</span>, err</span><br><span class="line">        &#125;</span><br><span class="line">        addr = filepath.Join(runtimeDir, <span class="string">&quot;containerd&quot;</span>, <span class="string">&quot;containerd.sock&quot;</span>)</span><br><span class="line">    &#125;</span><br><span class="line">    _, err := os.Lstat(addr)</span><br><span class="line">    <span class="keyword">return</span> addr, err == <span class="literal">nil</span>, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>是查看连接 containerd 的 grpc sock 文件 <code>/run/containerd/containerd.sock</code> 存在否判断是否运行的，也就是说如果 systemd 启动了 containerd，docker daemon 就不 <code>supervisor.Start</code> 启动 containerd 子进程。</p><p>现场的是 suse docker rpm 包安装的，有 containerd 的 rpm 包，才发现里面没 service 文件，也就是走源码子进程逻辑：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">rpm -ql containerd</span></span><br><span class="line">/etc/containerd</span><br><span class="line">/etc/containerd/config.toml/usr/sbin/containerd</span><br><span class="line">/usr/sbin/containerd-shim</span><br><span class="line">/usr/sbin/docker-containerd</span><br><span class="line">/usr/sbin/docker-containerd-shim</span><br><span class="line">/usr/share/doc/packages/containerd</span><br><span class="line">/usr/share/doc/packages/containerd/README</span><br><span class="line">/usr/share/licenses/containerd</span><br><span class="line">/usr/share/licenses/containerd/LICENSE</span><br></pre></td></tr></table></figure><p>查看果然是有 sock 文件而没 containerd 进程：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">ls</span> -l /run/containerd/</span></span><br><span class="line">total 28</span><br><span class="line">srw-rw---- 1 root root     0 Oct 23 15:20 containerd.sock</span><br><span class="line">-rwxr-xr-x 1 root root 25651 Oct 23 15:26 events.log</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ps -ef | grep container[d]</span></span><br></pre></td></tr></table></figure><p>删掉该文件后重起 docker 解决。</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;简单科普下 docker 启动时候和 contaienrd 相关&lt;/p&gt;</summary>
    
    
    
    
    <category term="docker" scheme="http://zhangguanzhang.github.io/tags/docker/"/>
    
    <category term="containerd" scheme="http://zhangguanzhang.github.io/tags/containerd/"/>
    
  </entry>
  
  <entry>
    <title>keepalived Locking pid file error 22 - Invalid argument</title>
    <link href="http://zhangguanzhang.github.io/2025/10/17/keepalived-locking-pid-file-invalid/"/>
    <id>http://zhangguanzhang.github.io/2025/10/17/keepalived-locking-pid-file-invalid/</id>
    <published>2025-10-17T18:10:30.000Z</published>
    <updated>2025-10-17T18:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>现场部署业务后发现访问有问题，排查后发现业务网关访问 etcd 的 <a href="https://zhangguanzhang.github.io/2021/09/28/ipvs-svc/">keepalived IPVS svc</a> 不通</p><h2 id="排查"><a href="#排查" class="headerlink" title="排查"></a>排查</h2><h3 id="基础排查"><a href="#基础排查" class="headerlink" title="基础排查"></a>基础排查</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">curl 169.254.20.4:12379</span></span><br><span class="line">curl: (7) Failed to connect to 169.254.20.4 port 12379: Connection refused</span><br></pre></td></tr></table></figure><p>这个是我们 etcd 的 svc 的，curl <code>real server</code> 正常：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ curl &lt;本机IP&gt;:12379</span><br></pre></td></tr></table></figure><p>然后看了下 <code>iptables -t filter -S</code> 无额外规则，怕客户安全加固啥的前面 insert 了规则影响。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">sysctl --all |&amp; grep vs.conn</span></span><br><span class="line">net.ipv4.vs.conn_reuse_mode = 1</span><br><span class="line">net.ipv4.vs.conntrack = 1</span><br></pre></td></tr></table></figure><p><code>vs.conntrack</code> 正常，不是 0，查看下 ipvs 规则：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker <span class="built_in">exec</span> keepalived-ipvs ipvsadm -<span class="built_in">ln</span> | grep -A3 169.254.20.4</span></span><br></pre></td></tr></table></figure><p>现场说为空，让去掉 grep 直接看也是为空：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker <span class="built_in">exec</span> keepalived-ipvs ipvsadm -<span class="built_in">ln</span></span></span><br><span class="line">IP Virtual Server version 1.2.1 (size=4096)</span><br><span class="line">Prot LocalAddress:Port Scheduler Flags</span><br><span class="line"><span class="meta prompt_">  -&gt; </span><span class="language-bash">RemoteAddress:Port           Forward Weight ActiveConn InActConn</span></span><br></pre></td></tr></table></figure><p>让看下 keepalived 日志，发现有个错误：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">daemon is already running</span><br><span class="line">Locking pid file /run/keepalived.pid error 22 - Invalid argument</span><br><span class="line">Opening file &#x27;/etc/keepalived/conf.d/xxx.conf&#x27;.</span><br><span class="line">Opening file &#x27;/etc/keepalived/conf.d/xxx2.conf&#x27;.</span><br><span class="line">...</span><br></pre></td></tr></table></figure><h3 id="源码"><a href="#源码" class="headerlink" title="源码"></a>源码</h3><p>因为 keepalived 容器需要操作 lvs 规则，所以是有 <code>privileged: true</code> 的，感觉还是内核相关问题导致的，那就看源码了。我们用的 keepalived 版本是最新的 <code>v2.3.4</code> ，搜索源码 <code>Locking pid file</code> 搜到：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/acassen/keepalived/blob/v2.3.4/keepalived/core/pidfile.c#L183-L194</span></span><br><span class="line"><span class="meta">#<span class="keyword">if</span> HAVE_DECL_F_OFD_SETLK == 1</span></span><br><span class="line">    fl.l_pid = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">while</span> ((ret = fcntl(pidf-&gt;fd, F_OFD_SETLK, &amp;fl)) &amp;&amp; errno == EINTR);</span><br><span class="line">    <span class="keyword">if</span> (ret) &#123;</span><br><span class="line">      <span class="keyword">if</span> (errno == EAGAIN)</span><br><span class="line">        log_message(LOG_INFO, <span class="string">&quot;Another process has pid file %s locked&quot;</span>, pidf-&gt;path);</span><br><span class="line">      <span class="keyword">else</span></span><br><span class="line">        log_message(LOG_INFO, <span class="string">&quot;Locking pid file %s error %d - %m&quot;</span>, pidf-&gt;path, errno);</span><br><span class="line"></span><br><span class="line">      <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br></pre></td></tr></table></figure><p>看这个宏定义内的代码就是 fcntl 使用 <code>F_OFD_SETLK</code> 针对 Pid 文件上锁，这个特性是内核特性，现场是 CentOS 7.4 ，内核版本 <code>3.10.0-693.el7.x86_64</code>，想看下这块提交，于是下载源码后找到下面俩个 commmit：</p><ul><li><a href="https://github.com/acassen/keepalived/commit/2c4cd3b927e5f12f59c62481261621b70375a304">https://github.com/acassen/keepalived/commit/2c4cd3b927e5f12f59c62481261621b70375a304</a></li><li><a href="https://github.com/acassen/keepalived/commit/7d2b85d1f03d3ae2944237c60c3eeb5edc4fa12a">https://github.com/acassen/keepalived/commit/7d2b85d1f03d3ae2944237c60c3eeb5edc4fa12a</a></li></ul><p>第一个 commit 是增加 Pid 锁，第二个是允许宏定义不用这个特性在老系统上构建，利用 autoconf 检测支持 <code>F_OFD_SETLK</code> 不。</p><h2 id="编译"><a href="#编译" class="headerlink" title="编译"></a>编译</h2><p>因为我们使用的是容器化部署，基础镜像换成欧拉了，而欧拉的包管理安装的 keepalived 版本太低，所以我是 Dockerfile 内编译安装的 keepalived，而构建这个镜像的机器的内核比较高，只能构建的时候传递选项关闭了。</p><p>影响面也不需要关注，因为是容器，并且镜像的启动脚本内我在启动的时候先删除 pid 文件的，无脑编译关闭即可，不要 hack 修改代码，autoconf 啥的都是大家遵守的规范，主要是 <code>configure.ac</code> 下面的：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">dnl -- Linux 3.15</span><br><span class="line">AC_CHECK_DECLS([F_OFD_SETLK], [], [],</span><br><span class="line">  [[</span><br><span class="line">    #include &lt;unistd.h&gt;</span><br><span class="line">    #include &lt;fcntl.h&gt;</span><br><span class="line">  ]])</span><br><span class="line">for flag in F_OFD_SETLK; do</span><br><span class="line">  AS_VAR_COPY([decl_var], [ac_cv_have_decl_$flag])</span><br><span class="line">  if test $&#123;decl_var&#125; = yes; then</span><br><span class="line">    add_system_opt[$&#123;flag&#125;]</span><br><span class="line">  fi</span><br><span class="line">done</span><br></pre></td></tr></table></figure><p>看下面逻辑，<code>&quot;ac_cv_have_decl_&quot;+&quot;F_OFD_SETLK&quot;=&quot;yes&quot;</code> ，在 <code>./configure</code> 后面加就行：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">...</span><br><span class="line">    ./autogen.sh; \</span><br><span class="line">    ./configure ac_cv_have_decl_F_OFD_SETLK=no \</span><br><span class="line">        --disable-dynamic-linking \</span><br><span class="line">        --prefix=/usr \</span><br><span class="line">        --exec-prefix=/usr \</span><br><span class="line">        --bindir=/usr/bin \</span><br><span class="line">        --sbindir=/usr/sbin \</span><br><span class="line">        --sysconfdir=/etc \</span><br><span class="line">        --enable-nftables \</span><br><span class="line">        --enable-regex \</span><br><span class="line">        --disable-systemd \</span><br><span class="line">        ; \</span><br></pre></td></tr></table></figure><p>编译打包镜像后测试没问题。</p>]]></content>
    
    
      
      
    <summary type="html">&lt;h2 id=&quot;由来&quot;&gt;&lt;a href=&quot;#由来&quot; class=&quot;headerlink&quot; title=&quot;由来&quot;&gt;&lt;/a&gt;由来&lt;/h2&gt;&lt;p&gt;现场部署业务后发现访问有问题，排查后发现业务网关访问 etcd 的 &lt;a href=&quot;https://zhangguanzhang.gith</summary>
      
    
    
    
    
    <category term="keepalived" scheme="http://zhangguanzhang.github.io/tags/keepalived/"/>
    
    <category term="fnctl" scheme="http://zhangguanzhang.github.io/tags/fnctl/"/>
    
  </entry>
  
  <entry>
    <title>从自己造轮子NodePort白名单到参考 calico 规则</title>
    <link href="http://zhangguanzhang.github.io/2025/10/15/k8s-NodePort-filter/"/>
    <id>http://zhangguanzhang.github.io/2025/10/15/k8s-NodePort-filter/</id>
    <published>2025-10-15T10:10:30.000Z</published>
    <updated>2025-10-15T10:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>研究下 calico 如何实现 nodePort 白名单。</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>由于我们做私有化，很多客户注重安全，需要有类似 <code>NetworkPolicy</code> 那样做白名单限制来源，而 calico 最小化部署的话下面配置：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> <span class="string">operator.tigera.io/v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Installation</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line">  <span class="attr">name:</span> <span class="string">default</span></span><br><span class="line"><span class="attr">spec:</span> <span class="comment"># https://docs.tigera.io/calico/latest/reference/installation/api#installationspec</span></span><br><span class="line">  <span class="attr">calicoNetwork:</span> <span class="comment"># https://docs.tigera.io/calico/latest/reference/installation/api#caliconetworkspec</span></span><br><span class="line">    <span class="attr">ipPools:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">default-ipv4-ippool</span></span><br><span class="line">      <span class="attr">blockSize:</span> <span class="number">24</span> <span class="comment"># node 上分配到的 PodIP 的掩码，默认26，我喜欢改成24方便阅读</span></span><br><span class="line">      <span class="attr">cidr:</span> <span class="number">10.187</span><span class="number">.0</span><span class="number">.0</span><span class="string">/16</span></span><br><span class="line">      <span class="attr">encapsulation:</span> <span class="string">VXLANCrossSubnet</span> <span class="comment"># https://docs.tigera.io/calico/latest/reference/installation/api#encapsulationtype</span></span><br><span class="line">      <span class="attr">natOutgoing:</span> <span class="string">Enabled</span></span><br><span class="line">      <span class="attr">nodeSelector:</span> <span class="string">all()</span></span><br><span class="line">    <span class="attr">nodeAddressAutodetectionV4:</span> <span class="comment"># https://docs.tigera.io/calico/latest/reference/installation/api#nodeaddressautodetection</span></span><br><span class="line">      <span class="attr">canReach:</span> <span class="number">223.5</span><span class="number">.5</span><span class="number">.5</span></span><br><span class="line">  <span class="attr">registry:</span> <span class="string">m.daocloud.io/quay.io</span> <span class="comment"># https://docs.tigera.io/calico/latest/operations/image-options/alternate-registry</span></span><br><span class="line">  <span class="attr">flexVolumePath:</span> <span class="string">None</span> <span class="comment"># 设置为None 不安装 CSI 相关</span></span><br><span class="line">  <span class="attr">kubeletVolumePluginPath:</span> <span class="string">None</span></span><br><span class="line"><span class="meta">---</span></span><br><span class="line"><span class="attr">apiVersion:</span> <span class="string">operator.tigera.io/v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">APIServer</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line">  <span class="attr">name:</span> <span class="string">default</span></span><br><span class="line"><span class="attr">spec:</span> &#123;&#125;</span><br></pre></td></tr></table></figure><p>下都会部署很多组件：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl -n calico-system get deploy</span></span><br><span class="line">NAME                      READY   UP-TO-DATE   AVAILABLE   AGE</span><br><span class="line">calico-kube-controllers   1/1     1            1           6d3h</span><br><span class="line">calico-typha              1/1     1            1           6d3h</span><br><span class="line">goldmane                  1/1     1            1           6d1h</span><br><span class="line">whisker                   1/1     1            1           6d1h</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl -n calico-apiserver get deploy</span></span><br><span class="line">NAME               READY   UP-TO-DATE   AVAILABLE   AGE</span><br><span class="line">calico-apiserver   2/2     2            2           6d1h</span><br></pre></td></tr></table></figure><p>而且很多客户机器配置不高，所以我们使用 flannel，而网络策略这块有实现一个 agent 容器做 iptables 规则白名单，非 K8S 下 docker 也可以用。某个版本开始有部分业务需要 NodePort 暴漏用于外部上传备份文件，但是考虑到客户安全要求，所以需要 <code>NodePort</code> 做白名单限制。<br>如果对 iptables 不熟悉，可能会下意识的去 <code>INPUT</code> 链去做，实际是不行的，NodePort 在 nat 表里 <code>PREROUTING</code> 先匹配做 nat 的，拿如下 svc 做说明：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">Name:                     my-service</span><br><span class="line">Namespace:                default</span><br><span class="line">Labels:                   &lt;none&gt;</span><br><span class="line">Annotations:              &lt;none&gt;</span><br><span class="line">Selector:                 nodePort=test</span><br><span class="line">Type:                     NodePort</span><br><span class="line">IP Family Policy:         SingleStack</span><br><span class="line">IP Families:              IPv4</span><br><span class="line">IP:                       10.186.158.205</span><br><span class="line">IPs:                      10.186.158.205</span><br><span class="line">Port:                     &lt;unset&gt;  80/TCP</span><br><span class="line">TargetPort:               80/TCP</span><br><span class="line">NodePort:                 &lt;unset&gt;  30008/TCP</span><br><span class="line">Endpoints:                10.187.220.19:80</span><br><span class="line">Session Affinity:         None</span><br><span class="line">External Traffic Policy:  Cluster</span><br><span class="line">Events:                   &lt;none&gt;</span><br></pre></td></tr></table></figure><p>相关 nat 表下的 <code>PRETOUTING</code> 链如下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">入口</span></span><br><span class="line">-A PREROUTING -m comment --comment &quot;kubernetes service portals&quot; -j KUBE-SERVICES</span><br><span class="line">-A KUBE-SERVICES -m comment --comment &quot;kubernetes service nodeports; NOTE: this must be the last rule in this chain&quot; -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS</span><br><span class="line">-A KUBE-NODEPORTS -p tcp -m comment --comment &quot;default/my-service&quot; -m tcp --dport 30008 -j KUBE-EXT-FXIYY6OHUSNBITIX</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">nodePort 匹配之前先打 snat mark，再是下面的 KUBE-SVC-FXIYY6OHUSNBITIX svc 的 dnat 链</span></span><br><span class="line">-A KUBE-EXT-FXIYY6OHUSNBITIX -m comment --comment &quot;masquerade traffic for default/my-service external destinations&quot; -j KUBE-MARK-MASQ</span><br><span class="line">-A KUBE-EXT-FXIYY6OHUSNBITIX -j KUBE-SVC-FXIYY6OHUSNBITIX</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">svc 的 dnat 链</span></span><br><span class="line">-A KUBE-SVC-FXIYY6OHUSNBITIX -d 10.186.158.205/32 -p tcp -m comment --comment &quot;default/my-service cluster IP&quot; -m tcp --dport 80 -j KUBE-MARK-MASQ</span><br><span class="line">-A KUBE-SVC-FXIYY6OHUSNBITIX -m comment --comment &quot;default/my-service -&gt; 10.187.220.19:80&quot; -j KUBE-SEP-DPGHCWFA3YQKRCGQ</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">svc 的 endpoint 链</span></span><br><span class="line">-A KUBE-SEP-DPGHCWFA3YQKRCGQ -s 10.187.220.19/32 -m comment --comment &quot;default/my-service&quot; -j KUBE-MARK-MASQ</span><br><span class="line">-A KUBE-SEP-DPGHCWFA3YQKRCGQ -p tcp -m comment --comment &quot;default/my-service&quot; -m tcp -j DNAT --to-destination 10.187.220.19:80</span><br></pre></td></tr></table></figure><p>等走到 INPUT 后，目标 IP 和 port 都经过了 dnat了，所以不能在 INPUT 拦截匹配，同理 docker -p 暴漏的端口也是一样。所以之前我是在 raw 表的 <code>PREROUTING</code> 里做的。</p><h2 id="规则问题"><a href="#规则问题" class="headerlink" title="规则问题"></a>规则问题</h2><p>设计的规则是一个 ipset 存白名单端口列表，一个是 ip 白名单，规则如下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">iptables -t raw -S PREROUTING</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">回程 conntrack 状态放行</span></span><br><span class="line">-A PREROUTING -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">来源 IP 不是白名单 IP，但是目标端口是白名单端口就拒绝</span></span><br><span class="line">-A PREROUTING -m set ! --match-set whiteiplist src -m set --match-set whiteportlist dst -j DROP</span><br></pre></td></tr></table></figure><p>然后测了下发现没问题，后面时不时收到实施反馈客户现场环境上，服务作为客户端访问外部低概率超时，抓包发现本机上访问外部 server，server 回包被阻断：</p><ul><li>本机上作为 client 访问外部，发送 SYN 包</li><li>外部 server 给本机发送 SYN-ACK 被阻断</li></ul><p>根据 iptables 统计数据看：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">iptables -t raw -nvL PREROUTING</span></span><br><span class="line">Chain PREROUTING (policy ACCEPT)</span><br><span class="line"> pkts bytes target     prot opt in     out     source               destination         </span><br><span class="line"> 162M   64G ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED</span><br><span class="line"> 20   1604 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ! match-set whiteiplist src match-set whiteportlist dst</span><br></pre></td></tr></table></figure><p>发现就是 raw 的这个规则匹配 DROP 的，排查发现，某个版本开始后，把白名单端口增加很多，例如 <code>49100-49500</code> 之类的（INPUT 链我们也在用 whiteportlist），在 <code>ip_local_port_range</code> 范围内，刚好客户端使用就会发生：</p><ul><li>本机请求外部 server，分配的 <code>local_port</code> 是 <code>whiteportlist</code> 例如：<code>49123</code></li><li>外部回包，此刻没有被 <code>conntrack</code> 标记为 <code>ESTABLISHED</code> 状态，走到下一条规则</li><li>然后命中下一条就被 DROP</li></ul><p>使用 tcp 编程复现：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> socket</span><br><span class="line"></span><br><span class="line">s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)</span><br><span class="line">s.bind((<span class="string">&#x27;0.0.0.0&#x27;</span>, <span class="number">49123</span>))</span><br><span class="line">s.connect((<span class="string">&#x27;39.156.70.37&#x27;</span>, <span class="number">80</span>))</span><br><span class="line">s.send(<span class="string">b&#x27;GET / HTTP/1.1\r\nHost: www.baidu.com\r\n\r\n&#x27;</span>)</span><br><span class="line"><span class="built_in">print</span>(s.recv(<span class="number">1024</span>))</span><br><span class="line">s.close()</span><br></pre></td></tr></table></figure><p><code>39.156.70.37</code> 为百度域名 IP，执行后卡住，查看统计信息增加了也符合：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">清空 PREROUTING 统计信息</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">iptables -t raw -Z PREROUTING</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">python test.py</span></span><br><span class="line">^C</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">iptables -t raw -nvL PREROUTING</span></span><br></pre></td></tr></table></figure><p>看来 raw 如字面意思，太原始了。</p><h2 id="calico"><a href="#calico" class="headerlink" title="calico"></a>calico</h2><p>研究下 calico 如何实现的，单机干净 K8S 集群上部署了 calico ，看了下 operator 安装的 calico 版本为：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker images | grep calico/node</span></span><br><span class="line">m.daocloud.io/quay.io/calico/node                           v3.30.3                                  ce9c4ac0f175   7 weeks ago    401MB</span><br></pre></td></tr></table></figure><p>本文的规则研究以 <code>v3.30.3</code> 版本为准。</p><h3 id="准备工作"><a href="#准备工作" class="headerlink" title="准备工作"></a>准备工作</h3><p>先部署一个 NodePort：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Pod</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line">  <span class="attr">namespace:</span> <span class="string">default</span></span><br><span class="line">  <span class="attr">name:</span> <span class="string">test-hostname</span></span><br><span class="line">  <span class="attr">labels:</span></span><br><span class="line">    <span class="attr">nodePort:</span> <span class="string">test</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line">  <span class="attr">containers:</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">test</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">m.daocloud.io/docker.io/library/nginx:alpine</span></span><br><span class="line"><span class="meta">---</span></span><br><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Service</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line">  <span class="attr">name:</span> <span class="string">my-service</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line">  <span class="attr">type:</span> <span class="string">NodePort</span></span><br><span class="line">  <span class="attr">selector:</span></span><br><span class="line">    <span class="attr">nodePort:</span> <span class="string">test</span></span><br><span class="line">  <span class="attr">ports:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">protocol:</span> <span class="string">TCP</span></span><br><span class="line">      <span class="attr">port:</span> <span class="number">80</span></span><br><span class="line">      <span class="attr">targetPort:</span> <span class="number">80</span></span><br><span class="line">      <span class="attr">nodePort:</span> <span class="number">30008</span></span><br></pre></td></tr></table></figure><p>相关信息存档，后续生效的策略对比：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">iptables -w -t raw -S &gt; raw</span><br><span class="line">iptables -w -t nat -S &gt; nat</span><br><span class="line">iptables -w -t mangle -S &gt; mangle</span><br><span class="line">iptables -w -S &gt; filter</span><br><span class="line">ipset list &gt; ipset</span><br></pre></td></tr></table></figure><h3 id="GlobalNetworkPolicy"><a href="#GlobalNetworkPolicy" class="headerlink" title="GlobalNetworkPolicy"></a>GlobalNetworkPolicy</h3><p>谷歌搜到官方文档<a href="https://docs.tigera.io/calico/latest/network-policy/services/kubernetes-node-ports">network-policy kubernetes-node-ports</a>，使用的是 <code>GlobalNetworkPolicy</code>，看了下文档，calico 的这个 CRD 相对于 <code>NetworkPolicy</code> 范围更广，它可以控制主机层面，而非 <code>NetworkPolicy</code> 只控制 ns 和 Pod 策略，根据文档例子写了下：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> <span class="string">projectcalico.org/v3</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">GlobalNetworkPolicy</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line">  <span class="attr">name:</span> <span class="string">allow-cluster-nodeport-only</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"><span class="comment"># 常规使用是配合兜底策略，优先级高的前面放行，优先级最低的是拒绝，也就是白名单策略。或者优先级最低的是放行，优先级高的是DROP，也就是黑名单策略</span></span><br><span class="line"><span class="comment"># 这里我是只测试，写出下面规则</span></span><br><span class="line">  <span class="attr">order:</span> <span class="number">20</span></span><br><span class="line">  <span class="attr">preDNAT:</span> <span class="literal">true</span></span><br><span class="line">  <span class="attr">applyOnForward:</span> <span class="literal">true</span></span><br><span class="line">  <span class="attr">ingress:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">action:</span> <span class="string">Allow</span></span><br><span class="line">      <span class="attr">source:</span></span><br><span class="line">        <span class="attr">nets:</span></span><br><span class="line">         <span class="bullet">-</span> <span class="number">10.</span><span class="string">xxx.41.110/32</span> <span class="comment"># 自身 IP</span></span><br><span class="line">         <span class="bullet">-</span> <span class="number">10.</span><span class="string">xxx.195.118/32</span> <span class="comment"># 外部测试 IP</span></span><br><span class="line">         <span class="bullet">-</span> <span class="number">10.187</span><span class="number">.0</span><span class="number">.0</span><span class="string">/16</span> <span class="comment"># Pod CIDR</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">action:</span> <span class="string">Deny</span></span><br><span class="line">      <span class="attr">protocol:</span> <span class="string">TCP</span></span><br><span class="line">      <span class="attr">destination:</span></span><br><span class="line">        <span class="attr">ports:</span> [<span class="number">30008</span>]</span><br><span class="line">  <span class="attr">selector:</span> <span class="string">has(kubernetes.io/os)</span></span><br></pre></td></tr></table></figure><p>apply 后发现 iptables 的所有表里都没有规则增加，官方文档说选择器可以选 node 的，但是实际测试不行，看官方文档其他地方有使用 <code>kind: HostEndpoint</code> 配合 <code>selector</code> ，设置下自动创建 hep 还是不行：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl patch kubecontrollersconfigurations default \</span></span><br><span class="line"><span class="language-bash">  --<span class="built_in">type</span>=merge -p <span class="string">&#x27;&#123;&quot;spec&quot;: &#123;&quot;controllers&quot;: &#123;&quot;node&quot;:&#123;&quot;hostEndpoint&quot;:&#123;&quot;autoCreate&quot;: &quot;Enabled&quot;&#125;&#125;&#125;&#125;&#125;&#x27;</span></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl get hep -l kubernetes.io/os</span></span><br><span class="line">NAME                     CREATED AT</span><br><span class="line">10.xxx.xx.170-auto-hep   2025-10-15T09:18:37</span><br></pre></td></tr></table></figure><p>看了下选择器文档，直接改成 <code>selector: all()</code> 后可以了，外部 IP 不在上面的白名单里 curl nodeport 不通</p><h2 id="规则研究"><a href="#规则研究" class="headerlink" title="规则研究"></a>规则研究</h2><p>导出现在规则：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">iptables -w -t raw -S &gt; raw2</span><br><span class="line">iptables -w -t nat -S &gt; nat2</span><br><span class="line">iptables -w -t mangle -S &gt; mangle2</span><br><span class="line">iptables -w -S &gt; filter2</span><br></pre></td></tr></table></figure><h3 id="新增规则"><a href="#新增规则" class="headerlink" title="新增规则"></a>新增规则</h3><p>对比：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">diff mangle*</span></span><br><span class="line">10a11</span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-N cali-failsafe-in</span></span><br><span class="line">11a13</span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-N cali-fh-any-interface-at-all</span></span><br><span class="line">12a15</span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-N cali-pi-_Ddz2TLFtYPs0Zt3iUZs</span></span><br><span class="line">25a29,37</span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-failsafe-in -p tcp -m comment --comment <span class="string">&quot;cali:wWFQM43tJU7wwnFZ&quot;</span> -m multiport --dports 22 -j ACCEPT</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-failsafe-in -p udp -m comment --comment <span class="string">&quot;cali:LwNV--R8MjeUYacw&quot;</span> -m multiport --dports 68 -j ACCEPT</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-failsafe-in -p tcp -m comment --comment <span class="string">&quot;cali:QOO5NUOqOSS1_Iw0&quot;</span> -m multiport --dports 179 -j ACCEPT</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-failsafe-in -p tcp -m comment --comment <span class="string">&quot;cali:cwZWoBSwVeIAZmVN&quot;</span> -m multiport --dports 2379 -j ACCEPT</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-failsafe-in -p tcp -m comment --comment <span class="string">&quot;cali:7FbNXT91kugE_upR&quot;</span> -m multiport --dports 2380 -j ACCEPT</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-failsafe-in -p tcp -m comment --comment <span class="string">&quot;cali:8Ftbkk2dRH2eEeq1&quot;</span> -m multiport --dports 5473 -j ACCEPT</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-failsafe-in -p tcp -m comment --comment <span class="string">&quot;cali:-JoRSaAQZPJAegMo&quot;</span> -m multiport --dports 6443 -j ACCEPT</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-failsafe-in -p tcp -m comment --comment <span class="string">&quot;cali:PUKij4Rn9njHfVTi&quot;</span> -m multiport --dports 6666 -j ACCEPT</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-failsafe-in -p tcp -m comment --comment <span class="string">&quot;cali:vSprVE-4rient0wc&quot;</span> -m multiport --dports 6667 -j ACCEPT</span></span><br><span class="line">34a47,64</span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-fh-any-interface-at-all -m comment --comment <span class="string">&quot;cali:CCbcqJXqEISzSqnH&quot;</span> -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-fh-any-interface-at-all -m comment --comment <span class="string">&quot;cali:mmvu-cTJXJ7YH9Lp&quot;</span> -m conntrack --ctstate INVALID -j DROP</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-fh-any-interface-at-all -m comment --comment <span class="string">&quot;cali:NnqjZhu9yccY4C7-&quot;</span> -j cali-failsafe-in</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-fh-any-interface-at-all -m comment --comment <span class="string">&quot;cali:AtciE88iDfq0ah2L&quot;</span> -j MARK --set-xmark 0x0/0x30000</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-fh-any-interface-at-all -m comment --comment <span class="string">&quot;cali:BZMMxJKaVi8hIM9r&quot;</span> -m comment --comment <span class="string">&quot;Start of tier default&quot;</span> -j MARK --set-xmark 0x0/0x20000</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-fh-any-interface-at-all -m comment --comment <span class="string">&quot;cali:_hnIU4TYdSt--CFh&quot;</span> -m mark --mark 0x0/0x20000 -j cali-pi-_Ddz2TLFtYPs0Zt3iUZs</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-fh-any-interface-at-all -m comment --comment <span class="string">&quot;cali:-n3Ama1WlBcv-Yv9&quot;</span> -m comment --comment <span class="string">&quot;Return if policy accepted&quot;</span> -m mark --mark 0x10000/0x10000 -j RETURN</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-from-host-endpoint -m comment --comment <span class="string">&quot;cali:0MLuqUx2SPsTwgBS&quot;</span> -g cali-fh-any-interface-at-all</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment <span class="string">&quot;cali:5eFTXO3b0B-Tbiq8&quot;</span> -m comment --comment <span class="string">&quot;Policy default.allow-cluster-nodeport-only ingress&quot;</span> -j MARK --set-xmark 0x0/0x180000</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -s 10.xxx.41.110/32 -m comment --comment <span class="string">&quot;cali:L5TwSbHWsELZIAEd&quot;</span> -j MARK --set-xmark 0x80000/0x80000</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -s 10.xxx.195.118/32 -m comment --comment <span class="string">&quot;cali:noUEAlswvbgG5j7d&quot;</span> -j MARK --set-xmark 0x80000/0x80000</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -s 10.187.0.0/16 -m comment --comment <span class="string">&quot;cali:TxEjJz-IsLiJzVDK&quot;</span> -j MARK --set-xmark 0x80000/0x80000</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment <span class="string">&quot;cali:SBosizM5mtjxTsOe&quot;</span> -m mark --mark 0x80000/0x80000 -j MARK --set-xmark 0x10000/0x10000</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment <span class="string">&quot;cali:5w4NEetZaXhF7wjm&quot;</span> -m mark --mark 0x10000/0x10000 -j NFLOG --nflog-prefix  <span class="string">&quot;API0|default.allow-cluster-nodeport-only&quot;</span> --nflog-group 1 --nflog-range 80</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment <span class="string">&quot;cali:NMym66CfdBVWGhc6&quot;</span> -m mark --mark 0x10000/0x10000 -j RETURN</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -p tcp -m comment --comment <span class="string">&quot;cali:HONlGpSGnitWLUh-&quot;</span> -m multiport --dports 30008 -j MARK --set-xmark 0x40000/0x40000</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment <span class="string">&quot;cali:9vcFh92OaOMP06xg&quot;</span> -m mark --mark 0x40000/0x40000 -j NFLOG --nflog-prefix  <span class="string">&quot;DPI1|default.allow-cluster-nodeport-only&quot;</span> --nflog-group 1 --nflog-range 80</span></span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash">-A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment <span class="string">&quot;cali:URAeOCUsDbThanFp&quot;</span> -m mark --mark 0x40000/0x40000 -j DROP</span></span><br></pre></td></tr></table></figure><p>主要是多了三个链：</p><ul><li><code>cali-failsafe-in</code></li><li><code>cali-fh-any-interface-at-all</code></li><li><code>cali-pi-_Ddz2TLFtYPs0Zt3iUZs</code></li></ul><p><code>cali-failsafe-in</code> 链如名字所示，兜底策略，先放行 ssh&#x2F;etcd&#x2F;kube-apiserver 之类端口 ，避免配置错误网络策略后导致机器集群无法连上，涉及到的端口见官方文档 <a href="https://docs.tigera.io/calico/latest/reference/host-endpoints/failsafe">failsafe</a>，对于新增过滤规则的都会先跳到这个链。</p><p>后面俩链是具体 <code>cali-pi-_Ddz2TLFtYPs0Zt3iUZs</code> 里做处理：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">&gt; -A cali-fh-any-interface-at-all -m comment --comment &quot;cali:_hnIU4TYdSt--CFh&quot; -m mark --mark 0x0/0x20000 -j cali-pi-_Ddz2TLFtYPs0Zt3iUZs</span><br></pre></td></tr></table></figure><p>来懒人办法，清理掉链的统计信息看生效在哪块：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">iptables -t mangle -Z cali-pi-_Ddz2TLFtYPs0Zt3iUZs</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">iptables -t mangle -nvL cali-pi-_Ddz2TLFtYPs0Zt3iUZs</span></span><br><span class="line">Chain cali-pi-_Ddz2TLFtYPs0Zt3iUZs (1 references)</span><br><span class="line"> pkts bytes target     prot opt in     out     source               destination         </span><br><span class="line">   10   600 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:5eFTXO3b0B-Tbiq8 */ /* Policy default.allow-cluster-nodeport-only ingress */ MARK and 0xffe7ffff</span><br><span class="line">    0     0 MARK       all  --  *      *       10.xxx.41.110        0.0.0.0/0            /* cali:L5TwSbHWsELZIAEd */ MARK or 0x80000</span><br><span class="line">    0     0 MARK       all  --  *      *       10.xxx.195.118       0.0.0.0/0            /* cali:noUEAlswvbgG5j7d */ MARK or 0x80000</span><br><span class="line">    0     0 MARK       all  --  *      *       10.187.0.0/16        0.0.0.0/0            /* cali:TxEjJz-IsLiJzVDK */ MARK or 0x80000</span><br><span class="line">    0     0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:SBosizM5mtjxTsOe */ mark match 0x80000/0x80000 MARK or 0x10000</span><br><span class="line">    0     0 NFLOG      all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:5w4NEetZaXhF7wjm */ mark match 0x10000/0x10000 nflog-prefix  &quot;API0|default.allow-cluster-nodeport-only&quot; nflog-group 1 nflog-range 80</span><br><span class="line">    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:NMym66CfdBVWGhc6 */ mark match 0x10000/0x10000</span><br><span class="line">    0     0 MARK       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:HONlGpSGnitWLUh- */ multiport dports 30008 MARK or 0x40000</span><br><span class="line">    0     0 NFLOG      all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:9vcFh92OaOMP06xg */ mark match 0x40000/0x40000 nflog-prefix  &quot;DPI1|default.allow-cluster-nodeport-only&quot; nflog-group 1 nflog-range 80</span><br><span class="line">    0     0 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:URAeOCUsDbThanFp */ mark match 0x40000/0x40000</span><br></pre></td></tr></table></figure><p>然后外部的不在白名单里的 curl 下 NodePort，再看下统计信息：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">iptables -t mangle -nvL cali-pi-_Ddz2TLFtYPs0Zt3iUZs</span></span><br><span class="line">Chain cali-pi-_Ddz2TLFtYPs0Zt3iUZs (1 references)</span><br><span class="line"> pkts bytes target     prot opt in     out     source               destination         </span><br><span class="line">   44  2640 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:5eFTXO3b0B-Tbiq8 */ /* Policy default.allow-cluster-nodeport-only ingress */ MARK and 0xffe7ffff</span><br><span class="line">    0     0 MARK       all  --  *      *       10.xxx.41.110        0.0.0.0/0            /* cali:L5TwSbHWsELZIAEd */ MARK or 0x80000</span><br><span class="line">    0     0 MARK       all  --  *      *       10.xxx.195.118       0.0.0.0/0            /* cali:noUEAlswvbgG5j7d */ MARK or 0x80000</span><br><span class="line">    0     0 MARK       all  --  *      *       10.187.0.0/16        0.0.0.0/0            /* cali:TxEjJz-IsLiJzVDK */ MARK or 0x80000</span><br><span class="line">    0     0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:SBosizM5mtjxTsOe */ mark match 0x80000/0x80000 MARK or 0x10000</span><br><span class="line">    0     0 NFLOG      all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:5w4NEetZaXhF7wjm */ mark match 0x10000/0x10000 nflog-prefix  &quot;API0|default.allow-cluster-nodeport-only&quot; nflog-group 1 nflog-range 80</span><br><span class="line">    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:NMym66CfdBVWGhc6 */ mark match 0x10000/0x10000</span><br><span class="line">    2   120 MARK       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:HONlGpSGnitWLUh- */ multiport dports 30008 MARK or 0x40000</span><br><span class="line">    2   120 NFLOG      all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:9vcFh92OaOMP06xg */ mark match 0x40000/0x40000 nflog-prefix  &quot;DPI1|default.allow-cluster-nodeport-only&quot; nflog-group 1 nflog-range 80</span><br><span class="line">    2   120 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:URAeOCUsDbThanFp */ mark match 0x40000/0x40000</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>其实就是用 mark 做条件 flag 匹配处理，主要看这几个规则就行：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">&gt; -A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment &quot;cali:5eFTXO3b0B-Tbiq8&quot; -m comment --comment &quot;Policy default.allow-cluster-nodeport-only ingress&quot; -j MARK --set-xmark 0x0/0x180000</span><br><span class="line">&gt; -A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -s 10.xxx.41.110/32 -m comment --comment &quot;cali:L5TwSbHWsELZIAEd&quot; -j MARK --set-xmark 0x80000/0x80000</span><br><span class="line">&gt; -A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -s 10.xxx.195.118/32 -m comment --comment &quot;cali:noUEAlswvbgG5j7d&quot; -j MARK --set-xmark 0x80000/0x80000</span><br><span class="line">&gt; -A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -s 10.187.0.0/16 -m comment --comment &quot;cali:TxEjJz-IsLiJzVDK&quot; -j MARK --set-xmark 0x80000/0x80000</span><br><span class="line">&gt; -A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment &quot;cali:SBosizM5mtjxTsOe&quot; -m mark --mark 0x80000/0x80000 -j MARK --set-xmark 0x10000/0x10000</span><br><span class="line">&gt; -A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment &quot;cali:5w4NEetZaXhF7wjm&quot; -m mark --mark 0x10000/0x10000 -j NFLOG --nflog-prefix  &quot;API0|default.allow-cluster-nodeport-only&quot; --nflog-group 1 --nflog-range 80</span><br><span class="line">&gt; -A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment &quot;cali:NMym66CfdBVWGhc6&quot; -m mark --mark 0x10000/0x10000 -j RETURN</span><br><span class="line">&gt; -A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -p tcp -m comment --comment &quot;cali:HONlGpSGnitWLUh-&quot; -m multiport --dports 30008 -j MARK --set-xmark 0x40000/0x40000</span><br><span class="line">&gt; -A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment &quot;cali:9vcFh92OaOMP06xg&quot; -m mark --mark 0x40000/0x40000 -j NFLOG --nflog-prefix  &quot;DPI1|default.allow-cluster-nodeport-only&quot; --nflog-group 1 --nflog-range 80</span><br><span class="line">&gt; -A cali-pi-_Ddz2TLFtYPs0Zt3iUZs -m comment --comment &quot;cali:URAeOCUsDbThanFp&quot; -m mark --mark 0x40000/0x40000 -j DROP</span><br></pre></td></tr></table></figure><ul><li>白名单会打上 <code>0x80000/0x80000</code> 标记</li><li><code>-m mark --mark 0x80000/0x80000 -j MARK --set-xmark 0x10000/0x10000</code> 匹配上 <code>0x80000/0x80000</code> 的打新 mark <code>0x10000/0x10000</code> ，这里按照二进制理解，两者都存在</li><li><code>--mark 0x10000/0x10000 -j NFLOG</code> 匹配 <code>0x10000/0x10000</code> 的在 NFLOG 上记录，可以用 <code>tcpdump -i nflog:1</code> 抓包，配合前面一条也就是命中规则的才会 NFLOG</li><li><code>--mark 0x10000/0x10000 -j RETURN</code> 白名单命中放行的此刻不往下走</li><li><code>-m multiport --dports 30008 -j MARK --set-xmark 0x40000/0x40000</code> 访问的是 NodePort 打上标记</li><li><code>--mark 0x40000/0x40000 -j NFLOG</code> 记录，再往下走</li><li><code>--mark 0x40000/0x40000 -j DROP</code> 扔掉报文</li></ul><h3 id="mangle-的-PREROUTING"><a href="#mangle-的-PREROUTING" class="headerlink" title="mangle 的 PREROUTING"></a>mangle 的 PREROUTING</h3><p>相关流程都在 mangle 里:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">iptables -t mangle -S PREROUTING</span></span><br><span class="line">-P PREROUTING ACCEPT</span><br><span class="line">-A PREROUTING -m comment --comment &quot;cali:6gwbT8clXdHdC1b1&quot; -j cali-PREROUTING</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">iptables -t mangle -S cali-PREROUTING</span></span><br><span class="line">-N cali-PREROUTING</span><br><span class="line">-A cali-PREROUTING -m comment --comment &quot;cali:6BJqBjBC7crtA-7-&quot; -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT</span><br><span class="line">-A cali-PREROUTING -m comment --comment &quot;cali:KX7AGNd6rMcDUai6&quot; -m mark --mark 0x10000/0x10000 -j ACCEPT</span><br><span class="line">-A cali-PREROUTING -m comment --comment &quot;cali:wNH7KsA3ILKJBsY9&quot; -j cali-from-host-endpoint</span><br><span class="line">-A cali-PREROUTING -m comment --comment &quot;cali:Cg96MgVuoPm7UMRo&quot; -m comment --comment &quot;Host endpoint policy accepted packet.&quot; -m mark --mark 0x10000/0x10000 -j ACCEPT</span><br></pre></td></tr></table></figure><p>然后是 <code>cali-from-host-endpoint</code> 里，如果没有 return 就无法走到下面的 <code>&quot;Host endpoint policy accepted packet.&quot; -m mark --mark 0x10000/0x10000 -j ACCEPT</code>，而它：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">iptables -t mangle -S cali-from-host-endpoint</span></span><br><span class="line">-N cali-from-host-endpoint</span><br><span class="line">-A cali-from-host-endpoint -m comment --comment &quot;cali:0MLuqUx2SPsTwgBS&quot; -g cali-fh-any-interface-at-all</span><br></pre></td></tr></table></figure><p>可以看到它会走到上面新增的 diff 规则里，这是整个流程。</p><h2 id="相关源码"><a href="#相关源码" class="headerlink" title="相关源码"></a>相关源码</h2><p>calico 负责 iptables 规则的是 felix</p><h3 id="链名字"><a href="#链名字" class="headerlink" title="链名字"></a>链名字</h3><p><a href="https://github.com/projectcalico/calico/blob/v3.30.3/felix/rules/rule_defs.go">https://github.com/projectcalico/calico/blob/v3.30.3/felix/rules/rule_defs.go</a></p><h3 id="mark"><a href="#mark" class="headerlink" title="mark"></a>mark</h3><p>相关 mark 值源码里找到：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/projectcalico/calico/blob/v3.30.3/felix/dataplane/driver.go#L156-L164</span></span><br><span class="line">log.WithFields(log.Fields&#123;</span><br><span class="line"><span class="string">&quot;acceptMark&quot;</span>:          markAccept,</span><br><span class="line"><span class="string">&quot;passMark&quot;</span>:            markPass,</span><br><span class="line"><span class="string">&quot;dropMark&quot;</span>:            markDrop,</span><br><span class="line"><span class="string">&quot;scratch0Mark&quot;</span>:        markScratch0,</span><br><span class="line"><span class="string">&quot;scratch1Mark&quot;</span>:        markScratch1,</span><br><span class="line"><span class="string">&quot;endpointMark&quot;</span>:        markEndpointMark,</span><br><span class="line"><span class="string">&quot;endpointMarkNonCali&quot;</span>: markEndpointNonCaliEndpoint,</span><br><span class="line">&#125;).Info(<span class="string">&quot;Calculated iptables mark bits&quot;</span>)</span><br></pre></td></tr></table></figure><p>查看日志，下面便于阅读加几个换行：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker logs d96d | grep <span class="string">&#x27;Calculated iptables mark bits&#x27;</span></span></span><br><span class="line">2025-10-15 08:54:06.100 [INFO][85] felix/driver.go 164: Calculated iptables mark bits </span><br><span class="line">acceptMark=0x10000 </span><br><span class="line">dropMark=0x40000 </span><br><span class="line">endpointMark=0xffe00000 </span><br><span class="line">endpointMarkNonCali=0x0 </span><br><span class="line">passMark=0x20000 </span><br><span class="line">scratch0Mark=0x80000 </span><br><span class="line">scratch1Mark=0x100000</span><br></pre></td></tr></table></figure><h3 id="host-ipset"><a href="#host-ipset" class="headerlink" title="host ipset"></a>host ipset</h3><p>发现 calico 有一个 ipset 存储了本机上的网卡 IP：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">Name: cali40this-host</span><br><span class="line">Type: hash:ip</span><br><span class="line">Revision: 4</span><br><span class="line">Header: family inet hashsize 1024 maxelem 1048576</span><br><span class="line">Size in memory: 456</span><br><span class="line">References: 0</span><br><span class="line">Number of entries: 7</span><br><span class="line">Members:</span><br><span class="line">127.0.0.1</span><br><span class="line">169.254.20.10</span><br><span class="line">10.187.220.0</span><br><span class="line">10.185.0.1</span><br><span class="line">10.xxx.xx.xxx #本机IP</span><br><span class="line">10.186.0.2</span><br></pre></td></tr></table></figure><p>相关代码在：</p><p><a href="https://github.com/projectcalico/calico/blob/v3.30.3/felix/daemon/daemon.go#L179">https://github.com/projectcalico/calico/blob/v3.30.3/felix/daemon/daemon.go#L179</a><br><a href="https://github.com/projectcalico/calico/blob/v3.30.3/felix/config/config_params.go#L1067">https://github.com/projectcalico/calico/blob/v3.30.3/felix/config/config_params.go#L1067</a></p><p>查看相关日志：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker logs calico-node | grep int_dataplane.go</span></span><br><span class="line">2025-10-15 08:54:06.562 [INFO][85] felix/int_dataplane.go 2063: Started internal iptables dataplane driver loop</span><br><span class="line">2025-10-15 08:54:06.562 [INFO][85] felix/int_dataplane.go 2180: Will refresh IP sets on timer interval=1m30s</span><br><span class="line">2025-10-15 08:54:06.562 [INFO][85] felix/int_dataplane.go 2180: Will refresh routes on timer interval=1m30s</span><br><span class="line">2025-10-15 08:54:06.562 [INFO][85] felix/int_dataplane.go 2618: Started internal status report thread</span><br><span class="line">2025-10-15 08:54:06.562 [INFO][85] felix/int_dataplane.go 2620: Process status reports disabled</span><br><span class="line">2025-10-15 08:54:06.565 [INFO][85] felix/int_dataplane.go 1590: Linux interface state changed. ifIndex=1 ifaceName=&quot;lo&quot; state=&quot;up&quot;</span><br><span class="line">2025-10-15 08:54:06.565 [INFO][85] felix/int_dataplane.go 2259: Received interface update msg=&amp;intdataplane.ifaceStateUpdate&#123;Name:&quot;lo&quot;, State:&quot;up&quot;, Index:1&#125;</span><br><span class="line">2025-10-15 08:54:06.565 [INFO][85] felix/int_dataplane.go 1634: Linux interface addrs changed. addrs=set.Set&#123;127.0.0.0,127.0.0.1,::1,fe80::ecee:eeff:feee:eeee&#125; ifaceName=&quot;lo&quot;</span><br><span class="line">2025-10-15 08:54:06.565 [INFO][85] felix/int_dataplane.go 1590: Linux interface state changed. ifIndex=2 ifaceName=&quot;ens192&quot; state=&quot;up&quot;</span><br><span class="line">2025-10-15 08:54:06.565 [INFO][85] felix/int_dataplane.go 2286: Received interface addresses update msg=&amp;intdataplane.ifaceAddrsUpdate&#123;Name:&quot;lo&quot;, Addrs:set.Typed[string]&#123;&quot;127.0.0.0&quot;:set.v&#123;&#125;, &quot;127.0.0.1&quot;:set.v&#123;&#125;, &quot;::1&quot;:set.v&#123;&#125;, &quot;fe80::ecee:eeff:feee:eeee&quot;:set.v&#123;&#125;&#125;&#125;</span><br></pre></td></tr></table></figure><p>看了下源码获取网卡 IP 逻辑在 <a href="https://github.com/projectcalico/calico/blob/v3.30.3/felix/ifacemonitor/iface_monitor.go">felix&#x2F;ifacemonitor&#x2F;iface_monitor.go</a> ，主要是使用 Linux netlink 接口获取网卡和变更添加删除消息监听，然后执行 OnUpdate ：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/projectcalico/calico/blob/v3.30.3/felix/dataplane/linux/hostip_mgr.go#L81-L103</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(m *hostIPManager)</span></span> OnUpdate(msg <span class="keyword">interface</span>&#123;&#125;) &#123;</span><br><span class="line"><span class="keyword">switch</span> msg := msg.(<span class="keyword">type</span>) &#123;</span><br><span class="line"><span class="keyword">case</span> *ifaceAddrsUpdate:</span><br><span class="line">log.WithField(<span class="string">&quot;update&quot;</span>, msg).Info(<span class="string">&quot;Interface addrs changed.&quot;</span>)</span><br><span class="line"><span class="keyword">if</span> m.nonHostIfacesRegexp.MatchString(msg.Name) &#123;</span><br><span class="line">log.WithField(<span class="string">&quot;update&quot;</span>, msg).Debug(<span class="string">&quot;Not a real host interface, ignoring.&quot;</span>)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> msg.Addrs != <span class="literal">nil</span> &#123;</span><br><span class="line">m.hostIfaceToAddrs[msg.Name] = msg.Addrs</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line"><span class="built_in">delete</span>(m.hostIfaceToAddrs, msg.Name)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// Host ip update is a relative rare event. Flush entire ipsets to make it simple.</span></span><br><span class="line">metadata := ipsets.IPSetMetadata&#123;</span><br><span class="line">Type:    ipsets.IPSetTypeHashIP,</span><br><span class="line">SetID:   m.hostIPSetID,</span><br><span class="line">MaxSize: m.maxSize,</span><br><span class="line">&#125;</span><br><span class="line">m.ipsetsDataplane.AddOrReplaceIPSet(metadata, m.getCurrentMembers())</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="设计"><a href="#设计" class="headerlink" title="设计"></a>设计</h2><ul><li>一个 ipset 存储白名单 <code>whiteiplist</code>，匹配上就 ACCEPT</li><li>一个 ipset 存储 Port <code>whiteportlist</code>，此刻还匹配就说明不是白名单走过来，打 mark 2</li><li>匹配到 mark 2 则 DROP</li></ul><p>如果有问题，可以某个地方再加一个规则打上 mark 1 来热修。</p><h3 id="mangle-表"><a href="#mangle-表" class="headerlink" title="mangle 表"></a>mangle 表</h3><p>由于我们只使用 flannel，并且也不需要各种情况，所以就不用 mark 处理了，所以也只是像 calico 那样在 mangle 的 PRETOUTING 做链 ：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">#</span><span class="language-bash">先创建链</span></span><br><span class="line">iptables --wait --table mangle --new test-PREROUTING</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">插入 mangle 表的 PREROUTING 链前面</span></span><br><span class="line">iptables --wait --table mangle --insert PREROUTING --jump test-PREROUTING</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">放行已建立的连接</span></span><br><span class="line">iptables --wait --table mangle --insert test-PREROUTING -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">创建单独的链</span></span><br><span class="line">iptables --wait --table mangle --new test-from-host-endpoint</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">匹配行为在 test-from-host-endpoint 里做，有问题在它前面 INSERT 规则即可</span></span><br><span class="line">iptables --wait --table mangle --append test-PREROUTING -j test-from-host-endpoint</span><br></pre></td></tr></table></figure><h3 id="test-from-host-endpoint-链"><a href="#test-from-host-endpoint-链" class="headerlink" title="test-from-host-endpoint 链"></a>test-from-host-endpoint 链</h3><p>拆成两个是可以后续再 <code>test-from-host-endpoint</code> 前 insert 本机网卡 IP 或者可以添加类似 <code>failsafe-in</code> 之类的 ipset 之类的。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">白名单 IP 直接放行</span></span><br><span class="line">iptables --wait --table mangle --append test-from-host-endpoint -m set --match-set whiteiplist src -j ACCEPT</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">非白名单 ip 访问白名单端口拒绝</span></span><br><span class="line">iptables --wait --table mangle --append test-from-host-endpoint -m set --match-set whiteportlist dst -j DROP</span><br></pre></td></tr></table></figure><p>filter 表的 INPUT 链里我们也加了下类似 this-host 的逻辑：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">-A INPUT -j BASE-RULE</span><br><span class="line"></span><br><span class="line">-A BASE-RULE -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT</span><br><span class="line">-A BASE-RULE -m set --match-set whiteiplist src -j ACCEPT</span><br><span class="line">-A BASE-RULE -m set --match-set this-host src -j ACCEPT</span><br><span class="line">-A BASE-RULE -m set --match-set whiteportlist dst -j DROP</span><br></pre></td></tr></table></figure><h2 id="成品"><a href="#成品" class="headerlink" title="成品"></a>成品</h2><p>支持双栈，支持获取网卡IP，ipv6 根据 ipv6list 和 <code>cat /proc/sys/net/ipv6/conf/all/disable_ipv6</code> 值做开关</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line">Name: this-host</span><br><span class="line">Type: hash:net</span><br><span class="line">Revision: 6</span><br><span class="line">Header: family inet hashsize 1024 maxelem 1000000</span><br><span class="line">Size in memory: 568</span><br><span class="line">References: 1</span><br><span class="line">Number of entries: 3</span><br><span class="line">Members:</span><br><span class="line">127.0.0.0/24</span><br><span class="line">10.xx.94.189</span><br><span class="line">169.254.0.0/16</span><br><span class="line"></span><br><span class="line">Name: whiteipv6list</span><br><span class="line">Type: hash:net</span><br><span class="line">Revision: 6</span><br><span class="line">Header: family inet6 hashsize 1024 maxelem 1000000</span><br><span class="line">Size in memory: 1608</span><br><span class="line">References: 2</span><br><span class="line">Number of entries: 4</span><br><span class="line">Members:</span><br><span class="line">2408:8656:22df:ff01::14:1620</span><br><span class="line">2408:8656:22df:ff01::14:1621</span><br><span class="line">::1</span><br><span class="line">2408:8656:22df:ff01::14:1622</span><br><span class="line"></span><br><span class="line">Name: this-host6</span><br><span class="line">Type: hash:net</span><br><span class="line">Revision: 6</span><br><span class="line">Header: family inet6 hashsize 1024 maxelem 1000000</span><br><span class="line">Size in memory: 1496</span><br><span class="line">References: 1</span><br><span class="line">Number of entries: 3</span><br><span class="line">Members:</span><br><span class="line">ee80:169:254:20::/64</span><br><span class="line">2408:8656:22df:ff01::14:1620</span><br><span class="line">::1</span><br><span class="line"></span><br><span class="line">Name: whiteportlist</span><br><span class="line">Type: bitmap:port</span><br><span class="line">Revision: 3</span><br><span class="line">Header: range 0-65535</span><br><span class="line">Size in memory: 8296</span><br><span class="line">References: 4</span><br><span class="line">Number of entries: 230</span><br><span class="line">Members:</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>一些仅供他人参考的 shell：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br></pre></td><td class="code"><pre><span class="line">function get_if_inet()&#123;</span><br><span class="line">  local if=$1</span><br><span class="line">  ip -4 -o a s $if | awk &#x27;&#123;print $4&#125;&#x27;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">function this_host()&#123;</span><br><span class="line">  ipset create this-host hash:net maxelem 1000000 -exist</span><br><span class="line">  ipset add this-host 127.0.0.1/24 -exist</span><br><span class="line">  ipset add this-host 169.254.0.0/16 -exist</span><br><span class="line"></span><br><span class="line">  if [ -d /sys/devices/virtual/net/cni0/ ];then</span><br><span class="line">    ipset add this-host $(get_if_inet cni0| sed &#x27;s#/\d+#/16#&#x27;) -exist</span><br><span class="line">  fi</span><br><span class="line"></span><br><span class="line">  ip -o -4 a s scope global | grep -Ev &#x27;:\s+(cali|tunl|vxlan|flannel|docker0|veth|wireguard|wg|cni0|kube|dummy|veth)&#x27; | awk -F&#x27;[ /]+&#x27; &#x27;&#123;print $4&#125;&#x27;| \</span><br><span class="line">  while read ip;do</span><br><span class="line">    ipset add this-host $ip -exist</span><br><span class="line">  done</span><br><span class="line"></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">function get_if_inet6()&#123;</span><br><span class="line">  local if=$1 ignore=$2 inet6</span><br><span class="line">  inet6=ip -6 -o a s $if | awk &#x27;&#123;print $4&#125;&#x27;</span><br><span class="line">  if [ -n &quot;$ignore&quot; ];then</span><br><span class="line">    inet6=$(echo $inet6 | grep -Ev &quot;$2&quot;)</span><br><span class="line">  fi</span><br><span class="line">  echo $inet6</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">function this_host6()&#123;</span><br><span class="line">  ipset create this-host6 hash:net maxelem 1000000 family inet6 -exist</span><br><span class="line">  ipset add this-host6 ::1 -exist</span><br><span class="line"></span><br><span class="line">  if [ -d /sys/devices/virtual/net/cni0/ ];then</span><br><span class="line">    cni0_inet6=$(get_if_inet6 cni0| sed &#x27;s#/\d+#/56#&#x27;)</span><br><span class="line">    if [ -n &quot;$cni0_inet6&quot; ];then</span><br><span class="line">      ipset add this-host6 $cni0_inet6 -exist</span><br><span class="line">    fi</span><br><span class="line">  fi</span><br><span class="line"></span><br><span class="line">  ip -o -6 a s scope global | grep -Ev &#x27;:\s+(cali|tunl|vxlan|flannel|docker0|veth|wireguard|wg|cni0|kube|dummy|veth)&#x27; |\</span><br><span class="line">   grep -Ev &#x27;^fe80::.+/64&#x27; | awk -F&#x27;[ /]+&#x27; &#x27;&#123;print $4&#125;&#x27;| \</span><br><span class="line">  while read ip;do</span><br><span class="line">    ipset add this-host6 $ip -exist</span><br><span class="line">  done</span><br><span class="line"></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;p&gt;研究下 calico 如何实现 nodePort 白名单。&lt;/p&gt;</summary>
    
    
    
    
    <category term="kubernetes" scheme="http://zhangguanzhang.github.io/tags/kubernetes/"/>
    
    <category term="nodeport" scheme="http://zhangguanzhang.github.io/tags/nodeport/"/>
    
  </entry>
  
  <entry>
    <title>hostNetwork下hostname的坑</title>
    <link href="http://zhangguanzhang.github.io/2025/10/13/k8s-hostNetwork-hostname/"/>
    <id>http://zhangguanzhang.github.io/2025/10/13/k8s-hostNetwork-hostname/</id>
    <published>2025-10-13T10:10:30.000Z</published>
    <updated>2025-10-13T10:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>最近升级和压测遇到的 hostNetwork 下 hostname 的坑问题</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>性能压测团队压测后发现 kafka 的数据目录下有俩个很大的目录给机器目录占满：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">total 88</span><br><span class="line">drwxr-xr-x 568 nfsnobody nfsnobody 28672 Oct  1 17:42 kafka-logs-kafka-1</span><br><span class="line">drwxr-xr-x 535 nfsnobody nfsnobody 28672 Sep 26 19:51 kafka-logs-vm10-7-131-94</span><br></pre></td></tr></table></figure><h2 id="大小问题"><a href="#大小问题" class="headerlink" title="大小问题"></a>大小问题</h2><p>大小问题是因为 kafka 默认配置 <code>log.retention.bytes = -1</code>：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"> docker logs 2d21e7 |&amp; grep -P <span class="string">&#x27;^\s+log\.retention&#x27;</span></span></span><br><span class="line">log.retention.bytes = -1</span><br><span class="line">log.retention.check.interval.ms = 150000</span><br><span class="line">log.retention.hours = 168</span><br><span class="line">log.retention.minutes = null</span><br><span class="line">log.retention.ms = null</span><br></pre></td></tr></table></figure><p>查看了下 kafka 的启动脚本，支持 env 配置，配置下 <code>KAFKA_LOG_RETENTION_BYTES: &quot;1073741824&quot;</code> 后测试没问题：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">终端1写入topic</span></span><br><span class="line">for i in &#123;1..1000&#125;; do head -c 102400000 /dev/urandom | base64 |kafka-console-producer.sh --bootstrap-server 127.0.0.1:9092  --topic test; done</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">终端2 观察</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">du</span> -shx *; <span class="built_in">ls</span> -lh test-0</span></span><br><span class="line">0cleaner-offset-checkpoint</span><br><span class="line">4.0Klog-start-offset-checkpoint</span><br><span class="line">4.0Kmeta.properties</span><br><span class="line">4.0Krecovery-point-offset-checkpoint</span><br><span class="line">4.0Kreplication-offset-checkpoint</span><br><span class="line">3.5Gtest-0</span><br><span class="line">3.5Gtest-1</span><br><span class="line">总用量 3.5G</span><br><span class="line">-rwxr-xr-x 1 65534 65534 514K 10月 13 17:58 00000000000012487638.index.deleted</span><br><span class="line">-rwxr-xr-x 1 65534 65534 1.0G 10月 13 17:58 00000000000012487638.log.deleted</span><br><span class="line">-rwxr-xr-x 1 65534 65534 355K 10月 13 17:58 00000000000012487638.timeindex.deleted</span><br><span class="line">-rwxr-xr-x 1 65534 65534 517K 10月 13 19:02 00000000000024975185.index</span><br><span class="line">-rwxr-xr-x 1 65534 65534 1.0G 10月 13 19:02 00000000000024975185.log</span><br><span class="line">-rwxr-xr-x 1 65534 65534 4.0K 10月 13 17:58 00000000000024975185.snapshot</span><br><span class="line">-rwxr-xr-x 1 65534 65534 372K 10月 13 19:02 00000000000024975185.timeindex</span><br><span class="line">-rwxr-xr-x 1 65534 65534 4.1K 10月 13 18:02 00000000000027453871.snapshot</span><br><span class="line">-rw-r--r-- 1 65534 65534  10M 10月 13 19:02 00000000000037462545.index</span><br><span class="line">-rw-r--r-- 1 65534 65534 430M 10月 13 19:02 00000000000037462545.log</span><br><span class="line">-rw-r--r-- 1 65534 65534 4.6K 10月 13 19:02 00000000000037462545.snapshot</span><br><span class="line">-rw-r--r-- 1 65534 65534  10M 10月 13 19:02 00000000000037462545.timeindex</span><br><span class="line">-rw-r--r-- 1 65534 65534   15 10月 13 19:03 leader-epoch-checkpoint</span><br><span class="line">-rwxr-xr-x 1 65534 65534   43 10月 13 17:33 partition.metadata</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">du</span> -shx *; <span class="built_in">ls</span> -lh test-0</span></span><br><span class="line">0cleaner-offset-checkpoint</span><br><span class="line">4.0Klog-start-offset-checkpoint</span><br><span class="line">4.0Kmeta.properties</span><br><span class="line">4.0Krecovery-point-offset-checkpoint</span><br><span class="line">4.0Kreplication-offset-checkpoint</span><br><span class="line">1.5Gtest-0</span><br><span class="line">1.5Gtest-1</span><br><span class="line">总用量 1.5G</span><br><span class="line">-rwxr-xr-x 1 65534 65534 517K 10月 13 19:02 00000000000024975185.index</span><br><span class="line">-rwxr-xr-x 1 65534 65534 1.0G 10月 13 19:02 00000000000024975185.log</span><br><span class="line">-rwxr-xr-x 1 65534 65534 4.0K 10月 13 17:58 00000000000024975185.snapshot</span><br><span class="line">-rwxr-xr-x 1 65534 65534 372K 10月 13 19:02 00000000000024975185.timeindex</span><br><span class="line">-rwxr-xr-x 1 65534 65534 4.1K 10月 13 18:02 00000000000027453871.snapshot</span><br><span class="line">-rw-r--r-- 1 65534 65534  10M 10月 13 19:02 00000000000037462545.index</span><br><span class="line">-rw-r--r-- 1 65534 65534 430M 10月 13 19:02 00000000000037462545.log</span><br><span class="line">-rw-r--r-- 1 65534 65534 4.6K 10月 13 19:02 00000000000037462545.snapshot</span><br><span class="line">-rw-r--r-- 1 65534 65534  10M 10月 13 19:02 00000000000037462545.timeindex</span><br><span class="line">-rw-r--r-- 1 65534 65534   15 10月 13 19:03 leader-epoch-checkpoint</span><br><span class="line">-rwxr-xr-x 1 65534 65534   43 10月 13 17:33 partition.metadata</span><br></pre></td></tr></table></figure><h2 id="hostname-问题"><a href="#hostname-问题" class="headerlink" title="hostname 问题"></a>hostname 问题</h2><h3 id="两个带-hostname-目录的问题"><a href="#两个带-hostname-目录的问题" class="headerlink" title="两个带 hostname 目录的问题"></a>两个带 hostname 目录的问题</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">https://github.com/wurstmeister/kafka-docker/blob/master/start-kafka.sh#L43C1-L45C3</span></span><br><span class="line">if [[ -z &quot;$KAFKA_LOG_DIRS&quot; ]]; then</span><br><span class="line">    export KAFKA_LOG_DIRS=&quot;/kafka/kafka-logs-$HOSTNAME&quot;</span><br><span class="line">fi</span><br></pre></td></tr></table></figure><p>从启动脚本看到上面逻辑，没有设置就拼接 hostname，指定 <code>KAFKA_LOG_DIRS</code> 成固定，再修改启动脚本支持升级后把老目录 mv 成不带 hostname 的唯一目录：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">if [ -n &quot;$KAFKA_LOG_DIRS&quot; ] &amp;&amp; ! [ -d &quot;$KAFKA_LOG_DIRS&quot; ];then</span><br><span class="line">latest_log_dir=$(ls -1 -t -d /kafka/kafka-logs-* | head -n1)</span><br><span class="line">if [ -n &quot;$latest_log_dir&quot; ];then</span><br><span class="line">echo &quot;found old kafka-logs-: $&#123;latest_log_dir&#125;&quot;</span><br><span class="line">mv -v $latest_log_dir $KAFKA_LOG_DIRS</span><br><span class="line">fi</span><br><span class="line">fi</span><br></pre></td></tr></table></figure><h3 id="hostname-问题-1"><a href="#hostname-问题-1" class="headerlink" title="hostname 问题"></a>hostname 问题</h3><p>两个目录问题还是要查找原因的，出现的场景是在我们 kafka 之前是 staticPod + hostPath 部署的，后面切成 docker-compose 部署后发生的，相关配置为：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># staticPod</span></span><br><span class="line">  <span class="attr">hostNetwork:</span> <span class="literal">true</span></span><br><span class="line">  <span class="attr">hostname:</span> <span class="string">kafka-&#123;&#123;</span> <span class="string">MY_ID</span> <span class="string">&#125;&#125;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># docker-compose</span></span><br><span class="line">    <span class="attr">network_mode:</span> <span class="string">host</span></span><br><span class="line">    <span class="attr">hostname:</span> <span class="string">kafka-&#123;&#123;</span> <span class="string">kafka.MY_ID</span> <span class="string">&#125;&#125;</span></span><br></pre></td></tr></table></figure><p>而仔细看 mtime：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">total 88</span><br><span class="line">drwxr-xr-x 568 nfsnobody nfsnobody 28672 Oct  1 17:42 kafka-logs-kafka-1</span><br><span class="line">drwxr-xr-x 535 nfsnobody nfsnobody 28672 Sep 26 19:51 kafka-logs-vm10-7-131-94</span><br></pre></td></tr></table></figure><p><code>-kafka-1</code> 是最新的，也就是说 docker-compose 没问题，而 k8s 的容器获取到的是宿主机的 hostname，验证了下确实：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker run -ti --entrypoint hostname --net host --hostname test111 m.daocloud.io/docker.io/library/nginx:alpine</span> </span><br><span class="line">test111</span><br></pre></td></tr></table></figure><p>测试 <code>hostNetwork</code> 和 <code>hostname</code> 一起下，容器的 hostname 是宿主机的：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">---</span></span><br><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Pod</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line">  <span class="attr">namespace:</span> <span class="string">default</span></span><br><span class="line">  <span class="attr">name:</span> <span class="string">test-hostname</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line">  <span class="attr">containers:</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">test</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">m.daocloud.io/docker.io/library/nginx:alpine</span></span><br><span class="line">    <span class="attr">command:</span> [<span class="string">&quot;/bin/sh&quot;</span>]</span><br><span class="line">    <span class="attr">tty:</span> <span class="literal">true</span></span><br><span class="line">  <span class="attr">hostname:</span> <span class="string">test111</span></span><br><span class="line">  <span class="attr">hostNetwork:</span> <span class="literal">true</span></span><br></pre></td></tr></table></figure><p>一开始想着是 cri-dockerd 的 bug，想着去再提交一个 pr ，于是先让群友 containerd 环境 K8S 测下，好拿信息去提交 pr。结果发现 containerd K8S 下也一样是宿主机的 hostname，搜索了下后发现：</p><p><a href="https://github.com/kubernetes/kubernetes/issues/67019">https://github.com/kubernetes/kubernetes/issues/67019</a></p><p>是 K8S 代码逻辑导致的，看了下 issue，看到有 pr 关联 <a href="https://github.com/kubernetes/kubernetes/pull/132558/files">KEP-4762: Allows setting any FQDN as the pod’s hostname</a> 是新版本里加了个特性 <code>features.HostnameOverride</code> 门控和字段 <code>HostnameOverride</code>，但是看了下描述，发现压根不是一回事：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">//</span> <span class="string">HostnameOverride</span> <span class="string">specifies</span> <span class="string">an</span> <span class="string">explicit</span> <span class="string">override</span> <span class="string">for</span> <span class="string">the</span> <span class="string">pod&#x27;s</span> <span class="string">hostname</span> <span class="string">as</span> <span class="string">perceived</span> <span class="string">by</span> <span class="string">the</span> <span class="string">pod.</span></span><br><span class="line"><span class="string">//</span> <span class="string">This</span> <span class="string">field</span> <span class="string">only</span> <span class="string">specifies</span> <span class="string">the</span> <span class="string">pod&#x27;s</span> <span class="string">hostname</span> <span class="string">and</span> <span class="string">does</span> <span class="string">not</span> <span class="string">affect</span> <span class="string">its</span> <span class="string">DNS</span> <span class="string">records.</span></span><br><span class="line"><span class="string">//</span> <span class="attr">When this field is set to a non-empty string:</span></span><br><span class="line"><span class="string">//</span> <span class="bullet">-</span> <span class="string">It</span> <span class="string">takes</span> <span class="string">precedence</span> <span class="string">over</span> <span class="string">the</span> <span class="string">values</span> <span class="string">set</span> <span class="string">in</span> <span class="string">`hostname`</span> <span class="string">and</span> <span class="string">`subdomain`.</span></span><br><span class="line"><span class="string">//</span> <span class="bullet">-</span> <span class="string">The</span> <span class="string">Pod&#x27;s</span> <span class="string">hostname</span> <span class="string">will</span> <span class="string">be</span> <span class="string">set</span> <span class="string">to</span> <span class="string">this</span> <span class="string">value.</span></span><br><span class="line"><span class="string">//</span> <span class="bullet">-</span> <span class="string">`setHostnameAsFQDN`</span> <span class="string">must</span> <span class="string">be</span> <span class="string">nil</span> <span class="string">or</span> <span class="string">set</span> <span class="string">to</span> <span class="string">false.</span></span><br><span class="line"><span class="string">//</span> <span class="bullet">-</span> <span class="string">`hostNetwork`</span> <span class="string">must</span> <span class="string">be</span> <span class="string">set</span> <span class="string">to</span> <span class="string">false.</span></span><br><span class="line"><span class="string">//</span></span><br><span class="line"><span class="string">//</span> <span class="string">This</span> <span class="string">field</span> <span class="string">must</span> <span class="string">be</span> <span class="string">a</span> <span class="string">valid</span> <span class="string">DNS</span> <span class="string">subdomain</span> <span class="string">as</span> <span class="string">defined</span> <span class="string">in</span> <span class="string">RFC</span> <span class="number">1123 </span><span class="string">and</span> <span class="string">contain</span> <span class="string">at</span> <span class="string">most</span> <span class="number">64</span> <span class="string">characters.</span></span><br><span class="line"><span class="string">//</span> <span class="string">Requires</span> <span class="string">the</span> <span class="string">HostnameOverride</span> <span class="string">feature</span> <span class="string">gate</span> <span class="string">to</span> <span class="string">be</span> <span class="string">enabled.</span></span><br><span class="line"><span class="string">//</span></span><br><span class="line"><span class="string">//</span> <span class="string">+featureGate=HostnameOverride</span></span><br><span class="line"><span class="string">//</span> <span class="string">+optional</span></span><br><span class="line"><span class="string">HostnameOverride</span> <span class="meta">*string</span></span><br></pre></td></tr></table></figure><p>注意看上面的 <code>hostNetwork must be set to false.</code> ，新版本的这个特性压根解决不了这个问题，只能是自己踩坑了。</p><h2 id="结论"><a href="#结论" class="headerlink" title="结论"></a>结论</h2><p>截止目前为止，k8s 的 Pod 同时配置 <code>hostNetwork</code> 和 <code>hostname</code> 下，容器内的 hostname 是宿主机的，docker 直接起这样配置的容器则没问题，如果你的进程依赖 hostname，则要注意这块。</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;最近升级和压测遇到的 hostNetwork 下 hostname 的坑问题&lt;/p&gt;</summary>
    
    
    
    
    <category term="kubernetes" scheme="http://zhangguanzhang.github.io/tags/kubernetes/"/>
    
    <category term="hostname" scheme="http://zhangguanzhang.github.io/tags/hostname/"/>
    
  </entry>
  
  <entry>
    <title>个别节点上 flannel.1 的 IP 无法 ping</title>
    <link href="http://zhangguanzhang.github.io/2025/09/23/flannel.1-cannot-ping/"/>
    <id>http://zhangguanzhang.github.io/2025/09/23/flannel.1-cannot-ping/</id>
    <published>2025-09-23T10:10:30.000Z</published>
    <updated>2025-09-23T10:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>一次客户环境上个别节点 flannel.1 的 IP 无法 ping 的排查</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>客户反馈他们环境 agent 告警监控： 本机上的 IP <code>10.187.12.0</code> 无法 ping 通。</p><h2 id="排查"><a href="#排查" class="headerlink" title="排查"></a>排查</h2><h3 id="定位范围"><a href="#定位范围" class="headerlink" title="定位范围"></a>定位范围</h3><p>客户环境不能远程，都是发命令让查的，查看 flannel 容器均没有重启：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docekr ps -a | grep flanneld</span></span><br><span class="line">cb0e35b00899   1e0b2bff6efb               &quot;/opt/bin/flanneld -…&quot;   7 weeks ago   Up 7 weeks                                k8s_kube-flannel_kube-flannel-ds-qbps5_kube-system_bffc6d17-0835-468d-bb90-2367c190c94f_0</span><br></pre></td></tr></table></figure><p>让客户去告警机器上 ping，客户说 cni0 地址是通的，就 <code>10.187.12.0</code> 和 <code>10.187.11.0</code> 无法 ping 通，沟通一番才意识到是这俩 ip 在各自本机上无法 ping 通，直接 ping 报错：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ping 10.187.12.0</span></span><br><span class="line">Do you want to ping broadcast? Then -b. If not, check your local firewall rules.</span><br></pre></td></tr></table></figure><p>看了下 <code>flannel.1</code> 和 <code>cni0</code> 的 IP 信息也没问题：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip a s flannel.1</span></span><br><span class="line">9: flannel.1: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1450 qdisc noqueue state UNKNOWN group default </span><br><span class="line">    link/ether 4e:81:e2:84:ff:49 brd ff:ff:ff:ff:ff:ff</span><br><span class="line">    inet 10.187.12.0/32 scope global flannel.1</span><br><span class="line">       valid_lft forever preferred_lft forever</span><br><span class="line">    inet6 fe80::4c81:e2ff:fe84:ff49/64 scope link </span><br><span class="line">       valid_lft forever preferred_lft forever</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip a s cni0</span></span><br><span class="line">7: cni0: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1450 qdisc noqueue state UP group default qlen 1000</span><br><span class="line">    link/ether 02:e6:56:fb:18:3d brd ff:ff:ff:ff:ff:ff</span><br><span class="line">    inet 10.187.12.1/24 brd 10.187.12.255 scope global cni0</span><br><span class="line">       valid_lft forever preferred_lft forever</span><br><span class="line">    inet6 fe80::e6:56ff:fefb:183d/64 scope link </span><br><span class="line">       valid_lft forever preferred_lft forever</span><br></pre></td></tr></table></figure><p>看了下内核参数也正常：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts</span></span><br><span class="line">1</span><br></pre></td></tr></table></figure><p>去搜下源码看看，</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">rpm -qf `<span class="built_in">which</span> ping`</span></span><br><span class="line">iputils-20190709-5.ky10.aarch64</span><br></pre></td></tr></table></figure><p>搜到源码 iputils 没特殊处理逻辑：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/iputils/iputils/blob/master/ping/ping.c#L885-L895</span></span><br><span class="line"></span><br><span class="line">sock_setmark(rts, probe_fd);</span><br><span class="line"></span><br><span class="line">dst.sin_port = htons(<span class="number">1025</span>);</span><br><span class="line"><span class="keyword">if</span> (rts-&gt;nroute)</span><br><span class="line">dst.sin_addr.s_addr = rts-&gt;route[<span class="number">0</span>];</span><br><span class="line"><span class="keyword">if</span> (connect(probe_fd, (<span class="keyword">struct</span> sockaddr *)&amp;dst, <span class="keyword">sizeof</span>(dst)) == <span class="number">-1</span>) &#123;</span><br><span class="line"><span class="keyword">if</span> (errno == EACCES) &#123;</span><br><span class="line"><span class="keyword">if</span> (rts-&gt;broadcast_pings == <span class="number">0</span>)</span><br><span class="line">error(<span class="number">2</span>, <span class="number">0</span>,</span><br><span class="line">_(<span class="string">&quot;Do you want to ping broadcast? Then -b. If not, check your local firewall rules&quot;</span>));</span><br><span class="line"><span class="built_in">fprintf</span>(<span class="built_in">stderr</span>, _(<span class="string">&quot;WARNING: pinging broadcast address\n&quot;</span>));</span><br></pre></td></tr></table></figure><p>完全走的系统层面分配，获取到的地址是广播地址才报错，看下路由：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip route show</span></span><br><span class="line">default via xxx dev enp4s0</span><br><span class="line">....</span><br><span class="line">10.187.12.0/24 dev cni0 proto kernel scope link src 10.187.12.1</span><br></pre></td></tr></table></figure><p>但是路由匹配就有问题了：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip route get 10.187.12.0</span></span><br><span class="line">broadcast 10.187.12.0 dev cni0 src 10.187.12.1 uid 58248</span><br><span class="line">    cache &lt;local,brd&gt;</span><br></pre></td></tr></table></figure><p>看来问题就在路由这块，<code>ip route show</code> 实际是 <code>ip route show talbe main</code> 看下由网卡生成的 local 路由表：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip route show table <span class="built_in">local</span> | grep 10.187.12.0</span></span><br><span class="line">broadcast 10.187.12.0 dev cni0 proto kernel scope link src 10.187.12.1 </span><br><span class="line">local 10.187.12.0 dev flannel.1 proto kernel scope host src 10.187.12.0</span><br></pre></td></tr></table></figure><p>果然是顺序导致，local 路由表是根据网卡顺序生成的，前面细心的话会发现 <code>flannel.1</code> 前面数字是 9，<code>cni0</code> 是 7，意味着 <code>cni0</code> 比 <code>flannel.1</code> 先创建，或者是 <code>flannel.1</code> 网卡删除后重启 flanneld 容器创建的。</p><h3 id="复现"><a href="#复现" class="headerlink" title="复现"></a>复现</h3><p>内部找个 k8s 环境测试下复现了：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip a s flannel.1</span></span><br><span class="line">6: flannel.1: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1450 qdisc noqueue state UNKNOWN group default </span><br><span class="line">    link/ether 4a:53:ae:34:23:99 brd ff:ff:ff:ff:ff:ff</span><br><span class="line">    inet 10.187.2.0/32 scope global flannel.1</span><br><span class="line">       valid_lft forever preferred_lft forever</span><br><span class="line">    inet6 fe80::4853:aeff:fe34:2399/64 scope link </span><br><span class="line">       valid_lft forever preferred_lft forever</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ping 10.187.2.0</span></span><br><span class="line">PING 10.187.2.0 (10.187.2.0) 56(84) bytes of data.</span><br><span class="line">64 bytes from 10.187.2.0: icmp_seq=1 ttl=64 time=0.042 ms</span><br><span class="line">^C</span><br><span class="line">--- 10.187.2.0 ping statistics ---</span><br><span class="line">1 packets transmitted, 1 received, 0% packet loss, time 0ms</span><br><span class="line">rtt min/avg/max/mdev = 0.042/0.042/0.042/0.000 ms</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip <span class="built_in">link</span> delete flannel.1</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker ps -a | grep flanneld</span></span><br><span class="line">46079b620076   reg.xxx.lan:5000/xxx/flannel                                                     &quot;/opt/bin/flanneld -…&quot;   8 days ago       Up 8 days </span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker restart 460</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ping 10.187.2.0</span></span><br><span class="line">Do you want to ping broadcast? Then -b. If not, check your local firewall rules.</span><br></pre></td></tr></table></figure><h3 id="解决"><a href="#解决" class="headerlink" title="解决"></a>解决</h3><p>添加 32 位掩码路由不行：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip route add 10.187.2.0/32 dev flannel.1</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ping 10.187.2.0</span></span><br><span class="line">Do you want to ping broadcast? Then -b. If not, check your local firewall rules.</span><br></pre></td></tr></table></figure><p>因为 local 先匹配：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip route show table <span class="built_in">local</span> | grep 10.187.2.0</span></span><br><span class="line">broadcast 10.187.2.0 dev cni0 proto kernel scope link src 10.187.2.1 </span><br><span class="line">local 10.187.2.0 dev flannel.1 proto kernel scope host src 10.187.2.0 </span><br></pre></td></tr></table></figure><p>删除后可以：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip route delete broadcast 10.187.2.0 dev cni0 proto kernel scope <span class="built_in">link</span> src 10.187.2.1</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip route show</span></span><br><span class="line">...</span><br><span class="line">10.185.0.0/16 dev docker0 proto kernel scope link src 10.185.0.1 </span><br><span class="line">10.187.0.0/24 via 10.187.0.0 dev flannel.1 onlink </span><br><span class="line">10.187.1.0/24 via 10.187.1.0 dev flannel.1 onlink </span><br><span class="line">10.187.2.0/24 dev cni0 proto kernel scope link src 10.187.2.1 </span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ip route show table <span class="built_in">local</span>  | grep 10.187.2.0</span></span><br><span class="line">local 10.187.2.0 dev flannel.1 proto kernel scope host src 10.187.2.0</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ping 10.187.2.0</span></span><br><span class="line">PING 10.187.2.0 (10.187.2.0) 56(84) bytes of data.</span><br><span class="line">64 bytes from 10.187.2.0: icmp_seq=1 ttl=64 time=0.074 ms</span><br><span class="line">^C</span><br><span class="line">--- 10.187.2.0 ping statistics ---</span><br><span class="line">1 packets transmitted, 1 received, 0% packet loss, time 0ms</span><br><span class="line">rtt min/avg/max/mdev = 0.074/0.074/0.074/0.000 ms</span><br></pre></td></tr></table></figure><p>测下跨节点也没问题：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ping 10.187.0.1</span></span><br><span class="line">PING 10.187.0.1 (10.187.0.1) 56(84) bytes of data.</span><br><span class="line">64 bytes from 10.187.0.1: icmp_seq=1 ttl=64 time=0.365 ms</span><br><span class="line">64 bytes from 10.187.0.1: icmp_seq=2 ttl=64 time=0.325 ms</span><br><span class="line">^C</span><br><span class="line">--- 10.187.0.1 ping statistics ---</span><br><span class="line">2 packets transmitted, 2 received, 0% packet loss, time 999ms</span><br><span class="line">rtt min/avg/max/mdev = 0.325/0.345/0.365/0.020 ms</span><br></pre></td></tr></table></figure><h2 id="结论"><a href="#结论" class="headerlink" title="结论"></a>结论</h2><p>虽然这个细节问题不影响 k8s overlay 网络，但是客户监控告警要查清楚原因。</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;一次客户环境上个别节点 flannel.1 的 IP 无法 ping 的排查&lt;/p&gt;</summary>
    
    
    
    
    <category term="linux" scheme="http://zhangguanzhang.github.io/tags/linux/"/>
    
    <category term="flannel" scheme="http://zhangguanzhang.github.io/tags/flannel/"/>
    
  </entry>
  
  <entry>
    <title>salt-run很久才返回</title>
    <link href="http://zhangguanzhang.github.io/2025/09/12/salt-run-hang/"/>
    <id>http://zhangguanzhang.github.io/2025/09/12/salt-run-hang/</id>
    <published>2025-09-12T20:10:30.000Z</published>
    <updated>2025-09-12T20:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>一次 salt-run 很久才返回的排查</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>所有 salt-run 命令耗时都很久</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">$ time salt-run jobs.active</span><br><span class="line">[INFO    ] Runner completed: 20250912111420062465_14113</span><br><span class="line"></span><br><span class="line">real0m23.432s</span><br><span class="line">user0m2.716s</span><br><span class="line">sys0m0.320s</span><br></pre></td></tr></table></figure><h2 id="排查"><a href="#排查" class="headerlink" title="排查"></a>排查</h2><p>salt-master 是容器里运行的，该容器没 ptrace 权限，怕重启容器后故障无了。就容器内执行卡住后看下宿主机进程 salt-run 是唯一的，宿主机上有 strace 命令，strace 看看：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">strace -p `ps -ef | grep -E <span class="string">&#x27;salt-ru[n]&#x27;</span> | awk <span class="string">&#x27;&#123;print $2&#125;&#x27;</span> `</span></span><br><span class="line">...</span><br><span class="line">...</span><br><span class="line"></span><br><span class="line">openat(AT_FDCWD, &quot;/etc/hosts&quot;, O_RDONLY|O_CLOEXEC) = 11</span><br><span class="line">fstat(11, &#123;st_mode=S_IFREG|0644, st_size=898, ...&#125;) = 0</span><br><span class="line">lseek(11, 0, SEEK_SET)                  = 0</span><br><span class="line">read(11, &quot;#\n# hosts         This file desc&quot;..., 4096) = 898</span><br><span class="line">read(11, &quot;&quot;, 4096)                      = 0</span><br><span class="line">close(11)                               = 0</span><br><span class="line">newfstatat(AT_FDCWD, &quot;/etc/nsswitch.conf&quot;, &#123;st_mode=S_IFREG|0644, st_size=1516, ...&#125;, 0) = 0</span><br><span class="line">newfstatat(AT_FDCWD, &quot;/etc/resolv.conf&quot;, &#123;st_mode=S_IFREG|0644, st_size=211, ...&#125;, 0) = 0</span><br><span class="line">openat(AT_FDCWD, &quot;/etc/hosts&quot;, O_RDONLY|O_CLOEXEC) = 11</span><br><span class="line">fstat(11, &#123;st_mode=S_IFREG|0644, st_size=898, ...&#125;) = 0</span><br><span class="line">lseek(11, 0, SEEK_SET)                  = 0</span><br><span class="line">read(11, &quot;#\n# hosts         This file desc&quot;..., 4096) = 898</span><br><span class="line">read(11, &quot;&quot;, 4096)                      = 0</span><br><span class="line">close(11)                               = 0</span><br><span class="line">socket(PF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 11</span><br><span class="line">setsockopt(11, SOL_IP, IP_RECVERR, [1], 4) = 0</span><br><span class="line">connect(11, &#123;sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr(&quot;10.xx.xx.1&quot;)&#125;, 16) = 0</span><br><span class="line">poll([&#123;fd=11, events=POLLOUT&#125;], 1, 0)   = 1 ([&#123;fd=11, revents=POLLOUT&#125;])</span><br><span class="line">sendto(11, &quot;e\366\1\0\0\1\0\0\0\0\0\0\7kubexxx\0\0\34\0\1&quot;, 25, MSG_NOSIGNAL, NULL, 0) = 25</span><br><span class="line">poll([&#123;fd=11, events=POLLIN&#125;], 1, 5000</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>左边窗口执行，右边赶紧 strace，发现有输出后卡主，赶紧按回车分开，最后往上翻找空行附近就是卡住的信息。从上面看就是 glibc 的 DNS 解析行为：</p><ol><li>先看 <code>/etc/nsswitch.conf</code> 内的 <code>hosts</code> 行，看 hosts 和 dns 的优先级</li><li><code>connect(11, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr(&quot;10.xx.xx.1&quot;)}, 16) = 0</code> 发起了 DNS 解析请求的连接信息</li><li><code>sendto(11, &quot;e\366\1\0\0\1\0\0\0\0\0\0\7kubexxx\0\0\34\0\1&quot;, 25, MSG_NOSIGNAL, NULL, 0) = 25</code> DNS 请求</li></ol><p>然后再执行下，另一个窗口看了下链接确实存在 DNS 解析行为：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ss -anuop | grep :53</span></span><br><span class="line">ESTAB      0      0      10.xx.xx.215:36250              10.xx.xx.1:53                  users:((&quot;salt-run&quot;,pid=13752,fd=4))</span><br></pre></td></tr></table></figure><p>然后抓包看了下请求：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">手机阅读的话，请往右侧翻</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">tcpdump -nn -i any port 53 -vvv</span></span><br><span class="line">tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes</span><br><span class="line">19:34:24.952563 IP (tos 0x0, ttl 64, id 50084, offset 0, flags [DF], proto UDP (17), length 53)</span><br><span class="line">    10.xx.xx.215.63760 &gt; 10.xx.xx.1.53: [bad udp cksum 0xc324 -&gt; 0xd5d2!] 64743+ AAAA? kubexxx. (25)</span><br><span class="line">19:34:26.613645 IP (tos 0x0, ttl 64, id 50437, offset 0, flags [DF], proto UDP (17), length 53)</span><br><span class="line">    10.xx.xx.215.35564 &gt; 10.xx.xx.1.53: [bad udp cksum 0xc324 -&gt; 0x3614!] 2763+ AAAA? kubexxx. (25)</span><br></pre></td></tr></table></figure><p>看上面的 <code>AAAA? kubexxx</code>，可以确认是请求 hostname kubexxx 的 IPv6 DNS 解析记录，然后加了下 hosts 就好了。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"> <span class="built_in">echo</span> <span class="string">&quot;::1 <span class="variable">$HOSTNAME</span>&quot;</span> &gt;&gt; /etc/hosts</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="keyword">time</span> salt-run jobs.active</span></span><br><span class="line">[INFO    ] Runner completed: 20250912113512377461_16938</span><br><span class="line"></span><br><span class="line">real0m3.265s</span><br><span class="line">user0m2.714s</span><br><span class="line">sys0m0.254s</span><br></pre></td></tr></table></figure><h2 id="解决"><a href="#解决" class="headerlink" title="解决"></a>解决</h2><p>搜了下关键字 <code>salt-run dns ipv6</code> 看看有没有其他人遇到，结果找到类似问题：</p><ul><li><a href="https://github.com/saltstack/salt/issues/40912">https://github.com/saltstack/salt/issues/40912</a></li><li><a href="https://github.com/saltstack/salt/issues/32719#issuecomment-238114720">https://github.com/saltstack/salt/issues/32719#issuecomment-238114720</a></li></ul><p>但是回复都是老版本遇见多，我这边版本都很新了：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">salt --versions-report</span></span><br><span class="line">Salt Version:</span><br><span class="line">          Salt: 3006.15</span><br><span class="line"> </span><br><span class="line">Python Version:</span><br><span class="line">        Python: 3.10.13 (main, Sep  8 2025, 06:32:11) [GCC 10.3.1]</span><br><span class="line"> </span><br><span class="line">Dependency Versions:</span><br><span class="line">          cffi: 1.14.5</span><br><span class="line">      cherrypy: Not Installed</span><br><span class="line">  cryptography: 44.0.1</span><br><span class="line">      dateutil: 2.8.2</span><br><span class="line">     docker-py: 6.1.3</span><br><span class="line">         gitdb: Not Installed</span><br><span class="line">     gitpython: Not Installed</span><br><span class="line">        Jinja2: 3.0.1</span><br><span class="line">       libgit2: Not Installed</span><br><span class="line">  looseversion: 1.3.0</span><br><span class="line">      M2Crypto: Not Installed</span><br><span class="line">          Mako: 1.2.2</span><br><span class="line">       msgpack: 1.0.5</span><br><span class="line">  msgpack-pure: Not Installed</span><br><span class="line">  mysql-python: Not Installed</span><br><span class="line">     packaging: 24.1</span><br><span class="line">     pycparser: 2.20</span><br><span class="line">      pycrypto: 3.19.1</span><br><span class="line">  pycryptodome: Not Installed</span><br><span class="line">        pygit2: Not Installed</span><br><span class="line">  python-gnupg: Not Installed</span><br><span class="line">        PyYAML: 6.0.1</span><br><span class="line">         PyZMQ: 27.0.2</span><br><span class="line">        relenv: Not Installed</span><br><span class="line">         smmap: Not Installed</span><br><span class="line">       timelib: Not Installed</span><br><span class="line">       Tornado: 4.5.3</span><br><span class="line">           ZMQ: 4.3.5</span><br><span class="line"> </span><br><span class="line">System Versions:</span><br><span class="line">          dist: openeuler 22.03 LTS-SP4</span><br><span class="line">        locale: utf-8</span><br><span class="line">       machine: x86_64</span><br><span class="line">       release: 4.12.14-120-default</span><br><span class="line">        system: Linux</span><br><span class="line">       version: openEuler 22.03 LTS-SP4</span><br></pre></td></tr></table></figure><p>这种是 glibc 的行为，salt 会解析 hostname，业务临近封库，改这个 docker 镜像内启动脚本也来不及了，就先在代码层面解决了，搜了下 python socket 库并没有那种纯看 hosts 条目的库，就手动写了如下逻辑：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> fcntl</span><br><span class="line"><span class="keyword">import</span> socket</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">ensure_hosts_entry</span>(<span class="params">ip_address, hostname</span>):</span><br><span class="line">    <span class="keyword">with</span> <span class="built_in">open</span>(<span class="string">&quot;/etc/hosts&quot;</span>, <span class="string">&quot;r+&quot;</span>) <span class="keyword">as</span> f:</span><br><span class="line">        <span class="keyword">try</span>:</span><br><span class="line">            <span class="comment"># 加排他锁（阻塞模式，确保并发安全）</span></span><br><span class="line">            fcntl.flock(f, fcntl.LOCK_EX)</span><br><span class="line">            original_content = f.readlines()</span><br><span class="line"></span><br><span class="line">            <span class="keyword">for</span> index, line <span class="keyword">in</span> <span class="built_in">enumerate</span>(original_content):</span><br><span class="line">                line_part = line.split(<span class="string">&#x27; &#x27;</span>)</span><br><span class="line">                <span class="keyword">if</span> <span class="built_in">len</span>(line_part) &lt; <span class="number">2</span>:</span><br><span class="line">                    <span class="keyword">continue</span></span><br><span class="line">                <span class="keyword">if</span> hostname <span class="keyword">in</span> line_part:</span><br><span class="line">                    <span class="keyword">return</span></span><br><span class="line"></span><br><span class="line">            original_content.append(<span class="string">f&quot;<span class="subst">&#123;ip_address&#125;</span> <span class="subst">&#123;hostname&#125;</span>\n&quot;</span>)</span><br><span class="line">            f.seek(<span class="number">0</span>)</span><br><span class="line">            f.truncate()</span><br><span class="line">            f.write(<span class="string">&#x27;&#x27;</span>.join(original_content))</span><br><span class="line">            f.flush()</span><br><span class="line">        <span class="keyword">finally</span>:</span><br><span class="line">            fcntl.flock(f, fcntl.LOCK_UN)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">ensure_hosts_entry(<span class="string">&quot;::1&quot;</span>, socket.gethostname())</span><br><span class="line">...</span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;p&gt;一次 salt-run 很久才返回的排查&lt;/p&gt;</summary>
    
    
    
    
    <category term="salt" scheme="http://zhangguanzhang.github.io/tags/salt/"/>
    
    <category term="salt-run" scheme="http://zhangguanzhang.github.io/tags/salt-run/"/>
    
    <category term="ipv6" scheme="http://zhangguanzhang.github.io/tags/ipv6/"/>
    
  </entry>
  
  <entry>
    <title>麒麟内核4.19.90-52.49导致的flannel vxlan跨节点不通</title>
    <link href="http://zhangguanzhang.github.io/2025/09/10/kylin-4.19.90-52.49-udp-drop/"/>
    <id>http://zhangguanzhang.github.io/2025/09/10/kylin-4.19.90-52.49-udp-drop/</id>
    <published>2025-09-10T18:10:30.000Z</published>
    <updated>2025-09-10T18:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>一次麒麟内核导致 flannel 跨节点不通的排查</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>客户有一套 K8S 环境 A 要扩容，给了两台机器加进后，同事发现新节点 flannel 跨节点不通。抓包排查发现新节点的 <code>ip -s a s flannel.1</code> 显示的 Rx 收包为0。<br>客户认为是 k8s 问题，我们认为是客户网络环境没放行 UDP。然后双方达成共识搞一套干净环境 B 部署 K8S 看看，然后发现依旧跨节点不通。</p><h2 id="排查"><a href="#排查" class="headerlink" title="排查"></a>排查</h2><h3 id="环境信息"><a href="#环境信息" class="headerlink" title="环境信息"></a>环境信息</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /etc/os-release</span></span><br><span class="line">NAME=&quot;Kylin Linux Advanced Server&quot;</span><br><span class="line">VERSION=&quot;V10 (Lance)&quot;</span><br><span class="line">ID=&quot;kylin&quot;</span><br><span class="line">VERSION_ID=&quot;V10&quot;</span><br><span class="line">PRETTY_NAME=&quot;Kylin Linux Advanced Server V10 (Lance)&quot;</span><br><span class="line">ANSI_COLOR=&quot;0;31&quot;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">lscpu</span></span><br><span class="line">Architecture:                    x86_64</span><br><span class="line">CPU op-mode(s):                  32-bit, 64-bit</span><br><span class="line">Byte Order:                      Little Endian</span><br><span class="line">Address sizes:                   48 bits physical, 48 bits virtual</span><br><span class="line">CPU(s):                          14</span><br><span class="line">On-line CPU(s) list:             0-13</span><br><span class="line">Thread(s) per core:              1</span><br><span class="line">Core(s) per socket:              2</span><br><span class="line">Socket(s):                       7</span><br><span class="line">NUMA node(s):                    1</span><br><span class="line">Vendor ID:                       AuthenticAMD</span><br><span class="line">CPU family:                      15</span><br><span class="line">Model:                           6</span><br><span class="line">Model name:                      Hygon C86-3G 7390 32-core Processor</span><br><span class="line">Stepping:                        3</span><br><span class="line">CPU MHz:                         2699.998</span><br><span class="line">BogoMIPS:                        5399.99</span><br><span class="line">Hypervisor vendor:               KVM</span><br><span class="line">Virtualization type:             full</span><br><span class="line">L1d cache:                       896 KiB</span><br><span class="line">L1i cache:                       896 KiB</span><br><span class="line">L2 cache:                        7 MiB</span><br><span class="line">L3 cache:                        112 MiB</span><br><span class="line">NUMA node0 CPU(s):               0-13</span><br><span class="line">Vulnerability Itlb multihit:     Not affected</span><br><span class="line">Vulnerability Spec store bypass: Not affected</span><br><span class="line">Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization</span><br><span class="line">Vulnerability Spectre v2:        Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected</span><br><span class="line">Vulnerability Srbds:             Not affected</span><br><span class="line">Vulnerability Tsx async abort:   Not affected</span><br><span class="line">Flags:                           fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm rep_good </span><br><span class="line">                                 nopl cpuid extd_apicid tsc_known_freq pni cx16 x2apic aes hypervisor cmp_legacy 3dnowprefetch vmmcall</span><br></pre></td></tr></table></figure><h3 id="接手缘由"><a href="#接手缘由" class="headerlink" title="接手缘由"></a>接手缘由</h3><p>现场的同事卸载了 K8S 后测试UDP 还是一样，然后认为 udp 存在限制，客户用 nmap 扫描认为 UDP 没限制，我们用 nc 起 server 用 client 测不通，客户认为我们这种测试方式不准，我就写了个测试脚本给同事：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># -*- coding: utf-8 -*-</span></span><br><span class="line"><span class="keyword">from</span> __future__ <span class="keyword">import</span> print_function</span><br><span class="line"><span class="keyword">import</span> socket</span><br><span class="line"><span class="keyword">import</span> sys</span><br><span class="line"><span class="keyword">import</span> logging</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> sys.version_info[<span class="number">0</span>] == <span class="number">2</span>:</span><br><span class="line">    <span class="built_in">input</span> = raw_input</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">usage</span>():</span><br><span class="line">    <span class="built_in">print</span>(<span class="string">&quot;Usage:&quot;</span>)</span><br><span class="line">    <span class="built_in">print</span>(<span class="string">&quot;  Server: python udp-test.py &lt;local-port&gt;&quot;</span>)</span><br><span class="line">    <span class="built_in">print</span>(<span class="string">&quot;  Client: python udp-test.py &lt;host:port&gt;&quot;</span>)</span><br><span class="line">    sys.exit(<span class="number">1</span>)</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">parse_addr</span>(<span class="params">s</span>):</span><br><span class="line">    <span class="keyword">if</span> <span class="string">&quot;:&quot;</span> <span class="keyword">in</span> s:</span><br><span class="line">        host, port = s.rsplit(<span class="string">&quot;:&quot;</span>, <span class="number">1</span>)</span><br><span class="line">        <span class="keyword">return</span> host, <span class="built_in">int</span>(port)</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">None</span>, <span class="built_in">int</span>(s)</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">server</span>(<span class="params">port</span>):</span><br><span class="line">    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)</span><br><span class="line">    sock.bind((<span class="string">&quot;0.0.0.0&quot;</span>, port))</span><br><span class="line">    logging.info(<span class="string">&quot;Server listening on UDP *:%d&quot;</span>, port)</span><br><span class="line">    <span class="keyword">while</span> <span class="literal">True</span>:</span><br><span class="line">        data, addr = sock.recvfrom(<span class="number">4096</span>)</span><br><span class="line">        msg = data.decode(<span class="string">&#x27;utf-8&#x27;</span>, <span class="string">&#x27;ignore&#x27;</span>)</span><br><span class="line">        logging.info(<span class="string">&quot;Received from %s: %r&quot;</span>, addr, msg)</span><br><span class="line">        sock.sendto(data, addr)          <span class="comment"># 回显</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">client</span>(<span class="params">host, port</span>):</span><br><span class="line">    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)</span><br><span class="line">    target = (host, port)</span><br><span class="line">    logging.info(<span class="string">&quot;Client target UDP %s:%d (type message and press Enter)&quot;</span>, host, port)</span><br><span class="line">    <span class="keyword">while</span> <span class="literal">True</span>:</span><br><span class="line">        <span class="keyword">try</span>:</span><br><span class="line">            text = <span class="built_in">input</span>(<span class="string">&quot;&gt;&gt;&gt; &quot;</span>)</span><br><span class="line">        <span class="keyword">except</span> (EOFError, KeyboardInterrupt):</span><br><span class="line">            <span class="built_in">print</span>(<span class="string">&quot;\nBye&quot;</span>)</span><br><span class="line">            <span class="keyword">break</span></span><br><span class="line">        <span class="keyword">if</span> <span class="keyword">not</span> text:</span><br><span class="line">            <span class="keyword">continue</span></span><br><span class="line">        sock.sendto(text.encode(<span class="string">&#x27;utf-8&#x27;</span>), target)</span><br><span class="line">        sock.settimeout(<span class="number">2</span>)</span><br><span class="line">        <span class="keyword">try</span>:</span><br><span class="line">            data, _ = sock.recvfrom(<span class="number">4096</span>)</span><br><span class="line">            logging.info(<span class="string">&quot;Server echoed: %r&quot;</span>, data.decode(<span class="string">&#x27;utf-8&#x27;</span>, <span class="string">&#x27;ignore&#x27;</span>))</span><br><span class="line">        <span class="keyword">except</span> socket.timeout:</span><br><span class="line">            logging.debug(<span class="string">&quot;No echo within 2s&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">main</span>():</span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(sys.argv) != <span class="number">2</span>:</span><br><span class="line">        usage()</span><br><span class="line">    addr_str = sys.argv[<span class="number">1</span>]</span><br><span class="line">    host, port = parse_addr(addr_str)</span><br><span class="line"></span><br><span class="line">    LOG_FMT = <span class="string">&quot;%(asctime)s [%(levelname)s] %(message)s&quot;</span></span><br><span class="line">    logging.basicConfig(level=logging.INFO, <span class="built_in">format</span>=LOG_FMT)</span><br><span class="line">    <span class="keyword">if</span> host <span class="keyword">is</span> <span class="literal">None</span>:</span><br><span class="line">        <span class="comment"># 纯数字端口 -&gt; 服务端</span></span><br><span class="line">        server(port)</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        <span class="comment"># host:port -&gt; 客户端</span></span><br><span class="line">        client(host, port)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&quot;__main__&quot;</span>:</span><br><span class="line">    main()</span><br></pre></td></tr></table></figure><p>脚本就是连上后发消息回车，server 把收到的消息发回客户端，客户端也打印收到的消息，同事客户环境上测了下发现不正常，然后客户用 tcpdump 抓包说接收到了。看了下客户抓包方式确实没问题：</p><ul><li>机器B 上先开抓包 <code>tcpdump -nn -i eth0 port 8472 -w xxx.pcap</code></li><li>机器A 上 <code>echo &quot;123&quot; | nc -u &lt;机器B_ip&gt; 8472</code></li></ul><p>之前以为是客户在发送机器上抓的，理清楚客户思路是正确后，就远程上去看了。</p><h3 id="抓包重现"><a href="#抓包重现" class="headerlink" title="抓包重现"></a>抓包重现</h3><p>确实目标机器能抓到报文：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">tcpdump  -nn -i any port 8472 -vvv</span></span><br><span class="line">dropped privs to tcpdump</span><br><span class="line">tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes</span><br><span class="line">14:43:39.267522 IP (tos 0x0, ttl 64, id 56033, offset 0, flags [DF], proto UDP (17), length 35)</span><br><span class="line">    10.xx.50.166.50492 &gt; 10.xx.50.169.8472: [udp sum ok] OTV,  [|OTV]</span><br><span class="line">^C</span><br><span class="line">1 packet captured</span><br><span class="line">3 packets received by filter</span><br><span class="line">0 packets dropped by kernel</span><br></pre></td></tr></table></figure><p>上面是抓包后用 nc 发 UDP 抓的，然后下面是脚本形式发的抓的</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">tcpdump -nn -i any port 8472 -vvv</span></span><br><span class="line">dropped privs to tcpdump</span><br><span class="line">tcpdump: listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes</span><br><span class="line">14:48:58.540224 IP (tos 0x0, ttl 64, id 48511, offset 0, flags [DF], proto UDP (17), length 31)</span><br><span class="line">    10.xx.50.166.46543 &gt; 10.xx.50.169.8472: [bad udp cksum 0x7a0d -&gt; 0x4acd!] OTV,  [|OTV]</span><br><span class="line">^C</span><br><span class="line">1 packet captured</span><br><span class="line">2 packets received by filter</span><br><span class="line">0 packets dropped by kernel</span><br></pre></td></tr></table></figure><h3 id="解决"><a href="#解决" class="headerlink" title="解决"></a>解决</h3><p>上面抓包对比里有 <code>bad udp cksum</code>，尝试取消网卡计算 checksum 试试：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ethtool --show-offload eth0 | grep checksum</span></span><br><span class="line">rx-checksumming: on [fixed]</span><br><span class="line">tx-checksummtng: on</span><br><span class="line">tx-checksum-ipv4:off[fixed]</span><br><span class="line">tx-checksum-ip-generic: on</span><br><span class="line">tx-checksum-ipv6: off [fixed]</span><br><span class="line">tx-checksum-fcoe-crc: off [ftxed]</span><br><span class="line">tx-checksum-sctp: off [fixed]</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ethtool --offload eth0 tx-checksum-ip-generic off</span></span><br><span class="line">Actual changes:</span><br><span class="line">tx-checksumming: off</span><br><span class="line">tx-checksum-ip-generic: off</span><br><span class="line">tcp-segmentatton-offLoad: off</span><br><span class="line">tx-tcp-segmentation; off [requested on]</span><br><span class="line">tx-tcp-ecn-segmentation: off [requested on]</span><br><span class="line">tx-tcp6-segmentation: off [requested on]</span><br></pre></td></tr></table></figure><p>测了下发下还不行，然后几个都设置了还是不行：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ethtool --offload eth0 tx off rx off</span><br></pre></td></tr></table></figure><p>对比了环境 A 的 offload 都一样，并且查看驱动也一致：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">readlink</span> -f /sys/class/net/eth0/device/driver/module</span></span><br><span class="line">/sys/module/virtio_net</span><br></pre></td></tr></table></figure><p>iptables 啥规则都没有，还有不一样的就只有内核版本了：</p><ul><li>正常环境：<code>Linux 4.19.90-52.22.v2207.ky10.x86_64 #1 SMP Tue Mar 14 12:19:10 CST 2023 x86_64 x86_64 x86_64 GNU/Linux</code></li><li>异常环境：<code>Linux 4.19.90-52.49.v2207.ky10.x86_64 #3 SMP Thu Jul 24 02:43:35 CST 2025 x86_64 x86_64 x86_64 GNU/Linux</code></li></ul><p>询问客户能不能这套环境 B 机器换成和正常环境一样内核的虚机，客户答复说是平台自动化开的机器，无法保持内核一致。没办法，然后看这套环境是不是升级过内核：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">grep kernel /var/log/dnf*</span></span><br><span class="line">...</span><br><span class="line">/var/log/dnf.rpm.log:2025-08-07T01:34:25Z SUBDEBUG InstaLLed:kernel-4.19.90-52.49.v2207.ky10.x86_64</span><br><span class="line">/var/log/dnf.rpm.log:2025-08-67T01:34:26Z SUBDEBUG Upgrade:kernel-tools-4.19.90-52.49.v2207.ky10.x86_64</span><br><span class="line">/var/log/dnf.rpm.log:2025-08-07T01:34:29Z SUBDEBUG Upgraded:kernel-tools-4.19.90-52.45.v2207.ky10.x86_64</span><br><span class="line">/var/log/dnf.rpm.log:2025-08-67T01:34:30Z SUBDEBUG Upgraded:kernel-headers-4.19.90-52.45.v2207.ky10.x86_64</span><br><span class="line">/var/log/dnf.rpm.log:2025-08-07T01:34:34Z SUBDEBUG Upgraded:kernel-tools-libs-4.19.90-52.45.v2207.ky10.x86_64</span><br></pre></td></tr></table></figure><p>果然升级了，看下老版本在不在：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">rpm -qa | grep kernel</span></span><br><span class="line">kernel-modules-extra-4.19.90-52.45.v2207.ky10.x86_64</span><br><span class="line">kernel-moduLes-extra-4.19.90-52.22.v2207.ky10.x86_64</span><br><span class="line">kernel-modules-4.19.90-52.49.v2207.ky10.x86_64</span><br><span class="line">kernel-core-4.19.90-52.22.v2207.ky10.x86_64</span><br><span class="line">kerne1-t001s-11bs-4.19.90-52.49.V2207.ky10.x86_64</span><br><span class="line">kernel-modules-extra-4.19.96-52.49.v2207.ky10.x86_64</span><br><span class="line">kernel-4.19.90-52.22.v2207.ky10.x86_64</span><br><span class="line">kernel-t00L5-4.19.90-52.49.v2207.ky10.x86_64</span><br><span class="line">kernel-core-4.19.90-52.45.v2207.ky10.x86_64</span><br><span class="line">kernel-modules-4.19.90-52.45.v2207.ky10.x86_64</span><br><span class="line">kernel-4.19.90-52.45.v2207.ky10.x86_64</span><br><span class="line">kernel-modules-4.19.90-52.22.v2207.ky10.x86_64</span><br><span class="line">kernel-headers-4.19.96-52.49.v2207.ky10.x86_64</span><br><span class="line">kernel-4.19.90-52.49.v2207.ky10.x86_64</span><br><span class="line">kernel-4.19.90-52.49.v2207.ky10.x86_64</span><br></pre></td></tr></table></figure><p>查看下 grub 里顺序：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">awk -F\&#x27; <span class="string">&#x27;$1==&quot;menuentry &quot; &#123;print i++ &quot; : &quot; $2&#125;&#x27;</span> /etc/grub2.cfg</span></span><br><span class="line">0 : Kylin Linux Advanced Server (4.19.90-52.49.v2207.ky10.x86_64) V10 (Lance)</span><br><span class="line">1 : Kylin Linux Advanced Server (4.19.90-52.45.v2207.ky10.x86_64) V10 (Lance)</span><br><span class="line">2 : Kylin Linux Advanced Server (4.19.90-52.22.v2207.ky10.x86_64) V10 (Lance)</span><br><span class="line">3 : Kylin Linux Advanced Server (0-rescue-de06076a688a45bf9d1acd0bf45bb93e) V10 (Lance)</span><br></pre></td></tr></table></figure><p>切换到 52.22:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">grub2-set-default 2</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">grub2-mkconfig -o /etc/grub2.cfg</span></span><br></pre></td></tr></table></figure><p>询问客户能否重启，可以重启，重启后测试就正常了：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">$ uname -a</span><br><span class="line">Linux TKVMJT0240 4.19.90-52.22.v2207.ky10.x86_64 #1 SMP Tue Mar 14 12:19:10 CST </span><br><span class="line"></span><br><span class="line"># 任意机器1</span><br><span class="line">$ python udp-test.py 8472</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"># 机器2</span><br><span class="line">$ python udp-test.py 机器1的ip:8472</span><br><span class="line">25-09-10 17:29:59,623 [INFO] Client raget UDP xxx:8472( type message and press Enter)</span><br><span class="line">&gt;&gt;&gt; 123</span><br><span class="line">25-09-10 17:30:00,753 [INFO] Server echoed: &#x27;123&#x27;</span><br><span class="line">&gt;&gt;&gt; ^C</span><br><span class="line">Bye</span><br></pre></td></tr></table></figure><p>其他几个机器一样处理后都正常。</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;一次麒麟内核导致 flannel 跨节点不通的排查&lt;/p&gt;</summary>
    
    
    
    
    <category term="kubernetes" scheme="http://zhangguanzhang.github.io/tags/kubernetes/"/>
    
    <category term="flannel" scheme="http://zhangguanzhang.github.io/tags/flannel/"/>
    
    <category term="vxlan" scheme="http://zhangguanzhang.github.io/tags/vxlan/"/>
    
    <category term="kylin" scheme="http://zhangguanzhang.github.io/tags/kylin/"/>
    
  </entry>
  
  <entry>
    <title>flannel 路由错乱</title>
    <link href="http://zhangguanzhang.github.io/2025/09/03/flannel-mode-chaos/"/>
    <id>http://zhangguanzhang.github.io/2025/09/03/flannel-mode-chaos/</id>
    <published>2025-09-03T10:10:30.000Z</published>
    <updated>2025-09-03T10:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>一次 flannel 路由错乱导致的跨节点不通</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>有一套客户测试环境，实施部署业务后发现有问题，看了下业务日志无法解析域名。因为 host-gw 需要二层，而且有些虚拟化有 IP&#x2F;MAC 绑定，所以默认用 vxlan 模式。</p><h2 id="排查"><a href="#排查" class="headerlink" title="排查"></a>排查</h2><h3 id="路由不对"><a href="#路由不对" class="headerlink" title="路由不对"></a>路由不对</h3><p>上去查了下发现 Pod 网段路由不对：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"> ip r s</span> </span><br><span class="line">default via 172.16.0.1 dev eth0 proto dhcp metric 100 </span><br><span class="line">10.185.0.0/16 dev docker0 proto kernel scope link src 10.185.0.1 linkdown </span><br><span class="line">10.187.0.0/24 via 172.16.0.250 dev eth0</span><br><span class="line">10.187.1.0/24 via 172.16.0.202 dev eth0 </span><br><span class="line">10.187.2.0/24 via 172.16.0.104 dev eth0 #&lt;--- 这几个</span><br><span class="line">10.187.3.0/24 via 172.16.0.104 dev eth0 #&lt;--- 这几个</span><br><span class="line">10.187.3.0/24 dev cni0 proto kernel scope link src 10.187.3.1 #&lt;--- 这几个</span><br><span class="line">172.16.0.0/24 dev eth0 proto kernel scope link src 172.16.0.231 metric 100 </span><br></pre></td></tr></table></figure><p>是 vxlan 模式，但是 2.0、3.0 下一跳都是 172.16.0.104，而且本机是 <code>10.187.3.0/24</code> 网段，cni0 的路由无法匹配到。怀疑是客户添加的路由，让和客户沟通后换个和客户内网不重合的 Pod CIDR。然后实施重装后还是这样，上去看了下日志：</p><h3 id="模式错乱"><a href="#模式错乱" class="headerlink" title="模式错乱"></a>模式错乱</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">$ docker ps -a | grep flanneld</span><br><span class="line">$ docker logs xxxx # flannel容器ID</span><br><span class="line">W0903 01:37:31.229597       1 main.go:540] no subnet found for key: FLANNEL_IPV6_NETWORK in file: /run/flannel/subnet.env</span><br><span class="line">W0903 01:37:31.229636       1 main.go:540] no subnet found for key: FLANNEL_IPV6_SUBNET in file: /run/flannel/subnet.env</span><br><span class="line">I0903 01:37:31.229643       1 iptables.go:125] Setting up masking rules</span><br><span class="line">I0903 01:37:31.422425       1 iptables.go:226] Changing default FORWARD chain policy to ACCEPT</span><br><span class="line">I0903 01:37:31.523863       1 main.go:396] Wrote subnet file to /run/flannel/subnet.env</span><br><span class="line">I0903 01:37:31.523889       1 main.go:400] Running backend.</span><br><span class="line">I0903 01:37:31.524023       1 route_network.go:56] Watching for new subnet leases</span><br><span class="line">I0903 01:37:31.524247       1 subnet.go:152] Batch elem [0] is &#123; lease.Event&#123;Type:0, Lease:lease.Lease&#123;EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net&#123;IP:0xa610100, PrefixLen:0x18&#125;, IPv6Subnet:ip.IP6Net&#123;IP:(*ip.IP6)(nil), PrefixLen:0x0&#125;, Attrs:lease.LeaseAttrs&#123;PublicIP:0xac100068, PublicIPv6:(*ip.IP6)(nil), BackendType:&quot;host-gw&quot;, BackendData:json.RawMessage&#123;0x6e, 0x75, 0x6c, 0x6c&#125;, BackendV6Data:json.RawMessage(nil)&#125;, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0&#125;&#125; &#125;</span><br><span class="line">I0903 01:37:31.524425       1 subnet.go:152] Batch elem [0] is &#123; lease.Event&#123;Type:0, Lease:lease.Lease&#123;EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net&#123;IP:0xa610300, PrefixLen:0x18&#125;, IPv6Subnet:ip.IP6Net&#123;IP:(*ip.IP6)(nil), PrefixLen:0x0&#125;, Attrs:lease.LeaseAttrs&#123;PublicIP:0xac1000e7, PublicIPv6:(*ip.IP6)(nil), BackendType:&quot;host-gw&quot;, BackendData:json.RawMessage&#123;0x6e, 0x75, 0x6c, 0x6c&#125;, BackendV6Data:json.RawMessage(nil)&#125;, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0&#125;&#125; &#125;</span><br><span class="line">I0903 01:37:31.524449       1 route_network.go:93] Subnet added: 10.97.1.0/24 via 172.16.0.104</span><br><span class="line">I0903 01:37:31.524705       1 route_network.go:166] Route to &#123;Ifindex: 2 Dst: 10.97.1.0/24 Src: &lt;nil&gt; Gw: 172.16.0.104 Flags: [] Table: 0 Realm: 0&#125; already exists, skipping.</span><br><span class="line">I0903 01:37:31.524797       1 route_network.go:93] Subnet added: 10.97.3.0/24 via 172.16.0.231</span><br><span class="line">I0903 01:37:31.524876       1 route_network.go:166] Route to &#123;Ifindex: 2 Dst: 10.97.3.0/24 Src: &lt;nil&gt; Gw: 172.16.0.231 Flags: [] Table: 0 Realm: 0&#125; already exists, skipping.</span><br><span class="line">I0903 01:37:31.524903       1 subnet.go:152] Batch elem [0] is &#123; lease.Event&#123;Type:0, Lease:lease.Lease&#123;EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net&#123;IP:0xa610000, PrefixLen:0x18&#125;, IPv6Subnet:ip.IP6Net&#123;IP:(*ip.IP6)(nil), PrefixLen:0x0&#125;, Attrs:lease.LeaseAttrs&#123;PublicIP:0xac1000fa, PublicIPv6:(*ip.IP6)(nil), BackendType:&quot;host-gw&quot;, BackendData:json.RawMessage&#123;0x6e, 0x75, 0x6c, 0x6c&#125;, BackendV6Data:json.RawMessage(nil)&#125;, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0&#125;&#125; &#125;</span><br><span class="line">I0903 01:37:31.524939       1 route_network.go:93] Subnet added: 10.97.0.0/24 via 172.16.0.250</span><br><span class="line">I0903 01:37:31.525190       1 route_network.go:166] Route to &#123;Ifindex: 2 Dst: 10.97.0.0/24 Src: &lt;nil&gt; Gw: 172.16.0.250 Flags: [] Table: 0 Realm: 0&#125; already exists, skipping.</span><br><span class="line">I0903 01:37:31.621427       1 main.go:421] Waiting for all goroutines to exit</span><br><span class="line">I0903 01:37:31.922267       1 iptables.go:372] bootstrap done</span><br><span class="line">I0903 01:37:32.129063       1 iptables.go:372] bootstrap done</span><br></pre></td></tr></table></figure><p>上面日志很奇怪，注意几个关键地方：</p><ul><li><code>BackendType:&quot;host-gw&quot;</code></li><li><code>Subnet added: 10.97.0.0/24 via 172.16.0.250</code></li></ul><p>怎么会是 <code>host-gw</code> 模式，查看下路由：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">[root@vm172-16-0-202 ~]# ip r s </span><br><span class="line">default via 172.16.0.1 dev eth0 proto dhcp metric 100 </span><br><span class="line">10.97.0.0/24 via 172.16.0.250 dev eth0 </span><br><span class="line">10.97.1.0/24 via 172.16.0.104 dev eth0 </span><br><span class="line">10.97.2.0/24 via 172.16.0.250 dev eth0 </span><br><span class="line">10.97.2.0/24 dev cni0 proto kernel scope link src 10.97.2.1 </span><br><span class="line">10.97.3.0/24 via 172.16.0.231 dev eth0 </span><br><span class="line">10.185.0.0/16 dev docker0 proto kernel scope link src 10.185.0.1 linkdown </span><br><span class="line">10.187.0.0/24 via 172.16.0.250 dev eth0 </span><br><span class="line">10.187.2.0/24 via 172.16.0.104 dev eth0 </span><br><span class="line">10.187.3.0/24 via 172.16.0.231 dev eth0 </span><br><span class="line">172.16.0.0/24 dev eth0 proto kernel scope link src 172.16.0.202 metric 100</span><br></pre></td></tr></table></figure><p>老路由忽略，新路由看确实是 host-gw 模式的路由，看下 configmap 配置模式：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line">[root@vm172-16-0-202 ~]# kubectl -n kube-system get cm kube-flannel-cfg -o yaml</span><br><span class="line">apiVersion: v1</span><br><span class="line">data:</span><br><span class="line">  cni-conf.json: |</span><br><span class="line">    &#123;</span><br><span class="line">      <span class="string">&quot;name&quot;</span>: <span class="string">&quot;cbr0&quot;</span>,</span><br><span class="line">      <span class="string">&quot;cniVersion&quot;</span>: <span class="string">&quot;0.3.1&quot;</span>,</span><br><span class="line">      <span class="string">&quot;plugins&quot;</span>: [</span><br><span class="line">        &#123;</span><br><span class="line">          <span class="string">&quot;type&quot;</span>: <span class="string">&quot;flannel&quot;</span>,</span><br><span class="line">          <span class="string">&quot;delegate&quot;</span>: &#123;</span><br><span class="line">            <span class="string">&quot;hairpinMode&quot;</span>: <span class="literal">true</span>,</span><br><span class="line">            <span class="string">&quot;isDefaultGateway&quot;</span>: <span class="literal">true</span></span><br><span class="line">          &#125;</span><br><span class="line">        &#125;,</span><br><span class="line">        &#123;</span><br><span class="line">          <span class="string">&quot;type&quot;</span>: <span class="string">&quot;portmap&quot;</span>,</span><br><span class="line">          <span class="string">&quot;capabilities&quot;</span>: &#123;</span><br><span class="line">            <span class="string">&quot;portMappings&quot;</span>: <span class="literal">true</span></span><br><span class="line">          &#125;</span><br><span class="line">        &#125;</span><br><span class="line">      ]</span><br><span class="line">    &#125;</span><br><span class="line">  net-conf.json: |</span><br><span class="line">    &#123;</span><br><span class="line"></span><br><span class="line">      <span class="string">&quot;Network&quot;</span>: <span class="string">&quot;10.97.0.0/16&quot;</span>,</span><br><span class="line">            <span class="string">&quot;Backend&quot;</span>: &#123;</span><br><span class="line">        <span class="string">&quot;Type&quot;</span>: <span class="string">&quot;vxlan&quot;</span>,</span><br><span class="line">        <span class="string">&quot;Port&quot;</span>: 8475</span><br><span class="line">      &#125;</span><br><span class="line">    &#125;</span><br><span class="line">kind: ConfigMap</span><br></pre></td></tr></table></figure><p>configmap 是 vxlan 没问题，看下文件：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">[root@vm172-16-0-202 ~]# cat /run/flannel/subnet.env </span><br><span class="line">FLANNEL_NETWORK=10.97.0.0/16</span><br><span class="line">FLANNEL_SUBNET=10.97.2.1/24</span><br><span class="line">FLANNEL_MTU=1500</span><br><span class="line">FLANNEL_IPMASQ=true</span><br></pre></td></tr></table></figure><p>奇怪了，怎么是 host-gw 的 1500 MTU，删除下 flannel 容器后再看看：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br></pre></td><td class="code"><pre><span class="line">[root@vm172-16-0-202 ~]# docker rm -f xxx # flanneld 容器id</span><br><span class="line">[root@vm172-16-0-202 ~]# cat /run/flannel/subnet.env </span><br><span class="line">FLANNEL_NETWORK=10.97.0.0/16</span><br><span class="line">FLANNEL_SUBNET=10.97.2.1/24</span><br><span class="line">FLANNEL_MTU=1450</span><br><span class="line">FLANNEL_IPMASQ=true</span><br><span class="line">[root@vm172-16-0-202 ~]# docker ps -a | grep flanneld</span><br><span class="line">e288c4442ee2   reg.xxx.lan:5000/xxx/flannel                   &quot;/opt/bin/flanneld -…&quot;   41 seconds ago   Up 40 seconds                         k8s_kube-flannel_kube-flannel-ds-w7rk2_kube-system_581101b1-cfa5-4ccc-80be-e78a9c248b96_1</span><br><span class="line">[root@vm172-16-0-202 ~]# docker logs e288</span><br><span class="line">I0903 01:47:13.123249       1 main.go:211] CLI flags config: &#123;etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true&#125;</span><br><span class="line">W0903 01:47:13.123387       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.</span><br><span class="line">I0903 01:47:13.223377       1 kube.go:139] Waiting 10m0s for node controller to sync</span><br><span class="line">I0903 01:47:13.223476       1 kube.go:469] Starting kube subnet manager</span><br><span class="line">I0903 01:47:13.230593       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.97.1.0/24]</span><br><span class="line">I0903 01:47:13.230628       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.97.2.0/24]</span><br><span class="line">I0903 01:47:13.230637       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.97.3.0/24]</span><br><span class="line">I0903 01:47:13.230644       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.97.0.0/24]</span><br><span class="line">I0903 01:47:14.223684       1 kube.go:146] Node controller sync successful</span><br><span class="line">I0903 01:47:14.223761       1 main.go:231] Created subnet manager: Kubernetes Subnet Manager - 172.16.0.202</span><br><span class="line">I0903 01:47:14.223767       1 main.go:234] Installing signal handlers</span><br><span class="line">I0903 01:47:14.224176       1 main.go:452] Found network config - Backend type: vxlan</span><br><span class="line">I0903 01:47:14.229346       1 kube.go:669] List of node(172.16.0.202) annotations: map[string]string&#123;&quot;flannel.alpha.coreos.com/backend-data&quot;:&quot;null&quot;, &quot;flannel.alpha.coreos.com/backend-type&quot;:&quot;host-gw&quot;, &quot;flannel.alpha.coreos.com/kube-subnet-manager&quot;:&quot;true&quot;, &quot;flannel.alpha.coreos.com/public-ip&quot;:&quot;172.16.0.202&quot;, &quot;node.alpha.kubernetes.io/ttl&quot;:&quot;0&quot;, &quot;volumes.kubernetes.io/controller-managed-attach-detach&quot;:&quot;true&quot;&#125;</span><br><span class="line">I0903 01:47:14.229398       1 match.go:210] Determining IP address of default interface</span><br><span class="line">I0903 01:47:14.229758       1 match.go:263] Using interface with name eth0 and address 172.16.0.202</span><br><span class="line">I0903 01:47:14.229788       1 match.go:285] Defaulting external address to interface address (172.16.0.202)</span><br><span class="line">I0903 01:47:14.229841       1 vxlan.go:141] VXLAN config: VNI=1 Port=8475 GBP=false Learning=false DirectRouting=false</span><br><span class="line">I0903 01:47:14.233427       1 kube.go:636] List of node(172.16.0.202) annotations: map[string]string&#123;&quot;flannel.alpha.coreos.com/backend-data&quot;:&quot;null&quot;, &quot;flannel.alpha.coreos.com/backend-type&quot;:&quot;host-gw&quot;, &quot;flannel.alpha.coreos.com/kube-subnet-manager&quot;:&quot;true&quot;, &quot;flannel.alpha.coreos.com/public-ip&quot;:&quot;172.16.0.202&quot;, &quot;node.alpha.kubernetes.io/ttl&quot;:&quot;0&quot;, &quot;volumes.kubernetes.io/controller-managed-attach-detach&quot;:&quot;true&quot;&#125;</span><br><span class="line">I0903 01:47:14.249292       1 iptables.go:51] Starting flannel in iptables mode...</span><br><span class="line">W0903 01:47:14.249452       1 main.go:540] no subnet found for key: FLANNEL_IPV6_NETWORK in file: /run/flannel/subnet.env</span><br><span class="line">W0903 01:47:14.249491       1 main.go:540] no subnet found for key: FLANNEL_IPV6_SUBNET in file: /run/flannel/subnet.env</span><br><span class="line">I0903 01:47:14.249498       1 iptables.go:125] Setting up masking rules</span><br><span class="line">I0903 01:47:14.250112       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.97.2.0/24]</span><br><span class="line">I0903 01:47:14.529553       1 iptables.go:226] Changing default FORWARD chain policy to ACCEPT</span><br><span class="line">I0903 01:47:14.622211       1 main.go:396] Wrote subnet file to /run/flannel/subnet.env</span><br><span class="line">I0903 01:47:14.622240       1 main.go:400] Running backend.</span><br><span class="line">I0903 01:47:14.622545       1 vxlan_network.go:65] watching for new subnet leases</span><br><span class="line">I0903 01:47:14.622590       1 subnet.go:152] Batch elem [0] is &#123; lease.Event&#123;Type:0, Lease:lease.Lease&#123;EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net&#123;IP:0xa610100, PrefixLen:0x18&#125;, IPv6Subnet:ip.IP6Net&#123;IP:(*ip.IP6)(nil), PrefixLen:0x0&#125;, Attrs:lease.LeaseAttrs&#123;PublicIP:0xac100068, PublicIPv6:(*ip.IP6)(nil), BackendType:&quot;host-gw&quot;, BackendData:json.RawMessage&#123;0x6e, 0x75, 0x6c, 0x6c&#125;, BackendV6Data:json.RawMessage(nil)&#125;, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0&#125;&#125; &#125;</span><br><span class="line">I0903 01:47:14.622651       1 subnet.go:152] Batch elem [0] is &#123; lease.Event&#123;Type:0, Lease:lease.Lease&#123;EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net&#123;IP:0xa610300, PrefixLen:0x18&#125;, IPv6Subnet:ip.IP6Net&#123;IP:(*ip.IP6)(nil), PrefixLen:0x0&#125;, Attrs:lease.LeaseAttrs&#123;PublicIP:0xac1000e7, PublicIPv6:(*ip.IP6)(nil), BackendType:&quot;host-gw&quot;, BackendData:json.RawMessage&#123;0x6e, 0x75, 0x6c, 0x6c&#125;, BackendV6Data:json.RawMessage(nil)&#125;, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0&#125;&#125; &#125;</span><br><span class="line">I0903 01:47:14.622694       1 vxlan_network.go:100] Received Subnet Event with VxLan: BackendType: host-gw, PublicIP: 172.16.0.104, PublicIPv6: (nil), BackendData: null, BackendV6Data: (nil)</span><br><span class="line">W0903 01:47:14.622710       1 vxlan_network.go:102] ignoring non-vxlan v4Subnet(10.97.1.0/24) v6Subnet(::/0): type=host-gw</span><br><span class="line">I0903 01:47:14.622724       1 vxlan_network.go:100] Received Subnet Event with VxLan: BackendType: host-gw, PublicIP: 172.16.0.231, PublicIPv6: (nil), BackendData: null, BackendV6Data: (nil)</span><br><span class="line">W0903 01:47:14.622727       1 vxlan_network.go:102] ignoring non-vxlan v4Subnet(10.97.3.0/24) v6Subnet(::/0): type=host-gw</span><br><span class="line">I0903 01:47:14.622737       1 subnet.go:152] Batch elem [0] is &#123; lease.Event&#123;Type:0, Lease:lease.Lease&#123;EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net&#123;IP:0xa610000, PrefixLen:0x18&#125;, IPv6Subnet:ip.IP6Net&#123;IP:(*ip.IP6)(nil), PrefixLen:0x0&#125;, Attrs:lease.LeaseAttrs&#123;PublicIP:0xac1000fa, PublicIPv6:(*ip.IP6)(nil), BackendType:&quot;host-gw&quot;, BackendData:json.RawMessage&#123;0x6e, 0x75, 0x6c, 0x6c&#125;, BackendV6Data:json.RawMessage(nil)&#125;, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0&#125;&#125; &#125;</span><br><span class="line">I0903 01:47:14.622759       1 vxlan_network.go:100] Received Subnet Event with VxLan: BackendType: host-gw, PublicIP: 172.16.0.250, PublicIPv6: (nil), BackendData: null, BackendV6Data: (nil)</span><br><span class="line">W0903 01:47:14.622764       1 vxlan_network.go:102] ignoring non-vxlan v4Subnet(10.97.0.0/24) v6Subnet(::/0): type=host-gw</span><br><span class="line">I0903 01:47:14.722083       1 main.go:421] Waiting for all goroutines to exit</span><br><span class="line">I0903 01:47:15.123247       1 iptables.go:372] bootstrap done</span><br><span class="line">I0903 01:47:15.525861       1 iptables.go:372] bootstrap done</span><br></pre></td></tr></table></figure><p>文件内容是对了，但是看日志内部显示节点 annotation 注解不对劲：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">kube.go:636] List of node(172.16.0.202) annotations: \</span><br><span class="line">  map[string]string&#123;&quot;flannel.alpha.coreos.com/backend-data&quot;:&quot;null&quot;, \</span><br><span class="line">  &quot;flannel.alpha.coreos.com/backend-type&quot;:&quot;host-gw&quot;, \</span><br><span class="line">  &quot;flannel.alpha.coreos.com/kube-subnet-manager&quot;:&quot;true&quot;, \</span><br><span class="line">  &quot;flannel.alpha.coreos.com/public-ip&quot;:&quot;172.16.0.202&quot;, \</span><br><span class="line">  &quot;node.alpha.kubernetes.io/ttl&quot;:&quot;0&quot;, &quot;volumes.kubernetes.io/controller-managed-attach-detach&quot;:&quot;true&quot;&#125;</span><br></pre></td></tr></table></figure><p>显示的是 host-gw ，看下节点注解：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl get node -o yaml | grep flannel</span><br><span class="line">      flannel.alpha.coreos.com/backend-data: &quot;null&quot;</span><br><span class="line">      flannel.alpha.coreos.com/backend-type: host-gw</span><br><span class="line">      flannel.alpha.coreos.com/kube-subnet-manager: &quot;true&quot;</span><br><span class="line">      flannel.alpha.coreos.com/public-ip: 172.16.0.104</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel@sha256:13cddb14533a10394aa9436bd96a4c866a139b7ef01e71526aae013e724acca7</span><br><span class="line">      - flannel/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin@sha256:85aa4c338969e97b1ab751fdc2c167af228a241a224e2d0e5b81ca0f3e93e1fa</span><br><span class="line">      - flannel/flannel-cni-plugin:v1.4.1-flannel1</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin:v1.4.1</span><br><span class="line">      flannel.alpha.coreos.com/backend-data: &#x27;&#123;&quot;VNI&quot;:1,&quot;VtepMAC&quot;:&quot;f6:12:96:a0:5a:14&quot;&#125;&#x27;</span><br><span class="line">      flannel.alpha.coreos.com/backend-type: vxlan</span><br><span class="line">      flannel.alpha.coreos.com/kube-subnet-manager: &quot;true&quot;</span><br><span class="line">      flannel.alpha.coreos.com/public-ip: 172.16.0.202</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel@sha256:13cddb14533a10394aa9436bd96a4c866a139b7ef01e71526aae013e724acca7</span><br><span class="line">      - flannel/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin@sha256:85aa4c338969e97b1ab751fdc2c167af228a241a224e2d0e5b81ca0f3e93e1fa</span><br><span class="line">      - flannel/flannel-cni-plugin:v1.4.1-flannel1</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin:v1.4.1</span><br><span class="line">      flannel.alpha.coreos.com/backend-data: &quot;null&quot;</span><br><span class="line">      flannel.alpha.coreos.com/backend-type: host-gw</span><br><span class="line">      flannel.alpha.coreos.com/kube-subnet-manager: &quot;true&quot;</span><br><span class="line">      flannel.alpha.coreos.com/public-ip: 172.16.0.231</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel@sha256:13cddb14533a10394aa9436bd96a4c866a139b7ef01e71526aae013e724acca7</span><br><span class="line">      - flannel/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin@sha256:85aa4c338969e97b1ab751fdc2c167af228a241a224e2d0e5b81ca0f3e93e1fa</span><br><span class="line">      - flannel/flannel-cni-plugin:v1.4.1-flannel1</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin:v1.4.1</span><br><span class="line">      flannel.alpha.coreos.com/backend-data: &quot;null&quot;</span><br><span class="line">      flannel.alpha.coreos.com/backend-type: host-gw</span><br><span class="line">      flannel.alpha.coreos.com/kube-subnet-manager: &quot;true&quot;</span><br><span class="line">      flannel.alpha.coreos.com/public-ip: 172.16.0.250</span><br></pre></td></tr></table></figure><p>怎么两个 host-gw 模式，既然删除 pod 无用，就 edit 去掉 <code>flannel.alpha.coreos.com/backend-type</code> 注解了</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line">[root@vm172-16-0-202 ~]# kubectl edit node 172.16.0.202</span><br><span class="line">[root@vm172-16-0-202 ~]# kubectl -n kube-system delete pod kube-flannel-ds-2cxkf kube-flannel-ds-hml7r</span><br><span class="line">pod &quot;kube-flannel-ds-2cxkf&quot; deleted</span><br><span class="line">pod &quot;kube-flannel-ds-hml7r&quot; deleted</span><br><span class="line">[root@vm172-16-0-202 ~]# kubectl get no -o yaml | grep flannel</span><br><span class="line">      flannel.alpha.coreos.com/backend-data: &#x27;&#123;&quot;VNI&quot;:1,&quot;VtepMAC&quot;:&quot;de:87:3d:b7:74:fc&quot;&#125;&#x27;</span><br><span class="line">      flannel.alpha.coreos.com/backend-type: vxlan</span><br><span class="line">      flannel.alpha.coreos.com/kube-subnet-manager: &quot;true&quot;</span><br><span class="line">      flannel.alpha.coreos.com/public-ip: 172.16.0.104</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel@sha256:13cddb14533a10394aa9436bd96a4c866a139b7ef01e71526aae013e724acca7</span><br><span class="line">      - flannel/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin@sha256:85aa4c338969e97b1ab751fdc2c167af228a241a224e2d0e5b81ca0f3e93e1fa</span><br><span class="line">      - flannel/flannel-cni-plugin:v1.4.1-flannel1</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin:v1.4.1</span><br><span class="line">      flannel.alpha.coreos.com/backend-data: &#x27;&#123;&quot;VNI&quot;:1,&quot;VtepMAC&quot;:&quot;f6:12:96:a0:5a:14&quot;&#125;&#x27;</span><br><span class="line">      flannel.alpha.coreos.com/backend-type: vxlan</span><br><span class="line">      flannel.alpha.coreos.com/kube-subnet-manager: &quot;true&quot;</span><br><span class="line">      flannel.alpha.coreos.com/public-ip: 172.16.0.202</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel@sha256:13cddb14533a10394aa9436bd96a4c866a139b7ef01e71526aae013e724acca7</span><br><span class="line">      - flannel/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin@sha256:85aa4c338969e97b1ab751fdc2c167af228a241a224e2d0e5b81ca0f3e93e1fa</span><br><span class="line">      - flannel/flannel-cni-plugin:v1.4.1-flannel1</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin:v1.4.1</span><br><span class="line">      flannel.alpha.coreos.com/backend-data: &#x27;&#123;&quot;VNI&quot;:1,&quot;VtepMAC&quot;:&quot;1e:24:24:5e:b8:84&quot;&#125;&#x27;</span><br><span class="line">      flannel.alpha.coreos.com/backend-type: vxlan</span><br><span class="line">      flannel.alpha.coreos.com/kube-subnet-manager: &quot;true&quot;</span><br><span class="line">      flannel.alpha.coreos.com/public-ip: 172.16.0.231</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel@sha256:13cddb14533a10394aa9436bd96a4c866a139b7ef01e71526aae013e724acca7</span><br><span class="line">      - flannel/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel:v0.25.4</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin@sha256:85aa4c338969e97b1ab751fdc2c167af228a241a224e2d0e5b81ca0f3e93e1fa</span><br><span class="line">      - flannel/flannel-cni-plugin:v1.4.1-flannel1</span><br><span class="line">      - reg.xxx.lan:5000/xxx/flannel-cni-plugin:v1.4.1</span><br><span class="line">      flannel.alpha.coreos.com/backend-data: &#x27;&#123;&quot;VNI&quot;:1,&quot;VtepMAC&quot;:&quot;36:a5:28:4d:ec:e3&quot;&#125;&#x27;</span><br><span class="line">      flannel.alpha.coreos.com/backend-type: vxlan</span><br><span class="line">      flannel.alpha.coreos.com/kube-subnet-manager: &quot;true&quot;</span><br><span class="line">      flannel.alpha.coreos.com/public-ip: 172.16.0.250</span><br></pre></td></tr></table></figure><p>剩下几个 node 处理后：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">[root@vm172-16-0-202 ~]# ip r s</span><br><span class="line">default via 172.16.0.1 dev eth0 proto dhcp metric 100 </span><br><span class="line">10.97.0.0/24 via 10.97.0.0 dev flannel.1 onlink </span><br><span class="line">10.97.1.0/24 via 10.97.1.0 dev flannel.1 onlink </span><br><span class="line">10.97.2.0/24 via 172.16.0.250 dev eth0 </span><br><span class="line">10.97.2.0/24 dev cni0 proto kernel scope link src 10.97.2.1 </span><br><span class="line">10.97.3.0/24 via 10.97.3.0 dev flannel.1 onlink </span><br><span class="line">10.185.0.0/16 dev docker0 proto kernel scope link src 10.185.0.1 linkdown </span><br><span class="line">10.187.0.0/24 via 172.16.0.250 dev eth0 </span><br><span class="line">10.187.2.0/24 via 172.16.0.104 dev eth0 </span><br><span class="line">10.187.3.0/24 via 172.16.0.231 dev eth0 </span><br><span class="line">172.16.0.0/24 dev eth0 proto kernel scope link src 172.16.0.202 metric 100</span><br></pre></td></tr></table></figure><p>重启后也没问题</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">[root@vm172-16-0-202 ~]# ip r</span><br><span class="line">default via 172.16.0.1 dev eth0 proto dhcp metric 100 </span><br><span class="line">10.97.0.0/24 via 10.97.0.0 dev flannel.1 onlink </span><br><span class="line">10.97.1.0/24 via 10.97.1.0 dev flannel.1 onlink </span><br><span class="line">10.97.2.0/24 dev cni0 proto kernel scope link src 10.97.2.1 </span><br><span class="line">10.97.3.0/24 via 10.97.3.0 dev flannel.1 onlink </span><br><span class="line">10.185.0.0/16 dev docker0 proto kernel scope link src 10.185.0.1 linkdown </span><br><span class="line">172.16.0.0/24 dev eth0 proto kernel scope link src 172.16.0.202 metric 100</span><br></pre></td></tr></table></figure><p>询问了是不是有人最开始部署改过模式了，说没有，奇怪了。</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;一次 flannel 路由错乱导致的跨节点不通&lt;/p&gt;</summary>
    
    
    
    
    <category term="kubernetes" scheme="http://zhangguanzhang.github.io/tags/kubernetes/"/>
    
    <category term="flannel" scheme="http://zhangguanzhang.github.io/tags/flannel/"/>
    
    <category term="vxlan" scheme="http://zhangguanzhang.github.io/tags/vxlan/"/>
    
  </entry>
  
  <entry>
    <title>私有化下(CentOS 7)Podman调研</title>
    <link href="http://zhangguanzhang.github.io/2025/08/06/centos7-podman/"/>
    <id>http://zhangguanzhang.github.io/2025/08/06/centos7-podman/</id>
    <published>2025-08-06T10:10:30.000Z</published>
    <updated>2025-08-06T10:10:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>CentOS 7 上 Podman调研…..</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>内部需要调研 podman 替换掉 docker，因为我们私有化要适配很多操作系统（很多客户内部规定了必须使用啥系统，所以要支持），使用最常见的 CentOS 7：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /etc/redhat-release</span> </span><br><span class="line">CentOS Linux release 7.8.2003 (Core)</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">uname</span> -a</span></span><br><span class="line">Linux xxx 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux</span><br></pre></td></tr></table></figure><h2 id="过程"><a href="#过程" class="headerlink" title="过程"></a>过程</h2><h3 id="离线安装"><a href="#离线安装" class="headerlink" title="离线安装"></a>离线安装</h3><p><a href="https://podman.io/docs/installation">官方安装文档</a>上并没有 centos7 的安装，因为已经 EOL 好几年。考虑到客户会无网，需要类似 docker-static 那样，搜索谷歌和 github 找到 <a href="https://github.com/mgoltzsche/podman-static">podman-static</a>。podman 最新版本是v5，下载 v5.5.2 的压缩包后解压。</p><h3 id="daemon-进程相关"><a href="#daemon-进程相关" class="headerlink" title="daemon 进程相关"></a>daemon 进程相关</h3><p>因为我们还需要调用 API，所以需要起 daemon 进程监听 socket 和 tcp，需要以下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">podman system service -h</span></span><br><span class="line">Run API service</span><br><span class="line"></span><br><span class="line">Description:</span><br><span class="line">  Run an API service</span><br><span class="line"></span><br><span class="line">Enable a listening service for API access to Podman commands.</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">Usage:</span><br><span class="line">  podman system service [options] [URI]</span><br><span class="line"></span><br><span class="line">Examples:</span><br><span class="line">  podman system service --time=0 unix:///tmp/podman.sock</span><br><span class="line">  podman system service --time=0 tcp://localhost:8888</span><br><span class="line"></span><br><span class="line">Options:</span><br><span class="line">      --cors string   Set CORS Headers</span><br><span class="line">  -t, --time uint     Time until the service session expires in seconds.  Use 0 to disable the timeout (default 5)</span><br></pre></td></tr></table></figure><p>发现无法同时监听 tcp 和 socket，并且 tcp 不支持tls选项，搜索issue <a href="https://github.com/containers/podman/issues/24583">Support (m)TLS API socket</a>发现暂未支持，只能使用 socket 监听。</p><p>然后发现无法监听指定路径 socket文件：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">API service listening on \&quot;/run/podman/podman.sock\&quot;. URI: \&quot;unix:///var/run/docker.sock\&quot;&quot;</span><br></pre></td></tr></table></figure><p>查阅 podman 源码，发现走到以下逻辑：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containers/podman/blob/v5.5.2/cmd/podman/system/service_abi.go#L58-L77</span></span><br><span class="line">        <span class="keyword">switch</span> uri.Scheme &#123;</span><br><span class="line">        <span class="keyword">case</span> <span class="string">&quot;unix&quot;</span>:</span><br><span class="line">            path, err := filepath.Abs(uri.Path)</span><br><span class="line">            <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">                <span class="keyword">return</span> err</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">if</span> os.Getenv(<span class="string">&quot;LISTEN_FDS&quot;</span>) != <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">                <span class="comment">// If it is activated by systemd, use the first LISTEN_FD (3)</span></span><br><span class="line">                <span class="comment">// instead of opening the socket file.</span></span><br><span class="line">                f := os.NewFile(<span class="type">uintptr</span>(<span class="number">3</span>), <span class="string">&quot;podman.sock&quot;</span>)</span><br><span class="line">                listener, err = net.FileListener(f)</span><br><span class="line">                <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">                    <span class="keyword">return</span> err</span><br><span class="line">                &#125;</span><br><span class="line">            &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">                listener, err = net.Listen(uri.Scheme, path)</span><br><span class="line">                <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">                    <span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;unable to create socket: %w&quot;</span>, err)</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br></pre></td></tr></table></figure><p>通过查看 env 发现确实有 <code>LISTEN_FDS</code> 的 env：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">xargs -n1 -0 &lt; /proc/$(pgrep podman)/environ</span></span><br><span class="line">LANG=zh_CN.UTF-8</span><br><span class="line">PATH=&quot;/data/kube/bin:/bin:/sbin:/usr/bin:/usr/sbin&quot;</span><br><span class="line">LISTEN_PID=9047</span><br><span class="line">LISTEN_FDS=1</span><br><span class="line">LOGGING=&quot;--log-level=info&quot;</span><br></pre></td></tr></table></figure><p>看了下代码说明，是支持 systemd 的 socket 主动激活，我们不需要，去掉压缩包里的:</p><ol><li><code>system/podman.socket</code></li><li><code>system/podman.service</code> 内 require和 after podman.socket</li></ol><p>然后启动可行：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">API service listening on &quot;/var/run/docker.sock&quot;. URI: &quot;unix:///var/run/docker.sock&quot;</span><br></pre></td></tr></table></figure><h3 id="info-的-format-差异"><a href="#info-的-format-差异" class="headerlink" title="info 的 format 差异"></a>info 的 format 差异</h3><p>我们使用到了部分 info 里的 format 存在差异：</p><ol><li><code>&#39;&#123;&#123;.OSType&#125;&#125;&#39;</code> -&gt; <code>&#39;&#123;&#123;.Host.OS&#125;&#125;&#39;</code></li><li><code>&#39;&#123;&#123;.DockerRootDir&#125;&#125;&#39;</code> -&gt; <code>&#39;&#123;&#123;.Store.GraphRoot&#125;&#125;&#39;</code></li></ol><h3 id="非-host-网络容器"><a href="#非-host-网络容器" class="headerlink" title="非 host 网络容器"></a>非 host 网络容器</h3><p>部署后发现无法启动非 host 网络容器：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker run --name registry_pass --entrypoint htpasswd registry:2.7.1</span></span><br><span class="line">Error: creating network namespace for container 3de0fd230fd7693a107de4b56e5ab1444a558ae1e835e78e292ed364915a6362: failed to create namespace: failed to bind mount ns at /run/netns/netns-b0f807fe-e630-3182-e16b-5b5837e2b1a3: no such file or directory</span><br></pre></td></tr></table></figure><p>golang 代码的 Error 是信息叠加的，所以可以直接搜索报错 <code>creating network namespace for container</code> ，找到报错代码：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/containers/podman/blob/v5.5.2/libpod/networking_linux.go#L77-L81</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(r *Runtime)</span></span> createNetNS(ctr *Container) (n <span class="type">string</span>, q <span class="keyword">map</span>[<span class="type">string</span>]types.StatusBlock, retErr <span class="type">error</span>) &#123;</span><br><span class="line">    ctrNS, err := netns.NewNS()</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;&quot;</span>, <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;creating network namespace for container %s: %w&quot;</span>, ctr.ID(), err)</span><br><span class="line">    &#125;</span><br></pre></td></tr></table></figure><p>跳转到 <code>netns.NewNS()</code> 内：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewNS</span><span class="params">()</span></span> (ns.NetNS, <span class="type">error</span>) &#123;</span><br><span class="line">    nsRunDir, err := GetNSRunDir()</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Create the directory for mounting network namespaces</span></span><br><span class="line">    <span class="comment">// This needs to be a shared mountpoint in case it is mounted in to</span></span><br><span class="line">    <span class="comment">// other namespaces (containers)</span></span><br><span class="line">    err = makeNetnsDir(nsRunDir)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> <span class="keyword">range</span> <span class="number">10000</span> &#123;</span><br><span class="line">        nsName, err := getRandomNetnsName()</span><br><span class="line">        <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">        &#125;</span><br><span class="line">        nsPath := path.Join(nsRunDir, nsName)</span><br><span class="line">        ns, err := newNSPath(nsPath)</span><br><span class="line">        <span class="keyword">if</span> err == <span class="literal">nil</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> ns, <span class="literal">nil</span></span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// retry when the name already exists</span></span><br><span class="line">        <span class="keyword">if</span> errors.Is(err, os.ErrExist) &#123;</span><br><span class="line">            <span class="keyword">continue</span></span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span>, errNoFreeName</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>是那行 <code>makeNetnsDir()</code> 报错，内部都是基础的文件和 ns mount 操作：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">makeNetnsDir</span><span class="params">(nsRunDir <span class="type">string</span>)</span></span> <span class="type">error</span> &#123;</span><br><span class="line">    err := os.MkdirAll(nsRunDir, <span class="number">0o755</span>)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> err</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// Important, the bind mount setup is racy if two process try to set it up in parallel.</span></span><br><span class="line">    <span class="comment">// This can have very bad consequences because we end up with two duplicated mounts</span></span><br><span class="line">    <span class="comment">// for the netns file that then might have a different parent mounts.</span></span><br><span class="line">    <span class="comment">// Also because as root netns dir is also created by ip netns we should not race against them.</span></span><br><span class="line">    <span class="comment">// Use a lock on the netns dir like they do, compare the iproute2 ip netns add code.</span></span><br><span class="line">    <span class="comment">// https://github.com/iproute2/iproute2/blob/8b9d9ea42759c91d950356ca43930a975d0c352b/ip/ipnetns.c#L806-L815</span></span><br><span class="line"></span><br><span class="line">    dirFD, err := unix.Open(nsRunDir, unix.O_RDONLY|unix.O_DIRECTORY|unix.O_CLOEXEC, <span class="number">0</span>)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> &amp;os.PathError&#123;Op: <span class="string">&quot;open&quot;</span>, Path: nsRunDir, Err: err&#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// closing the fd will also unlock so we do not have to call flock(fd,LOCK_UN)</span></span><br><span class="line">    <span class="keyword">defer</span> unix.Close(dirFD)</span><br><span class="line"></span><br><span class="line">    err = unix.Flock(dirFD, unix.LOCK_EX)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;failed to lock %s dir: %w&quot;</span>, nsRunDir, err)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Remount the namespace directory shared. This will fail with EINVAL</span></span><br><span class="line">    <span class="comment">// if it is not already a mountpoint, so bind-mount it on to itself</span></span><br><span class="line">    <span class="comment">// to &quot;upgrade&quot; it to a mountpoint.</span></span><br><span class="line">    err = unix.Mount(<span class="string">&quot;&quot;</span>, nsRunDir, <span class="string">&quot;none&quot;</span>, unix.MS_SHARED|unix.MS_REC, <span class="string">&quot;&quot;</span>)</span><br><span class="line">    <span class="keyword">if</span> err == <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> err != unix.EINVAL &#123;</span><br><span class="line">        <span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;mount --make-rshared %s failed: %q&quot;</span>, nsRunDir, err)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Recursively remount /run/netns on itself. The recursive flag is</span></span><br><span class="line">    <span class="comment">// so that any existing netns bindmounts are carried over.</span></span><br><span class="line">    err = unix.Mount(nsRunDir, nsRunDir, <span class="string">&quot;none&quot;</span>, unix.MS_BIND|unix.MS_REC, <span class="string">&quot;&quot;</span>)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;mount --rbind %s %s failed: %q&quot;</span>, nsRunDir, nsRunDir, err)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Now we can make it shared</span></span><br><span class="line">    err = unix.Mount(<span class="string">&quot;&quot;</span>, nsRunDir, <span class="string">&quot;none&quot;</span>, unix.MS_SHARED|unix.MS_REC, <span class="string">&quot;&quot;</span>)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;mount --make-rshared %s failed: %q&quot;</span>, nsRunDir, err)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>下载代码后编译 podman 调试看看 <code>makeNetnsDir()</code> 具体哪个步骤出问题，发现无法跳到断点：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">$ dlv exec bin/podman --  run  -ti --entrypoint ls docker.io/library/registry:2.7.1</span><br><span class="line">Type &#x27;help&#x27; for list of commands.</span><br><span class="line">(dlv) b libpod/networking_linux.go:78</span><br><span class="line">Breakpoint 1 set at 0x1555494 for github.com/containers/podman/v5/libpod.(*Runtime).createNetNS() ./libpod/networking_linux.go:78</span><br><span class="line">(dlv) c</span><br><span class="line">WARN[0000] Using cgroups-v1 which is deprecated in favor of cgroups-v2 with Podman v5 and will be removed in a future version. Set environment variable `PODMAN_IGNORE_CGROUPSV1_WARNING` to hide this warning. </span><br><span class="line">WARN[0000] The input device is not a TTY. The --tty and --interactive flags might not work properly </span><br><span class="line">received SIGINT, stopping process (will not forward signal)</span><br><span class="line">&gt; runtime.futex() /usr/local/go/src/runtime/sys_linux_amd64.s:558 (PC: 0x492243)</span><br><span class="line">Warning: debugging optimized function</span><br><span class="line">   553:        MOVQ    ts+16(FP), R10</span><br><span class="line">   554:        MOVQ    addr2+24(FP), R8</span><br><span class="line">   555:        MOVL    val3+32(FP), R9</span><br><span class="line">   556:        MOVL    $SYS_futex, AX</span><br><span class="line">   557:        SYSCALL</span><br><span class="line">=&gt; 558:        MOVL    AX, ret+40(FP)</span><br></pre></td></tr></table></figure><p>然后发现 <code>podman info</code> 也卡住：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">bin/podman  info</span></span><br><span class="line">WARN[0000] Using cgroups-v1 which is deprecated in favor of cgroups-v2 with Podman v5 and will be removed in a future version. Set environment variable `PODMAN_IGNORE_CGROUPSV1_WARNING` to hide this warning. </span><br><span class="line">^C</span><br></pre></td></tr></table></figure><p>看了下 <code>podman info</code> 调用链，调试了下发现：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">(dlv) n</span><br><span class="line">&gt; github.com/containers/podman/v5/libpod.(*Runtime).hostInfo() ./libpod/info.go:106 (PC: 0x21d942e)</span><br><span class="line">   101:        cpuUtil, err := getCPUUtilization()</span><br><span class="line">   102:        if err != nil &#123;</span><br><span class="line">   103:            return nil, err</span><br><span class="line">   104:        &#125;</span><br><span class="line">   105:    </span><br><span class="line">=&gt; 106:        locksFree, err := r.lockManager.AvailableLocks()</span><br><span class="line">   107:        if err != nil &#123;</span><br><span class="line">   108:            return nil, fmt.Errorf(&quot;getting free locks: %w&quot;, err)</span><br><span class="line">   109:        &#125;</span><br><span class="line">   110:    </span><br><span class="line">   111:        info := define.HostInfo&#123;</span><br><span class="line">(dlv) n</span><br></pre></td></tr></table></figure><p>看了下代码是卡在 cgo 的 shm lock 那里，无法调试找到网络ns创建问题。</p><h3 id="v4版本尝试"><a href="#v4版本尝试" class="headerlink" title="v4版本尝试"></a>v4版本尝试</h3><p>v4 已经不维护，官方主干版本是 v5，尝试下载了最新的v4 v4.9.5 启动报错不支持：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">podman run --entrypoint <span class="built_in">ls</span> alpine:latest</span></span><br><span class="line">Error: netavark: create veth pair: Netlink error: Not supported (os error 95)</span><br></pre></td></tr></table></figure><p>对于v5和v4的报错均搜索到类似的问题:</p><ul><li><a href="https://github.com/mgoltzsche/podman-static/issues/138">https://github.com/mgoltzsche/podman-static/issues/138</a></li><li><a href="https://github.com/containers/podman/discussions/12840">https://github.com/containers/podman/discussions/12840</a></li></ul><p>centos7的3.10内核不满足最低的4.18</p><h2 id="结论"><a href="#结论" class="headerlink" title="结论"></a>结论</h2><ol><li>podman 现阶段不支持 tls，并且 socket 和 tcp 无法同时监听，安全问题只能使用 socket 文件，无法远程管理</li><li>私有化需要支持众多操作系统下，内核版本跨度从老到新都有，无法使用 podman，如果不是私有化自己单一环境且有网的情况下，可以使用</li></ol>]]></content>
    
    
    <summary type="html">&lt;p&gt;CentOS 7 上 Podman调研…..&lt;/p&gt;</summary>
    
    
    
    
    <category term="podman" scheme="http://zhangguanzhang.github.io/tags/podman/"/>
    
    <category term="centos" scheme="http://zhangguanzhang.github.io/tags/centos/"/>
    
  </entry>
  
  <entry>
    <title>python grpc 使用域名第一次耗时长问题</title>
    <link href="http://zhangguanzhang.github.io/2025/07/29/python-grpc-first-long/"/>
    <id>http://zhangguanzhang.github.io/2025/07/29/python-grpc-first-long/</id>
    <published>2025-07-29T17:40:30.000Z</published>
    <updated>2025-07-29T17:40:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>最近给同事解决的 python grpc 耗时长问题…..</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>昨天上着班突然被群里拉到一个开发任务群里，看了下聊天记录说啥慢，询问了下理清了流程：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"> client    server1   model server</span><br><span class="line">┌──┐        ┌──┐        ┌──┐ </span><br><span class="line">└┬─┘        └┬─┘        └─┬┘ </span><br><span class="line"> ├───http───►│            │  </span><br><span class="line"> │           ├───grpc────►│  </span><br><span class="line"> │           │◄──grpc─────┤  </span><br><span class="line"> │◄───http───┤            │  </span><br><span class="line">                          </span><br></pre></td></tr></table></figure><p>http 客户端请求服务1，带了模型服务的 <code>model_url</code>，server1 会 grpc 请求 <code>model_url</code>，然后发现回复非常慢，把 <code>ai-xxx</code> 换成实际的外部 IP 就会很快。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">curl --location &#x27;http://xxxx:8870/chat&#x27; \</span><br><span class="line">    --header &#x27;Content-Type: application/json&#x27; \</span><br><span class="line">    --data &#x27;&#123;</span><br><span class="line">        &quot;stream&quot;: true,</span><br><span class="line">        &quot;messages_type&quot;: 1,</span><br><span class="line">        &quot;title&quot;: &quot;人工智能&quot;,</span><br><span class="line">        &quot;inspiration&quot;: &quot;文风严谨，语言简洁凝练&quot;,</span><br><span class="line">        &quot;model_config&quot;: &#123;</span><br><span class="line">            &quot;default&quot;: true,</span><br><span class="line">            &quot;type&quot;: &quot;xxxx&quot;,</span><br><span class="line">            &quot;model_url&quot;: &quot;ai-xxx&quot;,</span><br><span class="line">            &quot;model_name&quot;: &quot;xxxxx-xxxx&quot;,</span><br><span class="line">            &quot;model&quot;: &quot;xxxxx&quot;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;&#x27;</span><br></pre></td></tr></table></figure><p>环境上测了下果然是，server1 的访问日志看有 9s 左右耗时，看了下环境是 k8s， <code>ai-xxx</code> 是直接创建的 svc 和 ep，endpoint 为外部一个模型服务的 ip 端口。看了下 server1 内的 <code>/etc/nsswitch</code> 正常：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ grep hosts /etc/nsswitch.conf </span><br><span class="line">hosts:          files dns</span><br></pre></td></tr></table></figure><p>加了下 hosts 测下还是每次都必现，可以肯定和 k8s dns 无关了，如果是 k8s dns 5s timeout 问题，那是 IPv6 的 AAAA 记录五元组冲突导致的超时的话都是偶现而非必现，和这个不一样。</p><h2 id="解决过程"><a href="#解决过程" class="headerlink" title="解决过程"></a>解决过程</h2><p>初步怀疑是研发的 server1 后端接口逻辑有问题，但是他们说有些环境正常有些不正常，于是我让他们给个最小的 python grpc client 访问模型的 demo。这样二分排除范围，不然两个链路不好查。没想到今天测试又催这个问题看得咋样了。然后让研发把 demo 发过来看。</p><h3 id="demo-复现"><a href="#demo-复现" class="headerlink" title="demo 复现"></a>demo 复现</h3><p>拿到 demo 有点无语，这种耗时的打印居然是用的 print，手动修改成 logging 后测试复现了，大概逻辑如下：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">server_urls = [</span><br><span class="line">    <span class="string">&quot;ai-xxx:80&quot;</span>,</span><br><span class="line">    <span class="string">&quot;10.xx.xx.xxx:10037&quot;</span></span><br><span class="line">]</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> server_url <span class="keyword">in</span> server_urls:</span><br><span class="line">    grpc_test()</span><br></pre></td></tr></table></figure><p>拷贝到 server1 的容器里执行，发现必现，耗时久的附近是如下 <code>35s - 43s</code>：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">2025-07-29 16:52:35 [INFO] 🔥 开始测试GRPC连接: ai-xxx:80</span><br><span class="line">2025-07-29 16:52:35 [INFO] 📡 创建Triton客户端...</span><br><span class="line">2025-07-29 16:52:35 [INFO] ✅ 客户端创建成功</span><br><span class="line">2025-07-29 16:52:35 [INFO] 🔍 检查服务器状态...</span><br><span class="line">2025-07-29 16:52:43 [INFO]    服务器存活: ✅</span><br><span class="line">2025-07-29 16:52:43 [INFO]    服务器就绪: ✅</span><br><span class="line">2025-07-29 16:52:43 [INFO] ✅ 服务器状态检查通过</span><br><span class="line">...</span><br><span class="line">2025-07-29 16:52:44 [INFO] 🔥 开始测试GRPC连接: 10.xx.xx.xxx:10037</span><br><span class="line">2025-07-29 16:52:44 [INFO] 📡 创建Triton客户端...</span><br><span class="line">2025-07-29 16:52:44 [INFO] ✅ 客户端创建成功</span><br><span class="line">2025-07-29 16:52:44 [INFO] 🔍 检查服务器状态...</span><br><span class="line">2025-07-29 16:52:44 [INFO]    服务器存活: ✅</span><br><span class="line">2025-07-29 16:52:44 [INFO]    服务器就绪: ✅</span><br><span class="line">2025-07-29 16:52:44 [INFO] ✅ 服务器状态检查通过</span><br></pre></td></tr></table></figure><p>看了下 demo 代码对应位置：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">triton_client = grpcclient.InferenceServerClient(</span><br><span class="line">    url=server_url,</span><br><span class="line">    verbose=<span class="literal">False</span></span><br><span class="line">)</span><br><span class="line">logging.info(<span class="string">&quot;✅ 客户端创建成功&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 2. 检查服务器状态</span></span><br><span class="line">logging.info(<span class="string">&quot;🔍 检查服务器状态is_server_live&quot;</span>)</span><br><span class="line">is_live = triton_client.is_server_live()</span><br><span class="line">logging.info(<span class="string">&quot;🔍 检查服务器状态is_server_ready&quot;</span>)</span><br><span class="line">is_ready = triton_client.is_server_ready()</span><br></pre></td></tr></table></figure><p><code>python -m pdb grpc-test.py</code> 调试了下，发现这个方法只是一个纯粹的 grpc 请求：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">(Pdb) n</span><br><span class="line">&gt; /usr/local/lib/python3<span class="number">.10</span>/site-packages/tritonclient/grpc/_client.py(<span class="number">294</span>)is_server_live()</span><br><span class="line">-&gt; <span class="keyword">try</span>:</span><br><span class="line">(Pdb) <span class="built_in">list</span></span><br><span class="line"><span class="number">289</span>          InferenceServerException</span><br><span class="line"><span class="number">290</span>              If unable to get liveness <span class="keyword">or</span> has timed out.</span><br><span class="line"><span class="number">291</span>  </span><br><span class="line"><span class="number">292</span>          <span class="string">&quot;&quot;&quot;</span></span><br><span class="line"><span class="string">293          metadata = self._get_metadata(headers)</span></span><br><span class="line"><span class="string">294  -&gt;        try:</span></span><br><span class="line"><span class="string">295              request = service_pb2.ServerLiveRequest()</span></span><br><span class="line"><span class="string">296              if self._verbose:</span></span><br><span class="line"><span class="string">297                  print(&quot;is_server_live, metadata &#123;&#125;\n&#123;&#125;&quot;.format(metadata, request))</span></span><br><span class="line"><span class="string">298              response = self._client_stub.ServerLive(</span></span><br><span class="line"><span class="string">299                  request=request, metadata=metadata, timeout=client_timeout</span></span><br></pre></td></tr></table></figure><p>grpc 客户端用的 <code>tritonclient</code>:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">pip list | grep client</span></span><br><span class="line">tritonclient           2.57.0</span><br></pre></td></tr></table></figure><p><a href="https://pypi.org/project/tritonclient/#history">pypi</a> 查了下这个 grpc 库发现找到的仓库 <a href="https://github.com/triton-inference-server/client">triton-inference-server&#x2F;client</a> 分支和上面版本号对不上，很奇怪。</p><p>谷歌搜了下 <code>is_server_live long time</code> 发现能搜到同样问题 <a href="https://github.com/triton-inference-server/server/issues/3800">is_server_live() python GRPC client got no response</a>，但是别人用的是 IP ，后面还把 issue 关闭了说是他们自己的网络问题。</p><p>然后让研发找 <code>is_server_live()</code> 的 grpc server 端研发查下 <code>ServerLive</code> 接口调用逻辑，另一方面让 server1 研发把 <code>is_server_live()</code> 注释了试试，他说走 CI 流程会稍微慢些，让我看看环境上能直接改不，于是大家一起并行操作。</p><h3 id="稍有眉目"><a href="#稍有眉目" class="headerlink" title="稍有眉目"></a>稍有眉目</h3><p>发现环境改了后还是复现，于是我在 demo 里改了下发现慢在第二个了：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 2. 检查服务器状态</span></span><br><span class="line">logging.info(<span class="string">&quot;🔍 检查服务器状态is_server_live&quot;</span>)</span><br><span class="line"><span class="comment"># is_live = triton_client.is_server_live()</span></span><br><span class="line">logging.info(<span class="string">&quot;🔍 检查服务器状态is_server_ready&quot;</span>)</span><br><span class="line"><span class="comment"># is_ready = triton_client.is_server_ready()</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># logging.info(f&quot;   服务器存活: &#123;&#x27;✅&#x27; if is_live else &#x27;❌&#x27;&#125;&quot;)</span></span><br><span class="line"><span class="comment"># logging.info(f&quot;   服务器就绪: &#123;&#x27;✅&#x27; if is_ready else &#x27;❌&#x27;&#125;&quot;)</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># if not is_live:</span></span><br><span class="line"><span class="comment">#     raise Exception(&quot;服务器未存活&quot;)</span></span><br><span class="line"><span class="comment"># if not is_ready:</span></span><br><span class="line"><span class="comment">#     raise Exception(&quot;服务器未就绪&quot;)</span></span><br><span class="line"></span><br><span class="line">logging.info(<span class="string">&quot;✅ 服务器状态检查通过&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 3. 获取模型信息</span></span><br><span class="line">logging.info(<span class="string">&quot;📋 获取模型信息...&quot;</span>)</span><br><span class="line"><span class="keyword">try</span>:</span><br><span class="line">    model_metadata = triton_client.get_model_metadata(<span class="string">&quot;xxxxx&quot;</span>)</span><br></pre></td></tr></table></figure><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">2025-07-29 17:32:32 [INFO] ✅ 客户端创建成功</span><br><span class="line">2025-07-29 17:32:32 [INFO] 🔍 检查服务器状态is_server_live</span><br><span class="line">2025-07-29 17:32:32 [INFO] 🔍 检查服务器状态is_server_ready</span><br><span class="line">2025-07-29 17:32:32 [INFO] ✅ 服务器状态检查通过</span><br><span class="line">2025-07-29 17:32:32 [INFO] 📋 获取模型信息...</span><br><span class="line">2025-07-29 17:32:40 [INFO] ✅ 成功获取模型元数据</span><br><span class="line">2025-07-29 17:32:40 [INFO]    模型名称: xxxx</span><br><span class="line">2025-07-29 17:32:40 [INFO]    模型版本: [&#x27;1&#x27;]</span><br><span class="line">2025-07-29 17:32:40 [INFO] 🧠 执行简单推理测试...</span><br><span class="line">2025-07-29 17:32:40 [INFO] ✅ 流启动成功</span><br><span class="line">2025-07-29 17:32:40 [INFO] ✅ 推理请求发送成功</span><br><span class="line">2025-07-29 17:32:40 [INFO] ⏳ 等待响应...</span><br><span class="line">2025-07-29 17:32:40 [INFO] ✅ 收到成功响应</span><br><span class="line">2025-07-29 17:32:40 [INFO]    输出tokens: [9]</span><br><span class="line">2025-07-29 17:32:40 [INFO] 🎉 GRPC连接测试完全成功!</span><br><span class="line">2025-07-29 17:32:40 [INFO] 🎉 务器 ai-xxx:80 测试成功!</span><br></pre></td></tr></table></figure><p>这次不是前面的 <code>is_server_live()</code> 问题，仔细看了下代码想了下是 <code>grpcclient</code> 的第一个 grpc 请求有问题导致耗时长，搜索了下发现一样相似但是更耗时的问题：</p><ul><li><a href="https://github.com/triton-inference-server/server/issues/1821">https://github.com/triton-inference-server/server/issues/1821</a></li></ul><p>不过上面 issue 里 triton 官方说是 <code>grpc/grpc</code> 的问题， 搜到 <a href="https://github.com/grpc/grpc/issues/22260">Communication from c++ server to python client is too slow</a> 说 python 很慢但是 c++ 的不慢，python 的首次 grpc 建立连接耗时很长，后续的请求都在 ms 内完成。issue 内下面大佬排查出耗时大部分开销都是在 libc 相关调用上。</p><p>众所周知，grpc 是长连接后推流，让研发要不要改代码写成连接池试试，他们说来不及。</p><p>感觉既然和 triton 无关，就搜 <code>python grpc dns first time long</code> 搜到了 <a href="https://github.com/grpc/grpc/issues/24018">Python client hangs on first connection</a> 类似问题，排查和 DNS 相关，发现大佬说试试 <code>GRPC_DNS_RESOLVER=native</code> ，搜了下 <code>grpc/grpc</code> 官方文档：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">* GRPC_DNS_RESOLVER</span><br><span class="line">  Declares which DNS resolver to use. The default is ares if gRPC is built with</span><br><span class="line">  c-ares support. Otherwise, the value of this environment variable is ignored.</span><br><span class="line">  Available DNS resolver include:</span><br><span class="line">  - ares (default on most platforms except iOS, Android or Node)- a DNS</span><br><span class="line">    resolver based around the c-ares library</span><br><span class="line">  - native - a DNS resolver based around getaddrinfo(), creates a new thread to</span><br><span class="line">    perform name resolution</span><br></pre></td></tr></table></figure><p>默认是 c 的库 <code>c-ares</code> 去解析的，设置为 <code>native</code> 则使用系统 libc 的 <code>getaddrinfo()</code> ，于是测试了下可以：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">GRPC_DNS_RESOLVER=native python3  /root/test2.py</span> </span><br><span class="line">2025-07-29 17:59:43 [INFO] ✅ tritonclient导入成功</span><br><span class="line">2025-07-29 17:59:43 [INFO] ============================================================</span><br><span class="line">2025-07-29 17:59:43 [INFO] 🚀 GRPC连接测试工具</span><br><span class="line">2025-07-29 17:59:43 [INFO] ============================================================</span><br><span class="line">2025-07-29 17:59:43 [INFO] </span><br><span class="line">📍 测试服务器: ai-xxx:80</span><br><span class="line">2025-07-29 17:59:43 [INFO] ----------------------------------------</span><br><span class="line">2025-07-29 17:59:43 [INFO] 🔍 测试DNS解析: ai-xxx:80</span><br><span class="line">2025-07-29 17:59:43 [INFO] ✅ DNS解析成功: ai-xxx -&gt; 10.186.44.250:80</span><br><span class="line">2025-07-29 17:59:43 [INFO] 🔥 开始测试GRPC连接: ai-xxx:80</span><br><span class="line">2025-07-29 17:59:43 [INFO] 📡 创建Triton客户端...</span><br><span class="line">2025-07-29 17:59:43 [INFO] ✅ 客户端创建成功</span><br><span class="line">2025-07-29 17:59:43 [INFO] 🔍 检查服务器状态is_server_live</span><br><span class="line">2025-07-29 17:59:44 [INFO] 🔍 检查服务器状态is_server_ready</span><br><span class="line">2025-07-29 17:59:44 [INFO]    服务器存活: ✅</span><br><span class="line">2025-07-29 17:59:44 [INFO]    服务器就绪: ✅</span><br><span class="line">2025-07-29 17:59:44 [INFO] ✅ 服务器状态检查通过</span><br><span class="line">2025-07-29 17:59:44 [INFO] 📋 获取模型信息...</span><br><span class="line">2025-07-29 17:59:44 [INFO] ✅ 成功获取模型元数据</span><br><span class="line">2025-07-29 17:59:44 [INFO]    模型名称: xxxx</span><br><span class="line">2025-07-29 17:59:44 [INFO]    模型版本: [&#x27;1&#x27;]</span><br><span class="line">2025-07-29 17:59:44 [INFO] 🧠 执行简单推理测试...</span><br><span class="line">2025-07-29 17:59:44 [INFO] ✅ 流启动成功</span><br><span class="line">2025-07-29 17:59:44 [INFO] ✅ 推理请求发送成功</span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;p&gt;最近给同事解决的 python grpc 耗时长问题…..&lt;/p&gt;</summary>
    
    
    
    
    <category term="python" scheme="http://zhangguanzhang.github.io/tags/python/"/>
    
    <category term="grpc" scheme="http://zhangguanzhang.github.io/tags/grpc/"/>
    
  </entry>
  
  <entry>
    <title>golang gitlab subgroup 构建问题</title>
    <link href="http://zhangguanzhang.github.io/2025/06/17/golang-gitlab-subgroup/"/>
    <id>http://zhangguanzhang.github.io/2025/06/17/golang-gitlab-subgroup/</id>
    <published>2025-06-17T14:40:30.000Z</published>
    <updated>2025-06-17T14:40:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>最近给同事解决的 subgroup 问题…..</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>开发构建 Docker 镜像报错没权限拉取依赖：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">xxxapi/middlewares imports</span><br><span class="line">xxx.xxxgitlab.net/x/xx/xxx/econtext: xxx.xxxgitlab.net/x/xx/xxx@v1.6.5-rc.7.0.20250609085545-4855369c0f1e: invalid version: git ls-remote -q origin in /go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c: exit status 128:</span><br><span class="line">remote: </span><br><span class="line">remote: ========================================================================</span><br><span class="line">remote: </span><br><span class="line">remote: The project you were looking for could not be found or you don&#x27;t have permission to view it.</span><br><span class="line">remote: </span><br><span class="line">remote: ========================================================================</span><br><span class="line">remote: </span><br><span class="line">fatal: Could not read from remote repository.</span><br><span class="line"></span><br><span class="line">Please make sure you have the correct access rights</span><br><span class="line">and the repository exists.</span><br></pre></td></tr></table></figure><h2 id="解决过程"><a href="#解决过程" class="headerlink" title="解决过程"></a>解决过程</h2><p>镜像内有 gitlab 的 deploy keys，初步怀疑是没依赖仓库权限，让开发联系该依赖仓库负责人，去开启 <code>Enabled deploy keys</code> 后还是一样。</p><h3 id="手动构建测试"><a href="#手动构建测试" class="headerlink" title="手动构建测试"></a>手动构建测试</h3><p>登录到构建机器上，docker build 实际就是按照 Dockerfile 来 docker run 和 docker commit 的结合，找到失败的 run 的镜像 id：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker ps -a</span></span><br><span class="line">b7c4be99d86d   ba54b82c5b1c                                                        &quot;/bin/sh -c &#x27;IN_DOCK…&quot;   4 hours ago    Exited (1) 1 hours ago              nervous_chandrasekhar</span><br></pre></td></tr></table></figure><p>用上面的镜像 ID 和 command 手动测下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker run --<span class="built_in">rm</span> -ti --entrypoint bash ba54b82c5b1c</span></span><br><span class="line">root@a9114c46d496:/go/src/xxx.xxxgitlab.net/xxxxx/# IN_DOCKER=1 bash ./build.sh</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>报错依旧，<code>GOPRIVATE</code> <code>GONOPROXY</code> 啥的都配置了的，以及 ssh 相关 <code>git config --global url.&quot;git@xxx.xxxgitlab.net:&quot;.insteadof &quot;https://xxx.xxxgitlab.net/&quot;</code>都是以及配置了没问题的：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> ~/.gitconfig</span></span><br><span class="line">[url &quot;git@xxx.xxxgitlab.net:&quot;]</span><br><span class="line">insteadof = https://xxx.xxxgitlab.net/</span><br></pre></td></tr></table></figure><p>用 git 测试也没问题：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ssh -T git@xxx.xxxgitlab.net</span></span><br><span class="line">Welcome to GitLab, @xxxx_reporter!</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">git <span class="built_in">clone</span> xxx.xxxgitlab.net/x/xx/xxx</span></span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>看了下 go 的 help ，发现 <code>go mod</code> 没有 debug level 相关 cmdline，但是 <code>go get</code> 有：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">go get -x  xxx.xxxgitlab.net/x/xx/xxx</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">get https://xxx.xxxgitlab.net/x/xx/xxx?go-get=1</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">get https://xxx.xxxgitlab.net/x/xx/xxx?go-get=1: 200 OK (0.063s)</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">get https://xxx.xxxgitlab.net/docmxini/xx?go-get=1</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">get https://xxx.xxxgitlab.net/x/xx?go-get=1: 200 OK (0.016s)</span></span><br><span class="line">mkdir -p /go/pkg/mod/cache/vcs # git3 https://xxx.xxxgitlab.net/x/xx.git</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">lock /go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c.lock</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">/go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c <span class="keyword">for</span> git3 https://xxx.xxxgitlab.net/x/fx.git</span></span><br><span class="line">cd /go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c; git -c log.showsignature=false log --no-decorate -n1 &#x27;--format=format:%H %ct %D&#x27; 4855369c0f1e --</span><br><span class="line">0.002s # cd /go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c; git -c log.showsignature=false log --no-decorate -n1 &#x27;--format=format:%H %ct %D&#x27; 4855369c0f1e --</span><br><span class="line">cd /go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c; git ls-remote -q origin</span><br><span class="line">0.232s # cd /go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c; git ls-remote -q origin</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">get https://xxx.xxxgitlab.net/x/xx.git</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">get https://xxx.xxxgitlab.net/x/xx.git: 200 OK (0.076s)</span></span><br><span class="line">cd /go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c; git tag -l</span><br><span class="line">0.002s # cd /go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c; git tag -l</span><br><span class="line">go: xxx.xxxgitlab.net/x/xx/xxx@v1.6.5-rc.7.0.20250609085545-4855369c0f1e: invalid version: git ls-remote -q origin in /go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c: exit status 128:</span><br><span class="line">remote: </span><br><span class="line">remote: ========================================================================</span><br><span class="line">remote: </span><br><span class="line">remote: The project you were looking for could not be found or you don&#x27;t have permission to view it.</span><br><span class="line">remote: </span><br><span class="line">remote: ========================================================================</span><br><span class="line">remote: </span><br><span class="line">fatal: Could not read from remote repository.</span><br><span class="line"></span><br><span class="line">Please make sure you have the correct access rights</span><br><span class="line">and the repository exists.</span><br></pre></td></tr></table></figure><p>上面最后的命令报错，cd 进去执行下看看：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cd</span> /go/pkg/mod/cache/vcs/672638ca2205d3f2cdc0288db28840b769e58250b95fbbd31e553c6c8076fc3c</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">git ls-remote</span></span><br><span class="line">remote: </span><br><span class="line">remote: ========================================================================</span><br><span class="line">remote: </span><br><span class="line">remote: The project you were looking for could not be found or you don&#x27;t have permission to view it.</span><br><span class="line">remote: </span><br><span class="line">remote: ========================================================================</span><br><span class="line">remote: </span><br><span class="line">fatal: Could not read from remote repository.</span><br><span class="line"></span><br><span class="line">Please make sure you have the correct access rights</span><br><span class="line">and the repository exists.</span><br></pre></td></tr></table></figure><p>其实前面的 <code>go get -x</code> 里就有问题详细信息了，这里我是看目录下文件发现的：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">ls</span> -al</span></span><br><span class="line">total 28</span><br><span class="line">drwxr-xr-x  7 root root  119 Jun 17 03:53 .</span><br><span class="line">drwxr-xr-x 20 root root 8192 Jun 17 03:53 ..</span><br><span class="line">-rw-r--r--  1 root root   23 Jun 17 03:53 HEAD</span><br><span class="line">drwxr-xr-x  2 root root    6 Jun 17 03:53 branches</span><br><span class="line">-rw-r--r--  1 root root  179 Jun 17 03:53 config</span><br><span class="line">-rw-r--r--  1 root root   73 Jun 17 03:53 description</span><br><span class="line">drwxr-xr-x  2 root root 4096 Jun 17 03:53 hooks</span><br><span class="line">drwxr-xr-x  2 root root   21 Jun 17 03:53 info</span><br><span class="line">drwxr-xr-x  4 root root   30 Jun 17 03:53 objects</span><br><span class="line">drwxr-xr-x  4 root root   31 Jun 17 03:53 refs</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> config</span></span><br><span class="line">[core]</span><br><span class="line">repositoryformatversion = 0</span><br><span class="line">filemode = true</span><br><span class="line">bare = true</span><br><span class="line">[remote &quot;origin&quot;]</span><br><span class="line">url = https://xxx.xxxgitlab.net/x/xx.git</span><br><span class="line">fetch = +refs/heads/*:refs/remotes/origin/*</span><br></pre></td></tr></table></figure><p>仓库地址不对，改成 <code>url = https://xxx.xxxgitlab.net/x/xx/xxx.git</code> 后就可以了：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">git ls-remote | <span class="built_in">head</span></span></span><br><span class="line">From git@xxx.xxxgitlab.net:x/xx/xxx.git</span><br><span class="line">74f35a28ecc0f9721a49c41fb5d7bffa071d1502HEAD</span><br><span class="line">295c3a79dfc0d27022459d0cec21411031edd5e8refs/heads/0xxx</span><br><span class="line">...</span><br></pre></td></tr></table></figure><h3 id="gitlab-subgroup"><a href="#gitlab-subgroup" class="headerlink" title="gitlab subgroup"></a>gitlab subgroup</h3><p>搜了下，发现是 gitlab 的鉴权和 golang 的 get 逻辑冲突，双方都认为对方不规范，谁都不让谁，具体见文章:</p><ul><li><a href="https://mp.weixin.qq.com/s/D_AsV9QpOZ_5v8f1eKoxjQ">Go 模块使用 GitLab subgroups 的问题</a></li><li><a href="https://docs.gitlab.com/ee/user/project/use_project_as_go_package.html#authenticate-go-requests-to-private-projects">use_project_as_go_package</a></li><li><a href="https://go.dev/ref/mod#private-module-proxy-auth">private-module-proxy-auth</a></li><li><a href="https://gitlab.com/gitlab-org/gitlab/-/issues/437005">Allow to set a go-modules folder for private Go projects</a></li></ul><p>解决方案只有两种，一种是 <code>~/.netrc</code> 但是是明文密码，不适用于 CI&#x2F;CD 构建。<br>使用 replace 的话需要指定一样的版本，而 <code>go get</code> 升级依赖的时候 replace 不会动。<br>写 shell 在 <code>go build</code> 前执行的话，怕正则边界和模糊匹配到了前缀一样的，并且 <code>go mod edit -json</code> 可以输出 json 信息，好在 golang 编译镜像是 ubuntu，内部有 python。</p><h3 id="脚本解决"><a href="#脚本解决" class="headerlink" title="脚本解决"></a>脚本解决</h3><p>花了点时间写了如下 python 脚本，执行方式如下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">go mod edit -json | python3 scripts/golang-subgroup.py \</span><br><span class="line">  --repo xxx.xxxgitlab.net/x/xx/xxx \</span><br><span class="line">  --repo xxx.xxxgitlab.net/x/xx/xxx2</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">或者</span></span><br><span class="line">go mod edit -json &gt; go.mod.json</span><br><span class="line">python3 scripts/golang-subgroup.py \</span><br><span class="line">  --repo xxx.xxxgitlab.net/x/xx/xxx \</span><br><span class="line">  --repo xxx.xxxgitlab.net/x/xx/xxx2</span><br></pre></td></tr></table></figure><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># -*- coding: utf-8 -*-</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> argparse</span><br><span class="line"><span class="keyword">import</span> sys</span><br><span class="line"><span class="keyword">import</span> os</span><br><span class="line"><span class="keyword">import</span> json</span><br><span class="line"><span class="keyword">import</span> logging</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">parse_args</span>():</span><br><span class="line">    parser = argparse.ArgumentParser(description=<span class="string">&#x27;解决 go.mod 里的 gitlab subgroups 问题&#x27;</span>)</span><br><span class="line">    parser.add_argument(<span class="string">&#x27;--repo&#x27;</span>, action=<span class="string">&#x27;append&#x27;</span>, <span class="built_in">help</span>=<span class="string">&#x27;要处理的仓库名字，例如: xx.gitlab.cn/a/b/c&#x27;</span>)</span><br><span class="line">    parser.add_argument(<span class="string">&#x27;--mod&#x27;</span>, default=<span class="string">&quot;./go.mod&quot;</span>, <span class="built_in">help</span>=<span class="string">&#x27;go.mod path&#x27;</span>)</span><br><span class="line">    <span class="keyword">return</span> parser.parse_args()</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">replace_repo</span>(<span class="params">repo_list, mod_json, mod_path</span>):</span><br><span class="line">    <span class="keyword">if</span> mod_json[<span class="string">&#x27;Replace&#x27;</span>]:</span><br><span class="line">        mod_replace_list = [ i[<span class="string">&#x27;Old&#x27;</span>][<span class="string">&#x27;Path&#x27;</span>] <span class="keyword">for</span> i <span class="keyword">in</span> mod_json[<span class="string">&#x27;Replace&#x27;</span>]]</span><br><span class="line">        repo_list = [repo <span class="keyword">for</span> repo <span class="keyword">in</span> repo_list <span class="keyword">if</span> repo <span class="keyword">not</span> <span class="keyword">in</span> mod_replace_list]</span><br><span class="line">        <span class="keyword">if</span> <span class="built_in">len</span>(repo_list) == <span class="number">0</span>:</span><br><span class="line">            logging.info(<span class="string">&quot;已经全部替换&quot;</span>)</span><br><span class="line">            <span class="keyword">return</span></span><br><span class="line"></span><br><span class="line">    replace_list = [ i <span class="keyword">for</span> i <span class="keyword">in</span> mod_json[<span class="string">&#x27;Require&#x27;</span>] <span class="keyword">if</span> i[<span class="string">&#x27;Path&#x27;</span>] <span class="keyword">in</span> repo_list <span class="keyword">and</span> (<span class="keyword">not</span> i.get(<span class="string">&#x27;Indirect&#x27;</span>, <span class="literal">False</span>))]</span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(replace_list) == <span class="number">0</span>:</span><br><span class="line">        logging.error(<span class="string">&quot;未找到需要替换的仓库: %s&quot;</span>, <span class="string">&#x27;,&#x27;</span>.join(repo_list))</span><br><span class="line">        <span class="keyword">return</span></span><br><span class="line">    <span class="keyword">with</span> <span class="built_in">open</span>(mod_path, <span class="string">&#x27;a&#x27;</span>, encoding=<span class="string">&#x27;utf-8&#x27;</span>) <span class="keyword">as</span> f:</span><br><span class="line">        <span class="keyword">for</span> item <span class="keyword">in</span> replace_list:</span><br><span class="line">            replace_str = <span class="string">&quot;replace &#123;0&#125; =&gt; &#123;0&#125;.git &#123;1&#125;&quot;</span>.<span class="built_in">format</span>(item[<span class="string">&#x27;Path&#x27;</span>], item[<span class="string">&#x27;Version&#x27;</span>])</span><br><span class="line">            logging.info(<span class="string">&quot;添加 %s&quot;</span>, replace_str)</span><br><span class="line">            f.writelines(replace_str+<span class="string">&#x27;\n&#x27;</span>)</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">main</span>():</span><br><span class="line">    logging.basicConfig(level=logging.INFO, <span class="built_in">format</span>=<span class="string">&#x27;%(asctime)s - %(levelname)s - %(message)s&#x27;</span>)</span><br><span class="line">    args = parse_args()</span><br><span class="line">    <span class="keyword">if</span> args.repo <span class="keyword">is</span> <span class="literal">None</span> <span class="keyword">or</span> <span class="built_in">len</span>(args.repo) == <span class="number">0</span>:</span><br><span class="line">        <span class="keyword">return</span></span><br><span class="line"></span><br><span class="line">    mod_json = &#123;&#125;</span><br><span class="line">    mod_json_path = <span class="string">&quot;go.mod.json&quot;</span></span><br><span class="line">    <span class="keyword">if</span> os.isatty(<span class="number">0</span>):</span><br><span class="line">        <span class="keyword">if</span> <span class="keyword">not</span> os.path.isfile(mod_json_path):</span><br><span class="line">            logging.error(<span class="string">&quot;请以 go mod edit -json | python3 %s 运行&quot;</span>, __file__)</span><br><span class="line">            os._exit(<span class="number">2</span>)</span><br><span class="line">        <span class="keyword">with</span> <span class="built_in">open</span>(mod_json_path, <span class="string">&#x27;r&#x27;</span>) <span class="keyword">as</span> file:</span><br><span class="line">            mod_json = json.load(file)</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        mod_json = json.loads(sys.stdin.read())</span><br><span class="line"></span><br><span class="line">    </span><br><span class="line">    <span class="keyword">if</span> (<span class="keyword">not</span> <span class="built_in">isinstance</span>(mod_json, <span class="built_in">dict</span>)) <span class="keyword">or</span> mod_json.get(<span class="string">&quot;Module&quot;</span>, <span class="string">&quot;&quot;</span>) == <span class="string">&quot;&quot;</span>:</span><br><span class="line">        logging.error(<span class="string">&quot;请以 go mod edit -json | python3 %s 运行&quot;</span>, __file__)</span><br><span class="line">        os._exit(<span class="number">2</span>)</span><br><span class="line">    replace_repo(args.repo, mod_json, args.mod)</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&quot;__main__&quot;</span>:</span><br><span class="line">    main()</span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;p&gt;最近给同事解决的 subgroup 问题…..&lt;/p&gt;</summary>
    
    
    
    
    <category term="golang" scheme="http://zhangguanzhang.github.io/tags/golang/"/>
    
    <category term="gitlab" scheme="http://zhangguanzhang.github.io/tags/gitlab/"/>
    
    <category term="subgroup" scheme="http://zhangguanzhang.github.io/tags/subgroup/"/>
    
  </entry>
  
  <entry>
    <title>小白向的 kubernetes 证书讲解</title>
    <link href="http://zhangguanzhang.github.io/2025/06/02/kubernetes-cert/"/>
    <id>http://zhangguanzhang.github.io/2025/06/02/kubernetes-cert/</id>
    <published>2025-06-02T20:40:30.000Z</published>
    <updated>2025-06-02T20:40:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>最近同事遇到几起证书相关问题，从小白角度来写下 k8s 证书…..</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>很多人对  k8s 证书和 kubeconfig 望而却步，证书过期和相关报错就无从下手，市面上有写证书的文章博客，但是感觉很长的理论会让很多人看不下去，实际更需要的是解决问题时候的具体步骤和方向。</p><h2 id="理论部分"><a href="#理论部分" class="headerlink" title="理论部分"></a>理论部分</h2><p>简单讲解证书理论部分。</p><h3 id="双向-SSL"><a href="#双向-SSL" class="headerlink" title="双向 SSL"></a>双向 SSL</h3><p>访问一个 https 网站，需要目标 web server 配置有 ssl 证书，而证书来源两种：</p><ol><li>CA（证书颁发机构）使用私钥签署出 根 CA 证书（公钥），浏览器和操作系统内置这些 根 CA 证书， 只有 CA 机构签署的证书才会是绿锁。</li><li>使用证书工具或者遵守证书规范的库生成的 CA 私钥自己签署出的证书，浏览器显示红色警告（连接不安全&#x2F;无效证书）</li></ol><p>k8s 采用的是基于 X.509 V3 标准的双向 SSL，客户端和服务端通信，都会验证双方证书，根据双方是否是一样的 CA 签署的证书，而 CA 私钥是自己生成的，你可以看到 k8s 组件的 cmdline 都有指定参数 <code>ca-file|cert-file</code> 相关。</p><h3 id="证书建议相关"><a href="#证书建议相关" class="headerlink" title="证书建议相关"></a>证书建议相关</h3><h4 id="时间"><a href="#时间" class="headerlink" title="时间"></a>时间</h4><p>无论是 openssl 还是 cfssl，推荐都把过期时间设置高一些：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">$ openssl req -x509 ... -days 10000</span><br><span class="line">$ cat ca-config.json</span><br><span class="line">...</span><br><span class="line">  &quot;expiry&quot;: &quot;876000h&quot;</span><br></pre></td></tr></table></figure><p>而对于 kubeadm，网上有修改编译的，或者 go build 的时候注入覆盖默认的时间的，自行搜索。</p><h4 id="certSAN"><a href="#certSAN" class="headerlink" title="certSAN"></a>certSAN</h4><p>k8s 里 <code>kube-apiserver</code> 和 <code>etcd</code> 都是部署在多个机器上实现高可用的，在 <code>openssl/cfssl/kubeadm</code> 里推荐加 IP 以外还要加域名以防后续换 IP 相关：</p><ul><li>openssl 配置文件参考 <a href="https://github.com/kubernetes-sigs/kubespray/blob/master/roles/etcd/templates/openssl.conf.j2">kubernetes-sigs&#x2F;kubespray 的 openssl.conf</a></li><li>cfssl 使用的 json 文件里的 <code>hosts</code> 字段</li><li>kubeadm 的 <code>certSANs</code> 字段</li></ul><p>这里以 ipv4 下 kubeadm 的指定 yml 创建集群来举例一般写那些：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line">$ cat initconfig.yaml</span><br><span class="line">apiServer:</span><br><span class="line">  certSANs:</span><br><span class="line">  - 10.96.0.1 # service cidr的第一个ip</span><br><span class="line">  - 127.0.0.1 # 多个master的时候负载均衡出问题了能够快速使用localhost调试</span><br><span class="line">  - localhost</span><br><span class="line">  - apiserver.k8s.local # 负载均衡的域名或者vip</span><br><span class="line">  - 172.19.0.2 # 三台 kube-apiserver 的 IP</span><br><span class="line">  - 172.19.0.3</span><br><span class="line">  - 172.19.0.4</span><br><span class="line">  - apiserver01.k8s.local </span><br><span class="line">  - apiserver02.k8s.local</span><br><span class="line">  - apiserver03.k8s.local</span><br><span class="line">  - apiserver04.k8s.local # 预留域名</span><br><span class="line">  - apiserver05.k8s.local</span><br><span class="line">  - master</span><br><span class="line">  - kubernetes</span><br><span class="line">  - kubernetes.default</span><br><span class="line">  - kubernetes.default.svc</span><br><span class="line">  - kubernetes.default.svc.cluster.local # 集群内 dns search 和 clusterDomain</span><br><span class="line">...</span><br><span class="line">etcd: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Etcd</span><br><span class="line">  local:</span><br><span class="line">    serverCertSANs:</span><br><span class="line">    - localhost</span><br><span class="line">    - 127.0.0.1</span><br><span class="line">    - 172.19.0.2</span><br><span class="line">    - 172.19.0.3</span><br><span class="line">    - 172.19.0.4</span><br><span class="line">    - etcd01.k8s.local</span><br><span class="line">    - etcd02.k8s.local</span><br><span class="line">    - etcd03.k8s.local</span><br><span class="line">    - etcd04.k8s.local # 预留域名</span><br><span class="line">    - etcd05.k8s.local</span><br><span class="line">    peerCertSANs:</span><br><span class="line">    - localhost</span><br><span class="line">    - 127.0.0.1</span><br><span class="line">    - 172.19.0.2</span><br><span class="line">    - 172.19.0.3</span><br><span class="line">    - 172.19.0.4</span><br><span class="line">    - etcd01.k8s.local</span><br><span class="line">    - etcd02.k8s.local</span><br><span class="line">    - etcd03.k8s.local</span><br><span class="line">    - etcd04.k8s.local # 预留域名</span><br><span class="line">    - etcd05.k8s.local</span><br></pre></td></tr></table></figure><p>上面只列举 IPv4 的，如果后续有双栈啥的可以预先写上，如果写漏了域名和 IP，管理组件或者 pod 内通过 SDK 访问 kube-apiserver 的时候会报错：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Unable to connect to the server: tls: failed to verify certificate: x509: certificate is valid for 10.96.0.1, yyyy, not xxxx</span><br></pre></td></tr></table></figure><p>也就是证书的 certSANs 只有 <code>10.96.0.1, yyyy</code> 而没有 <code>xxxx</code>，可以使用原有 CA 证书签署下。命令行查看证书的 certSAN 可以使用 openssl ，编程语言的话推荐去使用库：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">一般会把证书放在 /etc/kubernetes/pki 找不到的找 apiserver cmdline 参数路径</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl x509 -noout -text -<span class="keyword">in</span> apiserver.crt</span></span><br><span class="line">X509v3 Subject Alternative Name:</span><br><span class="line">                DNS:localhost, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, IP Address:10.96.0.1, IP Address: 172.19.0.2</span><br></pre></td></tr></table></figure><p>openssl 上面的输出里很多信息，还包含证书过期时间，而且 openssl x509 下很多选项的：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">利用 -certopt 和 -text 配合只打印 certSANs</span></span><br><span class="line">openssl x509 -noout  -in apiserver.crt  -certopt no_subject,no_header,no_version,no_serial,no_signame,no_validity,no_issuer,no_pubkey,no_sigdump,no_aux -text</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">只查看证书时间</span></span><br><span class="line">openssl x509 -noout  -in apiserver.crt -dates</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">只展示 subject</span></span><br><span class="line">openssl x509 -noout  -in apiserver.crt  -subject</span><br></pre></td></tr></table></figure><p>只有同一套 ca 签署证书才符合要求，可以使用 openssl 命令检查：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">检查 apiserver.crt 是否是由 ca.key 签署</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl verify -CAfile ca.crt apiserver.crt</span></span><br><span class="line">apiserver.crt: OK</span><br></pre></td></tr></table></figure><h3 id="k8s-role-和-RBAC"><a href="#k8s-role-和-RBAC" class="headerlink" title="k8s role 和 RBAC"></a>k8s role 和 RBAC</h3><p>双向 TLS 过去了，但是具体权限控制 k8s 怎么做的呢，就是 X.509 证书签署（Subject 字段内）的 <code>CN(Common Name)</code> 与 <code>O(Organization)</code> 字段，对应 <code>User Name</code> 和 <code>Group</code>，也就是 k8s 的 RBAC：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl get clusterrolebinding</span><br><span class="line">NAME                                                   ROLE                                                                               AGE</span><br><span class="line">cluster-admin                                          ClusterRole/cluster-admin                                                          81m</span><br><span class="line">kubeadm:get-nodes                                      ClusterRole/kubeadm:get-nodes                                                      81m</span><br><span class="line">kubeadm:kubelet-bootstrap                              ClusterRole/system:node-bootstrapper                                               81m</span><br><span class="line">kubeadm:node-autoapprove-bootstrap                     ClusterRole/system:certificates.k8s.io:certificatesigningrequests:nodeclient       81m</span><br><span class="line">kubeadm:node-autoapprove-certificate-rotation          ClusterRole/system:certificates.k8s.io:certificatesigningrequests:selfnodeclient   81m</span><br><span class="line">kubeadm:node-proxier                                   ClusterRole/system:node-proxier                                                    81m</span><br><span class="line">...</span><br><span class="line">system:coredns                                         ClusterRole/system:coredns                                                         81m</span><br><span class="line">system:discovery                                       ClusterRole/system:discovery                                                       81m</span><br><span class="line">system:kube-controller-manager                         ClusterRole/system:kube-controller-manager                                         81m</span><br><span class="line">system:kube-dns                                        ClusterRole/system:kube-dns                                                        81m</span><br><span class="line">system:kube-scheduler                                  ClusterRole/system:kube-scheduler                                                  81m</span><br><span class="line">system:monitoring                                      ClusterRole/system:monitoring                                                      81m</span><br><span class="line">system:node                                            ClusterRole/system:node                                                            81m</span><br><span class="line">system:node-proxier                                    ClusterRole/system:node-proxier                                                    81m</span><br><span class="line">system:public-info-viewer                              ClusterRole/system:public-info-viewer                                              81m</span><br><span class="line">system:service-account-issuer-discovery                ClusterRole/system:service-account-issuer-discovery                                81m</span><br><span class="line">system:volume-scheduler                                ClusterRole/system:volume-scheduler                                                81m</span><br></pre></td></tr></table></figure><p>kube-apiserver 启动后会创建上面的 <code>clusterrolebinding</code>，kubectl 本质就是个 client + kubeconfig 访问 <code>kube-apiserver</code> 的，查看 kubectl 当前使用的 <code>kubeconfig</code> 可以通过 k8s 所有二进制的 cmdline 的 <code>-v</code> 选项，从详细信息里获取：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl config view -v=6</span></span><br><span class="line">I0604 10:40:46.836786   14513 loader.go:395] Config loaded from file:  /root/.kube/config</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>可以从上面看到是 <code>/root/.kube/config</code> ，以 kubeadm 的举例，该文件内容内有证书内容：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">$ cat /root/.kube/config</span><br><span class="line">certificate-authority-data  /etc/kubernetes/pki/ca.crt 内容 base64 加密后的值</span><br><span class="line">client-certificate-data     /etc/kubernetes/pki/admin.crt 内容 base64 加密后的值</span><br><span class="line">client-key-data             /etc/kubernetes/pki/admin.key 内容 base64 加密后的值</span><br></pre></td></tr></table></figure><p>上面后面俩已经内嵌了，所以文件不存在，但是也会有些人自建集群上面的后面值是路径，是因为 kubectl config 生成 kubeconfig 的时候没指定选项 <code>--embed-certs</code>，内嵌的步骤如下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">kubectl --kubeconfig /etc/kubernetes/admin.conf config set-credentials admin \</span><br><span class="line">        --client-certificate=/etc/kubernetes/pki/admin.crt \</span><br><span class="line">        --embed-certs=true \</span><br><span class="line">        --client-key=/etc/kubernetes/pki/admin.key&quot;</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash"><span class="built_in">rm</span> -f /etc/kubernetes/pki/admin.???</span></span><br></pre></td></tr></table></figure><p>kubeadm golang 直接没有落地文件，直接生成 yml 内容的，我们可以扣出证书信息看看</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl x509 -<span class="keyword">in</span> &lt;(kubectl config view --raw -o jsonpath=<span class="string">&quot;&#123;.users[0][&#x27;user&#x27;][&#x27;client-certificate-data&#x27;]&#125;&quot;</span> | <span class="built_in">base64</span> -d ) -noout -text</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl x509 -<span class="keyword">in</span> &lt;(kubectl config view --raw -o jsonpath=<span class="string">&quot;&#123;.users[0][&#x27;user&#x27;][&#x27;client-certificate-data&#x27;]&#125;&quot;</span> | <span class="built_in">base64</span> -d ) -noout  -subject</span></span><br><span class="line">subject= /O=system:masters/CN=kubernetes-admin</span><br></pre></td></tr></table></figure><p>我们可以看到 <code>O=system:masters</code> ，实际对应 <code>clusterrolebinding cluster-admin</code> 的信息：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl get clusterrolebinding cluster-admin -o yaml</span></span><br><span class="line">apiVersion: rbac.authorization.k8s.io/v1</span><br><span class="line">kind: ClusterRoleBinding</span><br><span class="line">metadata:</span><br><span class="line">  annotations:</span><br><span class="line">    rbac.authorization.kubernetes.io/autoupdate: &quot;true&quot;</span><br><span class="line">  creationTimestamp: &quot;2025-06-03T07:36:44Z&quot;</span><br><span class="line">  labels:</span><br><span class="line">    kubernetes.io/bootstrapping: rbac-defaults</span><br><span class="line">  name: cluster-admin</span><br><span class="line">  resourceVersion: &quot;160&quot;</span><br><span class="line">  uid: d5638680-38de-4010-a2c9-084645a8ad21</span><br><span class="line">roleRef:</span><br><span class="line">  apiGroup: rbac.authorization.k8s.io</span><br><span class="line">  kind: ClusterRole</span><br><span class="line">  name: cluster-admin</span><br><span class="line">subjects:</span><br><span class="line">- apiGroup: rbac.authorization.k8s.io</span><br><span class="line">  kind: Group</span><br><span class="line">  name: system:masters</span><br></pre></td></tr></table></figure><p>也就是组 <code>system:masters</code> 具备 clusterrole <code>cluster-admin</code> 的权限。</p><p>本小结参考：</p><ul><li><a href="https://kubernetes.io/zh-cn/docs/reference/access-authn-authz/rbac/">使用 RBAC 鉴权</a></li><li><a href="https://kubernetes.io/zh-cn/docs/reference/setup-tools/kubeadm/implementation-details/">kubeadm 实现细节</a></li></ul><h2 id="实战"><a href="#实战" class="headerlink" title="实战"></a>实战</h2><p>说完理论部分，来实战下，利用证书的 <code>CN(Common Name)</code> 与 <code>O(Organization)</code> 字段来创建两个权限证书测试下：</p><ul><li>用户 test1 具备 default ns 下的 pod list 权限</li><li>组 test2 具备所有 ns 的 pod list 权限</li></ul><p>避免路径、证书名字和后缀和习惯问题，实战部分以 cfssl 在 kubeadm 初始化后的文件目录内操作。</p><p>对证书做操作之前要有备份习惯，无论证书损坏还是过期：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">cd /etc/kubernetes</span><br><span class="line">cp -a pki pki.bak</span><br></pre></td></tr></table></figure><h3 id="创建证书"><a href="#创建证书" class="headerlink" title="创建证书"></a>创建证书</h3><p>创建配置文件：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line">cd /etc/kubernetes/pki/</span><br><span class="line"></span><br><span class="line"># 创建 ca 签署证书签名配置文件，因为该证书是只 client 使用，不需要在 usages 里带 &quot;server auth&quot;</span><br><span class="line"># 如果所有证书手动生成时候用同一个 ca-config.json 可以偷懒带上 &quot;server auth&quot;</span><br><span class="line"></span><br><span class="line">cat &gt; ca-config.json &lt;&lt; EOF</span><br><span class="line">&#123;</span><br><span class="line">  &quot;signing&quot;: &#123;</span><br><span class="line">    &quot;default&quot;: &#123;</span><br><span class="line">      &quot;expiry&quot;: &quot;876000h&quot;</span><br><span class="line">    &#125;,</span><br><span class="line">    &quot;profiles&quot;: &#123;</span><br><span class="line">      &quot;kubernetes&quot;: &#123;</span><br><span class="line">        &quot;usages&quot;: [</span><br><span class="line">            &quot;signing&quot;,</span><br><span class="line">            &quot;key encipherment&quot;,</span><br><span class="line">            &quot;client auth&quot;</span><br><span class="line">        ],</span><br><span class="line">        &quot;expiry&quot;: &quot;876000h&quot;</span><br><span class="line">      &#125;</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line">EOF</span><br><span class="line"></span><br><span class="line"># cn 对应 user o 对应 group</span><br><span class="line">cat &gt; test1-csr.json &lt;&lt; EOF</span><br><span class="line">&#123;</span><br><span class="line">  &quot;CN&quot;: &quot;test1&quot;,</span><br><span class="line">  &quot;hosts&quot;: [],</span><br><span class="line">  &quot;key&quot;: &#123;</span><br><span class="line">    &quot;algo&quot;: &quot;rsa&quot;,</span><br><span class="line">    &quot;size&quot;: 2048</span><br><span class="line">  &#125;,</span><br><span class="line">  &quot;names&quot;: [</span><br><span class="line">    &#123;</span><br><span class="line">      &quot;O&quot;: &quot;test1&quot;,</span><br><span class="line">      &quot;OU&quot;: &quot;System&quot;</span><br><span class="line">    &#125;</span><br><span class="line">  ]</span><br><span class="line">&#125;</span><br><span class="line">EOF</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>签署证书：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">cfssl gencert \</span><br><span class="line">        -ca=ca.crt \</span><br><span class="line">        -ca-key=ca.key \</span><br><span class="line">        -config=ca-config.json \</span><br><span class="line">        -profile=kubernetes test1-csr.json | cfssljson -bare test1</span><br></pre></td></tr></table></figure><p>测试证书权限：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">避免 kubeconfig 干扰，改名下家目录文件</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">mv</span> ~/.kube/config ~/.kube/config.bak</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">KUBECONFIG= kubectl --kubeconfig /dev/null --server=https://xxx:6443 \</span></span><br><span class="line"><span class="language-bash">  --certificate-authority=/etc/kubernetes/pki/ca.crt \</span></span><br><span class="line"><span class="language-bash">  --client-certificate=/etc/kubernetes/pki/test1.pem \</span></span><br><span class="line"><span class="language-bash">  --client-key=/etc/kubernetes/pki/test1-key.pem get pod</span></span><br><span class="line">Error from server (Forbidden): pods is forbidden: User &quot;test1&quot; cannot list resource &quot;pods&quot; in API group &quot;&quot; in the namespace &quot;default&quot;</span><br></pre></td></tr></table></figure><p>kube-apiserver 本质是 http&#x2F;grpc server，我们也可以 curl 测下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">curl -X GET \</span></span><br><span class="line"><span class="language-bash">  --cacert /etc/kubernetes/pki/ca.crt \</span></span><br><span class="line"><span class="language-bash">  --cert /etc/kubernetes/pki/test1.pem \</span></span><br><span class="line"><span class="language-bash">  --key /etc/kubernetes/pki/test1-key.pem \</span></span><br><span class="line"><span class="language-bash">  -H <span class="string">&quot;Accept: application/json&quot;</span> \</span></span><br><span class="line"><span class="language-bash">  <span class="string">&quot;https://xxx:6443/api/v1/namespaces/default/pods?limit=500&quot;</span></span></span><br><span class="line">&#123;</span><br><span class="line">  &quot;kind&quot;: &quot;Status&quot;,</span><br><span class="line">  &quot;apiVersion&quot;: &quot;v1&quot;,</span><br><span class="line">  &quot;metadata&quot;: &#123;&#125;,</span><br><span class="line">  &quot;status&quot;: &quot;Failure&quot;,</span><br><span class="line">  &quot;message&quot;: &quot;pods is forbidden: User \&quot;test1\&quot; cannot list resource \&quot;pods\&quot; in API group \&quot;\&quot; in the namespace \&quot;default\&quot;&quot;,</span><br><span class="line">  &quot;reason&quot;: &quot;Forbidden&quot;,</span><br><span class="line">  &quot;details&quot;: &#123;</span><br><span class="line">    &quot;kind&quot;: &quot;pods&quot;</span><br><span class="line">  &#125;,</span><br><span class="line">  &quot;code&quot;: 403</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>因为我们没有创建 <code>test1</code> 的 <code>RBAC</code> ，也就是 <code>rolebinding</code>，创建下后再试试：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">mv</span> ~/.kube/config.bak ~/.kube/config</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> &gt; test1-rbac.yml &lt;&lt; <span class="string">EOF</span></span></span><br><span class="line">apiVersion: rbac.authorization.k8s.io/v1</span><br><span class="line">kind: Role</span><br><span class="line">metadata:</span><br><span class="line">  name: test1-role</span><br><span class="line">  namespace: default</span><br><span class="line">rules:</span><br><span class="line">- apiGroups:</span><br><span class="line">  - &quot;&quot;</span><br><span class="line">  resources:</span><br><span class="line">  - pods</span><br><span class="line">  verbs:</span><br><span class="line">  - get</span><br><span class="line">  - list</span><br><span class="line">---</span><br><span class="line">apiVersion: rbac.authorization.k8s.io/v1</span><br><span class="line">kind: RoleBinding</span><br><span class="line">metadata:</span><br><span class="line">  name: test1-rolebinding</span><br><span class="line">  namespace: default</span><br><span class="line">roleRef:</span><br><span class="line">  apiGroup: rbac.authorization.k8s.io</span><br><span class="line">  kind: Role</span><br><span class="line">  name: test1-role</span><br><span class="line">subjects:</span><br><span class="line">- apiGroup: rbac.authorization.k8s.io</span><br><span class="line">  kind: User</span><br><span class="line">  name: test1</span><br><span class="line">EOF</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="string">kubectl apply -f test1-rbac.yml</span></span></span><br></pre></td></tr></table></figure><p>然后再测试，可以列出 default 下的 pod 而不能列出 svc：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">mv</span> ~/.kube/config ~/.kube/config.bak</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">KUBECONFIG= kubectl --kubeconfig /dev/null --server=https://xxx:6443 \</span></span><br><span class="line"><span class="language-bash">  --certificate-authority=/etc/kubernetes/pki/ca.crt \</span></span><br><span class="line"><span class="language-bash">  --client-certificate=/etc/kubernetes/pki/test1.pem \</span></span><br><span class="line"><span class="language-bash">  --client-key=/etc/kubernetes/pki/test1-key.pem get pod</span></span><br><span class="line">No resources found in default namespace.</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">KUBECONFIG= kubectl --server=https://xxx:6443   --certificate-authority=/etc/kubernetes/pki/ca.crt   --client-certificate=/etc/kubernetes/pki/test1.pem   --client-key=/etc/kubernetes/pki/test1-key.pem get svc</span></span><br><span class="line">Error from server (Forbidden): services is forbidden: User &quot;test1&quot; cannot list resource &quot;services&quot; in API group &quot;&quot; in the namespace &quot;default&quot;</span><br></pre></td></tr></table></figure><p>同样使用 curl 测下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">curl -X GET  \</span></span><br><span class="line"><span class="language-bash">  --cacert /etc/kubernetes/pki/ca.crt  \</span></span><br><span class="line"><span class="language-bash">  --cert /etc/kubernetes/pki/test1.pem  \</span></span><br><span class="line"><span class="language-bash">  --key /etc/kubernetes/pki/test1-key.pem  \</span></span><br><span class="line"><span class="language-bash">   -H <span class="string">&quot;Accept: application/json&quot;</span>   <span class="string">&quot;https://10.xxx.xx.xxx:6443/api/v1/namespaces/default/pods?limit=500&quot;</span></span></span><br><span class="line">&#123;</span><br><span class="line">  &quot;kind&quot;: &quot;PodList&quot;,</span><br><span class="line">  &quot;apiVersion&quot;: &quot;v1&quot;,</span><br><span class="line">  &quot;metadata&quot;: &#123;</span><br><span class="line">    &quot;resourceVersion&quot;: &quot;107116&quot;</span><br><span class="line">  &#125;,</span><br><span class="line">  &quot;items&quot;: []</span><br><span class="line">&#125;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">curl -X GET  \</span></span><br><span class="line"><span class="language-bash">  --cacert /etc/kubernetes/pki/ca.crt  \</span></span><br><span class="line"><span class="language-bash">  --cert /etc/kubernetes/pki/test1.pem  \</span></span><br><span class="line"><span class="language-bash">  --key /etc/kubernetes/pki/test1-key.pem  \</span></span><br><span class="line"><span class="language-bash">   -H <span class="string">&quot;Accept: application/json&quot;</span>   <span class="string">&quot;https://10.xxx.xx.xxx:6443/api/v1/namespaces/default/services?limit=500&quot;</span></span></span><br><span class="line">&#123;</span><br><span class="line">  &quot;kind&quot;: &quot;Status&quot;,</span><br><span class="line">  &quot;apiVersion&quot;: &quot;v1&quot;,</span><br><span class="line">  &quot;metadata&quot;: &#123;&#125;,</span><br><span class="line">  &quot;status&quot;: &quot;Failure&quot;,</span><br><span class="line">  &quot;message&quot;: &quot;services is forbidden: User \&quot;test1\&quot; cannot list resource \&quot;services\&quot; in API group \&quot;\&quot; in the namespace \&quot;default\&quot;&quot;,</span><br><span class="line">  &quot;reason&quot;: &quot;Forbidden&quot;,</span><br><span class="line">  &quot;details&quot;: &#123;</span><br><span class="line">    &quot;kind&quot;: &quot;services&quot;</span><br><span class="line">  &#125;,</span><br><span class="line">  &quot;code&quot;: 403</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>可以看到证书权限符合预期，kubeconfig 里可以包含多个配置段的，前面的证书生成 kubeconfig 可以使用 kubectl ，按照下面步骤生成对应配置段：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">设置集群参数，指定CA证书和apiserver地址</span></span><br><span class="line">kubectl --kubeconfig=test1.kubeconfig config set-cluster kubernetes \</span><br><span class="line">    --certificate-authority=/etc/kubernetes/pki/ca.crt \</span><br><span class="line">    --embed-certs=true \</span><br><span class="line">    --server=https://xxx:6443</span><br><span class="line">        </span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">设置客户端认证参数，指定使用证书和私钥</span></span><br><span class="line">kubectl --kubeconfig=test1.kubeconfig config set-credentials test1 \</span><br><span class="line">    --client-certificate=test1.pem \</span><br><span class="line">    --embed-certs=true \</span><br><span class="line">    --client-key=test1-key.pem</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">追加一个名为 kubernetes 的上下文参数，指定它使用前面添加的 集群 kubernetes 和名为 test1 的凭据</span></span><br><span class="line">kubectl --kubeconfig=test1.kubeconfig config set-context kubernetes \</span><br><span class="line">    --cluster=kubernetes --user=test1</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">选择默认的上下文</span></span><br><span class="line">kubectl --kubeconfig=test1.kubeconfig config use-context kubernetes</span><br></pre></td></tr></table></figure><p>然后使用该 kubeconfig 测试：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">[root@zgz pki]# kubectl --kubeconfig=test1.kubeconfig get pod</span><br><span class="line">No resources found in default namespace.</span><br><span class="line">[root@zgz pki]# kubectl --kubeconfig=test1.kubeconfig get svc</span><br><span class="line">Error from server (Forbidden): services is forbidden: User &quot;test1&quot; cannot list resource &quot;services&quot; in API group &quot;&quot; in the namespace &quot;default&quot;</span><br></pre></td></tr></table></figure><p>group test2 一样操作，就是注意 <code>O</code> 字段即可，然后是 <code>clusterrole</code> 和 <code>clusterrolebinding</code> ，自行挑战下。</p><h3 id="kube-apiserver-的-certSAN"><a href="#kube-apiserver-的-certSAN" class="headerlink" title="kube-apiserver 的 certSAN"></a>kube-apiserver 的 certSAN</h3><p>此部分解决 kube-apiserver 的证书（过期也可以按照如下步骤来），例如很多人 kubeadm 初始化后，certSAN 缺少 hosts 报错：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">$ echo &#x27;127.0.0.1 santest&#x27; &gt;&gt; /etc/hosts</span><br><span class="line">$ kubectl --server https://santest:6443 get pod</span><br><span class="line">...</span><br><span class="line">Unable to connect to the server: tls: </span><br><span class="line">    failed to verify certificate: x509: </span><br><span class="line">    certificate is valid for kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, node, not santest</span><br></pre></td></tr></table></figure><p>这个时候可以用 ca 签署新证书来包含 santest ：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br></pre></td><td class="code"><pre><span class="line">cd /etc/kubernetes/pki/</span><br><span class="line"></span><br><span class="line"># 创建 ca 签署证书签名配置文件，因为该证书是只 server 使用，不需要在 usages 里带 &quot;client auth&quot;</span><br><span class="line"># 如果所有证书手动生成时候用同一个 ca-config.json 可以偷懒带上 &quot;client auth&quot;</span><br><span class="line"></span><br><span class="line">cat &gt; ca-config.json &lt;&lt; EOF</span><br><span class="line">&#123;</span><br><span class="line">  &quot;signing&quot;: &#123;</span><br><span class="line">    &quot;default&quot;: &#123;</span><br><span class="line">      &quot;expiry&quot;: &quot;876000h&quot;</span><br><span class="line">    &#125;,</span><br><span class="line">    &quot;profiles&quot;: &#123;</span><br><span class="line">      &quot;kubernetes&quot;: &#123;</span><br><span class="line">        &quot;usages&quot;: [</span><br><span class="line">            &quot;signing&quot;,</span><br><span class="line">            &quot;key encipherment&quot;,</span><br><span class="line">            &quot;server auth&quot;</span><br><span class="line">        ],</span><br><span class="line">        &quot;expiry&quot;: &quot;876000h&quot;</span><br><span class="line">      &#125;</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line">EOF</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"># 查看现有 certSAN</span><br><span class="line">openssl x509 -noout  -in apiserver.crt  -certopt no_subject,no_header,no_version,no_serial,no_signame,no_validity,no_issuer,no_pubkey,no_sigdump,no_aux -text</span><br><span class="line"></span><br><span class="line"># 把上面老的和要添加的 写到文件里</span><br><span class="line">cat &gt; kubernetes-csr.json &lt;&lt; EOF</span><br><span class="line">&#123;</span><br><span class="line">  &quot;CN&quot;: &quot;kube-apiserver&quot;,</span><br><span class="line">  &quot;hosts&quot;: [</span><br><span class="line">    &quot;127.0.0.1&quot;,</span><br><span class="line">    &quot;::1&quot;,</span><br><span class="line">    &quot;localhost&quot;,</span><br><span class="line">    &quot;santest&quot;</span><br><span class="line">    &quot;10.xx&quot;,</span><br><span class="line">    &quot;10.96.0.1&quot;,</span><br><span class="line">    &quot;kubernetes&quot;,</span><br><span class="line">    &quot;kubernetes.default&quot;,</span><br><span class="line">    &quot;kubernetes.default.svc&quot;,</span><br><span class="line">    &quot;kubernetes.default.svc.cluster&quot;,</span><br><span class="line">    &quot;kubernetes.default.svc.cluster.local&quot;</span><br><span class="line">  ],</span><br><span class="line">  &quot;key&quot;: &#123;</span><br><span class="line">    &quot;algo&quot;: &quot;rsa&quot;,</span><br><span class="line">    &quot;size&quot;: 2048</span><br><span class="line">  &#125;,</span><br><span class="line">  &quot;names&quot;: [</span><br><span class="line">    &#123;</span><br><span class="line">      &quot;O&quot;: &quot;k8s&quot;,</span><br><span class="line">      &quot;OU&quot;: &quot;Kubernetes&quot;</span><br><span class="line">    &#125;</span><br><span class="line">  ]</span><br><span class="line">&#125;</span><br><span class="line">EOF</span><br><span class="line"></span><br><span class="line"># 签署证书</span><br><span class="line"></span><br><span class="line">cfssl gencert \</span><br><span class="line">        -ca=ca.crt \</span><br><span class="line">        -ca-key=ca.key \</span><br><span class="line">        -config=ca-config.json \</span><br><span class="line">        -profile=kubernetes kubernetes-csr.json | cfssljson -bare apiserver2</span><br></pre></td></tr></table></figure><p>修改 kube-apiserver 的 cmdline 使用 <code>apiserver2.pem</code> 和 <code>apiserver2-key.pem</code> :</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cd</span> /etc/kubernetes/manifests/</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">grep -E <span class="string">&#x27;apiserver.(crt|key)&#x27;</span> /etc/kubernetes/manifests/kube-apiserver.yaml</span></span><br><span class="line">    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt</span><br><span class="line">    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">sed -ri -e <span class="string">&#x27;s#/apiserver.crt#/apiserver2.pem#&#x27;</span> -e <span class="string">&#x27;s#/apiserver.key#/apiserver2-key.pem#&#x27;</span> /etc/kubernetes/manifests/kube-apiserver.yaml</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">grep -E -- <span class="string">&#x27;--tls-(cert|private)&#x27;</span> /etc/kubernetes/manifests/kube-apiserver.yaml</span></span><br><span class="line">    - --tls-cert-file=/etc/kubernetes/pki/apiserver2.pem</span><br><span class="line">    - --tls-private-key-file=/etc/kubernetes/pki/apiserver2-key.pem</span><br></pre></td></tr></table></figure><p>然后用上面的 santest 域名测试：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl --server https://santest:6443 get pod</span><br><span class="line">No resources found in default namespace.</span><br><span class="line">$ kubectl --server https://santest:6443 get svc</span><br><span class="line">NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE</span><br><span class="line">kubernetes   ClusterIP   10.96.0.1    &lt;none&gt;        443/TCP   1h</span><br></pre></td></tr></table></figure><h2 id="故障案例"><a href="#故障案例" class="headerlink" title="故障案例"></a>故障案例</h2><h3 id="kubectl-证书过期"><a href="#kubectl-证书过期" class="headerlink" title="kubectl 证书过期"></a>kubectl 证书过期</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl apply -f /tmp/test-svc.yml</span></span><br><span class="line">... x509: certificate has exprired or is not yet valid: current time 2025-05-20T23:25:51+08:00 is after 2025-01-16T02:16:34Z</span><br></pre></td></tr></table></figure><p>查看 kubeconfig 内嵌的证书过期时间：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl x509 -<span class="keyword">in</span> &lt;(kubectl config view --raw -o jsonpath=<span class="string">&quot;&#123;.users[0][&#x27;user&#x27;][&#x27;client-certificate-data&#x27;]&#125;&quot;</span> | <span class="built_in">base64</span> -d ) -noout -enddate</span></span><br><span class="line">notAfter=Jan 16 02:16:34 2025 GMT</span><br></pre></td></tr></table></figure><p>而 <code>admin.pem</code> 看现场重新签署了下，时间没过期：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl x509 -<span class="keyword">in</span> admin.pem -noout -dates</span></span><br><span class="line">notBefore=Jan 17 03:05:00 2024 GMT</span><br><span class="line">notAfter=Dec 24 03:05:00 2123 GMT</span><br></pre></td></tr></table></figure><p>所以内嵌下证书生成新的 kubeconfig 即可：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">我们证书后缀和文件路径不一样，不要照抄，<span class="built_in">env</span> 和命令行指定生成的 kubeconfig 均一样</span></span><br><span class="line">cd /etc/kubernetes/cluster1/ssl</span><br><span class="line"></span><br><span class="line">KUBECONFIG=/etc/kubernetes/cluster1/.kube/config2 \</span><br><span class="line">        kubectl config set-cluster kubernetes \</span><br><span class="line">        --certificate-authority=ca.pem \</span><br><span class="line">        --embed-certs=true \</span><br><span class="line">        --server=https://127.0.0.1:8443</span><br><span class="line">        </span><br><span class="line">KUBECONFIG=/etc/kubernetes/cluster1/.kube/config2 \</span><br><span class="line">        kubectl config set-credentials admin \</span><br><span class="line">        --client-certificate=admin.pem \</span><br><span class="line">        --embed-certs=true \</span><br><span class="line">        --client-key=admin-key.pem</span><br><span class="line"></span><br><span class="line">KUBECONFIG=/etc/kubernetes/cluster1/.kube/config2 \</span><br><span class="line">        kubectl config set-context kubernetes \</span><br><span class="line">        --cluster=kubernetes --user=admin</span><br><span class="line"></span><br><span class="line">KUBECONFIG=/etc/kubernetes/cluster1/.kube/config2 \</span><br><span class="line">         kubectl config use-context kubernetes</span><br><span class="line"></span><br><span class="line">KUBECONFIG=/etc/kubernetes/cluster1/.kube/config2 kubectl get node</span><br></pre></td></tr></table></figure><h3 id="deploy-的-rs-创建报错过期"><a href="#deploy-的-rs-创建报错过期" class="headerlink" title="deploy 的 rs 创建报错过期"></a>deploy 的 rs 创建报错过期</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl -n default describe deploy deployment-example</span><br><span class="line">Name:                   deployment-example</span><br><span class="line">Namespace:              default</span><br><span class="line">CreationTimestamp:      Fri, 30 May 2025 15:45:03 +0800</span><br><span class="line">Labels:                 &lt;none&gt;</span><br><span class="line">Annotations:            &lt;none&gt;</span><br><span class="line">Selector:               app=nginx</span><br><span class="line">Replicas:               2 desired | 0 updated | 0 total | 0 available | 0 unavailable</span><br><span class="line">StrategyType:           RollingUpdate</span><br><span class="line">MinReadySeconds:        0</span><br><span class="line">RollingUpdateStrategy:  25% max unavailable, 25% max surge</span><br><span class="line">Pod Template:</span><br><span class="line">  Labels:  app=nginx</span><br><span class="line">  Containers:</span><br><span class="line">   nginx:</span><br><span class="line">    Image:        nginx:1.19-alpine</span><br><span class="line">    Port:         12343/TCP</span><br><span class="line">    Host Port:    0/TCP</span><br><span class="line">    Environment:  &lt;none&gt;</span><br><span class="line">    Mounts:       &lt;none&gt;</span><br><span class="line">  Volumes:        &lt;none&gt;</span><br><span class="line">Conditions:</span><br><span class="line">  Type           Status  Reason</span><br><span class="line">  ----           ------  ------</span><br><span class="line">  Progressing    False   ReplicaSetCreateError</span><br><span class="line">OldReplicaSets:  &lt;none&gt;</span><br><span class="line">NewReplicaSet:   &lt;none&gt;</span><br><span class="line">Events:</span><br><span class="line">  Type     Reason                 Age                From                   Message</span><br><span class="line">  ----     ------                 ----               ----                   -------</span><br><span class="line">  Warning  ReplicaSetCreateError  21s (x7 over 21s)  deployment-controller  Failed to create new replica set &quot;deployment-example-b4f6c7989&quot;: Get &quot;https://[::1]:6443/api/v1/namespaces/default/resourcequotas&quot;: x509: certificate has expired or is not yet valid: current time 2025-05-30T22:06:13+08:00 is after 2024-08-28T14:45:36Z</span><br><span class="line">  Warning  ReplicaSetCreateError  20s (x2 over 20s)  deployment-controller  Failed to create new replica set &quot;deployment-example-b4f6c7989&quot;: Get &quot;https://[::1]:6443/api/v1/namespaces/default/resourcequotas&quot;: x509: certificate has expired or is not yet valid: current time 2025-05-30T22:06:14+08:00 is after 2024-08-28T14:45:36Z</span><br><span class="line">  Warning  ReplicaSetCreateError  18s                deployment-controller  Failed to create new replica set &quot;deployment-example-b4f6c7989&quot;: Get &quot;https://[::1]:6443/api/v1/namespaces/default/resourcequotas&quot;: x509: certificate has expired or is not yet valid: current time 2025-05-30T22:06:16+08:00 is after 2024-08-28T14:45:36Z</span><br><span class="line">  Warning  ReplicaSetCreateError  16s                deployment-controller  Failed to create new replica set &quot;deployment-example-b4f6c7989&quot;: Get &quot;https://[::1]:6443/api/v1/namespaces/default/resourcequotas&quot;: x509: certificate has expired or is not yet valid: current time 2025-05-30T22:06:18+08:00 is after 2024-08-28T14:45:36Z</span><br><span class="line">  Warning  ReplicaSetCreateError  11s                deployment-controller  Failed to create new replica set &quot;deployment-example-b4f6c7989&quot;: Get &quot;https://[::1]:6443/api/v1/namespaces/default/resourcequotas&quot;: x509: certificate has expired or is not yet valid: current time 2025-05-30T22:06:23+08:00 is after 2024-08-28T14:45:36Z</span><br><span class="line">  Warning  ReplicaSetCreateError  0s                 deployment-controller  Failed to create new replica set &quot;deployment-example-b4f6c7989&quot;: Get &quot;https://[::1]:6443/api/v1/namespaces/default/resourcequotas&quot;: x509: certificate has expired or is not yet valid: current time 2025-05-30T22:06:34+08:00 is after 2024-08-28T14:45:36Z</span><br></pre></td></tr></table></figure><p>这套环境是二进制部署，ReplicaSet 是 <code>kube-controller-manager</code> 创建的，该报错需要看 kube-controller-manager 日志，然后 k8s 的管理组件是通过 lease 对象保证只有一个真正处理：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl -n kube-system get lease</span></span><br><span class="line">NAME                      HOLDER                                                                     AGE</span><br><span class="line">kube-controller-manager   ubuntu-Standard-PC-i440FX-PIIX-1996_296a57fb-a219-4301-a0a6-62c3cd09e0f2   639d</span><br><span class="line">kube-scheduler            ubuntu-Standard-PC-i440FX-PIIX-1996_edd2caff-d647-4633-8bd5-2d9788986e1f   639d</span><br></pre></td></tr></table></figure><p>holder 的名字生成规则如下</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/kubernetes/kubernetes/blob/v1.29.5/cmd/kube-controller-manager/app/controllermanager.go#L256-L286</span></span><br><span class="line">id, err := os.Hostname()</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// add a uniquifier so that two processes on the same host don&#x27;t accidentally both become active</span></span><br><span class="line">id = id + <span class="string">&quot;_&quot;</span> + <span class="type">string</span>(uuid.NewUUID())</span><br></pre></td></tr></table></figure><p>发现每台机器的 hostname 一样的，完全不知道当前持有 lease 的 <code>kube-controller-manager</code> 是哪台，算了，每个去看日志吧：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">journalctl -xe --no-pager -u kube-controller-manager.service</span> </span><br><span class="line">-- Logs begin at Fri 2025-05-09 14:24:19 CST, end at Fri 2025-05-30 22:21:03 CST. --</span><br><span class="line">May 22 00:01:48 ubuntu-Standard-PC-i440FX-PIIX-1996 kube-controller-manager[53418]: E0522 00:01:48.204891   53418 leaderelection.go:325] error retrieving resource lock kube-system/kube-controller-manager: etcdserver: leader changed</span><br><span class="line"></span><br><span class="line">May 30 22:08:57 ubuntu-Standard-PC-i440FX-PIIX-1996 kube-controller-manager[22314]: E0530 22:08:57.593721   22314 deployment_controller.go:495] Get &quot;https://[::1]:6443/api/v1/namespaces/default/resourcequotas&quot;: x509: certificate has expired or is not yet valid: current time 2025-05-30T22:08:57+08:00 is after 2024-08-28T14:45:36Z</span><br><span class="line">May 30 22:08:57 ubuntu-Standard-PC-i440FX-PIIX-1996 kube-controller-manager[22314]: I0530 22:08:57.593752   22314 deployment_controller.go:496] Dropping deployment &quot;default/deployment-example&quot; out of the queue: Get &quot;https://[::1]:6443/api/v1/namespaces/default/resourcequotas&quot;: x509: certificate has expired or is not yet valid: current time 2025-05-30T22:08:57+08:00 is after 2024-08-28T14:45:36Z</span><br><span class="line">May 30 22:08:57 ubuntu-Standard-PC-i440FX-PIIX-1996 kube-controller-manager[22314]: I0530 22:08:57.593824   22314 event.go:291] &quot;Event occurred&quot; object=&quot;default/deployment-example&quot; kind=&quot;Deployment&quot; apiVersion=&quot;apps/v1&quot; type=&quot;Warning&quot; reason=&quot;ReplicaSetCreateError&quot; message=&quot;Failed to create new replica set \&quot;deployment-example-b4f6c7989\&quot;: Get \&quot;https://[::1]:6443/api/v1/namespaces/default/resourcequotas\&quot;: x509: certificate has expired or is not yet valid: current time 2025-05-30T22:08:57+08:00 is after 2024-08-28T14:45:36Z&quot;</span><br></pre></td></tr></table></figure><p>从 cmdline 获取 kubeconfig 路径：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">systemctl <span class="built_in">cat</span> kube-controller-manager.service</span> </span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">/etc/systemd/system/kube-controller-manager.service</span></span><br><span class="line">[Unit]</span><br><span class="line">Description=Kubernetes Controller Manager</span><br><span class="line">Documentation=https://github.com/GoogleCloudPlatform/kubernetes</span><br><span class="line"></span><br><span class="line">[Service]</span><br><span class="line">ExecStart=/data/kube/bin/kube-controller-manager \</span><br><span class="line">  --address=127.0.0.1 \</span><br><span class="line">  --kubeconfig=/etc/kubernetes/cluster1/ssl/kube-controller-manager.kubeconfig \</span><br><span class="line">...</span><br><span class="line">Restart=always</span><br><span class="line">RestartSec=5</span><br><span class="line"></span><br><span class="line">[Install]</span><br><span class="line">WantedBy=multi-user.target</span><br></pre></td></tr></table></figure><p>查看下时间</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl --kubeconfig  /etc/kubernetes/cluster1/ssl/kube-controller-manager.kubeconfig \</span></span><br><span class="line"><span class="language-bash">config view --raw -o jsonpath=<span class="string">&quot;&#123;.users[0][&#x27;user&#x27;][&#x27;client-certificate-data&#x27;]&#125;&quot;</span> | <span class="built_in">base64</span> -d &gt; test.pem</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">openssl x509 -<span class="keyword">in</span> test.pem -noout -enddate</span></span><br><span class="line">notAfter=Aug  5 15:37:00 2123 GMT</span><br></pre></td></tr></table></figure><p>时间没问题，然后看了下每个 kube-controller-manager 都没问题，轮流间隔重启了下 <code>kube-controller-manager</code> 还是一样，然后重启了下 kube-apiserver 才好，感觉 kube-apiserver 缓存 bug。</p><h3 id="kubelet-轮转证书"><a href="#kubelet-轮转证书" class="headerlink" title="kubelet 轮转证书"></a>kubelet 轮转证书</h3><p>证书位于 <code>/var/lib/kubelet</code> ，有时候 kubelet 不会自动轮转，该目录内证书备份下后重启 kubelet 即可，以及推荐设置 kube-controller-manager 的轮转证书时间久些。</p><h2 id="一些其他的"><a href="#一些其他的" class="headerlink" title="一些其他的"></a>一些其他的</h2><p>不单单 k8s 证书， etcd 证书一样，k8s 访问 etcd 相关大同小异，上面实战部分如果理解能力不行，在关于 <code>ca-config.json</code> 的 usages 可以偷懒下面 client 和 server 都写上了，一个 ca 配置文件用于所有：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">&quot;usages&quot;: [</span><br><span class="line">    &quot;signing&quot;,</span><br><span class="line">    &quot;key encipherment&quot;,</span><br><span class="line">    &quot;server auth&quot;,</span><br><span class="line">    &quot;client auth&quot;</span><br><span class="line">],</span><br></pre></td></tr></table></figure><p>任何关于证书报错的信息和日志仔细看，证书过期、 certSAN 不匹配和不是一套 ca 导致校验不通过等是不一样的事情，不要无脑找到啥证书文章博客就跟着瞎操作，证书操作前要备份已有证书，产生 kubeconfig 文件的时候，要使用 kubectl 指定新路径生成，不要动老的。</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;最近同事遇到几起证书相关问题，从小白角度来写下 k8s 证书…..&lt;/p&gt;</summary>
    
    
    
    
    <category term="ssl" scheme="http://zhangguanzhang.github.io/tags/ssl/"/>
    
    <category term="kubernetes" scheme="http://zhangguanzhang.github.io/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>开发一个让指定应用走代理的安卓 app 过程</title>
    <link href="http://zhangguanzhang.github.io/2025/04/27/android-tun2sock/"/>
    <id>http://zhangguanzhang.github.io/2025/04/27/android-tun2sock/</id>
    <published>2025-04-27T20:40:30.000Z</published>
    <updated>2025-04-27T20:40:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>最近开发一个让指定 app 走 http 代理的经过…..</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>想对网络做一些验证，初步想法是找下 golang socks5 代理 server 上对每个 conn 的字节流反序列化后做动作，然后发现如果设备多了的话，没有代理池基本源 IP 就是一个了，所以想着手机端上有没有非 root 下让指定 app 走代理的代理软件。</p><p>然后想起了 <code>v2rxxNG</code> 里可以指定 app 走代理，于是研究折腾了一番到学习 kotlin 和安卓 compose 自己开发安卓 app。</p><h2 id="底层探讨经过"><a href="#底层探讨经过" class="headerlink" title="底层探讨经过"></a>底层探讨经过</h2><p>先大致看了下 <code>v2rxxNG</code> 的代码，发现用的 <code>tun2socks</code> 技术，主要是 tun 接口，也就是 Linux 的 tun 技术。</p><h3 id="Tun"><a href="#Tun" class="headerlink" title="Tun"></a>Tun</h3><p>在 Linux 里一切皆文件，Tap&#x2F;Tun 是 Linux 提供的用户态封装报文的接口，Tap 是数据链路层二层，Tun 是网络 IP 层三层，Tun 一端连着内核的协议栈，另一端连着用户态的进程。使用流程是：</p><ol><li>程序使用现有或者创建的虚拟网卡（Tap&#x2F;Tun）</li><li>程序需要对收和发的报文进行处理（也就是读写 &#x2F;dev&#x2F;net&#x2F;tun 字符设备），程序的逻辑就像物理网卡的硬件功能一样</li></ol><p>详细流程见<a href="https://www.junmajinlong.com/virtual/network/all_about_tun_tap/">理解Linux虚拟网卡设备tun&#x2F;tap的一切</a>，不搞啥虚拟交换机，而是常规做软件代理隧道啥的一般都是使用 Tun。</p><p>例如 linux 机器使用 socks5 tun 模式代理：</p><ul><li>走路由或者 iptables fmark 匹配到发往 tun 网卡</li><li>Linux 协议栈会把 TCP&#x2F;IP 报文发到用户态程序的 socks5 client</li><li>client 程序解析 IP 层的数据，封装成 socks5 协议的包发出去</li><li>协议栈收到后发往物理网卡，物理网卡发出去</li><li>socks5 server 端收到报文，解析后，本机发往目标地址</li></ul><p>这样目标地址看到就是 socks5 server 的 IP 请求的自己了，和 socks5 无关，代理隧道啥的基本都是这样的工作原理。</p><p>更详细的图文见：</p><ul><li><a href="https://www.junmajinlong.com/virtual/network/data_flow_about_openvpn/">https://www.junmajinlong.com/virtual/network/data_flow_about_openvpn/</a></li><li><a href="https://www.zhaohuabing.com/post/2020-02-24-linux-taptun/">https://www.zhaohuabing.com/post/2020-02-24-linux-taptun/</a></li></ul><h3 id="安卓和-Tun"><a href="#安卓和-Tun" class="headerlink" title="安卓和 Tun"></a>安卓和 Tun</h3><p><a href="https://github.com/2dust/v2rayNG/blob/master/V2rayNG/app/src/main/java/com/v2ray/ang/service/V2RayVpnService.kt">v2rxxNG</a> 是把 <code>badvxn</code> 编译成二进制文件，放在 libs 下命名为 <code>libtun2socks.so</code>，这样 Android 在安装 app 的时候会自动给 <code>.so</code> 后缀的文件 <code>+rx</code> 执行权限，在没有 root 的情况下，应用程序是无法给一个文件增加 x 权限的，这也算是一个骚操作。</p><p>然后在安卓开发的时候，安卓提供了接口，需要继承 <a href="https://developer.android.com/reference/kotlin/android/net/VpnService">VxnService</a> ，在其类的内部使用 <code>Builder()</code> 创建 tun，只不过安卓上不再是 <code>/dev/net/tun</code> 而是文件描述符了：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">ParcelFileDescriptor tunDevice = new Builder()</span><br><span class="line">    .addAddress(VXN_ADDRESS, 32)</span><br><span class="line">    .addRoute(VXN_ROUTE, 0)</span><br><span class="line">    .addDnsServer(VXN_DNS)</span><br><span class="line">    .addAllowedApplication(&quot;com.google.android.tethering&quot;)</span><br><span class="line">    .addAllowedApplication(&quot;com.google.android.tethering2&quot;)</span><br><span class="line">    .establish();</span><br><span class="line"></span><br><span class="line">fd = tunDevice.getFd()</span><br></pre></td></tr></table></figure><p>然后 <code>libtun2socks</code> 使用这个 fd 运行，<code>badvxn</code> 是基于 <code>LwIP</code> 修改的实现，c 语言基本都忘光了，找了下 golang 的实现看看，找到了 <code>https://github.com/xjasonlyu/tun2socks</code> ，为啥选它是应为它内置了好几个模式，例如 direct ，修改起来应该比较简单（这里我不去追求性能极限，以需求优先而选型）。</p><h2 id="安卓开发"><a href="#安卓开发" class="headerlink" title="安卓开发"></a>安卓开发</h2><p>搜了下相关没有搜到现有的轮子，唯一一个比较接近我需求的 <a href="https://github.com/ys1231/appproxy/tree/iyue">appproxy</a> 是 flutter 写的，并且代理类型只有 http 和 socks5，没办法，就去学习了下 kotlin 和安卓开发。</p><h3 id="相关资源"><a href="#相关资源" class="headerlink" title="相关资源"></a>相关资源</h3><p>2025年，搞这种稍微偏向底层的对接的，当然是学习 kotlin 搞安卓开发了，之前看到的 <a href="https://kotlin.liying-cn.net/home.html">kotlin 中文翻译文档</a> 看了下感觉从学习路线来看好琐碎，安卓官方开发文档页面又对 kotlin 介绍很少，都是课程里穿插着基础知识。完整体系的还是要从互联网上找下看看，下面的 gist 是收藏的一些，基本都看过的：</p><p><a href="https://gist.github.com/zhangguanzhang/cd2f3eb20de5a1314e5d3802401aa192">https://gist.github.com/zhangguanzhang/cd2f3eb20de5a1314e5d3802401aa192</a></p><p>先学 kotlin，再去看安卓官方的教程，安卓官方文档是实战带你入门安卓开发，相对于完整安卓开发体系来说还是缺少很多内容，例如 Service、Intent 和 Flow 啥的讲解很少或者基本没讲。</p><h3 id="需求分析"><a href="#需求分析" class="headerlink" title="需求分析"></a>需求分析</h3><p>主要需求为如下：</p><ul><li>添加、修改、删除代理配置</li><li>通知栏的前台运行通知</li><li>右下角的浮动开关</li><li>tun2socks 对接启动</li><li>app 选择界面选择哪些 app 走代理</li></ul><p>快速看完官方文档的 compose 开发后，基于 <a href="https://github.com/google-developer-training/basic-android-kotlin-compose-training-inventory-app">inventory-app</a> 复制修改开始，各个界面跳转用 navigate ，另外还发现 <a href="https://github.com/tailscale/tailscale-android/tree/main/android">tailscale-android</a> 也是全部用 kotlin + jetpack compose 开发的，可以从里面学下代码。</p><h3 id="tun2socks-aar"><a href="#tun2socks-aar" class="headerlink" title="tun2socks aar"></a>tun2socks aar</h3><p>根据文档 tun2socks 的 <a href="https://github.com/xjasonlyu/tun2socks/issues/123">How to use file descriptor in Android</a> 得知，集成到安卓是要利用 gomobile 编译出安卓的 aar 后代码里加载（而不是 <code>v2rxxNG</code> 那样运行二进制文件）。先编译出来后写个最小 demo 试试看，需要 NDK 和 SDK_TOOLS 和 golang，搜索了下后制作了一个带环境的 docker 镜像，直接下面步骤快速编译：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://github.com/xjasonlyu/tun2socks</span><br><span class="line"><span class="built_in">cd</span> tun2socks</span><br><span class="line"><span class="comment"># 不要在低配置机器上编译，会卡死机器</span></span><br><span class="line">docker run --<span class="built_in">rm</span> -ti \</span><br><span class="line"> -e GOPROXY=<span class="string">&#x27;https://goproxy.cn,https://mirrors.aliyun.com/goproxy/,https://goproxy.io,https://proxy.golang.com.cn,direct&#x27;</span> \</span><br><span class="line">  --entrypoint bash -v <span class="variable">$PWD</span>:/w -w /w registry.aliyuncs.com/zhangguanzhang/gomobile</span><br><span class="line">go get golang.org/x/mobile/bind</span><br><span class="line"><span class="built_in">mkdir</span> -p build</span><br><span class="line"><span class="comment">#  也可以使用一些go build 参数 这里我指定-androidapi 24最低兼容安卓7，默认值是16，但是较新的 NDK 上无法编译， 23.1.7779620 可以编译</span></span><br><span class="line">gomobile <span class="built_in">bind</span> -ldflags=<span class="string">&quot;-s -w&quot;</span> -trimpath -o build/tun2socks.aar \</span><br><span class="line">-target android \</span><br><span class="line">-androidapi 24 \</span><br><span class="line">github.com/xjasonlyu/tun2socks/v2/engine</span><br><span class="line">$ <span class="built_in">ls</span> -l build/</span><br><span class="line">total 30580</span><br><span class="line">-rw-r--r-- 1 root root 31304576 Apr 27 11:50 tun2socks.aar</span><br><span class="line">-rw-r--r-- 1 root root     6927 Apr 27 11:50 tun2socks-sources.jar</span><br></pre></td></tr></table></figure><p>issue 里基本都是 java 开发的，我这里是 kotlin 和最新版本的 <code>Android Studio</code>，最新的版本里，都是用 kotlin 的 DSL gradle 了，按照 issue 里的添加后在 <code>Android Studio</code> 里一直报红，然后搜索后下面的才行：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"># app 下的 buid.grade.kts 的 依赖下添加</span><br><span class="line">implementation(files(&quot;libs/tun2socks.aar&quot;))</span><br></pre></td></tr></table></figure><p>把上面的 aar 文件放 <code>app/libs/</code> 下即可，使用和 issue 里一样。</p><p>gomobile 一些实践参考:</p><ul><li><a href="https://mbox.dev/dev/go/go-mobile/03/">GoMobile 3: 在 iOS &amp; Android 上的集成</a></li><li><a href="https://apkdv.com/creating-android-ios-cross-platform-libraries-with-gomobile.html">使用 GoMobile 创建 Android、iOS 跨平台 WebSocket Library</a></li></ul><h3 id="前台"><a href="#前台" class="headerlink" title="前台"></a>前台</h3><p>官方的 demo <a href="https://android.googlesource.com/platform/development/+/master/samples/ToyVpn">Toy</a> 非常老，而且是 java 的，这里是我找的官方文档和一些参考代码：</p><ul><li><a href="https://developer.android.com/develop/background-work/services/fgs?hl=zh-cn">前台服务</a></li><li><a href="https://github.com/satishnada/android-vpn-implementation-guide/blob/master/app/src/main/java/com/satish/vpnguide/service/LocalVpnService.kt">android-vxn-implementation-guide</a></li><li><a href="https://github.com/microsoft/HydraLab/blob/main/android_client/app/src/main/java/com/microsoft/hydralab/android/client/vpn/HydraLabVpnService.kt">https://github.com/microsoft/HydraLab/blob/main/android_client/app/src/main/java/com/microsoft/hydralab/android/client/vpn/HydraLabVpnService.kt</a></li></ul><p>前台服务运行需要 <code>Notification</code> ，参考下别人代码后 <code>Android Studio</code> 模拟器里状态栏前台需要的通知出不来，在真机安卓 11 试了下没问题，发现模拟器的安卓版本太高了，谷歌了下需要加下面代码申请：</p><figure class="highlight kotlin"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@RequiresApi(Build.VERSION_CODES.TIRAMISU)</span></span><br><span class="line"><span class="function"><span class="keyword">fun</span> <span class="title">checkNotificationPermission</span><span class="params">()</span></span> &#123;</span><br><span class="line">    <span class="keyword">val</span> permission = android.Manifest.permission.POST_NOTIFICATIONS</span><br><span class="line">    <span class="keyword">when</span> &#123;</span><br><span class="line">        ContextCompat.checkSelfPermission(</span><br><span class="line">            <span class="keyword">this</span>,</span><br><span class="line">            permission</span><br><span class="line">        ) == PackageManager.PERMISSION_GRANTED -&gt; &#123;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        shouldShowRequestPermissionRationale(permission) -&gt; &#123;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">else</span> -&gt; &#123;</span><br><span class="line">            requestNotificationPermission.launch(permission)</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">private</span> <span class="keyword">val</span> requestNotificationPermission =</span><br><span class="line">    registerForActivityResult(ActivityResultContracts.RequestPermission()) &#123; isGranted -&gt;</span><br><span class="line">        <span class="keyword">if</span> (isGranted) Toast.makeText(<span class="keyword">this</span>, <span class="string">&quot;通知权限已授予&quot;</span>, Toast.LENGTH_SHORT)</span><br><span class="line">            .show()</span><br><span class="line">        <span class="keyword">else</span> Toast.makeText(<span class="keyword">this</span>, <span class="string">&quot;通知权限被拒绝&quot;</span>, Toast.LENGTH_SHORT)</span><br><span class="line">            .show()</span><br><span class="line">    &#125;</span><br></pre></td></tr></table></figure><p>然后启动代理闪退，发现没有像以前用的 app 那样弹窗口和钥匙，就是说该 app 申请 <code>vpn</code> 接口，会对啥啥的，需要调用：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">VpnService.prepare(this)</span><br></pre></td></tr></table></figure><h3 id="一些其他细节问题"><a href="#一些其他细节问题" class="headerlink" title="一些其他细节问题"></a>一些其他细节问题</h3><p>启动后，切出去，再从通知栏进来，发现浮动按钮状态和实际不一致，最后还是按照 Binder 和 Service 通信才行，以及 <code>Broadcast</code> 让按钮状态和实际一致。<br>完整代码在 <a href="https://github.com/zhangguanzhang/appproxy">appproxy</a></p><p>打包发给其他人，发现安装不上，最后发现是签名问题，没签名安装不了，偷懒可以暂时 <code>Build -&gt; Generate App Bundles or APKs -&gt; Generate APKs</code> 弹出的点 locate 里。还可以分架构打非一体包：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">splits &#123;</span><br><span class="line">    abi &#123;</span><br><span class="line">        isEnable = true</span><br><span class="line">        reset()</span><br><span class="line">        include(&quot;armeabi-v7a&quot;, &quot;arm64-v8a&quot;, &quot;x86_64&quot;)</span><br><span class="line">        isUniversalApk = false</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="后续"><a href="#后续" class="headerlink" title="后续"></a>后续</h2><p>app 开发完成后，后续就是 fork tun2socks 复制 direct 后在 conn 对流解析和修改发送 ，当然可以天马行空下不要局限在代理层面，例如用户态 kotlin 直接对 fd 的流转发下只统计访问 ip，就可以做一个简单的网络访问统计了。</p><p>tun2socks 好多全局变量和 init 里做操作，这里记录下梳理的一些启动流程：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">1. main.go 里的 engine.Start()</span><br><span class="line">2. engine/engine.go 里的 start()</span><br><span class="line">3. engine/engine.go 里的 netstack() 内的 `tunnel.T().SetDialer(_defaultProxy)`</span><br><span class="line">4. 上面 import 了 tunnel 执行了 tunnel/global.go 内的 init() 的 T().ProcessAsync()</span><br><span class="line">5. go t.process(ctx)</span><br><span class="line">func (t *Tunnel) process(ctx context.Context) &#123;</span><br><span class="line">for &#123;</span><br><span class="line">select &#123;</span><br><span class="line">case conn := &lt;-t.tcpQueue:</span><br><span class="line">go t.handleTCPConn(conn)</span><br><span class="line">case conn := &lt;-t.udpQueue:</span><br><span class="line">go t.handleUDPConn(conn)</span><br><span class="line">case &lt;-ctx.Done():</span><br><span class="line">return</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>由于是 aar 引用，入口是 <code>engine/engine.go</code> 内的 <code>engine.Start()</code> 方法，对于日志，如果有打印到文件的需求的要在这个文件里修改，由于默认 <code>cwd</code> 是 <code>/</code> 会报错只读，推荐给 Key 添加 <code>Dir</code> 参数在 <code>engine/engine.go</code> 里：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">general</span><span class="params">(k *Key)</span></span> <span class="type">error</span> &#123;</span><br><span class="line">level, err := log.ParseLevel(k.LogLevel)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">filename := path.Join(_defaultKey.Dir, <span class="string">&quot;tun2socks.log&quot;</span>)</span><br><span class="line">cfg := zap.NewProductionConfig()</span><br><span class="line">cfg.OutputPaths = []<span class="type">string</span>&#123;filename&#125;</span><br><span class="line">cfg.Level.SetLevel(level)</span><br><span class="line">log.SetLogger(zap.Must(cfg.Build()))</span><br><span class="line"></span><br><span class="line"><span class="comment">// log.SetLogger(log.Must(log.NewLeveled(level)))</span></span><br></pre></td></tr></table></figure><p>然后在 service 里给 <code>key.dir</code> 传递 <code>getExternalFilesDir(null)?.absolutePath</code>，也就是下面路径：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">/storage/emulated/0/Android/data/&lt;package_name&gt;/files/</span><br></pre></td></tr></table></figure><p>这样 engine 日志就可以打印在上面目录内，另外对于 tcp 链接，核心就是这块代码：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(t *Tunnel)</span></span> handleTCPConn(originConn adapter.TCPConn) &#123;</span><br><span class="line"><span class="keyword">defer</span> originConn.Close()</span><br><span class="line"></span><br><span class="line">id := originConn.ID()</span><br><span class="line">metadata := &amp;M.Metadata&#123;</span><br><span class="line">Network: M.TCP,</span><br><span class="line">SrcIP:   parseTCPIPAddress(id.RemoteAddress),</span><br><span class="line">SrcPort: id.RemotePort,</span><br><span class="line">DstIP:   parseTCPIPAddress(id.LocalAddress),</span><br><span class="line">DstPort: id.LocalPort,</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">ctx, cancel := context.WithTimeout(context.Background(), tcpConnectTimeout)</span><br><span class="line"><span class="keyword">defer</span> cancel()</span><br><span class="line"></span><br><span class="line">remoteConn, err := t.Dialer().DialContext(ctx, metadata)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">log.Warnf(<span class="string">&quot;[TCP] dial %s: %v&quot;</span>, metadata.DestinationAddress(), err)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line">metadata.MidIP, metadata.MidPort = parseNetAddr(remoteConn.LocalAddr())</span><br><span class="line"></span><br><span class="line">remoteConn = statistic.NewTCPTracker(remoteConn, metadata, t.manager)</span><br><span class="line"><span class="keyword">defer</span> remoteConn.Close()</span><br><span class="line"></span><br><span class="line">log.Infof(<span class="string">&quot;[TCP] %s &lt;-&gt; %s&quot;</span>, metadata.SourceAddress(), metadata.DestinationAddress())</span><br><span class="line">pipe(originConn, remoteConn)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>工作流程如下，客户端发起连接到 tun2socks，然后 tun2socks <code>t.Dialer().DialContext(ctx, metadata)</code> 连接目标地址，然后 pipe 两个 conn</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"># 也可以去看看 core/tcp.go</span><br><span class="line">client ----originConn----&gt; tun2socks -----remoteConn----&gt; server</span><br></pre></td></tr></table></figure><p>如果需要 tun2socks 主动给 client 回包，需要把 remoteConn 传递过去，同时在 <code>t.Dialer().DialContext</code> 内产生的 <code>net.Conn</code> 的 <code>Read</code> 的 <code>[]byte</code> len 发现有点小，我自己维护了个 buf 做处理：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(m *mirrorStream)</span></span> Read(p []<span class="type">byte</span>) (<span class="type">int</span>, <span class="type">error</span>) &#123;</span><br><span class="line">tmp := <span class="built_in">make</span>([]<span class="type">byte</span>, <span class="built_in">len</span>(p))</span><br><span class="line">n, err := m.rawConn.Read(tmp)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="number">0</span>, err</span><br><span class="line">&#125;</span><br><span class="line"><span class="built_in">copy</span>(p, tmp[:n])</span><br><span class="line">m.buf = <span class="built_in">append</span>(m.buf, tmp[:n]...)</span><br><span class="line">m.mirrorMessages()</span><br><span class="line"><span class="keyword">return</span> n, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="参考"><a href="#参考" class="headerlink" title="参考"></a>参考</h2><ul><li><a href="https://blog.csdn.net/qq_45649553/article/details/136543933">安卓基础-application详解</a></li><li><a href="https://github.com/xjasonlyu/tun2socks/discussions/139">tun2socks 自定义代理</a></li><li><a href="https://ivanfan.site/2022/01/05/Android/VPN%E4%BB%A3%E7%90%86%E5%8A%A0%E9%80%9F/">代理加速</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;p&gt;最近开发一个让指定 app 走 http 代理的经过…..&lt;/p&gt;</summary>
    
    
    
    
    <category term="android" scheme="http://zhangguanzhang.github.io/tags/android/"/>
    
    <category term="tun2sock2" scheme="http://zhangguanzhang.github.io/tags/tun2sock2/"/>
    
  </entry>
  
  <entry>
    <title>kube-log-runner 使用和适配 logrotate 改造</title>
    <link href="http://zhangguanzhang.github.io/2025/04/21/kube-log-runner/"/>
    <id>http://zhangguanzhang.github.io/2025/04/21/kube-log-runner/</id>
    <published>2025-04-21T10:40:30.000Z</published>
    <updated>2025-04-21T10:40:30.000Z</updated>
    
    <content type="html"><![CDATA[<p>最近使用 kube-log-runner 的经历…..</p><span id="more"></span><h2 id="由来"><a href="#由来" class="headerlink" title="由来"></a>由来</h2><p>kubelet 和一些 kube 组件在二进制 systemd service 管理下，日志最终会在 <code>/var/log/messages</code> 里，我们的客户对该文件会有关键字（Error、Failed …）监控，让我们把相关组件日志写到其他文件里去。</p><h2 id="经过"><a href="#经过" class="headerlink" title="经过"></a>经过</h2><h3 id="选型"><a href="#选型" class="headerlink" title="选型"></a>选型</h3><p>根据官方文档 <a href="https://kubernetes.io/zh-cn/docs/concepts/cluster-administration/system-logs/">系统日志</a> 得知 <code>v1.26</code> 开始移除了以前的日志文件、目录和轮转之类的参数，而是让使用 <code>kube-log-runner</code> 代替，该二进制已经内置在二进制下载压缩包里了。如果是使用镜像，官方的容器镜像内置了，只是名字叫做 <code>/go-runner</code>。</p><p>根据 <a href="https://github.com/kubernetes/component-base/tree/master/logs/kube-log-runner">github kube-log-runner</a> 得知使用方式和源码，它就是 golang 写的一个简单工具，启动命令，把命令的标准输出和错误输出捕获写到文件，然后转发信号给进程。</p><h3 id="尝试"><a href="#尝试" class="headerlink" title="尝试"></a>尝试</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">$ systemctl cat --no-pager kube-apiserver</span><br><span class="line">...</span><br><span class="line">ExecStart=/usr/local/bin/kube-log-runner \</span><br><span class="line">  --log-file=/var/logs/kube-apiserver.log \</span><br><span class="line">  /usr/local/bin/kube-apiserver </span><br></pre></td></tr></table></figure><p>直接测试了下发现启动报错下面：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">4月 18 10:30:25 xxx systemd[1]: Got notification message from PID 9885, but reception only permitted for main PID 9880</span><br></pre></td></tr></table></figure><p>如果对 systemd 比较熟悉，可以直接看出问题。因为我们这边用的 <code>Type=notify</code>，该设置下，服务启动后会发送 <code>sd_notify</code> 给 systemd，这样 systemd 确认该服务正常启动。这个报错就是：</p><ol><li>systemd 拉起的主进程 Pid 是 9880</li><li>从非主进程的 9885 收到了 notification 消息</li></ol><p>解决该问题很简单，systemd 给了配置选项接收所有的 notify ：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">NotifyAccess=all</span><br></pre></td></tr></table></figure><h3 id="logrotate"><a href="#logrotate" class="headerlink" title="logrotate"></a>logrotate</h3><p>日志要写入到文件，那就一定要遵守 Linux 规范配置 logrotate 避免日志写满分区。然后相关配置完，写完配置用一个小的 size 参数测试了 logrotate 发现不行：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">logrotate -v /etc/logrotate.d/kube-apiserver</span><br></pre></td></tr></table></figure><p>logrotate 轮转日志分为两种模式，create 和 copytruncate，默认是 create，两种大致原理如下：</p><ul><li>create：重命名日志文件，再创建原有的日志文件，等同于 mv + touch</li><li>copytruncate：<code>cp xx.log xx.log.1 &amp;&amp; truncate -s 0 xx.log</code></li></ul><p>Linux 上打开文件名实际是操作 inode，文件名在打开 inode 后改名或者删掉文件 path 对进程并不会有影响，create 方式需要进程支持 reopen 日志文件路径使用新的 inode，类似 nginx 的 logrotate 配置如下：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">/var/log/nginx/*.log /var/log/nginx/*/*.log&#123;</span><br><span class="line">daily</span><br><span class="line">missingok</span><br><span class="line">rotate 14</span><br><span class="line">compress</span><br><span class="line">delaycompress</span><br><span class="line">notifempty</span><br><span class="line">create 640 root adm</span><br><span class="line">sharedscripts</span><br><span class="line">postrotate</span><br><span class="line">[ ! -f /var/run/nginx.pid ] || kill -USR1 `cat /var/run/nginx.pid`</span><br><span class="line">endscript</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>kube-apiserver 并没有支持这种行为，就尝试了下 <code>copytruncate</code> 发现也不行，然后去 <a href="https://github.com/kubernetes/enhancements/issues/2845">kep 的 kube-log-runner</a> 下回复了下请求添加 logrotate 支持。</p><h3 id="修改"><a href="#修改" class="headerlink" title="修改"></a>修改</h3><p>等官方估计很久了，先自己修改下源码支持下：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br><span class="line">177</span><br><span class="line">178</span><br><span class="line">179</span><br><span class="line">180</span><br><span class="line">181</span><br><span class="line">182</span><br><span class="line">183</span><br><span class="line">184</span><br><span class="line">185</span><br><span class="line">186</span><br><span class="line">187</span><br><span class="line">188</span><br><span class="line">189</span><br><span class="line">190</span><br><span class="line">191</span><br><span class="line">192</span><br><span class="line">193</span><br><span class="line">194</span><br><span class="line">195</span><br><span class="line">196</span><br><span class="line">197</span><br><span class="line">198</span><br><span class="line">199</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">&quot;flag&quot;</span></span><br><span class="line"><span class="string">&quot;fmt&quot;</span></span><br><span class="line"><span class="string">&quot;io&quot;</span></span><br><span class="line"><span class="string">&quot;log&quot;</span></span><br><span class="line"><span class="string">&quot;os&quot;</span></span><br><span class="line"><span class="string">&quot;os/exec&quot;</span></span><br><span class="line"><span class="string">&quot;os/signal&quot;</span></span><br><span class="line"><span class="string">&quot;strings&quot;</span></span><br><span class="line"><span class="string">&quot;sync&quot;</span></span><br><span class="line"><span class="string">&quot;syscall&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> (</span><br><span class="line">logFilePath    = flag.String(<span class="string">&quot;log-file&quot;</span>, <span class="string">&quot;&quot;</span>, <span class="string">&quot;If non-empty, save stdout to this file&quot;</span>)</span><br><span class="line">alsoToStdOut   = flag.Bool(<span class="string">&quot;also-stdout&quot;</span>, <span class="literal">false</span>, <span class="string">&quot;useful with log-file, log to standard output as well as the log file&quot;</span>)</span><br><span class="line">redirectStderr = flag.Bool(<span class="string">&quot;redirect-stderr&quot;</span>, <span class="literal">true</span>, <span class="string">&quot;treat stderr same as stdout&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">// 新增全局变量</span></span><br><span class="line">logFile         *os.File</span><br><span class="line">logFileMu       sync.Mutex</span><br><span class="line">globalLogFile   <span class="type">string</span></span><br><span class="line">globalAlsoStdOut <span class="type">bool</span></span><br><span class="line">globalRedirectStderr <span class="type">bool</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment">// SyncWriter 支持动态切换 Writer</span></span><br><span class="line"><span class="keyword">type</span> SyncWriter <span class="keyword">struct</span> &#123;</span><br><span class="line">mu sync.Mutex</span><br><span class="line">w  io.Writer</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(sw *SyncWriter)</span></span> Write(p []<span class="type">byte</span>) (n <span class="type">int</span>, err <span class="type">error</span>) &#123;</span><br><span class="line">sw.mu.Lock()</span><br><span class="line"><span class="keyword">defer</span> sw.mu.Unlock()</span><br><span class="line"><span class="keyword">return</span> sw.w.Write(p)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(sw *SyncWriter)</span></span> SetWriter(w io.Writer) &#123;</span><br><span class="line">sw.mu.Lock()</span><br><span class="line"><span class="keyword">defer</span> sw.mu.Unlock()</span><br><span class="line">sw.w = w</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> (</span><br><span class="line">outputSync = &amp;SyncWriter&#123;w: os.Stdout&#125;</span><br><span class="line">errSync    = &amp;SyncWriter&#123;w: os.Stderr&#125;</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">flag.Parse()</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> err := configureAndRun(); err != <span class="literal">nil</span> &#123;</span><br><span class="line">log.Fatal(err)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">configureAndRun</span><span class="params">()</span></span> <span class="type">error</span> &#123;</span><br><span class="line"><span class="comment">// 保存参数到全局变量</span></span><br><span class="line">globalLogFile = *logFilePath</span><br><span class="line">globalAlsoStdOut = *alsoToStdOut</span><br><span class="line">globalRedirectStderr = *redirectStderr</span><br><span class="line"></span><br><span class="line"><span class="comment">// 初始化日志文件</span></span><br><span class="line"><span class="keyword">if</span> globalLogFile != <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line"><span class="keyword">var</span> err <span class="type">error</span></span><br><span class="line">logFile, err = os.OpenFile(globalLogFile, os.O_APPEND|os.O_CREATE|os.O_WRONLY, <span class="number">0644</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;failed to open log file: %w&quot;</span>, err)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> globalAlsoStdOut &#123;</span><br><span class="line">outputSync.SetWriter(io.MultiWriter(os.Stdout, logFile))</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">outputSync.SetWriter(logFile)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> globalRedirectStderr &#123;</span><br><span class="line">errSync.SetWriter(outputSync)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">args := flag.Args()</span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(args) == <span class="number">0</span> &#123;</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;not enough arguments to run&quot;</span>)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">exe := args[<span class="number">0</span>]</span><br><span class="line"><span class="keyword">var</span> exeArgs []<span class="type">string</span></span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(args) &gt; <span class="number">1</span> &#123;</span><br><span class="line">exeArgs = args[<span class="number">1</span>:]</span><br><span class="line">&#125;</span><br><span class="line">cmd := exec.Command(exe, exeArgs...)</span><br><span class="line">cmd.Stdout = outputSync</span><br><span class="line">cmd.Stderr = errStream()</span><br><span class="line"></span><br><span class="line">log.Printf(<span class="string">&quot;Running command:\n%v&quot;</span>, cmdInfo(cmd))</span><br><span class="line">err := cmd.Start()</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;starting command: %w&quot;</span>, err)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 信号处理</span></span><br><span class="line"><span class="keyword">go</span> setupSigHandler(cmd.Process)</span><br><span class="line"><span class="keyword">if</span> err := cmd.Wait(); err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;running command: %w&quot;</span>, err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">errStream</span><span class="params">()</span></span> io.Writer &#123;</span><br><span class="line"><span class="keyword">if</span> *redirectStderr &#123;</span><br><span class="line"><span class="keyword">return</span> outputSync</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> os.Stderr</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">cmdInfo</span><span class="params">(cmd *exec.Cmd)</span></span> <span class="type">string</span> &#123;</span><br><span class="line"><span class="keyword">return</span> fmt.Sprintf(</span><br><span class="line"><span class="string">`Command env: (log-file=%v, also-stdout=%v, redirect-stderr=%v)</span></span><br><span class="line"><span class="string">Run from directory: %v</span></span><br><span class="line"><span class="string">Executable path: %v</span></span><br><span class="line"><span class="string">Args (comma-delimited): %v`</span>, *logFilePath, *alsoToStdOut, *redirectStderr,</span><br><span class="line">cmd.Dir, cmd.Path, strings.Join(cmd.Args, <span class="string">&quot;,&quot;</span>),</span><br><span class="line">)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 修改后的信号处理函数</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">setupSigHandler</span><span class="params">(process *os.Process)</span></span> &#123;</span><br><span class="line">signals := []os.Signal&#123;</span><br><span class="line">syscall.SIGHUP, syscall.SIGINT,</span><br><span class="line">syscall.SIGTERM, syscall.SIGQUIT, syscall.SIGUSR1,</span><br><span class="line">&#125;</span><br><span class="line">c := <span class="built_in">make</span>(<span class="keyword">chan</span> os.Signal, <span class="number">1</span>)</span><br><span class="line">signal.Notify(c, signals...)</span><br><span class="line"></span><br><span class="line">log.Println(<span class="string">&quot;Now listening for signals&quot;</span>)</span><br><span class="line"><span class="keyword">for</span> s := <span class="keyword">range</span> c &#123;</span><br><span class="line"><span class="keyword">if</span> s == syscall.SIGUSR1 &#123;</span><br><span class="line">handleLogRotate()</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">forwardSignal(process, s)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 处理日志轮转</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">handleLogRotate</span><span class="params">()</span></span> &#123;</span><br><span class="line">logFileMu.Lock()</span><br><span class="line"><span class="keyword">defer</span> logFileMu.Unlock()</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> globalLogFile == <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">log.Println(<span class="string">&quot;No log file configured, ignoring SIGUSR1&quot;</span>)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 关闭旧文件</span></span><br><span class="line"><span class="keyword">if</span> logFile != <span class="literal">nil</span> &#123;</span><br><span class="line">logFile.Sync()</span><br><span class="line"><span class="keyword">if</span> err := logFile.Close(); err != <span class="literal">nil</span> &#123;</span><br><span class="line">log.Printf(<span class="string">&quot;Error closing log file: %v&quot;</span>, err)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 打开新文件</span></span><br><span class="line">newFile, err := os.OpenFile(globalLogFile, os.O_APPEND|os.O_CREATE|os.O_WRONLY, <span class="number">0644</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">log.Printf(<span class="string">&quot;ERROR: Failed to reopen log file: %v (logging to stdout)&quot;</span>, err)</span><br><span class="line">outputSync.SetWriter(os.Stdout)</span><br><span class="line"><span class="keyword">if</span> globalRedirectStderr &#123;</span><br><span class="line">errSync.SetWriter(os.Stdout)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">logFile = newFile</span><br><span class="line">log.SetOutput(logFile)</span><br><span class="line">log.Println(<span class="string">&quot;Successfully reopened log file&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">// 更新输出流</span></span><br><span class="line"><span class="keyword">if</span> globalAlsoStdOut &#123;</span><br><span class="line">outputSync.SetWriter(io.MultiWriter(os.Stdout, logFile))</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">outputSync.SetWriter(logFile)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> globalRedirectStderr &#123;</span><br><span class="line">errSync.SetWriter(outputSync)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">forwardSignal</span><span class="params">(process *os.Process, s os.Signal)</span></span> &#123;</span><br><span class="line">log.Printf(<span class="string">&quot;Forwarding signal %v to PID %v&quot;</span>, s, process.Pid)</span><br><span class="line"><span class="keyword">if</span> err := process.Signal(s); err != <span class="literal">nil</span> &#123;</span><br><span class="line">log.Printf(<span class="string">&quot;Error forwarding signal %v: %v&quot;</span>, s, err)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>上面代码不转发 USR1 信号给套娃的进程，自身处理 USR1 信号就是 reopen 日志文件。然后 logrotate 触发需要 pid 文件，systemd 里增加:</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ExecStartPost=/bin/sh -c &quot;echo $MAINPID &gt; /var/run/kube-apiserver.pid&quot;</span><br></pre></td></tr></table></figure><p>logrotate 配置文件：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">/var/logs/kube-apiserver.log &#123;</span><br><span class="line">    daily</span><br><span class="line">    rotate 5</span><br><span class="line">    size 400M</span><br><span class="line">    missingok</span><br><span class="line">    compress</span><br><span class="line">    nomail</span><br><span class="line">    delaycompress</span><br><span class="line">    create</span><br><span class="line">    postrotate</span><br><span class="line">        [ ! -f /var/run/kube-apiserver.pid ] || kill -USR1 `cat /var/run/kube-apiserver.pid`</span><br><span class="line">    endscript</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="参考"><a href="#参考" class="headerlink" title="参考"></a>参考</h2><ul><li><a href="https://wsgzao.github.io/post/logrotate/">https://wsgzao.github.io/post/logrotate/</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;p&gt;最近使用 kube-log-runner 的经历…..&lt;/p&gt;</summary>
    
    
    
    
    <category term="kube-log-runner" scheme="http://zhangguanzhang.github.io/tags/kube-log-runner/"/>
    
    <category term="logrotate" scheme="http://zhangguanzhang.github.io/tags/logrotate/"/>
    
  </entry>
  
</feed>
