zhangguanzhang's Blog

手动修复v1.14-v1.15版本k8s的issue:76956

字数统计: 2.7k阅读时长: 15 min
2019/10/21

由来

该issue见 https://github.com/kubernetes/kubernetes/issues/76956, apiserver和controller-manager每次start的时候会刷下面信息。不影响使用,但是有强迫症可以按照本文思路手动修改。当然本文也是自己编译k8s的教程

1
2
3
4
5
6
E0423 17:35:58.491576       1 prometheus.go:138] failed to register depth metric admission_quota_controller: duplicate metrics collector registration attempted
E0423 17:35:58.491625 1 prometheus.go:150] failed to register adds metric admission_quota_controller: duplicate metrics collector registration attempted
E0423 17:35:58.491901 1 prometheus.go:162] failed to register latency metric admission_quota_controller: duplicate metrics collector registration attempted
E0423 17:35:58.492191 1 prometheus.go:174] failed to register work_duration metric admission_quota_controller: duplicate metrics collector registration attempted
E0423 17:35:58.492367 1 prometheus.go:189] failed to register unfinished_work_seconds metric admission_quota_controller: duplicate metrics collector registration attempted
E0423 17:35:58.492507 1 prometheus.go:202] failed to register longest_running_processor_microseconds metric admission_quota_controller: duplicate metrics collector registration attempted

从这个issue我们可以找到对应的修复pr https://github.com/kubernetes/kubernetes/pull/77553, 我们看到是metrics信息被注册了两次的错误输出。
但是该pr只被merge到了v1.16以上 https://github.com/kubernetes/kubernetes/commit/f101466d2e4d96854c80f58203de2cc6b5aaeb6a#diff-60187c78fbca5a995fd7d2ac913eb6c8, 意味着1.14,1.15都存在这个。

1
2
3
4
5
master (#77553)  v1.17.0-alpha.2  v1.17.0-alpha.1 v1.17.0-alpha.0 
v1.16.3-beta.0 v1.16.2 v1.16.2-beta.0 v1.16.1
v1.16.1-beta.0 v1.16.0 v1.16.0-rc.2
v1.16.0-rc.1 v1.16.0-beta.2 v1.16.0-beta.1
v1.16.0-beta.0 v1.16.0-alpha.3 v1.16.0-alpha.2

实战

下载源码

手动下载源码编译,这里我使用v1.15.5来做示范。我们使用容器编译,这里我使用4c8gcentos7.6+docker-ce-18.09。想办法把源码拉下来

1
2
3
git clone https://github.com/kubernetes/kubernetes -b v1.15.5
cd kubernetes
git checkout -b v1.15.5

编译前准备

因为我们拉取后修改了源码,编译出的version信息会是下面带-dirty

1
2
$ kube-controller-manager --version
Kubernetes v1.15.5-dirty

可以执行下面命令去掉-dirty,或者不执行下面的,在后文的patch后git add后commit了也行

1
sed -ri 's#KUBE_GIT_TREE_STATE="dirty"#KUBE_GIT_TREE_STATE="clean"#g' hack/lib/version.sh

另外如果有需求也编译docker镜像的话可以先提前准备本地镜像,修改build/lib/release.sh,命令sed -ri 's#(build)\s+--pull#\1#' build/lib/release.sh

1
2
3
"${DOCKER[@]}" build --pull -q -t "${docker_image_tag}" "${docker_build_path}" >/dev/null
修改为下面
"${DOCKER[@]}" build -q -t "${docker_image_tag}" "${docker_build_path}" >/dev/null

因为我们使用docker镜像起容器编译,需要提前准备镜像,查看下版本,各个版本k8s的golang版本是不一样的,不要乱写

1
2
$ cat build/build-image/cross/VERSION
v1.12.10-1

拉取该镜像

1
curl -s https://zhangguanzhang.github.io/bash/pull.sh | bash -s k8s.gcr.io/kube-cross:v1.12.10-1

把merge打patch

merge的url为 https://github.com/kubernetes/kubernetes/commit/f101466d2e4d96854c80f58203de2cc6b5aaeb6a#diff-60187c78fbca5a995fd7d2ac913eb6c8
把url中#以及后面的改为.patch执行下面命令打patch

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@k8s-m1 kubernetes]# wget https://github.com/kubernetes/kubernetes/commit/f101466d2e4d96854c80f58203de2cc6b5aaeb6a.patch
[root@k8s-m1 kubernetes]# patch -p1 < f101466d2e4d96854c80f58203de2cc6b5aaeb6a.patch
patching file pkg/util/workqueue/prometheus/prometheus.go
patching file pkg/util/workqueue/prometheus/prometheus_test.go
patching file pkg/util/workqueue/prometheus/BUILD
patching file pkg/util/workqueue/prometheus/prometheus.go
patching file pkg/util/workqueue/prometheus/prometheus_test.go
patching file staging/src/k8s.io/client-go/util/workqueue/delaying_queue.go
Hunk #1 succeeded at 44 (offset 1 line).
Hunk #2 succeeded at 76 (offset 3 lines).
Hunk #3 succeeded at 152 (offset 6 lines).
patching file staging/src/k8s.io/client-go/util/workqueue/metrics.go
patching file staging/src/k8s.io/client-go/util/workqueue/metrics_test.go
patching file staging/src/k8s.io/client-go/util/workqueue/rate_limiting_queue_test.go

编译

为了和官方尽量保持一致,这里使用docker容器编译,先说下几个环境变量:
KUBE_BUILD_CONFORMANCE=nKUBE_BUILD_HYPERKUBE=n 参数决定是否构建 hyperkube-amd64conformance-amd64 镜像,默认是 y 构建,这里设置为n表示不构建
配置尽量两个cpu以上而不是1个cpu多核心,不然编译很容易高负载被系统kill了。如果只编译一个组件,可以make后面加例如WHAT=cmd/kubectl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
[root@k8s-m1 kubernetes]# KUBE_BUILD_PLATFORMS=linux/amd64 KUBE_BUILD_CONFORMANCE=n KUBE_BUILD_HYPERKUBE=n make quick-release
+++ [1021 21:30:50] Verifying Prerequisites....
+++ [1021 21:30:51] Building Docker image kube-build:build-7db96ab759-5-v1.12.10-1
+++ [1021 21:30:54] Syncing sources to container
+++ [1021 21:30:58] Running build command...
+++ [1021 21:31:43] Building go targets for linux/amd64:
cmd/kube-proxy
cmd/kube-apiserver
cmd/kube-controller-manager
cmd/cloud-controller-manager
cmd/kubelet
cmd/kubeadm
cmd/hyperkube
cmd/kube-scheduler
vendor/k8s.io/apiextensions-apiserver
cluster/gce/gci/mounter
+++ [1021 21:34:05] Building go targets for linux/amd64:
cmd/kube-proxy
cmd/kubeadm
cmd/kubelet
+++ [1021 21:34:49] Building go targets for linux/amd64:
cmd/kubectl
+++ [1021 21:35:11] Building go targets for linux/amd64:
cmd/gendocs
cmd/genkubedocs
cmd/genman
cmd/genyaml
cmd/genswaggertypedocs
cmd/linkcheck
vendor/github.com/onsi/ginkgo/ginkgo
test/e2e/e2e.test
+++ [1021 21:37:01] Building go targets for linux/amd64:
cmd/kubemark
vendor/github.com/onsi/ginkgo/ginkgo
test/e2e_node/e2e_node.test
+++ [1021 21:38:03] Syncing out of container
+++ [1021 21:38:33] Building tarball: manifests
+++ [1021 21:38:33] Building tarball: src
+++ [1021 21:38:33] Starting tarball: client linux-amd64
+++ [1021 21:38:33] Waiting on tarballs
+++ [1021 21:38:49] Building images: linux-amd64
+++ [1021 21:38:49] Building tarball: node linux-amd64
+++ [1021 21:38:49] Starting docker build for image: cloud-controller-manager-amd64
+++ [1021 21:38:49] Starting docker build for image: kube-apiserver-amd64
+++ [1021 21:38:49] Starting docker build for image: kube-controller-manager-amd64
+++ [1021 21:38:49] Starting docker build for image: kube-scheduler-amd64
+++ [1021 21:38:49] Starting docker build for image: kube-proxy-amd64
+++ [1021 21:38:49] Building conformance image for arch: amd64
+++ [1021 21:38:49] Building hyperkube image for arch: amd64
Sending build context to Docker daemon 52.65MB
Step 1/2 : FROM k8s.gcr.io/debian-base-amd64:v1.0.0
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
!!! [1021 21:39:07] Call tree:
!!! [1021 21:39:07] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [1021 21:39:07] 2: /root/kubernetes/build/lib/release.sh:237 kube::release::build_server_images(...)
!!! [1021 21:39:07] 3: /root/kubernetes/build/lib/release.sh:93 kube::release::package_server_tarballs(...)
!!! [1021 21:39:07] 4: build/release.sh:45 kube::release::package_tarballs(...)
Sending build context to Docker daemon 50.07MB
Step 1/2 : FROM k8s.gcr.io/debian-iptables-amd64:v11.0.2
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
!!! [1021 21:39:08] Call tree:
!!! [1021 21:39:08] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [1021 21:39:08] 2: /root/kubernetes/build/lib/release.sh:237 kube::release::build_server_images(...)
!!! [1021 21:39:08] 3: /root/kubernetes/build/lib/release.sh:93 kube::release::package_server_tarballs(...)
!!! [1021 21:39:08] 4: build/release.sh:45 kube::release::package_tarballs(...)
Sending build context to Docker daemon 156MB
Step 1/2 : FROM k8s.gcr.io/debian-base-amd64:v1.0.0
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
!!! [1021 21:39:12] Call tree:
!!! [1021 21:39:12] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [1021 21:39:12] 2: /root/kubernetes/build/lib/release.sh:237 kube::release::build_server_images(...)
!!! [1021 21:39:12] 3: /root/kubernetes/build/lib/release.sh:93 kube::release::package_server_tarballs(...)
!!! [1021 21:39:12] 4: build/release.sh:45 kube::release::package_tarballs(...)
Sending build context to Docker daemon 134.3MB
Step 1/2 : FROM k8s.gcr.io/debian-base-amd64:v1.0.0
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
!!! [1021 21:39:13] Call tree:
!!! [1021 21:39:13] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [1021 21:39:13] 2: /root/kubernetes/build/lib/release.sh:237 kube::release::build_server_images(...)
!!! [1021 21:39:13] 3: /root/kubernetes/build/lib/release.sh:93 kube::release::package_server_tarballs(...)
!!! [1021 21:39:13] 4: build/release.sh:45 kube::release::package_tarballs(...)
Sending build context to Docker daemon 205.2MB
Step 1/2 : FROM k8s.gcr.io/debian-base-amd64:v1.0.0
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
!!! [1021 21:39:13] Call tree:
!!! [1021 21:39:13] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [1021 21:39:13] 2: /root/kubernetes/build/lib/release.sh:237 kube::release::build_server_images(...)
!!! [1021 21:39:13] 3: /root/kubernetes/build/lib/release.sh:93 kube::release::package_server_tarballs(...)
!!! [1021 21:39:13] 4: build/release.sh:45 kube::release::package_tarballs(...)
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
make[1]: *** [build] Error 1
!!! [1021 21:39:14] Call tree:
!!! [1021 21:39:14] 1: /root/kubernetes/build/lib/release.sh:420 kube::release::build_hyperkube_image(...)
!!! [1021 21:39:14] 2: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [1021 21:39:14] 3: /root/kubernetes/build/lib/release.sh:237 kube::release::build_server_images(...)
!!! [1021 21:39:14] 4: /root/kubernetes/build/lib/release.sh:93 kube::release::package_server_tarballs(...)
!!! [1021 21:39:14] 5: build/release.sh:45 kube::release::package_tarballs(...)
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
make[1]: *** [build] Error 1
!!! [1021 21:39:14] Call tree:
!!! [1021 21:39:14] 1: /root/kubernetes/build/lib/release.sh:424 kube::release::build_conformance_image(...)
!!! [1021 21:39:14] 2: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [1021 21:39:14] 3: /root/kubernetes/build/lib/release.sh:237 kube::release::build_server_images(...)
!!! [1021 21:39:14] 4: /root/kubernetes/build/lib/release.sh:93 kube::release::package_server_tarballs(...)
!!! [1021 21:39:14] 5: build/release.sh:45 kube::release::package_tarballs(...)
!!! [1021 21:39:14] previous Docker build failed
!!! [1021 21:39:14] Call tree:
!!! [1021 21:39:14] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [1021 21:39:14] 2: /root/kubernetes/build/lib/release.sh:237 kube::release::build_server_images(...)
!!! [1021 21:39:14] 3: /root/kubernetes/build/lib/release.sh:93 kube::release::package_server_tarballs(...)
!!! [1021 21:39:14] 4: build/release.sh:45 kube::release::package_tarballs(...)
!!! [1021 21:39:14] previous tarball phase failed
make: *** [quick-release] Error 1

查看结果

构建出错,不用管,出错是要构建k8s的docker镜像,二进制已经被构建出来的,我们可以查看_output/release-stage目录

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@k8s-m1 kubernetes]# ll _output/release-stage/server/linux-amd64/kubernetes/server/bin/
total 828572
-rwxr-xr-x 2 root root 134324933 Oct 21 21:38 cloud-controller-manager
drwxr-xr-x 2 root root 4096 Oct 21 21:38 cloud-controller-manager.dockerbuild
-rwxr-xr-x 1 root root 250209280 Oct 21 21:38 hyperkube
-rwxr-xr-x 2 root root 205195189 Oct 21 21:38 kube-apiserver
drwxr-xr-x 2 root root 4096 Oct 21 21:38 kube-apiserver.dockerbuild
-rwxr-xr-x 2 root root 155984780 Oct 21 21:38 kube-controller-manager
drwxr-xr-x 2 root root 4096 Oct 21 21:38 kube-controller-manager.dockerbuild
-rwxr-xr-x 2 root root 50064955 Oct 21 21:38 kube-proxy
drwxr-xr-x 2 root root 4096 Oct 21 21:38 kube-proxy.dockerbuild
-rwxr-xr-x 2 root root 52643510 Oct 21 21:38 kube-scheduler
drwxr-xr-x 2 root root 4096 Oct 21 21:38 kube-scheduler.dockerbuild

替换后查看输出信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
cp _output/release-stage/server/linux-amd64/kubernetes/server/bin/kube-controller-manager /usr/local/bin/
cp: overwrite ‘/usr/local/bin/kube-controller-manager’? y
[root@k8s-m1 kubernetes]# systemctl status kube-controller-manager.service
● kube-controller-manager.service - Kubernetes Controller Manager
Loaded: loaded (/usr/lib/systemd/system/kube-controller-manager.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Mon 2019-10-21 20:47:21 CST; 55min ago
Docs: https://github.com/kubernetes/kubernetes
Process: 114302 ExecStart=/usr/local/bin/kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/etc/kubernetes/controller-manager.kubeconfig --authorization-kubeconfig=/etc/kubernetes/controller-manager.kubeconfig --bind-address=0.0.0.0 --client-ca-file=/etc/kubernetes/pki/ca.crt --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --kubeconfig=/etc/kubernetes/controller-manager.kubeconfig --leader-elect=true --cluster-cidr=10.244.0.0/16 --service-cluster-ip-range=10.96.0.0/12 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key --root-ca-file=/etc/kubernetes/pki/ca.crt --use-service-account-credentials=true --controllers=*,bootstrapsigner,tokencleaner --experimental-cluster-signing-duration=86700h --feature-gates=RotateKubeletClientCertificate=true --node-monitor-period=5s --node-monitor-grace-period=2m --pod-eviction-timeout=1m --logtostderr=false --log-dir=/var/log/kubernetes/kube-controller-manager --v=2 (code=killed, signal=TERM)
Main PID: 114302 (code=killed, signal=TERM)

Oct 21 20:45:10 k8s-m1 systemd[1]: Started Kubernetes Controller Manager.
Oct 21 20:47:21 k8s-m1 systemd[1]: Stopping Kubernetes Controller Manager...
Oct 21 20:47:21 k8s-m1 systemd[1]: Stopped Kubernetes Controller Manager.
[root@k8s-m1 kubernetes]# systemctl start kube-controller-manager.service
[root@k8s-m1 kubernetes]# systemctl status kube-controller-manager.service
● kube-controller-manager.service - Kubernetes Controller Manager
Loaded: loaded (/usr/lib/systemd/system/kube-controller-manager.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2019-10-21 21:43:25 CST; 1s ago
Docs: https://github.com/kubernetes/kubernetes
Main PID: 11996 (kube-controller)
Tasks: 8
Memory: 12.2M
CGroup: /system.slice/kube-controller-manager.service
└─11996 /usr/local/bin/kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/etc/kubernetes/controller-manager.kubeconfig --authorization-kubeconfig=/etc/kubernetes/controller-manager.kubeconfig -...

Oct 21 21:43:25 k8s-m1 systemd[1]: Started Kubernetes Controller Manager.

已经没有相关错误了

参考

https://www.kubernetes.org.cn/5033.html
https://www.twblogs.net/a/5c76b632bd9eee33991811bf

CATALOG
  1. 1. 由来
  2. 2. 实战
    1. 2.1. 下载源码
    2. 2.2. 编译前准备
    3. 2.3. 把merge打patch
    4. 2.4. 编译
    5. 2.5. 查看结果
      1. 2.5.1. 替换后查看输出信息
  3. 3. 参考