zhangguanzhang's Blog

集群节点关机导致dns在eviction pod之前几率不可用

字数统计: 3.7k阅读时长: 19 min
2021/02/02 Share

由来

这几天我们内部在做新项目的容灾测试,业务都是在 K8S 上的。容灾里就是随便选节点 shutdown -h now。关机后同事便发现了(页面有错误,最终问题是)集群内 DNS 解析会有几率无法解析(导致的)。

根据 SVC 的流程,node 关机后,由于 kubelet 没有 update 自己。node 和 pod 在 apiserver get 的时候显示还是正常的。在 kube-controller-manager--node-monitor-grace-period 时间后再过 --pod-eviction-timeout 时间开始 eviction pod,大概流程是这样。

podeviction 之前,默认是大概 5m 的时间。这段时间内,node 上 的所有 PODIP 还在 SVCendpoint 里。而同事关机的 node 上恰好有 coredns 。所以在 5m 内一直会有 coredns 副本数之一的几率解析失败。

环境信息

其实和 K8S 版本没关系,因为 SVCeviction 的行为都是这样的。实际我调整了 node 更新自身状态的所有 相关参数,调整到在 20s 内就会 eviction pod,但是 20s 内还是存在几率无法解析(后续这个相关的更新时间调很快,结果出现了 svc 选中的pod还在running,但是kubelet实际更新自己 status 失败了,导致kube-controller-mananger把 pod的status patch成了非true,也就是svc的 endpoint消失了,回退更新时间后没这个bug了)。当然也问了下群友和社区群里,发现似乎大家从来没关机测试过这方面,应该是现在大伙都在用公有云了。。。。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
$  kubectl  version -o json
{
"clientVersion": {
"major": "1",
"minor": "15",
"gitVersion": "v1.15.5",
"gitCommit": "20c265fef0741dd71a66480e35bd69f18351daea",
"gitTreeState": "clean",
"buildDate": "2019-10-15T19:16:51Z",
"goVersion": "go1.12.10",
"compiler": "gc",
"platform": "linux/amd64"
},
"serverVersion": {
"major": "1",
"minor": "15",
"gitVersion": "v1.15.5",
"gitCommit": "20c265fef0741dd71a66480e35bd69f18351daea",
"gitTreeState": "clean",
"buildDate": "2019-10-15T19:07:57Z",
"goVersion": "go1.12.10",
"compiler": "gc",
"platform": "linux/amd64"
}

解决过程

loca-dns 真的可以吗

当然首选是 local-dns 的方案 了。方案搜下,很多人介绍了。简单讲下就是在每个 node 上起 hostNetworknode-cache 进程做代理 ,然后利用 dummy 接口和 nat 来拦截发向 kube-dns SVC IP 的 dns 请求做缓存。

官方提供的 yaml 文件 里的 __PILLAR__LOCAL__DNS__,__PILLAR__DNS__SERVER__需要换成dummy接口 IP 和 kube-dns SVC 的 IP,还有 __PILLAR__DNS__DOMAIN__ 自行根据文档更换。其余几个变量会在启动的时候替换,可以启动后看日志。

然后实际测试了下还是有问题。然后捋了下流程,yaml 文件里有这个 SVC 和 node-cache 的启动参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: v1
kind: Service
metadata:
name: kube-dns-upstream
namespace: kube-system
...
spec:
ports:
- name: dns
port: 53
protocol: UDP
targetPort: 53
- name: dns-tcp
port: 53
protocol: TCP
targetPort: 53
selector:
k8s-app: kube-dns
...
args: [ ..., "-upstreamsvc", "kube-dns-upstream" ]

启动的日志里可以看到配置文件被渲染了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
cluster1.local:53 {
errors
reload
bind 169.254.20.10 172.26.0.2
forward . 172.26.189.136 {
force_tcp
}
prometheus :9253
health 169.254.20.10:8080
}
in-addr.arpa:53 {
errors
cache 30
reload
loop
bind 169.254.20.10 172.26.0.2
forward . 172.26.189.136 {
force_tcp
}
prometheus :9253
}
ip6.arpa:53 {
errors
cache 30
reload
loop
bind 169.254.20.10 172.26.0.2
forward . 172.26.189.136 {
force_tcp
}
prometheus :9253
}
.:53 {
errors
cache 30
reload
loop
bind 169.254.20.10 172.26.0.2
forward . /etc/resolv.conf
prometheus :9253
}

因为要 nat 去 hook 请求 kube-dns SVC IP(172.26.0.2)的请求,但是它自己也需要访问 kube-dns,所以 yaml 文件里创建了一个和 kube-dns 一样的属性的 svc,启动参数写了这个 SVC 名字,可以看到它代理的是走 SVC 的 ip 的。因为 enableServiceLinks 的默认开启,pod 会有如下环境变量:

1
2
$ docker exec dfa env | grep KUBE_DNS_UPSTREAM_SERVICE_HOST
KUBE_DNS_UPSTREAM_SERVICE_HOST=172.26.189.136

代码里 可以看到就是把参数的 - 转换成 _ 取值然后渲染配置文件,这样就能取到 SVC 的 IP 了。

1
2
3
4
func toSvcEnv(svcName string) string {
envName := strings.Replace(svcName, "-", "_", -1)
return "$" + strings.ToUpper(envName) + "_SERVICE_HOST"
}

cluster1.local:53 这个 zone 在默认配置下还是代理到 SVC 上,所以还是有问题。

所以只有绕过 SVC 才能从根本上解决这个问题。然后就把 coredns 改成 port 153 + hostNetwork: truenodeSelector 到三个 master 上固定了。然后配置文件如下:

1
2
3
4
5
6
7
8
9
10
11
cluster1.local:53 {
errors
reload
bind 169.254.20.10 172.26.0.2
forward . 10.11.86.107:153 10.11.86.108:153 10.11.86.109:153 {
force_tcp
}
prometheus :9253
health 169.254.20.10:8080
}
...

然后测试还是有几率无法访问。之前看到过 米开朗基杨 分享过 coredns 的一个带故障转移的插件 dnsredir,尝试加这个插件去编译。

查阅文档编译后最后运行起来无法识别配置文件,因为官方不是直接基于 coredns 引入自己的插件开发的,而是自己的代码上来引入 coredns 的内置插件。

大概过程详情 issue 见链接 include coredns plugin at node-cache don’t work expect

官方的这个 node-cace 里的 bind 插件就是 dummy接口和 iptables 的 nat 部分了,这个特性蛮吸引我的,决定继续尝试下这个看看能不能配置成功。

意外收获

在测试加入插件 dnsredir 的时候米开朗基杨叫我试下最小配置段看看有干扰没,尝试了下面的配置段来回切换测:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

Corefile: |
cluster1.local:53 {
errors
reload
dnsredir . {
to 10.11.86.107:153 10.11.86.108:153 10.11.86.109:153
max_fails 1
health_check 1s
spray
}
#forward . 10.11.86.107:153 10.11.86.108:153 10.11.86.109:153 {
# max_fails 1
# policy round_robin
# health_check 0.4s
#}
prometheus :9253
health 169.254.20.10:8080
}
#----------
Corefile: |
cluster1.local:53 {
errors
reload
#dnsredir . {
# to 10.11.86.107:153 10.11.86.108:153 10.11.86.109:153
# max_fails 1
# health_check 1s
# spray
#}
forward . 10.11.86.107:153 10.11.86.108:153 10.11.86.109:153 {
max_fails 1
policy round_robin
health_check 0.4s
}
prometheus :9253
health 169.254.20.10:8080
}

然后发现请求居然不会发生解析失败了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ function d(){ while :;do sleep 0.2; date;dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short; done; }
$ d
2021年 02月 02日 星期二 12:54:43 CST
172.26.158.130
2021年 02月 02日 星期二 12:54:44 CST
172.26.158.130
2021年 02月 02日 星期二 12:54:44 CST
172.26.158.130
2021年 02月 02日 星期二 12:54:44 CST <---这个时间点关机了一个 master
172.26.158.130
2021年 02月 02日 星期二 12:54:45 CST
172.26.158.130
2021年 02月 02日 星期二 12:54:47 CST
172.26.158.130
2021年 02月 02日 星期二 12:54:48 CST
172.26.158.130
2021年 02月 02日 星期二 12:54:48 CST
172.26.158.130
2021年 02月 02日 星期二 12:54:48 CST
172.26.158.130
2021年 02月 02日 星期二 12:54:51 CST
172.26.158.130
2021年 02月 02日 星期二 12:54:51 CST
172.26.158.130
2021年 02月 02日 星期二 12:54:52 CST
172.26.158.130

然后就不打算继续折腾 dnsredir 插件了,去叫同事测试了下没问题,叫我在另一个环境上应用下修改他再测下,发现还是会发生:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
; <<>> DiG 9.10.3-P4-Ubuntu <<>> @172.26.0.2 account-gateway +short
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short

然后我多次测试最小配置 zone,对比排查到是反向解析的问题,反向解析关闭了就不存在任何问题了,注释掉下面的内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#in-addr.arpa:53 {
# errors
# cache 30
# reload
# loop
# bind 169.254.20.10 172.26.0.2
# forward . __PILLAR__CLUSTER__DNS__ {
# force_tcp
# }
# prometheus :9253
# }
#ip6.arpa:53 {
# errors
# cache 30
# reload
# loop
# bind 169.254.20.10 172.26.0.2
# forward . __PILLAR__CLUSTER__DNS__ {
# force_tcp
# }
# prometheus :9253
# }

测试解析的过程中去关机任何一台 coredns 所在 node 也没问题了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124
$ dig @172.26.0.2 account-gateway.default.svc.cluster1.local +short
172.26.158.124

大致的yaml文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
apiVersion: v1
kind: ServiceAccount
metadata:
name: node-local-dns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns-upstream
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "KubeDNSUpstream"
spec:
clusterIP: 172.26.0.3 # <---- 给他固定了得了,可以直接这个ip不走node-cache作为测试
ports:
- name: dns
port: 53
protocol: UDP
targetPort: 153
- name: dns-tcp
port: 53
protocol: TCP
targetPort: 153
selector:
k8s-app: kube-dns
---
apiVersion: v1
kind: ConfigMap
metadata:
name: node-local-dns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: Reconcile
data:
Corefile: |
cluster1.local:53 {
errors
cache {
success 9984 30
denial 9984 5
}
reload
loop
bind 169.254.20.10 172.26.0.2
forward . 10.11.86.107:153 10.11.86.108:153 10.11.86.109:153 {
force_tcp
max_fails 1
policy round_robin
health_check 0.5s
}
prometheus :9253
health 169.254.20.10:8070
}
#in-addr.arpa:53 {
# errors
# cache 30
# reload
# loop
# bind 169.254.20.10 172.26.0.2
# forward . __PILLAR__CLUSTER__DNS__ {
# force_tcp
# }
# prometheus :9253
# }
#ip6.arpa:53 {
# errors
# cache 30
# reload
# loop
# bind 169.254.20.10 172.26.0.2
# forward . __PILLAR__CLUSTER__DNS__ {
# force_tcp
# }
# prometheus :9253
# }
.:53 {
errors
cache 30
reload
loop
bind 169.254.20.10 172.26.0.2
forward . __PILLAR__UPSTREAM__SERVERS__
prometheus :9253
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-local-dns
namespace: kube-system
labels:
k8s-app: node-local-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
updateStrategy:
rollingUpdate:
maxUnavailable: 10%
selector:
matchLabels:
k8s-app: node-local-dns
template:
metadata:
labels:
k8s-app: node-local-dns
annotations:
prometheus.io/port: "9253"
prometheus.io/scrape: "true"
spec:
imagePullSecrets:
- name: regcred
priorityClassName: system-node-critical
serviceAccountName: node-local-dns
hostNetwork: true
dnsPolicy: Default # Don't use cluster DNS.
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
- effect: "NoExecute"
operator: "Exists"
- effect: "NoSchedule"
operator: "Exists"
containers:
- name: node-cache
image: xxx.lan:5000/k8s-dns-node-cache:1.16.0
resources:
requests:
cpu: 25m
memory: 10Mi
args: [ "-localip", "169.254.20.10,172.26.0.2", "-conf", "/etc/Corefile", "-upstreamsvc", "kube-dns-upstream", "-health-port","8070" ]
securityContext:
privileged: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9253
name: metrics
protocol: TCP
livenessProbe:
httpGet:
host: 169.254.20.10
path: /health
port: 8070
initialDelaySeconds: 40
timeoutSeconds: 3
volumeMounts:
- mountPath: /run/xtables.lock
name: xtables-lock
readOnly: false
- name: config-volume
mountPath: /etc/coredns
- name: kube-dns-config
mountPath: /etc/kube-dns
volumes:
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
- name: kube-dns-config
configMap:
name: kube-dns
optional: true
- name: config-volume
configMap:
name: node-local-dns
items:
- key: Corefile
path: Corefile.base
---
# A headless service is a service with a service IP but instead of load-balancing it will return the IPs of our associated Pods.
# We use this to expose metrics to Prometheus.
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/port: "9253"
prometheus.io/scrape: "true"
labels:
k8s-app: node-local-dns
name: node-local-dns
namespace: kube-system
spec:
clusterIP: None
ports:
- name: metrics
port: 9253
targetPort: 9253
selector:
k8s-app: node-local-dns
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: Reconcile
name: system:coredns
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: EnsureExists
name: system:coredns
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
data:
Corefile: |
.:153 {
errors
health :8180
kubernetes cluster1.local. in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values:
- kube-dns
topologyKey: kubernetes.io/hostname
hostNetwork: true
priorityClassName: system-cluster-critical
serviceAccountName: coredns
nodeSelector:
node-role.kubernetes.io/master: "true"
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
- key: "CriticalAddonsOnly"
operator: "Exists"
imagePullSecrets:
- name: regcred
containers:
- name: coredns
image: xxxx.lan:5000/coredns:1.7.1
imagePullPolicy: IfNotPresent
resources:
limits:
memory: 270Mi
requests:
cpu: 100m
memory: 150Mi
args: [ "-conf", "/etc/coredns/Corefile" ]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 153
name: dns
protocol: UDP
- containerPort: 153
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8180
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9153"
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: 172.26.0.2
ports:
- name: dns
port: 53
targetPort: 153
protocol: UDP
- name: dns-tcp
port: 53
targetPort: 153
protocol: TCP

自己的方案

但是后面发现 cpu 太高了,决定自己整个方案,中途尝试了很多,最后决定自己把里面的 dummy 接口部分源码抠出来写成一个工具(这样就不用改 svc ip 了),然后高可用用其他手段。主要是替换掉 nodelocaldns 的部分

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-local-dns
namespace: kube-system
labels:
k8s-app: node-local-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
updateStrategy:
rollingUpdate:
maxUnavailable: 10%
selector:
matchLabels:
k8s-app: node-local-dns
template:
metadata:
labels:
k8s-app: node-local-dns
spec:
imagePullSecrets:
- name: regcred
priorityClassName: system-node-critical
serviceAccountName: node-local-dns
hostNetwork: true
dnsPolicy: Default # Don't use cluster DNS.
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
- effect: "NoExecute"
operator: "Exists"
- effect: "NoSchedule"
operator: "Exists"
containers:
- name: dummy-tool
#image: registry.aliyuncs.com/zhangguanzhang/dummy-tool:v0.1
image: {{ docker_repo_url }}/dummy-tool:v0.1
args:
- -local-ip=169.254.20.10,172.26.0.2
- -health-port=8070
- -interface-name=nodelocaldns
securityContext:
privileged: true
livenessProbe:
httpGet:
host: 169.254.20.10
path: /health
port: 8070
initialDelaySeconds: 40
timeoutSeconds: 3
- name: dnsmasq
#image: registry.aliyuncs.com/zhangguanzhang/dnsmasq:2.83
image: {{ docker_repo_url }}/dnsmasq:2.83
command:
- dnsmasq
- -d
- --conf-file=/etc/dnsmasq/dnsmasq.conf
resources:
requests:
cpu: 25m
memory: 10Mi
securityContext:
privileged: true
volumeMounts:
- mountPath: /etc/localtime
name: host-localtime
- name: config-volume
mountPath: /etc/dnsmasq
volumes:
- name: config-volume
configMap:
name: node-local-dns
- hostPath:
path: /etc/localtime
name: host-localtime
---
apiVersion: v1
kind: ConfigMap
metadata:
name: node-local-dns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
data:
dnsmasq.conf: |
no-resolv
all-servers
server=10.11.86.107#153
server=10.11.86.108#153
server=10.11.86.109#153
#log-queries

参考

CATALOG
  1. 1. 由来
  2. 2. 环境信息
  3. 3. 解决过程
    1. 3.1. loca-dns 真的可以吗
    2. 3.2. 意外收获
    3. 3.3. 大致的yaml文件
    4. 3.4. 自己的方案
  4. 4. 参考