zhangguanzhang's Blog

consul仅当服务发现在k8s无状态部署使用

字数统计: 4.7k阅读时长: 24 min
2019/10/24

由来

consul server 在机器上的部署已经写完了,但是dba还是希望部署在 k8s 上,昨天搜了下相关文章扣出来部分步骤自己验证实现了。
先说下需求,三台物理机单独跑 mysql 主从 mha,利用 consul client 的服务注册+ttl向 consul server 注册成域名,然后配置 k8s 内部的 coredns 把域名*.service.consul转发到 consul server。架构为下图

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Physical server
+---------------------+ K8S Cluster
| | +--------------------------------+
|mysql + consul client+-----------+ | +---------------------+ |
| | | | | +-------------+ | |
+---------------------+ | | | |consul server| | |
Physical server | | | D +------+------+ | |
+---------------------+ | | | e | | |
| | | | | p +------+------+ | |
|mysql + consul client+----------->------>+ | l |consul server| | |
| | | | | o +------+------+ | |
+---------------------+ | | | y | | |
Physical server | | | +------+------+ | |
+---------------------+ | | | |consul server| | |
| | | | | +-------------+ | |
|mysql + consul client+-----------+ | +---------------------+ |
| | +--------------------------------+
+---------------------+

然后 k8s 集群里跑 maxscale,集群里服务连 maxscale,maxscale 通过俩域名连 mysql
consul 有 k8s client 的相关代码,赋予 RBAC 后能够用 selector 能够自动join到其他成员

1
2
3
4
...
"retry_join": [
"provider=k8s label_selector=\"app=consul,component=server\""
],

当然其实市面上还有其他的方案的,为啥我不用

环境信息为下

IP role nodeName
172.19.0.5 k8s+server 172.19.0.5
172.19.0.6 k8s+server 172.19.0.6
172.19.0.7 k8s+server 172.19.0.7
172.19.0.11 client 172.19.0.11
172.19.0.12 client 172.19.0.12
172.19.0.13 client 172.19.0.13

镜像准备

因为这里我是 consul server 跑 k8s 里,consul client 在物理机上,通信我要配置成 tls,tls我是 secret 导入的,cm写配置文件。挂载到 pod 里后因为官方 docker 镜像的entrypoint.sh里 chown 了配置文件目录,而挂载到 pod 内部是只读的,pod 起来后执行到 chown 那就报错退出了。所以改造了下官方的 entrypoint 脚本

1
2
3
4
5
6
7
8
9
FROM consul:1.6.1
LABEL maintainer="zhangguanzhang <zhangguanzhang@qq.com>"

ARG TZ=Asia/Shanghai

RUN set -eux && \
ln -sf /usr/share/zoneinfo/${TZ} /etc/localtime && \
echo ${TZ} > /etc/timezone
COPY docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh

docker-entrypoint.sh见我 github https://raw.githubusercontent.com/zhangguanzhang/Dockerfile/master/consul-k8s-dns/docker-entrypoint.sh
Dockerfile 我是打算直接改官方的而不是FROM镜像,但是 gpg key 出问题所以我写成上面那样构建的镜像。见issue https://github.com/hashicorp/docker-consul/issues/137
自己构建镜像的话docker-entrypoint.sh记得加执行权限,我构建的镜像推送到 dockerhub 上可以直接拉取

tls

证书操作都是在容器里生成的

1
2
3
4
mkdir -p consul/ssl
cd consul
docker run --rm -ti --entrypoint sh \
--workdir /root -v $PWD/ssl:/root/ zhangguanzhang/consul-k8s-dns:1.6.1

我们先进入容器

step 1: 创建ca

为了简单起见,这里我使用 Consul 命令行的的内置 TLS 功能来创建基本的 CA。您只需为数据中心创建一个 CA。您应该在用于创建CA的同一服务器上生成所有证书。
ca默认五年,其他的证书默认1年,这里需要带参数 -days= 设置长点的日期

1
2
3
consul tls ca create -days=36500
==> Saved consul-agent-ca.pem
==> Saved consul-agent-ca-key.pem

step2: 创建server角色的证书

这里数据中心默认名字为 dc1,其他的自行选项赋值。在创建 CA 的同一台服务器上重复此过程,直到每台服务器都有一个单独的证书。该命令可以反复调用,它将自动增加证书和密钥号。您将需要将证书分发到服务器。
严格来说每个 server 单独使用,也就是说假如三个 server,下面命令应该执行三次。但是我们仅仅当无状态服务使用,如果每个 server 单独一套证书就不能放 secret 里,毕竟文件名不一样不能对照到pod上,这里我们仅仅生成一份

1
2
3
4
5
6
7
8
consul tls cert create -server -dc=dc1 -days=36500
==> WARNING: Server Certificates grants authority to become a
server and access all state in the cluster including root keys
and all ACL tokens. Do not distribute them to production hosts
that are not server nodes. Store them as securely as CA keys.
==> Using consul-agent-ca.pem and consul-agent-ca-key.pem
==> Saved dc1-server-consul-0.pem
==> Saved dc1-server-consul-0-key.pem

step3: 创建client角色的证书

在Consul 1.5.2 中,您可以使用替代过程来自动将证书分发给客户端。要启用此新功能,请设置auto_encrypt

您可以继续使用生成证书consul tls cert create -client并手动分发证书。对于需要高度保护的数据中心,仍然需要现有的工作流程。

如果您正在运行Consul 1.5.1或更早版本,则需要使用来为每个客户端创建单独的证书consul tls cert create -client。客户端证书也由您的CA签名,但是它们没有特殊性Subject Alternative Name,这意味着如果verify_server_hostname启用,则它们不能作为server角色启动。

这里我是高于1.5.2的,不需要为每个客户端创建证书,客户端只需要拥有consul-agent-ca.pem这个ca下,会自动从server获取证书存在内存中,并且不会持久保存。但是我测试了并没有成功,还是生成了证书。
实际是三个consul client,所以下面命令得执行三次

1
2
3
4
5
6
7
8
9
10
11
12
$ consul tls cert create -client -dc=dc1 -days=36500
==> Using consul-agent-ca.pem and consul-agent-ca-key.pem
==> Saved dc1-client-consul-0.pem
==> Saved dc1-client-consul-0-key.pem
$ consul tls cert create -client -dc=dc1 -days=36500
==> Using consul-agent-ca.pem and consul-agent-ca-key.pem
==> Saved dc1-client-consul-1.pem
==> Saved dc1-client-consul-1-key.pem
$ consul tls cert create -client -dc=dc1 -days=36500
==> Using consul-agent-ca.pem and consul-agent-ca-key.pem
==> Saved dc1-client-consul-2.pem
==> Saved dc1-client-consul-2-key.pem

客户端实体服务跑,可以单独去生成而不是共用一套(理论上也可以一套,可以自行去尝试)

step4: 创建cli的证书

1
2
3
4
$ consul tls cert create -cli -dc=dc1 -days=36500
==> Using consul-agent-ca.pem and consul-agent-ca-key.pem
==> Saved dc1-cli-consul-0.pem
==> Saved dc1-cli-consul-0-key.pem

创建完证书ctrl+d退出容器

step5: 检查证书情况

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ ls -l ssl
total 40
-rw-r--r-- 1 root root 227 Mar 8 13:47 consul-agent-ca-key.pem
-rw-r--r-- 1 root root 1253 Mar 8 13:47 consul-agent-ca.pem
-rw-r--r-- 1 root root 227 Mar 8 13:50 dc1-cli-consul-0-key.pem
-rw-r--r-- 1 root root 1082 Mar 8 13:50 dc1-cli-consul-0.pem
-rw-r--r-- 1 root root 227 Mar 8 13:48 dc1-client-consul-0-key.pem
-rw-r--r-- 1 root root 1143 Mar 8 13:48 dc1-client-consul-0.pem
-rw-r--r-- 1 root root 227 Mar 8 13:49 dc1-client-consul-1-key.pem
-rw-r--r-- 1 root root 1143 Mar 8 13:49 dc1-client-consul-1.pem
-rw-r--r-- 1 root root 227 Mar 8 13:49 dc1-client-consul-2-key.pem
-rw-r--r-- 1 root root 1143 Mar 8 13:49 dc1-client-consul-2.pem
-rw-r--r-- 1 root root 227 Mar 8 13:47 dc1-server-consul-0-key.pem
-rw-r--r-- 1 root root 1143 Mar 8 13:47 dc1-server-consul-0.pem

consul deploy

consul可以跨数据中心,wan 相关配置(hostPort模拟)我测试没有成功,而且考虑到高可用,基本是每台机器一台。所以我用了硬性互斥+hostNetwork,关闭 serfwan 端口。

apply之前我们先把证书导入成secret

1
2
3
4
5
6
7
cd ssl
kubectl create secret generic consul \
--from-file=consul-agent-ca.pem \
--from-file=dc1-server-consul-0.pem \
--from-file=dc1-server-consul-0-key.pem \
--from-file=dc1-cli-consul-0.pem \
--from-file=dc1-cli-consul-0-key.pem

server部分 – on k8s

另外要注意

  • 因为每次 pod 重启基本相当于 node-id 变了,参数 leave_on_terminate 应该 false,见 https://github.com/hashicorp/consul/issues/6672https://github.com/hashicorp/consul/issues/3938 所以 preStop 的部分没啥用,忽略即可
    因为 coredns 的转发需要写 ip 或者文件而不能是域名,所以我们得把 consul-server 的 svc 的 clusterIP 给固定住(记住得在svc的 cidr 内选一个未使用的ip),或者干脆不创建svc直接使用 hostIP,这里按照svc实例
  • disable_host_node_id应该设置为 false,hostNetowkr下pod的hostname是宿主机的,这个选项默认值为true,会随机每次的node-id导致刷错误日志2020/04/21 04:07:16 [WARN] consul.fsm: EnsureRegistration failed: failed inserting node: Error while renaming Node ID: "6e68f174-1e65-b660-b407-51b7d9a4e22d": Node name 100.64.24.3 is reserved by node 7a9ab787-3593-52c2-81c1-87cf65fabbcf with name 100.64.24.3 (100.64.24.3)

master也是k8s的node,如果你总共有三个以上的node,nodeSelector记得自己去打label固定住,如果是只有三台(master+node合一)则取消下面yaml的nodeSelector属性

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
apiVersion: apps/v1
kind: Deployment
metadata:
name: consul-server
spec:
selector:
matchLabels:
app: consul
component: server
replicas: 3
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: consul
component: server
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- consul
topologyKey: kubernetes.io/hostname
serviceAccountName: consul-server
nodeSelector:
master: "true" # 如果单独三台master+node就不用nodeSelector
terminationGracePeriodSeconds: 7
hostNetwork: true
securityContext:
fsGroup: 1000
# imagePullSecrets:
# - name: harbor-local
containers:
- name: consul
image: zhangguanzhang/consul-k8s-dns:1.6.1
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: NODE
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: CONSUL_HTTP_ADDR
value: https://localhost:8501
- name: CONSUL_CACERT
value: /consul/config/ssl/consul-agent-ca.pem
- name: CONSUL_CLIENT_CERT
value: /consul/config/ssl/dc1-cli-consul-0.pem
- name: CONSUL_CLIENT_KEY
value: /consul/config/ssl/dc1-cli-consul-0-key.pem
# - name: RETRY_JOIN
# value: 172.19.0.5,172.19.0.6,172.19.0.7
args:
- agent
- -advertise=$(POD_IP)
- -node=$(NODE)
volumeMounts:
- name: localtime
mountPath: /etc/localtime
- name: data
mountPath: /consul/data
- name: config
mountPath: /consul/config/
- name: tls
mountPath: /consul/config/ssl/
lifecycle:
preStop:
exec:
command:
- consul leave
readinessProbe:
exec:
command:
- consul
- members
failureThreshold: 2
initialDelaySeconds: 10
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 3
ports:
- containerPort: 8300
name: server
- containerPort: 8301
name: serflan
# - containerPort: 8302
# name: serfwan
- containerPort: 8400
name: alt-port
- containerPort: 8501
name: https
- containerPort: 8600
name: dns-udp
protocol: UDP
- containerPort: 8600
name: dns-tcp
protocol: TCP
volumes:
- name: config
configMap:
name: consul-server
- name: tls
secret:
secretName: consul
- name: localtime
hostPath:
path: /etc/localtime
- name: data
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: consul-server
namespace: default
labels:
app: consul
spec:
clusterIP: 10.96.0.11
ports:
- name: https
port: 8501
targetPort: https
- name: serflan-tcp
protocol: "TCP"
port: 8301
targetPort: 8301
- name: serflan-udp
protocol: "UDP"
port: 8301
targetPort: 8301
# - name: serfwan-tcp
# protocol: "TCP"
# port: 8302
# targetPort: 8302
# - name: serfwan-udp
# protocol: "UDP"
# port: 8302
# targetPort: 8302
- name: server
port: 8300
targetPort: 8300
- name: dns-tcp
protocol: "TCP"
port: 8600
targetPort: dns-tcp
- name: dns-udp
protocol: "UDP"
port: 8600
targetPort: dns-udp
selector:
app: consul
component: server
---
apiVersion: v1
kind: ConfigMap
metadata:
name: consul-server
namespace: default
labels:
app: consul
role: server
data:
server.json: |-
{
"client_addr": "0.0.0.0",
"datacenter": "dc1",
"bootstrap_expect": 3,
"domain": "consul",
"skip_leave_on_interrupt": true,
"leave_on_terminate" : false,
"log_level": "INFO",
"retry_join": [
"provider=k8s label_selector=\"app=consul,component=server\""
],
"retry_interval": "2s",
"verify_incoming": true,
"verify_outgoing": true,
"verify_server_hostname": true,
"ca_file": "/consul/config/ssl/consul-agent-ca.pem",
"cert_file": "/consul/config/ssl/dc1-server-consul-0.pem",
"key_file": "/consul/config/ssl/dc1-server-consul-0-key.pem",
"disable_host_node_id": false,
"ports": {
"http": -1,
"serf_wan": -1,
"https": 8501
},
"server": true,
"ui": false
}
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: consul-server
labels:
app: consul
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: consul
labels:
app: consul
rules:
- apiGroups: [""]
resources:
- pods
verbs:
- get
- list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: consul
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: consul
subjects:
- kind: ServiceAccount
name: consul-server
namespace: default

确保nodeSelector符合自身环境后就 apply

1
2
3
4
5
$ kubectl get pod -o wide -l app=consul
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
consul-server-7879498d78-8qjbs 1/1 Running 0 2m34s 172.19.0.6 172.19.0.6 <none> <none>
consul-server-7879498d78-drm7n 1/1 Running 0 2m34s 172.19.0.7 172.19.0.7 <none> <none>
consul-server-7879498d78-zjc6z 1/1 Running 0 2m34s 172.19.0.5 172.19.0.5 <none> <none>

client部分

client是实体服务,用我的ansible部署的

下载ansible部署文件

没有安装 ansible 的话自行去我下面的git的 readme 安装下 ansible ,在 consul 目录下 clone

1
2
3
4
5
6
7
cd ..
git clone https://github.com/zhangguanzhang/consul-tls-ansible
mv consul-tls-ansible/* ssl/
rm -rf consul-tls-ansible
cd ssl
docker run --rm -tid --name tempconsul zhangguanzhang/consul-k8s-dns:1.6.1 sleep 20
docker cp tempconsul:/bin/consul .

自行配制下inventory/hosts,这里不用我剧本的其他role,server段写ip即可,主要用来渲染 client 去 join 哪里。这里的 mysql 物理机同事已经设置好了 hostname 和基础参数,所以client里的hostname=xxx得删掉,最后为如下

1
2
3
4
5
6
7
8
9
$ cat inventory/hosts
[server]
172.19.0.5
172.19.0.6
172.19.0.7
[client]
172.19.0.11 clusterName=172.19.0.11
172.19.0.12 clusterName=172.19.0.12
172.19.0.13 clusterName=172.19.0.13

还有在group_vars/all.yml里改下ansible的ssh密码
配置完测下连同性

1
ansible client -m ping

部署client

1
2
3
ansible-playbook 04-client.yml
# 删掉剧本目录的consul二进制文件
rm -f consul

在client机器上查看members

1
2
3
4
5
6
7
8
$ consul members
Node Address Status Type Build Protocol DC Segment
172.19.0.5 172.19.0.5:8301 alive server 1.6.1 2 dc1 <all>
172.19.0.6 172.19.0.6:8301 alive server 1.6.1 2 dc1 <all>
172.19.0.7 172.19.0.7:8301 alive server 1.6.1 2 dc1 <all>
172.19.0.11 172.19.0.11:8301 alive client 1.6.1 2 dc1 <default>
172.19.0.12 172.19.0.12:8301 alive client 1.6.1 2 dc1 <default>
172.19.0.13 172.19.0.13:8301 alive client 1.6.1 2 dc1 <default>

同样的,我们也可以在pod里查看

1
2
3
4
5
6
7
8
$ kubectl exec consul-server-7879498d78-8qjbs consul members
Node Address Status Type Build Protocol DC Segment
172.19.0.5 172.19.0.5:8301 alive server 1.6.1 2 dc1 <all>
172.19.0.6 172.19.0.6:8301 alive server 1.6.1 2 dc1 <all>
172.19.0.7 172.19.0.7:8301 alive server 1.6.1 2 dc1 <all>
172.19.0.11 172.19.0.11:8301 alive client 1.6.1 2 dc1 <default>
172.19.0.12 172.19.0.12:8301 alive client 1.6.1 2 dc1 <default>
172.19.0.13 172.19.0.13:8301 alive client 1.6.1 2 dc1 <default>

配置服务发现

client配置服务注册,我剧本都是规范目录和子配置文件,服务发现在/etc/consul.d/client/下写json文件即可,这里我在172.19.0.11上做个实例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
"services": [
{
"name": "r-3306-mysql",
"tags": [
"slave-3306"
],
"address": "172.19.0.11",
"port": 3306,
"checks": [
{
"args": ["echo"], //实际应该写健康检查脚本,并去掉此处注释
"interval": "5s"
}
]
}
]
}

然后重启下consul

1
systemctl restart consul

会看到服务注册上去

1
2
3
4
2019/10/24 13:32:12 [INFO] agent: (LAN) joined: 3
2019/10/24 13:32:12 [INFO] agent: Join LAN completed. Synced with 3 initial agents
2019/10/24 13:32:12 [INFO] agent: Synced service "r-3306-mysql"
2019/10/24 13:32:21 [INFO] agent: Synced check "service:r-3306-mysql"

测试解析,记得安装bind-utils

1
2
3
4
$ dig -p 8600 @172.19.0.11 r-3306-mysql.service.consul +short
172.19.0.11
$ dig -p 8600 @172.19.0.5 r-3306-mysql.service.consul +short
172.19.0.11

配置k8s域名解析转发

1
2
3
4
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
consul-server ClusterIP 10.96.0.11 <none> 8501/TCP,8301/TCP,8301/UDP,8300/TCP,8600/TCP,8600/UDP 2m48s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d22h

在k8s的node上直接使用svc来测试域名解析

1
2
$ dig -p 8600 @10.96.0.11 r-3306-mysql.service.consul +short
172.19.0.11

配置coredns转发,coredns的configmap添加下面内容,也可以添加上client的ip,切记ip不要和我一样,自行按照自己的ip填写
kubectl -n kube-system edit cm coredns

1
2
3
4
5
6
7
8
9
...
} # 下面行开始复制
service.consul:53 {
errors
cache 1
forward . 10.96.0.11:8600 172.19.0.11:8600 172.19.0.12:8600 172.19.0.13:8600 {
max_fails 1
}
}

跑一个指定版本带解析的工具pod,也可以用dig @10.96.0.10 +short w-3306-mysql.service.consul

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- name: busybox
image: library/busybox:1.28
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
EOF

等上面的pod就绪后来测试解析,不影响集群的内部解析

1
2
3
4
5
6
7
8
9
10
11
12
$ kubectl exec -ti busybox -- nslookup r-3306-mysql.service.consul
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name: r-3306-mysql.service.consul
Address 1: 172.19.0.11
$ kubectl exec -ti busybox -- nslookup kubernetes
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name: kubernetes
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

maxscale

这里maxscale是集群里运行,dba要求日志直接挂载在宿主机(所以去掉cmd的-l stdout),因为maxscale不允许root运行,且容器里 maxscale 的用户的 uid 为999,所以现在是三台node上创建/var/log/maxscale/chown 999:999
然后下面是yaml,密码部分打码。更改了下官方的 Dockerfile 增加了个 curl 和 jq ,可能后期存活检查用 curl 走r est api 做。下面的svc如果有需要向外暴露的话可以自己nodeport

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
apiVersion: apps/v1
kind: Deployment
metadata:
name: maxscale
spec:
selector:
matchLabels:
app: maxscale
component: production
replicas: 3
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: maxscale
component: production
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- maxscale
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 5
securityContext:
fsGroup: 999
# imagePullSecrets:
# - name: harbor-local
containers:
- name: maxscale
image: zhangguanzhang/maxscale:2.4.7
args:
- maxscale
- -d
- -U
- maxscale
- -f
- /etc/maxscale/maxscale.cnf
volumeMounts:
- name: localtime
mountPath: /etc/localtime
- name: log
mountPath: /var/log/maxscale
- name: maxscale-cnf
mountPath: /etc/maxscale
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 3
periodSeconds: 3
successThreshold: 1
tcpSocket:
port: mysql
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 3
periodSeconds: 3
successThreshold: 1
tcpSocket:
port: mysql
timeoutSeconds: 1
ports:
- name: mysql
containerPort: 3306
# - name: rest-api
# containerPort: 8989
resources:
limits:
cpu: "8"
memory: 2Gi
requests:
cpu: "2"
memory: 500Mi
volumes:
- name: localtime
hostPath:
path: /etc/localtime
- name: log
hostPath:
path: /var/log/maxscale
- name: maxscale-cnf
configMap:
name: maxscale-config
defaultMode: 420
---
apiVersion: v1
kind: Service
metadata:
name: mysql-service
namespace: default
labels:
app: mysql-service
spec:
ports:
- name: mysql
port: 3306
targetPort: mysql
# - name: rest-api
# port: 8989
# targetPort: rest-api
selector:
app: maxscale
component: production
---
# apiVersion: v1
# kind: Service
# metadata:
# name: uca-mysql
# namespace: default
# labels:
# app: mysql-service
# spec:
# type: NodePort
# ports:
# - name: mysql
# port: 3306
# nodePort: 3306
# targetPort: mysql
# # - name: rest-api
# # port: 8989
# # targetPort: rest-api
# selector:
# app: maxscale
# component: production
---
apiVersion: v1
kind: ConfigMap
metadata:
name: maxscale-config
namespace: default
labels:
app: maxscale
role: production
data:
maxscale.cnf: |-
# MaxScale documentation on GitHub:
# https://github.com/mariadb-corporation/MaxScale/blob/master/Documentation/Documentation-Contents.md
# Global parameters
#
# Complete list of configuration options:
# https://github.com/mariadb-corporation/MaxScale/blob/master/Documentation/Getting-Started/Configuration-Guide.md
[maxscale]
threads=16
ms_timestamp=1
syslog=0
maxlog=1
log_warning=1
log_notice=1
log_info=0
log_debug=0
log_augmentation=1

users_refresh_time=1

#logdir=/data/maxscale/log/
#datadir=/data/maxscale/data/
#cachedir=/data/maxscale/cache/
#piddir=/data/maxscale/pid/


[server1]
type=server
address=w-3306-mysql.service.consul
port=3306
protocol=MySQLBackend

[server2]
type=server
address=r-3306-mysql.service.consul
port=3306
protocol=MySQLBackend


[MySQL-Monitor]
type=monitor
module=mysqlmon
servers=server1,server2
user=maxscale
password=******
monitor_interval=10000

[Read-Write-Service]
type=service
router=readwritesplit
servers=server1,server2
user=maxscale
password=******
max_slave_connections=100%

[MaxAdmin-Service]
type=service
router=cli

[Read-Write-Listener]
type=listener
service=Read-Write-Service
protocol=MySQLClient
port=3306

[MaxAdmin-Listener]
type=listener
service=MaxAdmin-Service
protocol=maxscaled
socket=default
---

密码已经打码

1
2
3
4
5
6
7
8
9
10
11
12
13
$ kubectl get pod -o wide -l app=maxscale
maxscale-645864d9cb-dcx77 1/1 Running 0 67s 10.244.0.38 172.19.0.2 <none> <none>
$ kubectl exec maxscale-645864d9cb-dcx77 maxadmin list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
server1 | w-3306-mysql.service.consul | 3306 | 0 | Master, Running
server2 | r-3306-mysql.service.consul | 3306 | 0 | Slave, Running
-------------------+-----------------+-------+-------------+--------------------
$ kubectl get svc -l app=mysql-service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
mysql-service ClusterIP 10.105.107.173 <none> 3306/TCP 10m

测试下连接性,下面我是宿主机上安装mysql的客户端测试的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ mysql -u moove -p -h 10.105.107.173
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 77
Server version: 5.7.26-log MySQL Community Server (GPL)

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

清理和备份

把consul目录打包并分别发到server或者client的机器上,三份不可能同时坏掉

1
2
$ kubectl delete pod busybox
$ tar zcf consul-deploy.tar.gz consul/

自行把备份拷贝,这里我是拷贝到client的机器上

1
2
3
cd consul/ssl
ansible client -m file -a 'name=/opt/consul/ state=directory'
ansible client -m copy -a 'src=../../consul-deploy.tar.gz dest=/opt/consul/'

参考

https://www.consul.io/docs/platform/k8s/run.html
https://github.com/kelseyhightower/consul-on-kubernetes
https://hub.helm.sh/charts/appuio/maxscale

CATALOG
  1. 1. 由来
  2. 2. 镜像准备
  3. 3. tls
    1. 3.1. step 1: 创建ca
    2. 3.2. step2: 创建server角色的证书
    3. 3.3. step3: 创建client角色的证书
    4. 3.4. step4: 创建cli的证书
    5. 3.5. step5: 检查证书情况
  4. 4. consul deploy
    1. 4.1. server部分 – on k8s
    2. 4.2. client部分
      1. 4.2.1. 下载ansible部署文件
      2. 4.2.2. 部署client
      3. 4.2.3. 配置服务发现
    3. 4.3. 配置k8s域名解析转发
  5. 5. maxscale
  6. 6. 清理和备份
  7. 7. 参考