CRD
|
|
| 定义出/创建出
|
|
↓
CR,即resource type ----------------》受自定义的控制器watch监听并控制
|
|
| 定义出/创建出
|
|
↓
一条具体的resource
实现的功能:
1. 支持一主多从 采用GID的自动备份
2. 支持主从的自动选举切换
3. 支持在线扩容 副本不足时会自动拉起
4. 支持就绪探针的检测
5. .........
一、go环境准备
wget https://golang.google.cn/dl/go1.22.5.linux-amd64.tar.gz
tar zxvf go1.22.5.linux-amd64.tar.gz
mv go /usr/local/
cat >> /etc/profile << 'EOF'
export GOROOT=/usr/local/go
export PATH=$PATH:$GOROOT/bin
EOF
source /etc/profile
go version #查看是否生效
# 设置go代理
# 1、也可以用全球cdn加速
export GOPROXY=https://goproxy.cn,direct
go env -w GOPROXY=https://goproxy.cn,direct
二、安装kubebuilder框架
# 1、下载最新版本的kubebuilder(下载慢的话你就手动下载然后上传)
wget https://github.com/kubernetessigs/kubebuilder/releases/download/v4.1.1/kubebuilder_linux_amd64
mv kubebuilder_linux_amd64 kubebuilder && chmod +x kubebuilder && mv kubebui lder /usr/local/bin/
$ kubebuilder version
三、初始化项目
# 创建项目
mkdir -p /src/application-operator
cd /src/application-operator
go mod init application-operator
kubebuilder init --domain=egonlin.com --owner egonlin
# 创建api
$ kubebuilder create api --group apps --version v1 --kind Application # 设定的kind的首字母必须大写
Create Resource [y/n]
y
Create Controller [y/n]
y
# --kind Application,指定你要创建的resource type的名字,注意首字母必须大写
#项目地址直接拉取
https://gitee.com/axzys/mysqlcluster-operator/tree/slave/
四、可以先在本地测试执行
# 一、修改文件:文件utils.go
#1、文件开头增加导入:"k8s.io/client-go/tools/clientcmd"
删除导入:"k8s.io/client-go/rest"
#2、方法execCommandOnPod修改
config, err := clientcmd.BuildConfigFromFlags("", KubeConfigPath) // 打开注释
// config, err := rest.InClusterConfig() // 加上注释
# 二、mysqlcluster_controller.go修改
const (
......
KubeConfigPath = "/root/.kube/config" // 打开注释
......
)
# 并且确保宿主机上存在/root/.kube/config
# 测试yaml
apiVersion: apps.egonlin.com/v1
kind: MysqlCluster
metadata:
name: mysqlcluster-sample
labels:
app.kubernetes.io/name: mysql-operator
app.kubernetes.io/managed-by: kustomize
spec:
image: registry.cn-shanghai.aliyuncs.com/egon-k8s-test/mysql:5.7
replicas: 4
masterService: master-service
slaveService: slave-service
storage:
storageClassName: "local-path"
size: 1Gi
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "1"
memory: "2Gi"
livenessProbe:
initialDelaySeconds: 30
timeoutSeconds: 5
tcpSocket:
port: 3306
先执行make install
然后执行 make run
然后创建测试pod
创建测试功能正常以后。可以把控制器放进k8s里面。
五、以容器形式部署controller
如果想要部署在k8s里面需要把上面修改的配置还原回去。
# dockerfile文件中的FROM镜像无法拉取,要换成自己的
$ vi Dockerfile
# FROM golang:1.22 AS builder
FROM registry.cn-hangzhou.aliyuncs.com/egon-k8s-test/golang:1.22 AS builder
#FROM gcr.io/distroless/static:nonroot
FROM registry.cn-shanghai.aliyuncs.com/egon-k8s-test/static:nonroot
#并且构建过程中需要执行go mod download,默认从国外源下载非常慢需要再该命令前设置好环境变量
# 在go mod download前设置好环境变量
ENV GOPROXY=https://mirrors.aliyun.com/goproxy/,direct
RUN go mod download
然后构建 docker 镜像
make docker-build IMG=mysql-operator-master:v0.01
#然后启动推上阿里云仓库
# 使用 docker 镜像, 部署 controller 到 k8s 集群,会部署成一个deployment
make deploy IMG=registry.cn-guangzhou.aliyuncs.com/xingcangku/bendi:v0.8
#查询: 默认在system名称空间下
[root@master01 mysql-operator-master]# kubectl get namespace
NAME STATUS AGE
application-operator-system Active 3d
default Active 23d
kube-flannel Active 23d
kube-node-lease Active 23d
kube-public Active 23d
kube-system Active 23d
monitor Active 22d
system Active 36s
[root@master01 mysql-operator-master]# kubectl -n system get
api/ cmd/ Dockerfile .git/ .golangci.yml go.sum internal/ PROJECT test/
bin/ config/ .dockerignore .gitignore go.mod hack/ Makefile README.md test.yaml
[root@master01 mysql-operator-master]# kubectl -n system get deployments.apps
NAME READY UP-TO-DATE AVAILABLE AGE
controller-manager 1/1 1 1 52s
[root@master01 mysql-operator-master]# kubectl -n controller-manager get pods
No resources found in controller-manager namespace.
[root@master01 mysql-operator-master]# kubectl delete -f ./config/samples/apps_v1_mysqlcluster.yaml
Error from server (NotFound): error when deleting "./config/samples/apps_v1_mysqlcluster.yaml": mysqlclusters.apps.egonlin.com "mysqlcluster-sample" not found
[root@master01 mysql-operator-master]# kubectl apply -f ./config/samples/apps_v1_mysqlcluster.yaml
mysqlcluster.apps.egonlin.com/mysqlcluster-sample created
[root@master01 mysql-operator-master]# kubectl -n controller-manager get pods
No resources found in controller-manager namespace.
[root@master01 mysql-operator-master]# kubectl get pods -n system
NAME READY STATUS RESTARTS AGE
controller-manager-5699b5b476-4ngwd 1/1 Running 0 3m3s
# 如果发现pod没有起来可能是存储的问题。项目来面有个文件local-path-provisioner-0.0.29 进入然后再进入deploy这个文件
[root@master01 deploy]# kubectl apply -f local-path-storage.yaml
namespace/local-path-storage created
serviceaccount/local-path-provisioner-service-account created
role.rbac.authorization.k8s.io/local-path-provisioner-role created
clusterrole.rbac.authorization.k8s.io/local-path-provisioner-role created
rolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created
clusterrolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created
deployment.apps/local-path-provisioner created
storageclass.storage.k8s.io/local-path created
configmap/local-path-config created
[root@master01 deploy]# kubectl get pods
NAME READY STATUS RESTARTS AGE
axing-zzz-7d5cb7df74-4lbqn 1/1 Running 6 (31m ago) 16d
mysql-01 1/1 Running 0 7m50s
mysql-02 1/1 Running 0 40s
mysql-03 0/1 ContainerCreating 0 30s
[root@master01 deploy]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
mysql-01 Bound pvc-c4ffa04d-78bc-44e5-9948-8dd23e8197d4 1Gi RWO local-path <unset> 8m4s
mysql-02 Bound pvc-9870b7dc-274f-48d9-ab9c-12fdad4ab267 1Gi RWO local-path <unset> 8m4s
mysql-03 Bound pvc-517035dc-ec28-4733-8d8d-244cce025604 1Gi RWO local-path <unset> 8m4s
[root@master01 mysql-operator-master]# kubectl get pod -n system
'NAME READY STATUS RESTARTS AGE
controller-manager-5699b5b476-4ngwd 1/1 Running 0 103m
[root@master01 mysql-operator-master]#
[root@master01 mysql-operator-master]# kubectl -n system get deployments.apps
NAME READY UP-TO-DATE AVAILABLE AGE
controller-manager 1/1 1 1 103m
# 可以看日志的情况
[root@master01 mysql-operator-master]# kubectl -n system logs -f controller-manager-5699b5b476-4ngwd
正常最后是会一直更新日志
最后问题总结
# 启动operator的时候第三个pod无法拉起,一直pending,查看
[root@k8s-node-01 ~]# kubectl describe pod mysql-03
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 11m (x3 over 17m) default-scheduler 0/3 nodes
are available: 1 Insufficient cpu, 1 node(s) had untolerated taint
{node.kubernetes.io/disk-pressure: }, 2 Insufficient memory. preemption: 0/3
nodes are available: 1 Preemption is not helpful for scheduling, 2 No preemption
victims found for incoming pod.
Warning FailedScheduling 89s (x2 over 6m30s) default-scheduler 0/3 nodes
are available: 3 node(s) had untolerated taint {node.kubernetes.io/unreachable:
}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for
scheduling.
[root@k8s-node-01 ~]#
# 报错磁disk磁盘资源不足,因为我们用的存储卷是local-path-storage,所以会有卷亲和,msyql-03固定调度到卷所在的节点,卷所在的节点为k8s-node-01节点,通过查看也能分析出来
[root@k8s-node-01 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
NOMINATED NODE READINESS GATES
mysql-01 1/1 Running 0 18m 10.244.0.103 k8s-master-01
<none> <none>
mysql-02 1/1 Running 0 18m 10.244.2.184 k8s-node-02
<none> <none>
mysql-03 0/1 Pending 0 18m <none> <none>
<none> <none>
# 于是去k8s-node-01节点上查看,发现磁盘空间确实占满了,如下
先尝试把该节点的一些安装包,/tmp目录,yum缓存,/var/log都清理掉
kubelet的日志轮转也设置了
[root@k8s-node-01 ~]# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--container-runtimeendpoint=unix:///var/run/containerd/containerd.sock --pod-infra-containerimage=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9 --containerlog-max-files=2 --container-log-max-size='1Ki'"
# 注意:--container-log-max-files=2必须大于1,不能小于或等于1,否则无法启动
# go build缓存(/root/.cache)还是别清了,否则make run或花很久时间
# 并且把一些没有用的镜像也清理掉
docker system prune -a
nerdctl system prune -a
# 作用解释:
system prune:这个命令用于清理 Docker 系统,删除不再使用的容器、镜像、网络等资源。
-a(--all):此选项会使命令删除所有未使用的镜像,而不仅仅是无标签的镜像。
运行 docker/nerdctl system prune -a 后,系统会问你是否确认要删除这些资源。确认后,Docker会清理掉停止的容器、未使用的镜像和网络,从而释放磁盘空间。
发现空间得到了一定程度的释放
查看已删除但仍被占用的文件
当一个文件被删除后,如果有进程仍然在使用它,那么这个文件所占用的空间并不会立即被释放。文件
系统的空间使用会显示为已用,但 du 无法检测到这些被删除的文件。
检测被删除但仍然占用的文件
可以使用 lsof 来列出所有仍然被进程占用但已删除的文件。
lsof | grep deleted
如果发现某些文件已经被删除,但仍然被进程占用,可以通过重启相应的进程来释放这些文件占用的空间。
发现一堆这种文件
查找该进程,发现就是一个裸启动的mysql进程,无用,可以kill杀掉
kill -9 1100
评论 (0)