首页
导航
统计
留言
更多
壁纸
直播
关于
推荐
星的魔法
星的导航页
谷歌一下
镜像国内下载站
大模型国内下载站
docker镜像国内下载站
腾讯视频
Search
1
Ubuntu安装 kubeadm 部署k8s 1.30
252 阅读
2
rockylinux 9.3详细安装drbd
170 阅读
3
kubeadm 部署k8s 1.30
163 阅读
4
rockylinux 9.3详细安装drbd+keepalived
138 阅读
5
ceshi
85 阅读
默认分类
日记
linux
docker
k8s
ELK
Jenkins
Grafana
Harbor
Prometheus
Cepf
k8s安装
Gitlab
traefik
sonarqube
OpenTelemetry
MinIOn
Containerd进阶使用
ArgoCD
golang
Git
Python
Web开发
HTML和CSS
JavaScript
对象模型
公司
zabbix
zookeeper
hadoop
登录
/
注册
Search
标签搜索
k8s
linux
docker
drbd+keepalivde
ansible
dcoker
webhook
星
累计撰写
127
篇文章
累计收到
965
条评论
首页
栏目
默认分类
日记
linux
docker
k8s
ELK
Jenkins
Grafana
Harbor
Prometheus
Cepf
k8s安装
Gitlab
traefik
sonarqube
OpenTelemetry
MinIOn
Containerd进阶使用
ArgoCD
golang
Git
Python
Web开发
HTML和CSS
JavaScript
对象模型
公司
zabbix
zookeeper
hadoop
页面
导航
统计
留言
壁纸
直播
关于
推荐
星的魔法
星的导航页
谷歌一下
镜像国内下载站
大模型国内下载站
docker镜像国内下载站
腾讯视频
搜索到
125
篇与
的结果
2025-06-15
RadosGW部署
一、创建 radosgw服务root@ubuntu01:~# ceph orch apply rgw default es --placement="count:3" Scheduled rgw.default update...<realm-name>:RadosGW 的域名,可以自定义,例如 default. <zone-name>:RadosGW 的区域名,可以自定义,例如 es. --placement="count:1":表示 RadosGW 实例的数量。如果你想在多个节点上部署,可以增加 count 的值,或者指定节点列表二、验证radosgw服务状态root@ubuntu01:~# ceph orch ps | grep rgw rgw.default.ubuntu01.lusoxe ubuntu01 *:80 starting - - - - <unknown> <unknown> known> rgw.default.ubuntu02.nvamia ubuntu02 *:80 starting - - - - <unknown> <unknown> known> rgw.default.ubuntu03.rkgoya ubuntu03 *:80 starting - - - - <unknown> <unknown> known> root@ubuntu01:~# ceph -s cluster: id: 5b0e9b94-e6bb-11ef-a18c-274714e73e14 health: HEALTH_WARN clock skew detected on mon.ubuntu03 1 pool(s) do not have an application enabled services: mon: 3 daemons, quorum ubuntu01,ubuntu02,ubuntu03 (age 8h) mgr: ubuntu02.jmaxlt(active, since 8h), standbys: ubuntu01.exrhij mds: 1/1 daemons up, 1 standby osd: 3 osds: 3 up (since 8h), 3 in (since 2w) rgw: 3 daemons active (3 hosts, 1 zones) data: volumes: 1/1 healthy pools: 10 pools, 181 pgs objects: 9.37k objects, 2.5 GiB usage: 8.5 GiB used, 111 GiB / 120 GiB avail pgs: 181 active+clean io: client: 214 KiB/s rd, 5.1 KiB/s wr, 247 op/s rd, 141 op/s wr三、验证 radosgw 存储池资源#查看存储池列表 root@ubuntu01:~# ceph osd pool ls .mgr kubernetes k8s-rbd k8s cephfs_data cephfs_metadata .rgw.root default.rgw.log default.rgw.control default.rgw.meta#查看默认radosgw存储池信息 root@ubuntu01:~# radosgw-admin zone get --rgw-zone=default --rgw-zonegroup=default { "id": "7f883cbd-8ada-48aa-9358-f2c09aee0ca7",# 区域的唯一标识符 "name": "default",# 默认区域的名称 "domain_root": "default.rgw.meta:root",# 区域的根域名 "control_pool": "default.rgw.control",# 系统控制池,在有数据更新是,通知其他RGW更新缓存 "gc_pool": "default.rgw.log:gc", # 用于垃圾回收的存储池 "lc_pool": "default.rgw.log:lc",# 用于存储日志的存储池 "log_pool": "default.rgw.log",# 存储日志信息,用于记录各种log信息 "intent_log_pool": "default.rgw.log:intent", "usage_log_pool": "default.rgw.log:usage", "roles_pool": "default.rgw.meta:roles",# default.rgw.meta:元数据存储池,通过不同的名称空间分别存储不同的rados对象 "reshard_pool": "default.rgw.log:reshard", "user_keys_pool": "default.rgw.meta:users.keys",# 用户的密钥名称空间users.keys "user_email_pool": "default.rgw.meta:users.email",# 用户的email名称空间users.email "user_swift_pool": "default.rgw.meta:users.swift",# 用户的subuser的名称空间users.swift "user_uid_pool": "default.rgw.meta:users.uid", # 用户UID "otp_pool": "default.rgw.otp", "system_key": { "access_key": "", "secret_key": "" }, "placement_pools": [ { "key": "default-placement", "val": { "index_pool": "default.rgw.buckets.index",# 存放bucket到object的索引信息 "storage_classes": { "STANDARD": { "data_pool": "default.rgw.buckets.data"# 存放对象的数据 } }, "data_extra_pool": "default.rgw.buckets.non-ec",# 数据的额外信息存储池 "index_type": 0, "inline_data": true } } ], "realm_id": "", "notif_pool": "default.rgw.log:notif" }#查看默认配置信息 root@ubuntu01:~# ceph osd pool get default.rgw.meta crush_rule crush_rule: replicated_rule root@ubuntu01:~# ceph osd pool get default.rgw.meta size size: 3 root@ubuntu01:~# ceph osd pool get default.rgw.meta pgp_num pgp_num: 16 root@ubuntu01:~# ceph osd pool get default.rgw.meta pg_num pg_num: 16四、访问radosgw服务访问radosgw服务所在的节点 IP+默认 80 端口既可。五、RadosGW https 5.1生成自签证书root@ubuntu01:/# mkdir -p /etc/ceph/rgw/cert root@ubuntu01:/# openssl req -newkey rsa:2048 -x509 -days 3650 -nodes -out /etc/ceph/rgw/cert/rgw-cert.pem -keyout /etc/ceph/rgw/cert/rgw-key.pem -subj "/CN=ceph-rgw.local.com" Generating a RSA private key ...............................+++++ ..................................+++++ writing new private key to '/etc/ceph/rgw/cert/rgw-key.pem' ----- root@ubuntu01:/# tree /etc/ceph/rgw/cert/ /etc/ceph/rgw/cert/ ├── rgw-cert.pem └── rgw-key.pem 0 directories, 2 files5.2创建配置文件#将公钥和私钥内容添加到配置文件中 rgw.yaml service_type: rgw service_id: default placement: hosts: - ubuntu01 - ubuntu02 - ubuntu03 spec: rgw_frontend_ssl_certificate: | -----BEGIN PRIVATE KEY----- MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC/q3zdOnRPSsRz OlP1Z0rBYJUn9VzZF+Rlnt7IN3q+6Ir/jCb6jtMYP1A+KiTcv8OaTdre4OxOjIFG iK1bZpEl16q+phKGbnfbphrptCVwpw7vV3n7Pjz57awXM728S7ql62JutIfrDWqN YY0EnMeO7xFrXdy4ow8d3GobhsfiTcU38nyWXZ5Qy7vd43kJElMxbbsXJrhPO43x 6gJCWdvIztdjewtFW2rqy8cDPaKxcXubYAOlJUzN53eAdGxC6Br6HWsyVn8VnsPI UQigwSzOx6DAa9QKjVIIi454pb3C+se1TD3el1JhZ5ZDav5g0Fp9zB1DetkR3vh7 3ixqwddRAgMBAAECggEADohJElkBQpXPqVDt1rh7MYhKJtpyrL8kARR3ncSfGOR2 zYNp3SuBE+CRC/WUD+y2PvfSNX3mTNpYXum0Ay8WqEDe4E+lLe4oRk4k0j1nbVAt ULZYOFVyuBxuJOA2bZVsVHIxZ2VmvMqqnoeb8pKUiuDTeEmIl7M9TS1OGkIw25aa VUU+kFLNwhVXQPAYu17dApl5GWI32gNcfZD7QF3fwNMz+u4h6dLCBiI+IQmXjtzX OCPs2exwfj9NaZ6jxZPQU1i2YoMRh2AuephP7hW+zeTUpgd4A6VHkX5ASKvs89U2 qel2639RLqdEg1bu7f5kHOlqYVmwuj6rw+EAyfxpQQKBgQDl81/euRc7NKj3uLoh yJqe/vQkvLKgKwVDAM6aKdI8/FhG1LNYbm5QJ83EuSmXSPm20bJjZWOjmQuIXSV2 V2UvmygHcQOeZsipJPgc2Yza6b3MDCKbBq/Kb4YrTtVLogjgKbbJXg99sSPKgJFG B4g4OGs8H2inkE3tKOn4CXUF2QKBgQDVYfXnMuGcZ8LfBiD1zfX1WGt8jKsoDI2/ LGGwQa/0Lo7xgqrN47hOFKMDGWglJPT06G2JR3gsPIvJlZNo5m6tYoLtZ0vEERlK +YzTVmTA/ThGAK7rePQv7GOGWubF6WYA8Bdyfqnm0yv8QtptcXq4bzyLEGMTKh4a mNOBw0eaOQKBgQDh4IczyKzBQ6EWdSahOORpehdoXtM38gphHtHTLXGO5nHwYc8p 9o/1vKOrshbgvJZOtv18FYQil5t0f8rkzERg1xAi/yiG2IATZnUyIaDzCNefL+xu S8syFwq8T9vFR41gZJlghDzDQtXdfR02pP9x+jmb/V5X+rARXjc1plSFQQKBgAvG 0UyGSV9Zdq7aZr1KNbXpwFzqYpPeRYB0kZuptG1UmH0JyiV82PIuP2TvZQkPxhky LsFx2VcPrGNexvj2JsuY8ULq/Yp/qxaxOS18yijAkPeEGCNU1J1EfaWvpKbtn7yT g6fFB9l+dCIDCo0Zwz0knoHKUL2BCJJNNvclcPE5AoGAXCuMEjSk54OUXoKj355l NbbpnrGofy7Pmi9H7dKxgKiN7wonjhV9Rztrbsms9lD4Ab/B1CrIRz6Hp5si+bw/ mDDboXmgEbNGMccxknye2p4S34R6RoERbNlXvor3Z46aWMP9Suqwa11WdklXQiaW wLC0RkGadfimQSxSGuFaJDE= -----END PRIVATE KEY----- -----BEGIN CERTIFICATE----- MIIDGzCCAgOgAwIBAgIUYkIzcxcGB4FBt1qTLXE9tOKNeMUwDQYJKoZIhvcNAQEL BQAwHTEbMBkGA1UEAwwSY2VwaC1yZ3cubG9jYWwuY29tMB4XDTI1MDYxNDE5MTEy MloXDTM1MDYxMjE5MTEyMlowHTEbMBkGA1UEAwwSY2VwaC1yZ3cubG9jYWwuY29t MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAv6t83Tp0T0rEczpT9WdK wWCVJ/Vc2RfkZZ7eyDd6vuiK/4wm+o7TGD9QPiok3L/Dmk3a3uDsToyBRoitW2aR JdeqvqYShm5326Ya6bQlcKcO71d5+z48+e2sFzO9vEu6petibrSH6w1qjWGNBJzH ju8Ra13cuKMPHdxqG4bH4k3FN/J8ll2eUMu73eN5CRJTMW27Fya4TzuN8eoCQlnb yM7XY3sLRVtq6svHAz2isXF7m2ADpSVMzed3gHRsQuga+h1rMlZ/FZ7DyFEIoMEs zsegwGvUCo1SCIuOeKW9wvrHtUw93pdSYWeWQ2r+YNBafcwdQ3rZEd74e94sasHX UQIDAQABo1MwUTAdBgNVHQ4EFgQUbL6TyHoVR1jUBE+mT0ETcvJNwq0wHwYDVR0j BBgwFoAUbL6TyHoVR1jUBE+mT0ETcvJNwq0wDwYDVR0TAQH/BAUwAwEB/zANBgkq hkiG9w0BAQsFAAOCAQEAZ/8toy7FIUG4uq+SxP4dxcC/P/njpCklzoA5BGc8aEQ+ M+g/0eTR60ib6HWXp2PtezQ5fK1mZLImSeuHCcdAXddEq0opXaS3wEMs8N27fDLU jMjilBhzDlp7YnxZ64YzF3HzP2qHbDwoJjz/MqFSovkFEb4m8RYZNl6t/5U8XSmx vxWdypbmmd+Zr07BQ1l1ldeGi0CD9gxSYK3exF6Gdr7G/J7vC8Up0xHKnaZSqKOH vP8e/WL7T+p0s0ypjAIR29M1E9XfULt8xNQc3KtiEcvAZbxE3HWG7vnp93S/42vA errp5uKlZdaOaA1OD0/nmrP36hz6RnuUqIF88p7NnQ== -----END CERTIFICATE----- ssl: true rgw_frontend_port: 4435.3更新RGW配置root@ubuntu01:~# ceph orch apply -i rgw.yaml Scheduled rgw.default update... root@ubuntu01:~# ceph orch ps | grep rgw rgw.default.ubuntu01.lusoxe ubuntu01 *:80 running (31m) 12s ago 31m 101M - 18.2.4 2bc0b0f4375d 510eb15ca011 rgw.default.ubuntu02.abhedc ubuntu02 *:443 starting - - - - <unknown> <unknown> <unknown> rgw.default.ubuntu02.nvamia ubuntu02 *:80 running (31m) 13s ago 31m 106M - 18.2.4 2bc0b0f4375d 4a733c6c4488 rgw.default.ubuntu03.pvvalj ubuntu03 *:443 starting - - - - <unknown> <unknown> <unknown> rgw.default.ubuntu03.rkgoya ubuntu03 *:80 running (31m) 13s ago 31m 101M - 18.2.4 2bc0b0f4375d d341d13d6bac 5.4验证访问在本地host添加域名解析六、RadosGW 高可用 6.1自定义https 端口与副本数#修改默认端口并指定运行在 ceph1 和 ceph2 节点 root@ceph-1:/etc/ceph/rgw# cat > rgw.yaml << EOF service_type: rgw service_id: default placement: hosts: - ceph-1 - ceph-2 - ceph-3 spec: rgw_frontend_ssl_certificate: | -----BEGIN PRIVATE KEY----- MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC8EYtfCItzKsoy zh9AeiYMuJAebFaBkEpF00AYGvnviwOBUx8f4CjStnXAUtNq7nFrhTldai1B+pdL eZomzgfFTO6oN0tz6KBXEDJG+yoMqrce0RlGJH6i5ML1znWmWsVynU1xrqGC2D1J jIERsoyx+1vIuYHR8EBpKf7ldLYDqZ79bAEEj1ayzzPOIS4X3g9AAWo3EMCLRzCU ZY1GIfqXuPi2xMJKYfovXM1agU178qCqNlGFrjtAHOSYpmMssGM638Jh8ExaJxBU 4UvKcxc54Efypancj6ZffgEx07A6tVFk4zQrFp+2vbYTkcs3tF5KrRBvN1ExDqsb fofCTI+7AgMBAAECggEAaiYE4giynLgkE/TfEsdevoNVZLaFRO+p3CtV28UuGJP/ 0HiX8qfUosm1QG3/QjV+8s7pB96r2LeVuVXTOd/D5wp7EZrUDYHZLgrINeQBYdDh NpWSjFKA33P6zj5PjStikkRSt713D6D5Ro/1MYXzf2l97pc1vMa7tB+t7Nio+vtE PCzTsRZynbNCY3UoIKfQlbA/fMqyayU05GAJyJT0kHl8M1H5PD5czJjUqztKQEv0 aJGmXN77drBC/qznGfpTPaAi1l6Gh7eBBj8/7yWXDSyI3n7SSAz0e4eDjZRhBZrH hi2f6+xwpuTuOehUw0xo2rtJxl3o9qzNjwN05A8yoQKBgQDdOzuiAZtqBd7ERXZp sYev17tLuLpiCHQ5a5ljh4jVrP1/bh3mNiTnFMk3TwJLIwBPlo2Ugbd/vsl+Db3+ EKDFTC2md5CaH04/1QSmFFjsHfN1ZoLdmedVntlR4Hzah44jNdWvHl0WYRjzfxcM I782oawPYm73J7oGboiEfgtW8wKBgQDZoBNxcvjFZ+0LO5ms+ymWKwTnPpwGhW3M D0DcPcF08GSnmyHvZoB51FG4GEIjMkWRpOX+pg2fxNjO0Y5QxS5vBg0A5KztDg4r Kdy+McCZIWjCeF9O0mvhIXOKLFdUMrfp+s6GqJCIoht8QkxUuMn0TM3eq3p/WWts xVCs/tMmGQKBgBSyPOLsCZECmZN8+BXtMMdnhDMSRgVzywOwKDpibI+ozlJEh/GI cS1ZCXXuI0XKMXZAnGAfPn5p58muGW8SOSgb901SdCmm8hgQoo2y65qzNppuC6IV ism8wZHiUWvUMJzkpWfrjEPSs5Xb9tkA4xuGRmVuDPl8Mu/1GTpj3EW3AoGBAMJI 0pLZ3ZX+7fS1RMDViY7y4PHBR3Ha9ObUR0dYKrnHU1T+fhFIJTKehkYgAguB+fHI kTwB6u/TwOsC0lbxcj7T3BAMFwWbIrMOMG/r4tHSrb/PzuaDnKPkRU35wAz/KonM y0wUeNRCRN9uIM8SGdnsJ26/ECFZJzp3/Uo0RTUhAoGAG8X00lkMTiVHuAZmP7PO 4lYfUQA8PZ6i/7A/SnHuwWI0MyWKLw3T/4mCdHyw9YwPshdVWCddY59L1GKxdzI5 V87lNmdkH7l6jDm7IwY5KX0voZ8uLB1zQ9lIakQxPTj5ydO2lPsJGE8784suwAhY Y7UxYWWOAl7Pu0TfGXZjg+I= -----END PRIVATE KEY----- -----BEGIN CERTIFICATE----- MIIDGzCCAgOgAwIBAgIUN4U/CuGb78PO+vzw370d0hz1aBUwDQYJKoZIhvcNAQEL BQAwHTEbMBkGA1UEAwwSY2VwaC1yZ3cubG9jYWwuY29tMB4XDTI0MTIxNzA3MDAz M1oXDTM0MTIxNTA3MDAzM1owHTEbMBkGA1UEAwwSY2VwaC1yZ3cubG9jYWwuY29t MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAvBGLXwiLcyrKMs4fQHom DLiQHmxWgZBKRdNAGBr574sDgVMfH+Ao0rZ1wFLTau5xa4U5XWotQfqXS3maJs4H xUzuqDdLc+igVxAyRvsqDKq3HtEZRiR+ouTC9c51plrFcp1Nca6hgtg9SYyBEbKM sftbyLmB0fBAaSn+5XS2A6me/WwBBI9Wss8zziEuF94PQAFqNxDAi0cwlGWNRiH6 l7j4tsTCSmH6L1zNWoFNe/KgqjZRha47QBzkmKZjLLBjOt/CYfBMWicQVOFLynMX OeBH8qWp3I+mX34BMdOwOrVRZOM0Kxaftr22E5HLN7ReSq0QbzdRMQ6rG36HwkyP uwIDAQABo1MwUTAdBgNVHQ4EFgQUUxgqlKlO+Pmvr+1QYv0bzf8BY4wwHwYDVR0j BBgwFoAUUxgqlKlO+Pmvr+1QYv0bzf8BY4wwDwYDVR0TAQH/BAUwAwEB/zANBgkq hkiG9w0BAQsFAAOCAQEAJf0D5Wy3BS9fUWqqgxLgvUSuK9EzfVHyyBeAzW+AYzus Iqv3KscFnJFkl8U7tfy0E/03z6LzA91Ok/6IvlsQA9w5agJF++nqNSatVbEin4Fr h4hu1HFMDFLkaQeGLcaHBgmMWOgK0DonitYEJZMbHBBYY5W7IzoZfduaOsJaXVoG rYCsoYlH2JHwIu3hXelzCLPfZhdBpvcgWIsQCnVy2n4y2WLRif0R+zPPZ4ZIz0qT en7C+vmvtP9SrpI9eIPUC3VAcTKxftvzyOHqBIB0+BzDa8lk0b4MMmaJkzt7Uq1k EJmfIFBAfER1wHb2vVPKd5/zi3h55T3D366M8yLx9Q== -----END CERTIFICATE----- ssl: true rgw_frontend_port: 8443 EOF root@ceph-1:/etc/ceph/rgw# ceph orch apply -i rgw.yaml Scheduled rgw.default update... root@ceph-1:/etc/ceph/rgw# ceph orch ps | grep rgw rgw.default.ceph-1.rsdmtv ceph-1 *:9443 running (29s) 23s ago 29s 87.6M - 18.2.4 2bc0b0f4375d c4d059ef4eeb rgw.default.ceph-2.hyhuzv ceph-2 *:9443 running (29s) 13s ago 29s 88.8M - 18.2.4 2bc0b0f4375d ff7f86383ba1 rgw.default.ceph-3.wnnkpd ceph-3 *:9443 running (30s) 24s ago 30s 88.4M - 18.2.4 2bc0b0f4375d 3d198beac816 6.2HaProxy部署以下操作在 ceph1、2、3 机器执行安装haproxy root@ceph-1:~# apt install haproxy -y#修改配置文件 root@ceph-1:~# cat > /etc/haproxy/haproxy.cfg << EOF # 开启管理员监控页面 listen admin_stats bind *:8888 # 监听的IP和端口号 mode http # 开启HTTP模式,stats功能需要 stats enable stats refresh 30s # 统计页面自动刷新时间 stats uri /admin # 访问的uri ip:8888/admin stats realm haproxy stats auth admin:admin # 认证用户名和密码 stats hide-version # 隐藏HAProxy的版本号 stats admin if TRUE # 管理界面,如果认证成功了,可通过webui管理节点 timeout client 5s # 客户端超时 timeout connect 3s # 连接超时 timeout server 5s # 后端服务器超时 # 配置前端监听 frontend main # 监听地址 bind *:443 # 匹配后端服务 default_backend rgw # 客户端超时 timeout client 5s # 配置后端代理 backend rgw # 连接超时 timeout connect 3s # 后端服务器超时 timeout server 5s server rgw1 192.168.10.91:9443 check server rgw2 192.168.10.92:9443 check server rgw3 192.168.10.93:9443 check EOF6.3启动服务root@ceph-1:~# systemctl start haproxy root@ceph-1:~# systemctl enable haproxy Synchronizing state of haproxy.service with SysV service script with /lib/systemd/systemd-sysv-install. Executing: /lib/systemd/systemd-sysv-install enable haproxy root@ceph-1:~# ss -tunlp | grep haproxy tcp LISTEN 0 4096 0.0.0.0:443 0.0.0.0:* users:(("haproxy",pid=61839,fd=7)) tcp LISTEN 0 4096 0.0.0.0:8888 0.0.0.0:* users:(("haproxy",pid=61839,fd=6))确认无误后 ceph-2 和 ceph-3 服务器同样的步骤配置。七、KeepAlived部署 以下操作在ceph-1、2、3 机器执行,设备网卡名称为ens33,VIP为192.168.10.90。 安装软件包root@ceph-1:~# apt install keepalived -y新增haproxy检测脚本root@ceph-1:~# vim /etc/keepalived/check_port.sh #!/bin/bash # 检查指定端口是否正常 PORT=443 if netstat -tuln | grep -q ":${PORT}\b"; then echo "${PORT}端口正常: 服务正在监听" exit 0 else echo "${PORT}端口异常: 未发现监听服务" exit 1 fi root@ceph-1:~# chmod u+x /etc/keepalived/check_port.sh修改配置文件root@ceph-1:/etc/keepalived# cat > /etc/keepalived/keepalived.conf << EOF global_defs { script_user root enable_script_security } vrrp_script chk_port { script "/etc/keepalived/check_port.sh" # 自定义检测脚本路径 interval 1 # 检测间隔,单位为秒 weight -2 # 如果检测失败,权重降低2 } vrrp_instance VI_1 { state MASTER # 设置为master节点 interface ens33 # 物理网卡名称 virtual_router_id 51 # 虚拟路由ID,主备保持一致 priority 100 # 优先级,主大于备 advert_int 1 # 关播间隔 authentication { # 认证信息,主备一致 auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.10.90/24 # 虚拟IP信息 } track_script { chk_port # 引用上面定义的脚本 } } EOF启动服务root@ceph-1:/etc/keepalived# systemctl start keepalived.service root@ceph-1:/etc/keepalived# systemctl enable keepalived.service Synchronizing state of keepalived.service with SysV service script with /lib/systemd/systemd-sysv-install. Executing: /lib/systemd/systemd-sysv-install enable keepalived root@ceph-1:/etc/keepalived# ip a | grep 192.168.10.90 inet 192.168.10.90/24 scope global secondary ens33此时可以看到vip 192.168.10.90绑定到了 ceph-1服务器ens33 网卡上。同样的操作配置 ceph-2服务器,配置文件如下:global_defs { script_user root enable_script_security } vrrp_script chk_port { script "/etc/keepalived/check.sh" interval 2 weight -2 } vrrp_instance VI_1 { state BACKUP # 主备类型 interface ens33 virtual_router_id 51 priority 99 # 优先级低于主 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.10.90/24 } track_script { chk_port # 引用上面定义的脚本 } }ceph-3 配置如下global_defs { script_user root enable_script_security } vrrp_script chk_port { script "/etc/keepalived/check.sh" interval 2 weight -2 } vrrp_instance VI_1 { state BACKUP # 主备类型 interface ens33 virtual_router_id 51 priority 98 # 优先级低于主 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.10.90/24 } track_script { chk_port # 引用上面定义的脚本 } }八、高可用测试 接下来停止 ceph-1服务,模拟异常故障,查看 ceph-2服务器,vip已经成功飘移过来root@ceph-1:~# systemctl stop haproxy.service root@ceph-2:~# ip a | grep 192.168.10.90 inet 192.168.10.90/24 scope global secondary ens33访问vip的 443 端口,可正常提供服务
2025年06月15日
14 阅读
0 评论
0 点赞
2025-06-15
OpenTelemetry数据收集
一、收集器配置详解OpenTelemetry 的 Collector 组件是实现观测数据(Trace、Metrics、Logs)收集、处理和导出的一站式服务。它的配置主要分为以下 四大核心模块: receivers(接收数据) processors(数据处理) exporters(导出数据) service(工作流程)1、配置格式#具体配置项可参考文档https://opentelemetry.io/docs/collector/configuration/ apiVersion: opentelemetry.io/v1beta1 kind: OpenTelemetryCollector # 定义资源类型为 OpenTelemetryCollector metadata: name: sidecar # Collector 的名称 spec: mode: sidecar # 以 sidecar 模式运行(与应用容器同 Pod) config: # Collector 配置部分(结构化 YAML) receivers: # 数据接收器(如 otlp、prometheus) processors: # 数据处理器(如 batch、resource、attributes) exporters: # 数据导出器(如 otlp、logging、jaeger、prometheus) service: # 服务配置(定义哪些 pipeline 生效) pipelines: traces: # trace 数据的处理流程 metrics: # metric 数据的处理流程 logs: # log 数据的处理流程2、Receivers(接收器)用于接收数据。支持的类型有很多, otlp:接收 otlp 协议的数据内容 receivers: otlp: protocols: grpc: # 高性能、推荐使用 endpoint: 0.0.0.0:4317 http: # 浏览器或无 gRPC 支持的环境 endpoint: 0.0.0.0:4318prometheus: 用于采集 /metrics 接口的数据。 receivers: prometheus: config: scrape_configs: - job_name: my-service static_configs: - targets: ['my-app:8080']filelog: 从文件读取日志 receivers: filelog: include: [ /var/log/myapp/*.log ] start_at: beginning operators: - type: json_parser parse_from: body timestamp: parse_from: attributes.time3、Processors(处理器)用于在导出前对数据进行修改、增强或过滤。常用的包括: batch : 将数据批处理后导出,提高吞吐量。 processors: batch: timeout: 10s send_batch_size: 1024resource : 为 trace/metric/log 添加统一标签。 processors: resource: attributes: - key: service.namespace value: demo action: insertattributes : 添加、修改或删除属性 processors: attributes: actions: - key: http.method value: GET action: insert处理器配置可参考文档:https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor4、Exporters(导出器)用于将数据导出到后端系统 otlp: 用于将数据发送到另一个 OTEL Collector、Jaeger、Tempo、Datadog 等。 exporters: otlp: endpoint: tempo-collector:4317 tls: insecure: truePrometheus: 用于暴露一个 /metrics HTTP 端口给 Prometheus 拉取。 exporters: prometheus: endpoint: "0.0.0.0:8889"logging : 调试用,打印数据到控制台。 exporters: debug: loglevel: debug5、Service(工作流程)service.pipelines 是一个“调度图”,告诉 OpenTelemetry Collector,对于某种类型的数据,比如 trace,请用哪个 receiver 来接收,用哪些 processor 来处理,最终送到哪些 exporter 去导出。service: pipelines: traces: receivers: [otlp] processors: [batch, resource] exporters: [otlp, logging] metrics: receivers: [prometheus] processors: [batch] exporters: [prometheus] logs: receivers: [filelog] processors: [batch] exporters: [otlp]二、Collector 发行版本区别opentelemetry-collector 和 opentelemetry-collector-contrib 是两个 OpenTelemetry Collector 的发行版本,它们的区别主要在于 内置组件的丰富程度 和 维护主体。
2025年06月15日
7 阅读
0 评论
0 点赞
2025-06-14
OpenTelemetry 应用埋点
一、部署示例应用 1、部署java应用apiVersion: apps/v1 kind: Deployment metadata: name: java-demo spec: selector: matchLabels: app: java-demo template: metadata: labels: app: java-demo spec: containers: - name: java-demo image: registry.cn-guangzhou.aliyuncs.com/xingcangku/spring-petclinic:1.5.1 imagePullPolicy: IfNotPresent resources: limits: memory: "1Gi" # 增加内存 cpu: "500m" ports: - containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: java-demo spec: type: ClusterIP # 改为 ClusterIP,Traefik 使用服务发现 selector: app: java-demo ports: - port: 80 targetPort: 8080 --- apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: java-demo spec: entryPoints: - web # 使用 WEB 入口点 (端口 8000) routes: - match: Host(`java-demo.local.cn`) # 可以修改为您需要的域名 kind: Rule services: - name: java-demo port: 80 2、部署python应用apiVersion: apps/v1 kind: Deployment metadata: name: python-demo spec: selector: matchLabels: app: python-demo template: metadata: labels: app: python-demo spec: containers: - name: python-demo image: registry.cn-guangzhou.aliyuncs.com/xingcangku/python-demoapp:latest imagePullPolicy: IfNotPresent resources: limits: memory: "500Mi" cpu: "200m" ports: - containerPort: 5000 --- apiVersion: v1 kind: Service metadata: name: python-demo spec: selector: app: python-demo ports: - port: 5000 targetPort: 5000 --- apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: python-demo spec: entryPoints: - web routes: - match: Host(`python-demo.local.com`) kind: Rule services: - name: python-demo port: 5000二、应用埋点 1、java应用自动埋点apiVersion: opentelemetry.io/v1alpha1 kind: Instrumentation # 声明资源类型为 Instrumentation(用于语言自动注入) metadata: name: java-instrumentation # Instrumentation 资源的名称(可以被 Deployment 等引用) namespace: opentelemetry spec: propagators: # 指定用于 trace 上下文传播的方式,支持多种格式 - tracecontext # W3C Trace Context(最通用的跨服务追踪格式) - baggage # 传播用户定义的上下文键值对 - b3 # Zipkin 的 B3 header(用于兼容 Zipkin 环境) sampler: # 定义采样策略(决定是否收集 trace) type: always_on # 始终采样所有请求(适合测试或调试环境) java: # image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest # 使用的 Java 自动注入 agent 镜像地址 image: harbor.cuiliangblog.cn/otel/autoinstrumentation-java:latest env: - name: OTEL_EXPORTER_OTLP_ENDPOINT value: http://center-collector.opentelemetry.svc:4318#为了启用自动检测,我们需要更新部署文件并向其添加注解。这样我们可以告诉 OpenTelemetry Operator 将 sidecar 和 java-instrumentation 注入到我们的应用程序中。修改 Deployment 配置如下: apiVersion: apps/v1 kind: Deployment metadata: name: java-demo spec: selector: matchLabels: app: java-demo template: metadata: labels: app: java-demo annotations: instrumentation.opentelemetry.io/inject-java: "opentelemetry/java-instrumentation" # 填写 Instrumentation 资源的名称 sidecar.opentelemetry.io/inject: "opentelemetry/sidecar" # 注入一个 sidecar 模式的 OpenTelemetry Collector spec: containers: - name: java-demo image: registry.cn-guangzhou.aliyuncs.com/xingcangku/spring-petclinic:1.5.1 imagePullPolicy: IfNotPresent resources: limits: memory: "500Mi" cpu: "200m" ports: - containerPort: 8080#接下来更新 deployment,然后查看资源信息,java-demo 容器已经变为两个。 root@k8s01:~/helm/opentelemetry# kubectl get pods NAME READY STATUS RESTARTS AGE java-demo-5cdd74d47-vmqqx 0/2 Init:0/1 0 6s java-demo-5f4d989b88-xrzg7 1/1 Running 0 42m my-sonarqube-postgresql-0 1/1 Running 8 (2d21h ago) 9d my-sonarqube-sonarqube-0 0/1 Pending 0 6d6h python-demo-69c56c549c-jcgmj 1/1 Running 0 16m redis-5ff4857944-v2vz5 1/1 Running 5 (2d21h ago) 6d2h root@k8s01:~/helm/opentelemetry# kubectl get pods -w NAME READY STATUS RESTARTS AGE java-demo-5cdd74d47-vmqqx 0/2 PodInitializing 0 9s java-demo-5f4d989b88-xrzg7 1/1 Running 0 42m my-sonarqube-postgresql-0 1/1 Running 8 (2d21h ago) 9d my-sonarqube-sonarqube-0 0/1 Pending 0 6d6h python-demo-69c56c549c-jcgmj 1/1 Running 0 17m redis-5ff4857944-v2vz5 1/1 Running 5 (2d21h ago) 6d2h java-demo-5cdd74d47-vmqqx 2/2 Running 0 23s java-demo-5f4d989b88-xrzg7 1/1 Terminating 0 43m java-demo-5f4d989b88-xrzg7 0/1 Terminating 0 43m java-demo-5f4d989b88-xrzg7 0/1 Terminating 0 43m java-demo-5f4d989b88-xrzg7 0/1 Terminating 0 43m java-demo-5f4d989b88-xrzg7 0/1 Terminating 0 43m root@k8s01:~/helm/opentelemetry# kubectl get pods -w NAME READY STATUS RESTARTS AGE java-demo-5cdd74d47-vmqqx 2/2 Running 0 28s my-sonarqube-postgresql-0 1/1 Running 8 (2d21h ago) 9d my-sonarqube-sonarqube-0 0/1 Pending 0 6d6h python-demo-69c56c549c-jcgmj 1/1 Running 0 17m redis-5ff4857944-v2vz5 1/1 Running 5 (2d21h ago) 6d2h ^Croot@k8s01:~/helm/opentelemetry# kubectl get opentelemetrycollectors -A NAMESPACE NAME MODE VERSION READY AGE IMAGE MANAGEMENT opentelemetry center deployment 0.127.0 1/1 3h22m registry.cn-guangzhou.aliyuncs.com/xingcangku/opentelemetry-collector-0.127.0:0.127.0 managed opentelemetry sidecar sidecar 0.127.0 3h19m managed root@k8s01:~/helm/opentelemetry# kubectl get instrumentations -A NAMESPACE NAME AGE ENDPOINT SAMPLER SAMPLER ARG opentelemetry java-instrumentation 2m26s always_on #查看 sidecar日志,已正常启动并发送 spans 数据 root@k8s01:~/helm/opentelemetry# kubectl logs java-demo-5cdd74d47-vmqqx -c otc-container 2025-06-14T15:31:35.013Z info service@v0.127.0/service.go:199 Setting up own telemetry... {"resource": {}} 2025-06-14T15:31:35.014Z debug builders/builders.go:24 Stable component. {"resource": {}, "otelcol.component.id": "otlp", "otelcol.component.kind": "exporter", "otelcol.signal": "traces"} 2025-06-14T15:31:35.014Z info builders/builders.go:26 Development component. May change in the future. {"resource": {}, "otelcol.component.id": "debug", "otelcol.component.kind": "exporter", "otelcol.signal": "traces"} 2025-06-14T15:31:35.014Z debug builders/builders.go:24 Beta component. May change in the future. {"resource": {}, "otelcol.component.id": "batch", "otelcol.component.kind": "processor", "otelcol.pipeline.id": "traces", "otelcol.signal": "traces"} 2025-06-14T15:31:35.014Z debug builders/builders.go:24 Stable component. {"resource": {}, "otelcol.component.id": "otlp", "otelcol.component.kind": "receiver", "otelcol.signal": "traces"} 2025-06-14T15:31:35.014Z debug otlpreceiver@v0.127.0/otlp.go:58 created signal-agnostic logger {"resource": {}, "otelcol.component.id": "otlp", "otelcol.component.kind": "receiver"} 2025-06-14T15:31:35.021Z info service@v0.127.0/service.go:266 Starting otelcol... {"resource": {}, "Version": "0.127.0", "NumCPU": 8} 2025-06-14T15:31:35.021Z info extensions/extensions.go:41 Starting extensions... {"resource": {}} 2025-06-14T15:31:35.021Z info grpc@v1.72.1/clientconn.go:176 [core] original dial target is: "center-collector.opentelemetry.svc:4317" {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.021Z info grpc@v1.72.1/clientconn.go:459 [core] [Channel #1]Channel created {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.021Z info grpc@v1.72.1/clientconn.go:207 [core] [Channel #1]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"passthrough", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/center-collector.opentelemetry.svc:4317", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}} {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.021Z info grpc@v1.72.1/clientconn.go:208 [core] [Channel #1]Channel authority set to "center-collector.opentelemetry.svc:4317" {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.022Z info grpc@v1.72.1/resolver_wrapper.go:210 [core] [Channel #1]Resolver state updated: { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": [ { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Attributes": null } ], "ServiceConfig": null, "Attributes": null } (resolver returned new addresses) {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.022Z info grpc@v1.72.1/balancer_wrapper.go:122 [core] [Channel #1]Channel switches to new LB policy "pick_first" {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.023Z info gracefulswitch/gracefulswitch.go:194 [pick-first-leaf-lb] [pick-first-leaf-lb 0xc000bc6090] Received new config { "shuffleAddressList": false }, resolver state { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": [ { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Attributes": null } ], "ServiceConfig": null, "Attributes": null } {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.023Z info grpc@v1.72.1/clientconn.go:563 [core] [Channel #1]Channel Connectivity change to CONNECTING{"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.023Z info grpc@v1.72.1/balancer_wrapper.go:195 [core] [Channel #1 SubChannel #2]Subchannel created {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.023Z info grpc@v1.72.1/clientconn.go:364 [core] [Channel #1]Channel exiting idle mode {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.023Z info grpc@v1.72.1/clientconn.go:1224 [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to CONNECTING {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.024Z info grpc@v1.72.1/clientconn.go:1343 [core] [Channel #1 SubChannel #2]Subchannel picks a new address "center-collector.opentelemetry.svc:4317" to connect {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.024Z info grpc@v1.72.1/server.go:690 [core] [Server #3]Server created {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.024Z info otlpreceiver@v0.127.0/otlp.go:116 Starting GRPC server {"resource": {}, "otelcol.component.id": "otlp", "otelcol.component.kind": "receiver", "endpoint": "0.0.0.0:4317"} 2025-06-14T15:31:35.025Z info grpc@v1.72.1/server.go:886 [core] [Server #3 ListenSocket #4]ListenSocket created {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.025Z info otlpreceiver@v0.127.0/otlp.go:173 Starting HTTP server {"resource": {}, "otelcol.component.id": "otlp", "otelcol.component.kind": "receiver", "endpoint": "0.0.0.0:4318"} 2025-06-14T15:31:35.026Z info service@v0.127.0/service.go:289 Everything is ready. Begin running and processing data. {"resource": {}} 2025-06-14T15:31:35.034Z info grpc@v1.72.1/clientconn.go:1224 [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to READY {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.034Z info pickfirstleaf/pickfirstleaf.go:197 [pick-first-leaf-lb] [pick-first-leaf-lb 0xc000bc6090] SubConn 0xc0008e1db0 reported connectivity state READY and the health listener is disabled. Transitioning SubConn to READY. {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.034Z info grpc@v1.72.1/clientconn.go:563 [core] [Channel #1]Channel Connectivity change to READY {"resource": {}, "grpc_log": true} root@k8s01:~/helm/opentelemetry# kubectl logs java-demo-5cdd74d47-vmqqx -c otc-container 2025-06-14T15:31:35.013Z info service@v0.127.0/service.go:199 Setting up own telemetry... {"resource": {}} 2025-06-14T15:31:35.014Z debug builders/builders.go:24 Stable component. {"resource": {}, "otelcol.component.id": "otlp 2025-06-14T15:31:35.014Z info builders/builders.go:26 Development component. May change in the future. {"resource": {aces"} 2025-06-14T15:31:35.014Z debug builders/builders.go:24 Beta component. May change in the future. {"resource": {}, "oteles", "otelcol.signal": "traces"} 2025-06-14T15:31:35.014Z debug builders/builders.go:24 Stable component. {"resource": {}, "otelcol.component.id": "otlp 2025-06-14T15:31:35.014Z debug otlpreceiver@v0.127.0/otlp.go:58 created signal-agnostic logger {"resource": {}, "otel 2025-06-14T15:31:35.021Z info service@v0.127.0/service.go:266 Starting otelcol... {"resource": {}, "Version": "0.127.0", 2025-06-14T15:31:35.021Z info extensions/extensions.go:41 Starting extensions... {"resource": {}} 2025-06-14T15:31:35.021Z info grpc@v1.72.1/clientconn.go:176 [core] original dial target is: "center-collector.opentelemetr 2025-06-14T15:31:35.021Z info grpc@v1.72.1/clientconn.go:459 [core] [Channel #1]Channel created {"resource": {}, "grpc 2025-06-14T15:31:35.021Z info grpc@v1.72.1/clientconn.go:207 [core] [Channel #1]parsed dial target is: resolver.Target{URL:ector.opentelemetry.svc:4317", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}} {"resource": { 2025-06-14T15:31:35.021Z info grpc@v1.72.1/clientconn.go:208 [core] [Channel #1]Channel authority set to "center-collector. 2025-06-14T15:31:35.022Z info grpc@v1.72.1/resolver_wrapper.go:210 [core] [Channel #1]Resolver state updated: { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": [ { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Attributes": null } ], "ServiceConfig": null, "Attributes": null } (resolver returned new addresses) {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.022Z info grpc@v1.72.1/balancer_wrapper.go:122 [core] [Channel #1]Channel switches to new LB policy " 2025-06-14T15:31:35.023Z info gracefulswitch/gracefulswitch.go:194 [pick-first-leaf-lb] [pick-first-leaf-lb 0xc000bc6090] "shuffleAddressList": false }, resolver state { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": [ { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Attributes": null } ], "ServiceConfig": null, "Attributes": null } {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.023Z info grpc@v1.72.1/clientconn.go:563 [core] [Channel #1]Channel Connectivity change to CONNECTING 2025-06-14T15:31:35.023Z info grpc@v1.72.1/balancer_wrapper.go:195 [core] [Channel #1 SubChannel #2]Subchannel created 2025-06-14T15:31:35.023Z info grpc@v1.72.1/clientconn.go:364 [core] [Channel #1]Channel exiting idle mode {"resource": { 2025-06-14T15:31:35.023Z info grpc@v1.72.1/clientconn.go:1224 [core] [Channel #1 SubChannel #2]Subchannel Connectivity chang 2025-06-14T15:31:35.024Z info grpc@v1.72.1/clientconn.go:1343 [core] [Channel #1 SubChannel #2]Subchannel picks a new addres 2025-06-14T15:31:35.024Z info grpc@v1.72.1/server.go:690 [core] [Server #3]Server created {"resource": {}, "grpc 2025-06-14T15:31:35.024Z info otlpreceiver@v0.127.0/otlp.go:116 Starting GRPC server {"resource": {}, "otelcol.comp 2025-06-14T15:31:35.025Z info grpc@v1.72.1/server.go:886 [core] [Server #3 ListenSocket #4]ListenSocket created {"reso 2025-06-14T15:31:35.025Z info otlpreceiver@v0.127.0/otlp.go:173 Starting HTTP server {"resource": {}, "otelcol.comp 2025-06-14T15:31:35.026Z info service@v0.127.0/service.go:289 Everything is ready. Begin running and processing data. {"reso 2025-06-14T15:31:35.034Z info grpc@v1.72.1/clientconn.go:1224 [core] [Channel #1 SubChannel #2]Subchannel Connectivity chang 2025-06-14T15:31:35.034Z info pickfirstleaf/pickfirstleaf.go:197 [pick-first-leaf-lb] [pick-first-leaf-lb 0xc000bc6090]ansitioning SubConn to READY. {"resource": {}, "grpc_log": true} 2025-06-14T15:31:35.034Z info grpc@v1.72.1/clientconn.go:563 [core] [Channel #1]Channel Connectivity change to READY {"reso #查看collector 日志,已经收到 traces 数据 root@k8s01:~/helm/opentelemetry# kubectl get pod -n opentelemetry NAME READY STATUS RESTARTS AGE center-collector-78f7bbdf45-j798s 1/1 Running 0 3h24m root@k8s01:~/helm/opentelemetry# kubectl get -n opentelemetry pods NAME READY STATUS RESTARTS AGE center-collector-78f7bbdf45-j798s 1/1 Running 0 3h25m root@k8s01:~/helm/opentelemetry# kubectl logs -n opentelemetry center-collector-78f7bbdf45-j798s 2025-06-14T12:09:21.290Z info service@v0.127.0/service.go:199 Setting up own telemetry... {"resource": {}} 2025-06-14T12:09:21.291Z info builders/builders.go:26 Development component. May change in the future. {"resource": {}, "otelcol.component.id": "debug", "otelcol.component.kind": "exporter", "otelcol.signal": "traces"} 2025-06-14T12:09:21.294Z info service@v0.127.0/service.go:266 Starting otelcol... {"resource": {}, "Version": "0.127.0", "NumCPU": 8} 2025-06-14T12:09:21.294Z info extensions/extensions.go:41 Starting extensions... {"resource": {}} 2025-06-14T12:09:21.294Z info otlpreceiver@v0.127.0/otlp.go:116 Starting GRPC server {"resource": {}, "otelcol.component.id": "otlp", "otelcol.component.kind": "receiver", "endpoint": "0.0.0.0:4317"} 2025-06-14T12:09:21.295Z info otlpreceiver@v0.127.0/otlp.go:173 Starting HTTP server {"resource": {}, "otelcol.component.id": "otlp", "otelcol.component.kind": "receiver", "endpoint": "0.0.0.0:4318"} 2025-06-14T12:09:21.295Z info service@v0.127.0/service.go:289 Everything is ready. Begin running and processing data. {"resource": {}} root@k8s01:~/helm/opentelemetry# 2、python应用自动埋点与 java 应用类似,python 应用同样也支持自动埋点, OpenTelemetry 提供了 opentelemetry-instrument CLI 工具,在启动 Python 应用时通过 sitecustomize 或环境变量注入自动 instrumentation。 我们先创建一个java-instrumentation 资源apiVersion: opentelemetry.io/v1alpha1 kind: Instrumentation # 声明资源类型为 Instrumentation(用于语言自动注入) metadata: name: python-instrumentation # Instrumentation 资源的名称(可以被 Deployment 等引用) namespace: opentelemetry spec: propagators: # 指定用于 trace 上下文传播的方式,支持多种格式 - tracecontext # W3C Trace Context(最通用的跨服务追踪格式) - baggage # 传播用户定义的上下文键值对 - b3 # Zipkin 的 B3 header(用于兼容 Zipkin 环境) sampler: # 定义采样策略(决定是否收集 trace) type: always_on # 始终采样所有请求(适合测试或调试环境) python: image: registry.cn-guangzhou.aliyuncs.com/xingcangku/autoinstrumentation-python:latest env: - name: OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED # 启用日志的自动检测 value: "true" - name: OTEL_PYTHON_LOG_CORRELATION # 在日志中启用跟踪上下文注入 value: "true" - name: OTEL_EXPORTER_OTLP_ENDPOINT value: http://center-collector.opentelemetry.svc:4318^Croot@k8s01:~/helm/opentelemetry# cat new-python-demo.yaml apiVersion: apps/v1 kind: Deployment metadata: name: python-demo spec: selector: matchLabels: app: python-demo template: metadata: labels: app: python-demo annotations: instrumentation.opentelemetry.io/inject-python: "opentelemetry/python-instrumentation" # 填写 Instrumentation 资源的名称 sidecar.opentelemetry.io/inject: "opentelemetry/sidecar" # 注入一个 sidecar 模式的 OpenTelemetry Collector spec: containers: - name: pyhton-demo image: registry.cn-guangzhou.aliyuncs.com/xingcangku/python-demoapp:latest imagePullPolicy: IfNotPresent resources: limits: memory: "500Mi" cpu: "200m" ports: - containerPort: 5000 oot@k8s03:~# kubectl get pods NAME READY STATUS RESTARTS AGE java-demo-5559f949b9-74p68 2/2 Running 0 2m14s java-demo-5559f949b9-kwgpc 0/2 Terminating 0 14m my-sonarqube-postgresql-0 1/1 Running 8 (2d22h ago) 9d my-sonarqube-sonarqube-0 0/1 Pending 0 6d7h python-demo-599fc7f8d6-lbhnr 2/2 Running 0 20m redis-5ff4857944-v2vz5 1/1 Running 5 (2d22h ago) 6d3h root@k8s03:~# kubectl logs python-demo-599fc7f8d6-lbhnr -c otc-container 2025-06-14T15:57:12.951Z info service@v0.127.0/service.go:199 Setting up own telemetry... {"resource": {}} 2025-06-14T15:57:12.952Z info builders/builders.go:26 Development component. May change in the future. {"resource{}, "otelcol.component.id": "debug", "otelcol.component.kind": "exporter", "otelcol.signal": "traces"} 2025-06-14T15:57:12.952Z debug builders/builders.go:24 Stable component. {"resource": {}, "otelcol.component.id": "p", "otelcol.component.kind": "exporter", "otelcol.signal": "traces"} 2025-06-14T15:57:12.952Z debug builders/builders.go:24 Beta component. May change in the future. {"resource": {}, "lcol.component.id": "batch", "otelcol.component.kind": "processor", "otelcol.pipeline.id": "traces", "otelcol.signal": "traces"} 2025-06-14T15:57:12.952Z debug builders/builders.go:24 Stable component. {"resource": {}, "otelcol.component.id": "p", "otelcol.component.kind": "receiver", "otelcol.signal": "traces"} 2025-06-14T15:57:12.952Z debug otlpreceiver@v0.127.0/otlp.go:58 created signal-agnostic logger {"resource": {}, "lcol.component.id": "otlp", "otelcol.component.kind": "receiver"} 2025-06-14T15:57:12.953Z info service@v0.127.0/service.go:266 Starting otelcol... {"resource": {}, "Version": "0.127, "NumCPU": 8} 2025-06-14T15:57:12.953Z info extensions/extensions.go:41 Starting extensions... {"resource": {}} 2025-06-14T15:57:12.953Z info grpc@v1.72.1/clientconn.go:176 [core] original dial target is: "center-collector.opentelery.svc:4317" {"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/clientconn.go:459 [core] [Channel #1]Channel created {"resource": {}, "c_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/clientconn.go:207 [core] [Channel #1]parsed dial target is: resolver.Target{:url.URL{Scheme:"passthrough", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/center-collector.opentelemetry.svc:4317", Rawh:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}} {"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/clientconn.go:208 [core] [Channel #1]Channel authority set to "center-collec.opentelemetry.svc:4317" {"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/resolver_wrapper.go:210 [core] [Channel #1]Resolver state updated: { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": [ { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Attributes": null } ], "ServiceConfig": null, "Attributes": null } (resolver returned new addresses) {"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/balancer_wrapper.go:122 [core] [Channel #1]Channel switches to new LB poli"pick_first" {"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.954Z info gracefulswitch/gracefulswitch.go:194 [pick-first-leaf-lb] [pick-first-leaf-lb 0xc00046e] Received new config { "shuffleAddressList": false }, resolver state { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": [ { "Addresses": [ { "Addr": "center-collector.opentelemetry.svc:4317", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Attributes": null } ], "ServiceConfig": null, "Attributes": null } {"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/clientconn.go:563 [core] [Channel #1]Channel Connectivity change to CONNECTI"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/balancer_wrapper.go:195 [core] [Channel #1 SubChannel #2]Subchannel create"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/clientconn.go:364 [core] [Channel #1]Channel exiting idle mode {"resource{}, "grpc_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/clientconn.go:1224 [core] [Channel #1 SubChannel #2]Subchannel Connectivity cge to CONNECTING {"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/clientconn.go:1343 [core] [Channel #1 SubChannel #2]Subchannel picks a new adss "center-collector.opentelemetry.svc:4317" to connect {"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.954Z info grpc@v1.72.1/server.go:690 [core] [Server #3]Server created {"resource": {}, "c_log": true} 2025-06-14T15:57:12.954Z info otlpreceiver@v0.127.0/otlp.go:116 Starting GRPC server {"resource": {}, "otelcol.ponent.id": "otlp", "otelcol.component.kind": "receiver", "endpoint": "0.0.0.0:4317"} 2025-06-14T15:57:12.954Z info otlpreceiver@v0.127.0/otlp.go:173 Starting HTTP server {"resource": {}, "otelcol.ponent.id": "otlp", "otelcol.component.kind": "receiver", "endpoint": "0.0.0.0:4318"} 2025-06-14T15:57:12.954Z info service@v0.127.0/service.go:289 Everything is ready. Begin running and processing data. {"ource": {}} 2025-06-14T15:57:12.955Z info grpc@v1.72.1/server.go:886 [core] [Server #3 ListenSocket #4]ListenSocket created {"ource": {}, "grpc_log": true} 2025-06-14T15:57:12.962Z info grpc@v1.72.1/clientconn.go:1224 [core] [Channel #1 SubChannel #2]Subchannel Connectivity cge to READY {"resource": {}, "grpc_log": true} 2025-06-14T15:57:12.962Z info pickfirstleaf/pickfirstleaf.go:197 [pick-first-leaf-lb] [pick-first-leaf-lb 0xc00046e] SubConn 0xc0005fccd0 reported connectivity state READY and the health listener is disabled. Transitioning SubConn to READY. {"ource": {}, "grpc_log": true} 2025-06-14T15:57:12.962Z info grpc@v1.72.1/clientconn.go:563 [core] [Channel #1]Channel Connectivity change to READY {"ource": {}, "grpc_log": true} root@k8s03:~# root@k8s03:~# kubectl logs -n opentelemetry center-collector-78f7bbdf45-j798s 2025-06-14T12:09:21.290Z info service@v0.127.0/service.go:199 Setting up own telemetry... {"resource": {}} 2025-06-14T12:09:21.291Z info builders/builders.go:26 Development component. May change in the future. {"resourceaces"} 2025-06-14T12:09:21.294Z info service@v0.127.0/service.go:266 Starting otelcol... {"resource": {}, "Version": "0.127 2025-06-14T12:09:21.294Z info extensions/extensions.go:41 Starting extensions... {"resource": {}} 2025-06-14T12:09:21.294Z info otlpreceiver@v0.127.0/otlp.go:116 Starting GRPC server {"resource": {}, "otelcol. 2025-06-14T12:09:21.295Z info otlpreceiver@v0.127.0/otlp.go:173 Starting HTTP server {"resource": {}, "otelcol. 2025-06-14T12:09:21.295Z info service@v0.127.0/service.go:289 Everything is ready. Begin running and processing data. {" 2025-06-14T16:05:11.811Z info Traces {"resource": {}, "otelcol.component.id": "debug", "otelcol.component.kind": "expor 2025-06-14T16:05:16.636Z info Traces {"resource": {}, "otelcol.component.id": "debug", "otelcol.component.kind": "expor 2025-06-14T16:05:26.894Z info Traces {"resource": {}, "otelcol.component.id": "debug", "otelcol.component.kind": "expor 2025-06-14T16:18:11.294Z info Traces {"resource": {}, "otelcol.component.id": "debug", "otelcol.component.kind": "expor 2025-06-14T16:18:21.350Z info Traces {"resource": {}, "otelcol.component.id": "debug", "otelcol.component.kind": "expor root@k8s03:~#
2025年06月14日
10 阅读
0 评论
0 点赞
2025-06-14
OpenTelemetry部署
建议使用 OpenTelemetry Operator 来部署,因为它可以帮助我们轻松部署和管理 OpenTelemetry 收集器,还可以自动检测应用程序。具体可参考文档https://opentelemetry.io/docs/platforms/kubernetes/operator/一、部署cert-manager因为 Operator 使用了 Admission Webhook 通过 HTTP 回调机制对资源进行校验/修改。Kubernetes 要求 Webhook 服务必须使用 TLS,因此 Operator 需要为其 webhook server 签发证书,所以需要先安装cert-manager。# wget https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml # kubectl apply -f cert-manager.yaml root@k8s01:~/helm/opentelemetry/cert-manager# kubectl get -n cert-manager pod NAME READY STATUS RESTARTS AGE cert-manager-7bd494778-gs44k 1/1 Running 0 37s cert-manager-cainjector-76474c8c48-w9r5p 1/1 Running 0 37s cert-manager-webhook-6797c49f67-thvcz 1/1 Running 0 37s root@k8s01:~/helm/opentelemetry/cert-manager# 二、部署Operator在 Kubernetes 上使用 OpenTelemetry,主要就是部署 OpenTelemetry 收集器。# wget https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml # kubectl apply -f opentelemetry-operator.yaml # kubectl get pod -n opentelemetry-operator-system NAME READY STATUS RESTARTS AGE opentelemetry-operator-controller-manager-6d94c5db75-cz957 2/2 Running 0 74s # kubectl get crd |grep opentelemetry instrumentations.opentelemetry.io 2025-04-21T09:48:53Z opampbridges.opentelemetry.io 2025-04-21T09:48:54Z opentelemetrycollectors.opentelemetry.io 2025-04-21T09:48:54Z targetallocators.opentelemetry.io 2025-04-21T09:48:54Zroot@k8s01:~/helm/opentelemetry/cert-manager# kubectl apply -f opentelemetry-operator.yaml namespace/opentelemetry-operator-system created customresourcedefinition.apiextensions.k8s.io/instrumentations.opentelemetry.io created customresourcedefinition.apiextensions.k8s.io/opampbridges.opentelemetry.io created customresourcedefinition.apiextensions.k8s.io/opentelemetrycollectors.opentelemetry.io created customresourcedefinition.apiextensions.k8s.io/targetallocators.opentelemetry.io created serviceaccount/opentelemetry-operator-controller-manager created role.rbac.authorization.k8s.io/opentelemetry-operator-leader-election-role created clusterrole.rbac.authorization.k8s.io/opentelemetry-operator-manager-role created clusterrole.rbac.authorization.k8s.io/opentelemetry-operator-metrics-reader created clusterrole.rbac.authorization.k8s.io/opentelemetry-operator-proxy-role created rolebinding.rbac.authorization.k8s.io/opentelemetry-operator-leader-election-rolebinding created clusterrolebinding.rbac.authorization.k8s.io/opentelemetry-operator-manager-rolebinding created clusterrolebinding.rbac.authorization.k8s.io/opentelemetry-operator-proxy-rolebinding created service/opentelemetry-operator-controller-manager-metrics-service created service/opentelemetry-operator-webhook-service created deployment.apps/opentelemetry-operator-controller-manager created Warning: spec.privateKey.rotationPolicy: In cert-manager >= v1.18.0, the default value changed from `Never` to `Always`. certificate.cert-manager.io/opentelemetry-operator-serving-cert created issuer.cert-manager.io/opentelemetry-operator-selfsigned-issuer created mutatingwebhookconfiguration.admissionregistration.k8s.io/opentelemetry-operator-mutating-webhook-configuration created validatingwebhookconfiguration.admissionregistration.k8s.io/opentelemetry-operator-validating-webhook-configuration created root@k8s01:~/helm/opentelemetry/cert-manager# kubectl get pods -n opentelemetry-operator-system NAME READY STATUS RESTARTS AGE opentelemetry-operator-controller-manager-f78fc55f7-xtjk2 2/2 Running 0 107s root@k8s01:~/helm/opentelemetry/cert-manager# kubectl get crd |grep opentelemetry instrumentations.opentelemetry.io 2025-06-14T11:30:01Z opampbridges.opentelemetry.io 2025-06-14T11:30:01Z opentelemetrycollectors.opentelemetry.io 2025-06-14T11:30:02Z targetallocators.opentelemetry.io 2025-06-14T11:30:02Z三、部署Collector(中心)接下来我们部署一个精简版的 OpenTelemetry Collector,用于接收 OTLP 格式的 trace 数据,通过 gRPC 或 HTTP 协议接入,经过内存控制与批处理后,打印到日志中以供调试使用。 root@k8s01:~/helm/opentelemetry/cert-manager# cat center-collector.yaml apiVersion: opentelemetry.io/v1beta1 kind: OpenTelemetryCollector # 元数据定义部分 metadata: name: center # Collector 的名称为 center namespace: opentelemetry # 具体的配置内容 spec: image: registry.cn-guangzhou.aliyuncs.com/xingcangku/opentelemetry-collector-0.127.0:0.127.0 replicas: 1 # 设置副本数量为1 config: # 定义 Collector 配置 receivers: # 接收器,用于接收遥测数据(如 trace、metrics、logs) otlp: # 配置 OTLP(OpenTelemetry Protocol)接收器 protocols: # 启用哪些协议来接收数据 grpc: endpoint: 0.0.0.0:4317 # 启用 gRPC 协议 http: endpoint: 0.0.0.0:4318 # 启用 HTTP 协议 processors: # 处理器,用于处理收集到的数据 batch: {} # 批处理器,用于将数据分批发送,提高效率 exporters: # 导出器,用于将处理后的数据发送到后端系统 debug: {} # 使用 debug 导出器,将数据打印到终端(通常用于测试或调试) service: # 服务配置部分 pipelines: # 定义处理管道 traces: # 定义 trace 类型的管道 receivers: [otlp] # 接收器为 OTLP processors: [batch] # 使用批处理器 exporters: [debug] # 将数据打印到终端 root@k8s01:~/helm/opentelemetry/cert-manager# kubectl get pod -n opentelemetry NAME READY STATUS RESTARTS AGE center-collector-78f7bbdf45-j798s 1/1 Running 0 43s center-collector-7b7b8b9b97-qwhdr 0/1 Terminating 0 12m root@k8s01:~/helm/opentelemetry/cert-manager# kubectl get svc -n opentelemetry NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE center-collector ClusterIP 10.105.241.233 <none> 4317/TCP,4318/TCP 49s center-collector-headless ClusterIP None <none> 4317/TCP,4318/TCP 49s center-collector-monitoring ClusterIP 10.96.61.65 <none> 8888/TCP 49s root@k8s01:~/helm/opentelemetry/cert-manager# 四、部署Collector(代理)我们使用 Sidecar 模式部署 OpenTelemetry 代理。该代理会将应用程序的追踪发送到我们刚刚部署的中心OpenTelemetry 收集器。root@k8s01:~/helm/opentelemetry/cert-manager# cat center-collector.yaml apiVersion: opentelemetry.io/v1beta1 kind: OpenTelemetryCollector # 元数据定义部分 metadata: name: center # Collector 的名称为 center namespace: opentelemetry # 具体的配置内容 spec: image: registry.cn-guangzhou.aliyuncs.com/xingcangku/opentelemetry-collector-0.127.0:0.127.0 replicas: 1 # 设置副本数量为1 config: # 定义 Collector 配置 receivers: # 接收器,用于接收遥测数据(如 trace、metrics、logs) otlp: # 配置 OTLP(OpenTelemetry Protocol)接收器 protocols: # 启用哪些协议来接收数据 grpc: endpoint: 0.0.0.0:4317 # 启用 gRPC 协议 http: endpoint: 0.0.0.0:4318 # 启用 HTTP 协议 processors: # 处理器,用于处理收集到的数据 batch: {} # 批处理器,用于将数据分批发送,提高效率 exporters: # 导出器,用于将处理后的数据发送到后端系统 debug: {} # 使用 debug 导出器,将数据打印到终端(通常用于测试或调试) service: # 服务配置部分 pipelines: # 定义处理管道 traces: # 定义 trace 类型的管道 receivers: [otlp] # 接收器为 OTLP processors: [batch] # 使用批处理器 exporters: [debug] # 将数据打印到终端 root@k8s01:~/helm/opentelemetry/cert-manager# kubectl get pod -n opentelemetry NAME READY STATUS RESTARTS AGE center-collector-78f7bbdf45-j798s 1/1 Running 0 43s center-collector-7b7b8b9b97-qwhdr 0/1 Terminating 0 12m root@k8s01:~/helm/opentelemetry/cert-manager# kubectl get svc -n opentelemetry NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE center-collector ClusterIP 10.105.241.233 <none> 4317/TCP,4318/TCP 49s center-collector-headless ClusterIP None <none> 4317/TCP,4318/TCP 49s center-collector-monitoring ClusterIP 10.96.61.65 <none> 8888/TCP 49s root@k8s01:~/helm/opentelemetry/cert-manager# vi sidecar-collector.yaml root@k8s01:~/helm/opentelemetry/cert-manager# kubectl apply -f sidecar-collector.yaml opentelemetrycollector.opentelemetry.io/sidecar created root@k8s01:~/helm/opentelemetry/cert-manager# kubectl get opentelemetrycollectors -n opentelemetry NAME MODE VERSION READY AGE IMAGE MANAGEMENT center deployment 0.127.0 1/1 3m3s registry.cn-guangzhou.aliyuncs.com/xingcangku/opentelemetry-collector-0.127.0:0.127.0 managed sidecar sidecar 0.127.0 7s managed root@k8s01:~/helm/opentelemetry/cert-manager# kubectl get opentelemetrycollectors -n opentelemetry NAME MODE VERSION READY AGE IMAGE center deployment 0.127.0 1/1 3m8s registry.cn-guangzhou.aliyuncs.com/xingcangku/opentelemetry-collector-0.127.0:0.127.0 sidecar sidecar 0.127.0 12s root@k8s01:~/helm/opentelemetry/cert-manager# kubectl get pod -n opentelemetry NAME READY STATUS RESTARTS AGE center-collector-78f7bbdf45-j798s 1/1 Running 0 3m31s center-collector-7b7b8b9b97-qwhdr 0/1 Terminating 0 15m sidecar 代理依赖于应用程序启动,因此现在创建后并不会立即启动,需要我们创建一个应用程序并使用这个 sidecar 模式的 collector。
2025年06月14日
6 阅读
0 评论
0 点赞
2025-06-08
traefik添加路由
一、路由配置(IngressRouteTCP) 1、TCP路由(不带TLS证书)#在yaml文件中添加端口 ports: - name: metrics containerPort: 9100 protocol: TCP - name: traefik containerPort: 9000 protocol: TCP - name: web containerPort: 8000 protocol: TCP - name: websecure containerPort: 8443 protocol: TCP securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL - NET_BIND_SERVICE readOnlyRootFilesystem: false volumeMounts: - name: data mountPath: /data readOnly: false # 允许写入 args: - "--global.checknewversion" - "--global.sendanonymoususage" - "--entryPoints.metrics.address=:9100/tcp" - "--entryPoints.traefik.address=:9000/tcp" - "--entryPoints.web.address=:8000/tcp" - "--entryPoints.websecure.address=:8443/tcp" - "--api.dashboard=true" - "--entryPoints.redistcp.address=:6379/tcp" #添加端口 - "--ping=true" - "--metrics.prometheus=true" - "--metrics.prometheus.entrypoint=metrics" - "--providers.kubernetescrd" - "--providers.kubernetescrd.allowEmptyServices=true" - "--providers.kubernetesingress" - "--providers.kubernetesingress.allowEmptyServices=true" - "--providers.kubernetesingress.ingressendpoint.publishedservice=traefik/traefik-release" - "--entryPoints.websecure.http.tls=true" - "--log.level=INFO"#实例 root@k8s01:~/helm/traefik/test# cat redis.yaml # redis.yaml apiVersion: apps/v1 kind: Deployment metadata: name: redis spec: selector: matchLabels: app: redis template: metadata: labels: app: redis spec: containers: - name: redis image: registry.cn-guangzhou.aliyuncs.com/xingcangku/redis:1.0 resources: limits: memory: "128Mi" cpu: "500m" ports: - containerPort: 6379 protocol: TCP --- apiVersion: v1 kind: Service metadata: name: redis spec: selector: app: redis ports: - port: 6379 targetPort: 6379 root@k8s01:~/helm/traefik/test# cat redis-IngressRoute.yaml apiVersion: traefik.io/v1alpha1 kind: IngressRouteTCP metadata: name: redis namespace: default # 确保在 Redis 的命名空间 spec: entryPoints: - redistcp # 使用新定义的入口点 routes: - match: HostSNI(`*`) # 正确语法 services: - name: redis port: 6379 #redis_Ingress.yaml apiVersion: traefik.io/v1alpha1 kind: IngressRouteTCP metadata: name: redis namespace: default # 确保在 Redis 的命名空间 spec: entryPoints: - redistcp # 使用新定义的入口点 routes: - match: HostSNI(`*`) # 正确语法 services: - name: redis port: 6379#可以查询6379映射的端口 kubectl -n traefik get endpoints #集群外部客户端配置hosts解析192.168.93.128 redis.test.com(域名可以随意填写,只要能解析到traefik所在节点即可),然后通过redis-cli工具访问redis,记得指定tcpep的端口。 # redis-cli -h redis.test.com -p 上面查询映射的端口 redis.test.com:9200> set key_a value_a OK redis.test.com:9200> get key_a "value_a" redis.test.com:9200> 2、TCP路由(带TLS证书)有时候为了安全要求,tcp传输也需要使用TLS证书加密,redis从6.0开始支持了tls证书通信。root@k8s01:~/helm/traefik/test/redis-ssl# openssl genrsa -out ca.key 4096 Generating RSA private key, 4096 bit long modulus (2 primes) ..........................................................................................................................................................................................................................................................++++ ................++++ e is 65537 (0x010001) root@k8s01:~/helm/traefik/test/redis-ssl# ls ca.key root@k8s01:~/helm/traefik/test/redis-ssl# openssl req -x509 -new -nodes -sha256 -key ca.key -days 3650 -subj '/O=Redis Test/CN=Certificate Authority' -out ca.crt root@k8s01:~/helm/traefik/test/redis-ssl# ls ca.crt ca.key root@k8s01:~/helm/traefik/test/redis-ssl# openssl genrsa -out redis.key 2048 Generating RSA private key, 2048 bit long modulus (2 primes) ..........................................................+++++ ...................................................................+++++ e is 65537 (0x010001) root@k8s01:~/helm/traefik/test/redis-ssl# openssl req -new -sha256 -key redis.key -subj '/O=Redis Test/CN=Server' | openssl x509 -req -sha256 -CA ca.crt -CAkey ca.key -CAserial ca.txt -CAcreateserial -days 365 -out redis.crt Signature ok subject=O = Redis Test, CN = Server Getting CA Private Key root@k8s01:~/helm/traefik/test/redis-ssl# openssl dhparam -out redis.dh 2048 Generating DH parameters, 2048 bit long safe prime, generator 2 This is going to take a long time ...............................................+..................................................................................................................................................................................................+................+..........................................+...........+.............+..............................................................+................................................................+.+....................................................................................................................................................................................................................................................................................................+..............+................................................................................................................................................................................................+...................................................................................................................................................................................................................................................................................+...........................................................................................................................................................................+........................................+.+.............................+....................................................................................................................................................+.............................................................................+.++*++*++*++* root@k8s01:~/helm/traefik/test/redis-ssl# ll total 24 drwxr-xr-x 2 root root 98 Jun 8 06:57 ./ drwxr-xr-x 3 root root 249 Jun 8 06:55 ../ -rw-r--r-- 1 root root 1895 Jun 8 06:56 ca.crt -rw------- 1 root root 3243 Jun 8 06:55 ca.key -rw-r--r-- 1 root root 41 Jun 8 06:57 ca.txt -rw-r--r-- 1 root root 1407 Jun 8 06:57 redis.crt -rw-r--r-- 1 root root 424 Jun 8 06:57 redis.dh -rw------- 1 root root 1679 Jun 8 06:56 redis.key root@k8s01:~/helm/traefik/test/redis-ssl# ca.crt # 根证书 (公开) ca.key # CA 私钥 (绝密) ca.txt # 证书序列号记录 redis.key # Redis 服务器私钥 (绝密) redis.crt # Redis 服务器证书 (公开) redis.dh # Diffie-Hellman 参数 (公开)#生成 CA 私钥 openssl genrsa -out ca.key 4096 作用:创建根证书颁发机构(CA)的私钥 genrsa:生成 RSA 私钥 -out ca.key:输出到文件 ca.key 4096:密钥长度(4096 位高安全级别) 输出文件:ca.key(CA 私钥,需严格保密)#生成 CA 根证书 openssl req -x509 -new -nodes -sha256 -key ca.key -days 3650 -subj '/O=Redis Test/CN=Certificate Authority' -out ca.crt 下面是对您生成 Redis SSL/TLS 证书的每一步命令的解析: 1. 生成 CA 私钥 openssl genrsa -out ca.key 4096 作用:创建根证书颁发机构(CA)的私钥 genrsa:生成 RSA 私钥 -out ca.key:输出到文件 ca.key 4096:密钥长度(4096 位高安全级别) 输出文件:ca.key(CA 私钥,需严格保密) 2. 生成 CA 根证书 openssl req -x509 -new -nodes -sha256 -key ca.key -days 3650 -subj '/O=Redis Test/CN=Certificate Authority' -out ca.crt 作用:创建自签名的根证书 req -x509:创建自签名证书(而不是证书请求) -new:生成新证书 -nodes:不加密私钥(No DES,明文存储) -sha256:使用 SHA-256 哈希算法 -key ca.key:指定 CA 私钥 -days 3650:有效期 10 年 -subj:证书主题信息 /O=Redis Test:组织名称 /CN=Certificate Authority:通用名称(标识为 CA) -out ca.crt:输出到 ca.crt 输出文件:ca.crt(受信任的根证书)#生成 Redis 服务器私钥 openssl genrsa -out redis.key 2048 作用:创建 Redis 服务器的私钥 与步骤 1 类似,但密钥长度 2048 位(更短,性能更好) 输出文件:redis.key(服务器私钥,需保密)#生成并签署 Redis 服务器证书 openssl req -new -sha256 -key redis.key -subj '/O=Redis Test/CN=Server' | openssl x509 -req -sha256 -CA ca.crt -CAkey ca.key -CAserial ca.txt -CAcreateserial -days 365 -out redis.crt #前半部分(生成证书请求) openssl req -new -sha256 -key redis.key -subj '/O=Redis Test/CN=Server' req -new:创建证书签名请求(CSR) -subj '/O=Redis Test/CN=Server':主题信息 CN=Server:标识为 Redis 服务器 #后半部分(签署证书) openssl x509 -req ... -out redis.crt x509 -req:签署证书请求 -CA ca.crt:指定 CA 证书 -CAkey ca.key:指定 CA 私钥 -CAserial ca.txt:指定序列号记录文件 -CAcreateserial:如果序列号文件不存在则创建 -days 365:有效期 1 年 输出文件: redis.crt:Redis 服务器证书 ca.txt:证书序列号记录文件(首次生成) #生成 Diffie-Hellman 参数 openssl dhparam -out redis.dh 2048 作用:创建安全密钥交换参数 dhparam:生成 Diffie-Hellman 参数 2048:密钥长度 输出文件:redis.dh(用于前向保密 PFS)创建secret资源,使用tls类型,包含redis.crt和redis.keyroot@k8s01:~/helm/traefik/test/redis-ssl# kubectl create secret tls redis-tls --key=redis.key --cert=redis.crt secret/redis-tls created root@k8s01:~/helm/traefik/test/redis-ssl# kubectl describe secrets redis-tls Name: redis-tls Namespace: default Labels: <none> Annotations: <none> Type: kubernetes.io/tls Data ==== tls.crt: 1407 bytes tls.key: 1679 bytes root@k8s01:~/helm/traefik/test/redis-ssl# 创建secret资源,使用generic类型,包含ca.crtroot@k8s01:~/helm/traefik/test/redis-ssl# kubectl create secret generic redis-ca --from-file=ca.crt=ca.crt secret/redis-ca created root@k8s01:~/helm/traefik/test/redis-ssl# kubectl describe secrets redis-ca Name: redis-ca Namespace: default Labels: <none> Annotations: <none> Type: Opaque Data ==== ca.crt: 1895 bytes root@k8s01:~/helm/traefik/test/redis-ssl# 修改redis配置,启用tls证书,并挂载证书文件apiVersion: v1 kind: ConfigMap metadata: name: redis labels: app: redis data: redis.conf : |- port 0 tls-port 6379 tls-cert-file /etc/tls/tls.crt tls-key-file /etc/tls/tls.key tls-ca-cert-file /etc/ca/ca.crt --- apiVersion: apps/v1 kind: Deployment metadata: name: redis spec: selector: matchLabels: app: redis template: metadata: labels: app: redis spec: containers: - name: redis image: registry.cn-guangzhou.aliyuncs.com/xingcangku/redis:1.0 resources: limits: memory: "128Mi" cpu: "500m" ports: - containerPort: 6379 protocol: TCP volumeMounts: - name: config mountPath: /etc/redis - name: tls mountPath: /etc/tls - name: ca mountPath: /etc/ca args: - /etc/redis/redis.conf volumes: - name: config configMap: name: redis - name: tls secret: secretName: redis-tls - name: ca secret: secretName: redis-ca --- apiVersion: v1 kind: Service metadata: name: redis spec: selector: app: redis ports: - port: 6379 targetPort: 6379root@k8s01:~/helm/traefik/test# kubectl apply -f redis.yaml configmap/redis created deployment.apps/redis configured service/redis unchangedroot@k8s01:~/helm/traefik/test/redis-ssl# # 生成客户端证书 root@k8s01:~/helm/traefik/test/redis-ssl# openssl req -newkey rsa:4096 -nodes \ > -keyout client.key \ > -subj "/CN=redis-client" \ > -out client.csr rt \ -days 365Generating a RSA private key ...........................................++++ .........................................................................................................++++ writing new private key to 'client.key' ----- root@k8s01:~/helm/traefik/test/redis-ssl# root@k8s01:~/helm/traefik/test/redis-ssl# # 用同一CA签发 root@k8s01:~/helm/traefik/test/redis-ssl# openssl x509 -req -in client.csr \ > -CA ca.crt \ > -CAkey ca.key \ > -CAcreateserial \ > -out client.crt \ > -days 365 Signature ok subject=CN = redis-client Getting CA Private Key root@k8s01:~/helm/traefik/test/redis-ssl# scp {client.crt,client.key,ca.crt} root@192.168.3.131:/tmp/redis-ssl/ root@192.168.3.131's password: client.crt 100% 1732 308.5KB/s 00:00 client.key 100% 3272 972.0KB/s 00:00 ca.crt 100% 1895 1.0MB/s 00:00 root@k8s01:~/helm/traefik/test# cat redis-IngressRoute-tls.yaml # redis-IngressRoute-tls.yaml apiVersion: traefik.io/v1alpha1 kind: IngressRouteTCP metadata: name: redis spec: entryPoints: - tcpep routes: - match: HostSNI(`redis.test.com`) services: - name: redis port: 6379 tls: passthrough: true # 重要:这里是启用透传的正确位置 root@k8s01:~/helm/traefik/test# #到集群意外的机器测试redis连接 低版本不支持TLS,需要编译安装6.0以上版本,并在编译时开启TLS root@ubuntu02:~/redis-stable# ./src/redis-cli -h redis.test.com -p 31757 --tls --sni redis.test.com --cert /tmp/redis-ssl/client.crt --key /tmp/redis-ssl/client.key --cacert /tmp/redis-ssl/ca.crt redis.test.com:31757> get key "1" redis.test.com:31757> root@ubuntu02:~/redis-stable# ./src/redis-cli \ > -h redis.test.com \ > -p 31757 \ > --tls \ > --sni redis.test.com \ > --cert /tmp/redis-ssl/client.crt \ > --key /tmp/redis-ssl/client.key \ > --cacert /tmp/redis-ssl/ca.crt redis.test.com:31757> set v1=1 (error) ERR wrong number of arguments for 'set' command redis.test.com:31757> set key 1 OK #配置对比 # redis-IngressRoute-tls.yaml apiVersion: traefik.io/v1alpha1 kind: IngressRouteTCP metadata: name: redis spec: entryPoints: ◦ tcpep routes: ◦ match: HostSNI(`redis.test.com`) services: ▪ name: redis port: 6379 tls: passthrough: true #上面的配置可以用 下面的配置用不了 # redis-IngressRoute-tls.yaml apiVersion: traefik.io/v1alpha1 kind: IngressRouteTCP metadata: name: redis spec: entryPoints: ◦ tcpep routes: ◦ match: HostSNI(`redis.test.com`) services: ▪ name: redis port: 6379 tls: secretName: redis-tls工作配置: 客户端 → [TLS握手] → Traefik (透传) → Redis服务器 │ └─ 直接处理TLS 失败配置: 客户端 → [TLS握手] → Traefik (终止TLS) → Redis服务器 │ │ ├─验证Traefik证书 └─ 接收明文连接但期望加密 └─客户端CA不信任Traefik证书 因为第一个配置是不经过Traefik直接到redis认证证书 工作原理:Traefik 不处理TLS流量,直接透传加密数据到Redis服务器 证书使用:客户端直接与Redis服务器进行TLS握手,使用Redis服务器自身的证书 验证过程:客户端用自有的ca.crt验证Redis服务器证书(CN=redis.test.com) 第二个是需要先和Traefik认证 工作原理:Traefik 在入口处终止TLS连接,然后以明文转发到Redis服务器 证书使用: 客户端验证 Traefik 的证书(来自 redis-tls secret) Redis 服务器接收非加密流量 验证过程:客户端验证的是 Traefik 的证书,而不是Redis服务器证书 失败的核心原因 证书不匹配问题: 当使用 secretName: redis-tls 时,Traefik 使用该 Secret 中的证书 但是客户端配置的CA证书是用于验证Redis服务器证书(CN=redis.test.com)的 Traefik 的证书(CN=TRAEFIK DEFAULT CERT)无法通过客户端CA验证
2025年06月08日
10 阅读
0 评论
0 点赞
1
...
11
12
13
...
25