K8S问题排查-业务高并发导致Pod反复重启(续)

发表于 2021-07-16 更新于 2024-12-15 分类于 kubernetes 阅读时长 ≈ 2 分钟

问题背景

接上次的问题，一段时间后，环境再次出现harbor和calico因为健康检查不过反复重启的问题，并且使用kubectl命令进入Pod也响应非常慢甚至超时。

1 2	[root@node01 ~]# kubectl exec -it -n system node1-59c9475bc6-zkhq5 bash ^

原因分析

反复重启的原因上次已定位，这次上环境简单看还是因为健康检查超时的问题，并且现象也一样，TCP的连接卡在了第一次握手的SYN_SENT阶段。

1
2
3

[root@node01 ~]# netstat -anp|grep 23380
tcp        0      0 127.0.0.1:23380         0.0.0.0:*               LISTEN      38914/kubelet
tcp        0      0 127.0.0.1:38983         127.0.0.1:23380         SYN_SENT    -

也就是说，除了TCP连接队列的问题，还存在其他问题会导致该现象。先看看上次的参数还在不在：

1
2
3

[root@node01 ~]# cat /etc/sysctl.conf
net.ipv4.tcp_max_syn_backlog = 32768
net.core.somaxconn = 32768

再看下上次修改的参数是否生效：

1
2
3

[root@node01 ~]# ss -lnt
State      Recv-Q   Send-Q     Local Address:Port      Peer Address:Port              
LISTEN     0        32768      127.0.0.1:23380         *:*

参数的修改也生效了，那为什么还会卡在SYN_SENT阶段呢？从现有情况，看不出还有什么原因会导致该问题，只能摸索看看。

在问题节点和非问题节点上分别抓包，看报文交互是否存在什么异常；
根据参考资料[1]，排查是否为相同问题；
根据参考资料[2]，排查是否相同问题；
…

摸索一番，没发现什么异常。回过头来想想，既然是业务下发大量配置导致的，并且影响是全局的（除了业务Pod自身，其他组件也受到了影响），说明大概率原因还是系统层面存在的性能瓶颈。业务量大的影响除了CPU、一般还有内存、磁盘、连接数等等，与开发人员确认他们的连接还是长连接，那么连接数很大的情况下会受到什么内核参数的影响呢？其中一个就是我们熟知的文件句柄数。

[root@node01 ~]# lsof -p 45775 | wc -l
17974

[root@node01 ~]# lsof -p 45775|grep "sock"| wc -l
12051

嗯，打开了1w+的文件句柄数并且基本都是sock连接，而我们使用的操作系统默认情况下每个进程的文件句柄数限制为1024，查看确认一下：

1 2	[root@node01 ~]# ulimit -n 1024

超额使用了这么多，业务Pod竟然没有too many open files错误：

1
2
3

[root@node01 ~]# kubectl logs -n system node1-59c9475bc6-zkhq5
start config
...

临时修改一下：

1
2
3

[root@node01 ~]# ulimit -n 65535
[root@node01 ~]# ulimit  -n
65535

再次使用kubectl命令进入业务Pod，响应恢复正常，并且查看连接也不再有卡住的SYN_SENT阶段：

[root@node01 ~]# kubectl exec -it -n system node1-59c9475bc6-zkhq5 bash
[root@node1-59c9475bc6-zkhq5]# exit
[root@node01 ~]# kubectl exec -it -n system node1-59c9475bc6-zkhq5 bash
[root@node1-59c9475bc6-zkhq5]# exit
[root@node01 ~]# kubectl exec -it -n system node1-59c9475bc6-zkhq5 bash
[root@node1-59c9475bc6-zkhq5]# exit

[root@node01 ~]# netstat -anp|grep 23380
tcp        0      0 127.0.0.1:23380         0.0.0.0:*               LISTEN      38914/kubelet
tcp        0      0 127.0.0.1:56369         127.0.0.1:23380         TIME_WAIT   -
tcp        0      0 127.0.0.1:23380         127.0.0.1:57601         TIME_WAIT   -
tcp        0      0 127.0.0.1:23380         127.0.0.1:57479         TIME_WAIT   -

解决方案

业务根据实际情况调整文件句柄数。
针对业务量大的环境，强烈建议整体做一下操作系统层面的性能优化，否则，不定哪个系统参数就成了性能瓶颈，网上找了个调优案例[3]，感兴趣的可以参考。

参考资料

K8S问题排查-Influxdb监控数据获取异常

发表于 2021-07-10 更新于 2024-12-15 分类于 kubernetes 阅读时长 ≈ 5 分钟

问题背景

K8S集群内，Influxdb监控数据获取异常，最终CPU、内存和磁盘使用率都无法获取。

监控项         使用率
CPU(核)        3%
内存(GB)       18%
磁盘空间(GB)    0%

监控项         使用率
CPU(核)        7%
内存(GB)       18%
磁盘空间(GB)    1%

监控项         使用率
CPU(核)        0%
内存(GB)       0%
磁盘空间(GB)    0%

...

Influxdb监控架构图参考[1]，其中Load Balancer采用nginx实现：

        ┌─────────────────┐                 
        │writes & queries │                 
        └─────────────────┘                 
                 │                          
                 ▼                          
         ┌───────────────┐                  
         │               │                  
┌────────│ Load Balancer │─────────┐        
│        │               │         │        
│        └──────┬─┬──────┘         │        
│               │ │                │        
│               │ │                │        
│        ┌──────┘ └────────┐       │        
│        │ ┌─────────────┐ │       │┌──────┐
│        │ │/write or UDP│ │       ││/query│
│        ▼ └─────────────┘ ▼       │└──────┘
│  ┌──────────┐      ┌──────────┐  │        
│  │ InfluxDB │      │ InfluxDB │  │        
│  │ Relay    │      │ Relay    │  │        
│  └──┬────┬──┘      └────┬──┬──┘  │        
│     │    |              |  │     │        
│     |  ┌─┼──────────────┘  |     │        
│     │  │ └──────────────┐  │     │        
│     ▼  ▼                ▼  ▼     │        
│  ┌──────────┐      ┌──────────┐  │        
│  │          │      │          │  │        
└─▶│ InfluxDB │      │ InfluxDB │◀─┘        
   │          │      │          │           
   └──────────┘      └──────────┘

原因分析

因为获取的数据来源是influxdb数据库，所以先搞清楚异常的原因是请求路径上的问题，还是influxdb数据库自身没有数据的问题：

# 找到influxdb-nginx的service
kubectl get svc  -n kube-system -owide
NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE   SELECTOR
grafana-service          ClusterIP   10.96.177.245   <none>        3000/TCP                 21d   app=grafana
heapster                 ClusterIP   10.96.239.225   <none>        80/TCP                   21d   app=heapster
influxdb-nginx-service   ClusterIP   10.96.170.72    <none>        7076/TCP                 21d   app=influxdb-nginx
influxdb-relay-service   ClusterIP   10.96.196.45    <none>        9096/TCP                 21d   app=influxdb-relay
influxdb-service         ClusterIP   10.96.127.45    <none>        8086/TCP                 21d   app=influxdb

# 在集群节点上检查访问influxdb-nginx的service是否正常
curl -i 10.96.170.72:7076/query
HTTP/1.1 401 Unauthorized
Server: nginx/1.17.2

可以看出，请求发送到influxdb-nginx的service是正常的，也就是请求可以正常发送到后端的influxdb数据库。那就继续确认influxdb数据库自身没有数据的问题：

# 找到influxdb数据库的pod
kubectl get pod -n kube-system -owide |grep influxdb
influxdb-nginx-4x8pr                       1/1     Running   3          21d   177.177.52.201    node3
influxdb-nginx-tpngh                       1/1     Running   6          21d   177.177.41.214    node1
influxdb-nginx-wh6kc                       1/1     Running   5          21d   177.177.250.180   node2
influxdb-relay-rs-65c94bbf5f-dp7s4         1/1     Running   2          21d   177.177.250.148   node2
influxdb1-6ff9466d46-q6w5r                 1/1     Running   3          21d   177.177.41.230    node1
influxdb2-d6d6697f5-zzcnk                  1/1     Running   3          21d   177.177.250.161   node2
influxdb3-65ddfc7476-hxhr8                 1/1     Running   4          21d   177.177.52.217    node3

# 登录任意一个influxdb容器内并进入交互式命令
kubectl exec -it -n kube-systme influxdb-rs3-65ddfc7476-hxhr8 bash
root@influxdb-rs3-65ddfc7476-hxhr8:/# influx
Connected to http://localhost:8086 version 1.7.7
InfluxDB shell version: 1.7.7
> auth
username: admin
password: xxx
> use xxx;
Using database xxx

根据业务层面的查询语句，在influxdb交互式命令下手工查询验证：

1 2	> select sum(value) from "cpu/node_capacity" where "type" = 'node' and "nodename" = 'node1' and time > now() - 2m >

结果发现确实没有查到数据，既然2min内的数据没有，那把时间线拉长一些看看呢？

# 不限制时间范围的查询
> select sum(value) from "cpu/node_capacity"> select sum(value) from "cpu/node_capacity";
name: cpu/node_capacity
time sum
---- ---
0    5301432000

# 查询72min内的数据
> select sum(value) from "cpu/node_capacity" where "type" = 'node' and "nodename" = 'node1' and time > now() - 72m
name: cpu/node_capacity
time                sum
----                ---
1624348319900503945 72000

# sleep 1min，继续查询72min内的数据
> select sum(value) from "cpu/node_capacity" where "type" = 'node' and "nodename" = 'node1' and time > now() - 72m
name: cpu/node_capacity
> 

# 查询73min内的数据
> select sum(value) from "cpu/node_capacity" where "type" = 'node' and "nodename" = 'node1' and time > now() - 73m
name: cpu/node_capacity
time                sum
----                ---
1624348319900503945 72000

根据查询结果看，不添加时间范围的查询是有记录的，并且通过多次验证看，数据无法获取的原因是数据在某个时间点不再写入导致的。查看influxdb的日志看看有没有什么相关日志：

kubectl logs -n kube-systme influxdb-rs3-65ddfc7476-hxhr8
ts=2021-06-22T09:56:49.658621Z lvl=warn msg="max-values-per-tag limit may be exceeded soon" log_id=0UYIcREl000 service=store perc=100% n=100000 max=100000 db_instance=xxx measurement=network/rx tag=pod_name
ts=2021-06-22T09:56:49.658702Z lvl=warn msg="max-values-per-tag limit may be exceeded soon" log_id=0UYIcREl000 service=store perc=100% n=100000 max=100000 db_instance=xxx measurement=network/rx_errors tag=pod_name
ts=2021-06-22T09:56:49.658815Z lvl=warn msg="max-values-per-tag limit may be exceeded soon" log_id=0UYIcREl000 service=store perc=100% n=100000 max=100000 db_instance=xxx measurement=network/tx tag=pod_name
ts=2021-06-22T09:56:49.658893Z lvl=warn msg="max-values-per-tag limit may be exceeded soon" log_id=0UYIcREl000 service=store perc=100% n=100000 max=100000 db_instance=xxx measurement=network/tx_errors tag=pod_name
ts=2021-06-22T09:56:49.659062Z lvl=warn msg="max-values-per-tag limit may be exceeded soon" log_id=0UYIcREl000 service=store perc=100% n=100003 max=100000 db_instance=xxx measurement=uptime tag=pod_name

果然，有大量warn日志，提示max-values-per-tag limit may be exceeded soon，从日志可以看出，这个参数的默认值为100000。通过搜索，找到了这个参数引入的issue[2]，引入原因大概意思是：

如果不小心加载了大量的cardinality数据，那么当我们删除数据的时候，InfluxDB很容易会发生OOM。

通过临时修改max-values-per-tag参数，验证问题是否解决

cat influxdb.conf
[meta]
  dir = "/var/lib/influxdb/meta"
[data]
  dir = "/var/lib/influxdb/data"
  engine = "tsm1"
  wal-dir = "/var/lib/influxdb/wal"
  max-series-per-database = 0
  max-values-per-tag = 0
[http]
  auth-enabled = true

kubectl delete pod -n kube-system influxdb-rs1-6ff9466d46-q6w5r
pod "influxdb-rs1-6ff9466d46-q6w5r" deleted

kubectl delete pod -n kube-system influxdb-rs2-d6d6697f5-zzcnk
pod "influxdb-rs2-d6d6697f5-zzcnk" deleted

kubectl delete pod -n kube-system influxdb-rs3-65ddfc7476-hxhr8
pod "influxdb-rs3-65ddfc7476-hxhr8" deleted

再次观察业务层面获取的Influxdb监控数据，最终CPU、内存和磁盘使用率正常获取。

监控项         使用率
CPU(核)        19%
内存(GB)       22%
磁盘空间(GB)    2%

解决方案

根据业务情况，将influxdb的max-values-per-tag参数调整到合适值。

参考资料

K8S问题排查-Pod间通过服务名访问异常

发表于 2021-06-26 更新于 2024-12-15 分类于 kubernetes 阅读时长 ≈ 3 分钟

问题背景

K8S集群内，PodA使用服务名称访问PodB，请求出现异常。其中，PodA在node1节点上，PodB在node2节点上。

原因分析

先上tcpdump，观察请求是否有异常：

[root@node1 ~]# tcpdump -n -i ens192 port 50300
...
13:48:17.630335 IP 177.177.176.150.distinct -> 10.96.22.136.50300:  UDP, length 214
13:48:17.630407 IP 192.168.7.21.distinct  ->  10.96.22.136.50300:   UDP, length 214
...

从抓包数据可以看出，请求源地址端口号为177.177.176.150:50901，目标地址端口号为10.96.22.136:50300 ，其中10.96.22.136是PodA使用server-svc这个serviceName请求得到的目的地址，也就是server-svc对应的serviceIP，那就确认一下这个地址有没有问题：

[root@node1 ~]# kubectl get pod -A -owide|grep server
ss  server-xxx-xxx  1/1  Running 0 20h  177.177.176.150  node1
ss  server-xxx-xxx  1/1  Running 0 20h  177.177.254.245  node2
ss  server-xxx-xxx  1/1  Running 0 20h  177.177.18.152   node3

1 2	[root@node1 ~]# kubectl get svc -A -owide\|grep server ss server-svc ClusterIP 10.96.182.195 <none> 50300/UDP

可以看出，源地址没有问题，但目标地址跟预期不符，实际查到的服务名server-svc对应的地址为10.96.182.195，这是怎么回事儿呢？我们知道，K8S从v1.13版本开始默认使用CoreDNS作为服务发现，PodA使用服务名server-svc发起请求时，需要经过CoreDNS的解析，将服务名解析为serviceIP，那就登录到PodA内，验证域名解析是不是有问题：

[root@node1 ~]# kubectl exec -it -n ss server-xxx-xxx -- cat /etc/resolve.conf
nameserver 10.96.0.10
search ss.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

[root@node1 ~]# kubectl exec -it -n ss server-xxx-xxx -- nslookup server-svc
Server:    10.96.0.10

Name:    ss
Address: 10.96.182.195

从查看结果看，域名解析没有问题，PodA内也可以正确解析出server-svc对应的serviceIP为10.96.182.195，那最初使用tcpdump命令抓到的serviceIP 为10.96.22.136，难道这个地址是其他业务的服务，或者是残留的iptables规则，或者是有什么相关路由？分别查一下看看：

[root@node1 ~]# kubectl get svc -A -owide|grep 10.96.22.136

[root@node1 ~]# iptables-save|grep 10.96.22.136

[root@node1 ~]# ip route|grep 10.96.22.136

结果是，集群上根本不存在10.96.22.136这个地址，那PodA请求的目标地址为什么是它？既然主机上抓包时，目标地址已经是10.96.22.136，那再确认下出PodA时目标地址是什么：

[root@node1 ~]# ip route|grep 177.177.176.150
177.177.176.150 dev cali9afa4438787 scope link

[root@node1 ~]# tcpdump -n -i cali9afa4438787 port 50300
...
14:16:40.821511 IP 177.177.176.150.50902 ->  10.96.22.136.50300:  UDP, length 214
...

原来出PodA时，目标地址已经是错误的serviceIP。而结合上面的域名解析的验证结果看，请求出PodA时的域名解析应该不存在问题。综合上面的定位情况，基本可以推测出，问题出在发送方。

为了进一步区分出，是PodA内的所有发送请求都存在问题，还是只有业务自身的发送请求存在问题，我们使用nc命令在PodA内模拟发送一个UDP数据包，然后在主机上抓包验证（PodA内恰巧有nc命令，如果没有，感兴趣的同学可以使用/dev/{tcp|udp}模拟[1]）：

[root@node1 ~]# kubectl exec -it -n ss server-xxx-xxx -- echo “test” | nc -u server-svc 50300 -p 9999

[root@node1 ~]# tcpdump -n -i cali9afa4438787 port 50300
...
15:46:45.871580 IP 177.177.176.150.50902 ->  10.96.182.195.50300:  UDP, length 54
...

可以看出，PodA内模拟发送的请求，目标地址是可以正确解析的，也就把问题限定在了业务自身的发送请求存在问题。因为问题是服务名没有解析为正确的IP地址，所以怀疑是业务使用了什么缓存，如果猜想正确，那么重启PodA，理论上可以解决。而考虑到业务是多副本的，我们重启其中一个，其他副本上的问题环境还可以保留，跟开发沟通后重启并验证业务的请求：

[root@node1 ~]# docker ps |grep server-xxx-xxx | grep -v POD |awk '{print $1}' |xargs docker restart

[root@node1 ~]# tcpdump -n -i ens192 port 50300
...
15:58:17.150535 IP 177.177.176.150.distinct -> 10.96.182.195.50300:  UDP, length 214
15:58:17.150607 IP 192.168.7.21.distinct  ->  10.96.182.195.50300:   UDP, length 214
...

验证符合预期，进一步证明了业务可能是使用了什么缓存。与开发同学了解，业务的发送使用的是java原生的API发送UDP数据，会不会是java在使用域名建立socket时默认会做缓存呢？

通过一番搜索，找了一篇相关博客[2]，关键内容附上：

在通过DNS查找域名的过程中，可能会经过多台中间DNS服务器才能找到指定的域名，因此，在DNS服务器上查找域名是非常昂贵的操作。在Java中为了缓解这个问题，提供了DNS缓存。当InetAddress类第一次使用某个域名创建InetAddress对象后，JVM就会将这个域名和它从DNS上获得的信息（如IP地址）都保存在DNS缓存中。当下一次InetAddress类再使用这个域名时，就直接从DNS缓存里获得所需的信息，而无需再访问DNS服务器。

还真是，继续看怎么解决：

DNS缓存在默认时将永远保留曾经访问过的域名信息，但我们可以修改这个默认值。一般有两种方法可以修改这个默认值：

在程序中通过java.security.Security.setProperty方法设置安全属性networkaddress.cache.ttl的值（单位：秒）

设置java.security文件中的networkaddress.cache.negative.ttl属性。假设JDK的安装目录是C:/jdk1.6，那么java.security文件位于c:/jdk1.6/jre/lib/security目录中。打开这个文件，找到networkaddress.cache.ttl属性，并将这个属性值设为相应的缓存超时（单位：秒）

注：如果将networkaddress.cache.ttl属性值设为-1，那么DNS缓存数据将永远不会释放。

至此，问题定位结束。

解决方案

业务侧根据业务场景调整DNS缓存的设置。

参考资料

工具分享-使用阿里开源的Sealer快速部署K8S集群

发表于 2021-06-20 更新于 2024-12-15 分类于 tools 阅读时长 ≈ 32 分钟

什么是Sealer

引用官方文档的介绍[1]：

sealer[ˈsiːlər]是一款分布式应用打包交付运行的解决方案，通过把分布式应用及其数据库中间件等依赖一起打包以解决复杂应用的交付问题。

sealer构建出来的产物我们称之为“集群镜像”，集群镜像里内嵌了一个kubernetes，解决了分布式应用的交付一致性问题。

集群镜像可以push到registry中共享给其他用户使用，也可以在官方仓库中找到非常通用的分布式软件直接使用。

Docker可以把一个操作系统的rootfs+应用 build成一个容器镜像，sealer把kubernetes看成操作系统，在这个更高的抽象纬度上做出来的镜像就是集群镜像。实现整个集群的Build Share Run !!!

快速部署K8S集群

准备一个节点，先下载并安装Sealer：

[root@node1]# wget https://github.com/alibaba/sealer/releases/download/v0.1.5/sealer-v0.1.5-linux-amd64.tar.gz && tar zxvf sealer-v0.1.5-linux-amd64.tar.gz && mv sealer /usr/bin

[root@node1]# sealer version
{"gitVersion":"v0.1.5","gitCommit":"9143e60","buildDate":"2021-06-04 07:41:03","goVersion":"go1.14.15","compiler":"gc","platform":"linux/amd64"}

根据官方文档，如果要在一个已存在的机器上部署kubernetes，直接执行以下命令：

[root@node1]# sealer run kubernetes:v1.19.9 --masters xx.xx.xx.xx --passwd xxxx
2021-06-19 17:22:14 [WARN] [registry_client.go:37] failed to get auth info for registry.cn-qingdao.aliyuncs.com, err: auth for registry.cn-qingdao.aliyuncs.com doesn't exist
2021-06-19 17:22:15 [INFO] [current_cluster.go:39] current cluster not found, will create a new cluster new kube build config failed: stat /root/.kube/config: no such file or directory
2021-06-19 17:22:15 [WARN] [default_image.go:89] failed to get auth info, err: auth for registry.cn-qingdao.aliyuncs.com doesn't exist
Start to Pull Image kubernetes:v1.19.9
191908a896ce: pull completed 
2021-06-19 17:22:49 [INFO] [filesystem.go:88] image name is registry.cn-qingdao.aliyuncs.com/sealer-io/kubernetes:v1.19.9.alpha.1
2021-06-19 17:22:49 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /var/lib/sealer/data/my-cluster || true
copying files to 10.10.11.49: 198/198 
2021-06-19 17:25:22 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : cd /var/lib/sealer/data/my-cluster/rootfs  && chmod +x scripts/* && cd scripts && sh init.sh
+ storage=/var/lib/docker
+ mkdir -p /var/lib/docker
+ command_exists docker
+ command -v docker
+ systemctl daemon-reload
+ systemctl restart docker.service
++ docker info
++ grep Cg
+ cgroupDriver=' Cgroup Driver: cgroupfs'
+ driver=cgroupfs
+ echo 'driver is cgroupfs'
driver is cgroupfs
+ export criDriver=cgroupfs
+ criDriver=cgroupfs
* Applying /usr/lib/sysctl.d/00-system.conf ...
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
* Applying /usr/lib/sysctl.d/10-default-yama-scope.conf ...
kernel.yama.ptrace_scope = 0
* Applying /usr/lib/sysctl.d/50-default.conf ...
kernel.sysrq = 16
kernel.core_uses_pid = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.all.promote_secondaries = 1
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
* Applying /usr/lib/sysctl.d/60-libvirtd.conf ...
fs.aio-max-nr = 1048576
* Applying /etc/sysctl.d/99-sysctl.conf ...
net.ipv4.ip_forward = 1
net.ipv4.conf.all.rp_filter = 1
* Applying /etc/sysctl.d/k8s.conf ...
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.conf.all.rp_filter = 0
* Applying /etc/sysctl.conf ...
net.ipv4.ip_forward = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.ip_forward = 1
2021-06-19 17:25:26 [INFO] [runtime.go:107] metadata version v1.19.9
2021-06-19 17:25:26 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : cd /var/lib/sealer/data/my-cluster/rootfs && echo "
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.19.9
controlPlaneEndpoint: "apiserver.cluster.local:6443"
imageRepository: sea.hub:5000/library
networking:
  # dnsDomain: cluster.local
  podSubnet: 100.64.0.0/10
  serviceSubnet: 10.96.0.0/22
apiServer:
  certSANs:
  - 127.0.0.1
  - apiserver.cluster.local
  - 10.10.11.49
  - aliyun-inc.com
  - 10.0.0.2
  - 127.0.0.1
  - apiserver.cluster.local
  - 10.103.97.2
  - 10.10.11.49
  - 10.103.97.2
  extraArgs:
    etcd-servers: https://10.10.11.49:2379
    feature-gates: TTLAfterFinished=true,EphemeralContainers=true
    audit-policy-file: "/etc/kubernetes/audit-policy.yml"
    audit-log-path: "/var/log/kubernetes/audit.log"
    audit-log-format: json
    audit-log-maxbackup: '"10"'
    audit-log-maxsize: '"100"'
    audit-log-maxage: '"7"'
    enable-aggregator-routing: '"true"'
  extraVolumes:
  - name: "audit"
    hostPath: "/etc/kubernetes"
    mountPath: "/etc/kubernetes"
    pathType: DirectoryOrCreate
  - name: "audit-log"
    hostPath: "/var/log/kubernetes"
    mountPath: "/var/log/kubernetes"
    pathType: DirectoryOrCreate
  - name: localtime
    hostPath: /etc/localtime
    mountPath: /etc/localtime
    readOnly: true
    pathType: File
controllerManager:
  extraArgs:
    feature-gates: TTLAfterFinished=true,EphemeralContainers=true
    experimental-cluster-signing-duration: 876000h
  extraVolumes:
  - hostPath: /etc/localtime
    mountPath: /etc/localtime
    name: localtime
    readOnly: true
    pathType: File
scheduler:
  extraArgs:
    feature-gates: TTLAfterFinished=true,EphemeralContainers=true
  extraVolumes:
  - hostPath: /etc/localtime
    mountPath: /etc/localtime
    name: localtime
    readOnly: true
    pathType: File
etcd:
  local:
    extraArgs:
      listen-metrics-urls: http://0.0.0.0:2381
" > kubeadm-config.yaml
2021-06-19 17:25:27 [INFO] [kube_certs.go:234] APIserver altNames :  {map[aliyun-inc.com:aliyun-inc.com apiserver.cluster.local:apiserver.cluster.local kubernetes:kubernetes kubernetes.default:kubernetes.default kubernetes.default.svc:kubernetes.default.svc kubernetes.default.svc.cluster.local:kubernetes.default.svc.cluster.local localhost:localhost node1:node1] map[10.0.0.2:10.0.0.2 10.103.97.2:10.103.97.2 10.96.0.1:10.96.0.1 127.0.0.1:127.0.0.1 10.10.11.49:10.10.11.49]}
2021-06-19 17:25:27 [INFO] [kube_certs.go:254] Etcd altnames : {map[localhost:localhost node1:node1] map[127.0.0.1:127.0.0.1 10.10.11.49:10.10.11.49 ::1:::1]}, commonName : node1
2021-06-19 17:25:30 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes || true
copying files to 10.10.11.49: 22/22 
2021-06-19 17:25:43 [INFO] [kubeconfig.go:267] [kubeconfig] Writing "admin.conf" kubeconfig file
2021-06-19 17:25:43 [INFO] [kubeconfig.go:267] [kubeconfig] Writing "controller-manager.conf" kubeconfig file
2021-06-19 17:25:43 [INFO] [kubeconfig.go:267] [kubeconfig] Writing "scheduler.conf" kubeconfig file
2021-06-19 17:25:43 [INFO] [kubeconfig.go:267] [kubeconfig] Writing "kubelet.conf" kubeconfig file
2021-06-19 17:25:44 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes && cp -f /var/lib/sealer/data/my-cluster/rootfs/statics/audit-policy.yml /etc/kubernetes/audit-policy.yml
2021-06-19 17:25:44 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : cd /var/lib/sealer/data/my-cluster/rootfs/scripts && sh init-registry.sh 5000 /var/lib/sealer/data/my-cluster/rootfs/registry
++ dirname init-registry.sh
+ cd .
+ REGISTRY_PORT=5000
+ VOLUME=/var/lib/sealer/data/my-cluster/rootfs/registry
+ container=sealer-registry
+ mkdir -p /var/lib/sealer/data/my-cluster/rootfs/registry
+ docker load -q -i ../images/registry.tar
Loaded image: registry:2.7.1
+ docker run -d --restart=always --name sealer-registry -p 5000:5000 -v /var/lib/sealer/data/my-cluster/rootfs/registry:/var/lib/registry registry:2.7.1
e35aeefcfb415290764773f28dd843fc53dab8d1210373ca2c0f1f4773391686
2021-06-19 17:25:45 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes || true
copying files to 10.10.11.49: 1/1 
2021-06-19 17:25:46 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes || true
copying files to 10.10.11.49: 1/1 
2021-06-19 17:25:47 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes || true
copying files to 10.10.11.49: 1/1 
2021-06-19 17:25:48 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes || true
copying files to 10.10.11.49: 1/1 
2021-06-19 17:25:49 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : echo 10.10.11.49 apiserver.cluster.local >> /etc/hosts
2021-06-19 17:25:50 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : echo 10.10.11.49 sea.hub >> /etc/hosts
2021-06-19 17:25:50 [INFO] [init.go:211] start to init master0...
[ssh][10.10.11.49]failed to run command [kubeadm init --config=/var/lib/sealer/data/my-cluster/rootfs/kubeadm-config.yaml --upload-certs -v 0 --ignore-preflight-errors=SystemVerification],output is: W0619 17:25:50.649054  122163 common.go:77] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta1". Please use 'kubeadm config migrate --old-config old.yaml --new-config new.yaml', which will write the new, similar spec using a newer API version.

W0619 17:25:50.702549  122163 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.19.9
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING FileExisting-socat]: socat not found in system path
[WARNING Hostname]: hostname "node1" could not be reached
[WARNING Hostname]: hostname "node1": lookup node1 on 10.72.66.37:53: no such host
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
error execution phase preflight: [preflight] Some fatal errors occurred:

[ERROR ImagePull]: failed to pull image sea.hub:5000/library/kube-apiserver:v1.19.9: output: Error response from daemon: Get https://sea.hub:5000/v2/: http: server gave HTTP response to HTTPS client, error: exit status 1
[ERROR ImagePull]: failed to pull image sea.hub:5000/library/kube-controller-manager:v1.19.9: output: Error response from daemon: Get https://sea.hub:5000/v2/: http: server gave HTTP response to HTTPS client, error: exit status 1
[ERROR ImagePull]: failed to pull image sea.hub:5000/library/kube-scheduler:v1.19.9: output: Error response from daemon: Get https://sea.hub:5000/v2/: http: server gave HTTP response to HTTPS client, error: exit status 1
[ERROR ImagePull]: failed to pull image sea.hub:5000/library/kube-proxy:v1.19.9: output: Error response from daemon: Get https://sea.hub:5000/v2/: http: server gave HTTP response to HTTPS client, error: exit status 1
[ERROR ImagePull]: failed to pull image sea.hub:5000/library/pause:3.2: output: Error response from daemon: Get https://sea.hub:5000/v2/: http: server gave HTTP response to HTTPS client, error: exit status 1
[ERROR ImagePull]: failed to pull image sea.hub:5000/library/etcd:3.4.13-0: output: Error response from daemon: Get https://sea.hub:5000/v2/: http: server gave HTTP response to HTTPS client, error: exit status 1
[ERROR ImagePull]: failed to pull image sea.hub:5000/library/coredns:1.7.0: output: Error response from daemon: Get https://sea.hub:5000/v2/: http: server gave HTTP response to HTTPS client, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
2021-06-19 17:25:52 [EROR] [run.go:55] init master0 failed, error: [ssh][10.10.11.49]run command failed [kubeadm init --config=/var/lib/sealer/data/my-cluster/rootfs/kubeadm-config.yaml --upload-certs -v 0 --ignore-preflight-errors=SystemVerification]. Please clean and reinstall

部署报错，从错误日志看，是尝试访问Sealer自己搭建的私有registry异常。从报错信息server gave HTTP response to HTTPS client可以知道，应该是docker中没有配置insecure-registries字段导致的。查看docker的配置文件确认一下：

[root@node1]# cat /etc/docker/daemon.json 
{
  "max-concurrent-downloads": 10,
  "log-driver": "json-file",
  "log-level": "warn",
  "insecure-registries":["127.0.0.1"],
  "data-root":"/var/lib/docker"
}

可以看出，insecure-registries字段配置的不对，考虑到该节点在部署之前已经安装过docker，所以不确定这个配置是之前就存在，还是Sealer配置错了，那就自己修改一下吧：

[root@node1]# cat /etc/docker/daemon.json 
{
  "max-concurrent-downloads": 10,
  "log-driver": "json-file",
  "log-level": "warn",
  "insecure-registries":["sea.hub:5000"],
  "data-root":"/var/lib/docker"
}

再次执行部署命令：

sealer run kubernetes:v1.19.9 --masters xx.xx.xx.xx --passwd xxxx
...
2021-06-19 17:43:56 [INFO] [kubeconfig.go:277] [kubeconfig] Using existing kubeconfig file: "/var/lib/sealer/data/my-cluster/admin.conf"
2021-06-19 17:43:57 [INFO] [kubeconfig.go:277] [kubeconfig] Using existing kubeconfig file: "/var/lib/sealer/data/my-cluster/controller-manager.conf"
2021-06-19 17:43:57 [INFO] [kubeconfig.go:277] [kubeconfig] Using existing kubeconfig file: "/var/lib/sealer/data/my-cluster/scheduler.conf"
2021-06-19 17:43:57 [INFO] [kubeconfig.go:277] [kubeconfig] Using existing kubeconfig file: "/var/lib/sealer/data/my-cluster/kubelet.conf"
2021-06-19 17:43:57 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes && cp -f /var/lib/sealer/data/my-cluster/rootfs/statics/audit-policy.yml /etc/kubernetes/audit-policy.yml
2021-06-19 17:43:57 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : cd /var/lib/sealer/data/my-cluster/rootfs/scripts && sh init-registry.sh 5000 /var/lib/sealer/data/my-cluster/rootfs/registry
++ dirname init-registry.sh
+ cd .
+ REGISTRY_PORT=5000
+ VOLUME=/var/lib/sealer/data/my-cluster/rootfs/registry
+ container=sealer-registry
+ mkdir -p /var/lib/sealer/data/my-cluster/rootfs/registry
+ docker load -q -i ../images/registry.tar
Loaded image: registry:2.7.1
+ docker run -d --restart=always --name sealer-registry -p 5000:5000 -v /var/lib/sealer/data/my-cluster/rootfs/registry:/var/lib/registry registry:2.7.1
docker: Error response from daemon: Conflict. The container name "/sealer-registry" is already in use by container "e35aeefcfb415290764773f28dd843fc53dab8d1210373ca2c0f1f4773391686". You have to remove (or rename) that container to be able to reuse that name.
See 'docker run --help'.
+ true
2021-06-19 17:43:58 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes || true
copying files to 10.10.11.49: 1/1 
2021-06-19 17:43:59 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes || true
copying files to 10.10.11.49: 1/1 
2021-06-19 17:44:00 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes || true
copying files to 10.10.11.49: 1/1 
2021-06-19 17:44:01 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : mkdir -p /etc/kubernetes || true
copying files to 10.10.11.49: 1/1 
2021-06-19 17:44:02 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : echo 10.10.11.49 apiserver.cluster.local >> /etc/hosts
2021-06-19 17:44:02 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : echo 10.10.11.49 sea.hub >> /etc/hosts
2021-06-19 17:44:03 [INFO] [init.go:211] start to init master0...
2021-06-19 17:46:53 [INFO] [init.go:286] [globals]join command is:  apiserver.cluster.local:6443 --token comygj.c0kj18d7fh2h4xta \
    --discovery-token-ca-cert-hash sha256:cd8988f9a061765914dddb24d4e578ad446d8d31b0e30dba96a89e0c4f1e7240 \
    --control-plane --certificate-key b27f10340d2f89790f7e980af72cf9d54d790b53bfd4da823947d914359d6e81

2021-06-19 17:46:53 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : rm -rf .kube/config && mkdir -p /root/.kube && cp /etc/kubernetes/admin.conf /root/.kube/config
2021-06-19 17:46:53 [INFO] [init.go:230] start to install CNI
2021-06-19 17:46:53 [INFO] [init.go:250] render cni yaml success
2021-06-19 17:46:54 [INFO] [sshcmd.go:48] [ssh][10.10.11.49] : echo '
---
# Source: calico/templates/calico-config.yaml
# This ConfigMap is used to configure a self-hosted Calico installation.
kind: ConfigMap
apiVersion: v1
metadata:
  name: calico-config
  namespace: kube-system
data:
  # Typha is disabled.
  typha_service_name: "none"
  # Configure the backend to use.
  calico_backend: "bird"

  # Configure the MTU to use
  veth_mtu: "1550"

  # The CNI network configuration to install on each node.  The special
  # values in this config will be automatically populated.
  cni_network_config: |-
    {
      "name": "k8s-pod-network",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "calico",
          "log_level": "info",
          "datastore_type": "kubernetes",
          "nodename": "__KUBERNETES_NODE_NAME__",
          "mtu": __CNI_MTU__,
          "ipam": {
              "type": "calico-ipam"
          },
          "policy": {
              "type": "k8s"
          },
          "kubernetes": {
              "kubeconfig": "__KUBECONFIG_FILEPATH__"
          }
        },
        {
          "type": "portmap",
          "snat": true,
          "capabilities": {"portMappings": true}
        }
      ]
    }
---
# Source: calico/templates/kdd-crds.yaml
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
   name: felixconfigurations.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: FelixConfiguration
    plural: felixconfigurations
    singular: felixconfiguration
---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ipamblocks.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: IPAMBlock
    plural: ipamblocks
    singular: ipamblock

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: blockaffinities.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: BlockAffinity
    plural: blockaffinities
    singular: blockaffinity

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ipamhandles.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: IPAMHandle
    plural: ipamhandles
    singular: ipamhandle

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ipamconfigs.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: IPAMConfig
    plural: ipamconfigs
    singular: ipamconfig

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: bgppeers.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: BGPPeer
    plural: bgppeers
    singular: bgppeer

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: bgpconfigurations.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: BGPConfiguration
    plural: bgpconfigurations
    singular: bgpconfiguration

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ippools.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: IPPool
    plural: ippools
    singular: ippool

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: hostendpoints.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: HostEndpoint
    plural: hostendpoints
    singular: hostendpoint

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: clusterinformations.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: ClusterInformation
    plural: clusterinformations
    singular: clusterinformation

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: globalnetworkpolicies.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: GlobalNetworkPolicy
    plural: globalnetworkpolicies
    singular: globalnetworkpolicy

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: globalnetworksets.crd.projectcalico.org
spec:
  scope: Cluster
  group: crd.projectcalico.org
  version: v1
  names:
    kind: GlobalNetworkSet
    plural: globalnetworksets
    singular: globalnetworkset

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: networkpolicies.crd.projectcalico.org
spec:
  scope: Namespaced
  group: crd.projectcalico.org
  version: v1
  names:
    kind: NetworkPolicy
    plural: networkpolicies
    singular: networkpolicy

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: networksets.crd.projectcalico.org
spec:
  scope: Namespaced
  group: crd.projectcalico.org
  version: v1
  names:
    kind: NetworkSet
    plural: networksets
    singular: networkset
---
# Source: calico/templates/rbac.yaml

# Include a clusterrole for the kube-controllers component,
# and bind it to the calico-kube-controllers serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-kube-controllers
rules:
  # Nodes are watched to monitor for deletions.
  - apiGroups: [""]
    resources:
      - nodes
    verbs:
      - watch
      - list
      - get
  # Pods are queried to check for existence.
  - apiGroups: [""]
    resources:
      - pods
    verbs:
      - get
  # IPAM resources are manipulated when nodes are deleted.
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - ippools
    verbs:
      - list
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - blockaffinities
      - ipamblocks
      - ipamhandles
    verbs:
      - get
      - list
      - create
      - update
      - delete
  # Needs access to update clusterinformations.
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - clusterinformations
    verbs:
      - get
      - create
      - update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-kube-controllers
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: calico-kube-controllers
subjects:
- kind: ServiceAccount
  name: calico-kube-controllers
  namespace: kube-system
---
# Include a clusterrole for the calico-node DaemonSet,
# and bind it to the calico-node serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-node
rules:
  # The CNI plugin needs to get pods, nodes, and namespaces.
  - apiGroups: [""]
    resources:
      - pods
      - nodes
      - namespaces
    verbs:
      - get
  - apiGroups: [""]
    resources:
      - endpoints
      - services
    verbs:
      # Used to discover service IPs for advertisement.
      - watch
      - list
      # Used to discover Typhas.
      - get
  - apiGroups: [""]
    resources:
      - nodes/status
    verbs:
      # Needed for clearing NodeNetworkUnavailable flag.
      - patch
      # Calico stores some configuration information in node annotations.
      - update
  # Watch for changes to Kubernetes NetworkPolicies.
  - apiGroups: ["networking.k8s.io"]
    resources:
      - networkpolicies
    verbs:
      - watch
      - list
  # Used by Calico for policy information.
  - apiGroups: [""]
    resources:
      - pods
      - namespaces
      - serviceaccounts
    verbs:
      - list
      - watch
  # The CNI plugin patches pods/status.
  - apiGroups: [""]
    resources:
      - pods/status
    verbs:
      - patch
  # Calico monitors various CRDs for config.
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - globalfelixconfigs
      - felixconfigurations
      - bgppeers
      - globalbgpconfigs
      - bgpconfigurations
      - ippools
      - ipamblocks
      - globalnetworkpolicies
      - globalnetworksets
      - networkpolicies
      - networksets
      - clusterinformations
      - hostendpoints
    verbs:
      - get
      - list
      - watch
  # Calico must create and update some CRDs on startup.
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - ippools
      - felixconfigurations
      - clusterinformations
    verbs:
      - create
      - update
  # Calico stores some configuration information on the node.
  - apiGroups: [""]
    resources:
      - nodes
    verbs:
      - get
      - list
      - watch
  # These permissions are only required for upgrade from v2.6, and can
  # be removed after upgrade or on fresh installations.
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - bgpconfigurations
      - bgppeers
    verbs:
      - create
      - update
  # These permissions are required for Calico CNI to perform IPAM allocations.
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - blockaffinities
      - ipamblocks
      - ipamhandles
    verbs:
      - get
      - list
      - create
      - update
      - delete
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - ipamconfigs
    verbs:
      - get
  # Block affinities must also be watchable by confd for route aggregation.
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - blockaffinities
    verbs:
      - watch
  # The Calico IPAM migration needs to get daemonsets. These permissions can be
  # removed if not upgrading from an installation using host-local IPAM.
  - apiGroups: ["apps"]
    resources:
      - daemonsets
    verbs:
      - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: calico-node
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: calico-node
subjects:
- kind: ServiceAccount
  name: calico-node
  namespace: kube-system

---
# Source: calico/templates/calico-node.yaml
# This manifest installs the calico-node container, as well
# as the CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: calico-node
  namespace: kube-system
  labels:
    k8s-app: calico-node
spec:
  selector:
    matchLabels:
      k8s-app: calico-node
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        k8s-app: calico-node
      annotations:
        # This, along with the CriticalAddonsOnly toleration below,
        # marks the pod as a critical add-on, ensuring it gets
        # priority scheduling and that its resources are reserved
        # if it ever gets evicted.
    spec:
      nodeSelector:
        beta.kubernetes.io/os: linux
      hostNetwork: true
      tolerations:
        # Make sure calico-node gets scheduled on all nodes.
        - effect: NoSchedule
          operator: Exists
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - effect: NoExecute
          operator: Exists
      serviceAccountName: calico-node
      # Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
      # deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
      terminationGracePeriodSeconds: 0
      priorityClassName: system-node-critical
      initContainers:
        # This container performs upgrade from host-local IPAM to calico-ipam.
        # It can be deleted if this is a fresh installation, or if you have already
        # upgraded to use calico-ipam.
        - name: upgrade-ipam
          image: sea.hub:5000/calico/cni:v3.8.2
          command: ["/opt/cni/bin/calico-ipam", "-upgrade"]
          env:
            - name: KUBERNETES_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: CALICO_NETWORKING_BACKEND
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: calico_backend
          volumeMounts:
            - mountPath: /var/lib/cni/networks
              name: host-local-net-dir
            - mountPath: /host/opt/cni/bin
              name: cni-bin-dir
        # This container installs the CNI binaries
        # and CNI network config file on each node.
        - name: install-cni
          image: sea.hub:5000/calico/cni:v3.8.2
          command: ["/install-cni.sh"]
          env:
            # Name of the CNI config file to create.
            - name: CNI_CONF_NAME
              value: "10-calico.conflist"
            # The CNI network config to install on each node.
            - name: CNI_NETWORK_CONFIG
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: cni_network_config
            # Set the hostname based on the k8s node name.
            - name: KUBERNETES_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            # CNI MTU Config variable
            - name: CNI_MTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # Prevents the container from sleeping forever.
            - name: SLEEP
              value: "false"
          volumeMounts:
            - mountPath: /host/opt/cni/bin
              name: cni-bin-dir
            - mountPath: /host/etc/cni/net.d
              name: cni-net-dir
        # Adds a Flex Volume Driver that creates a per-pod Unix Domain Socket to allow Dikastes
        # to communicate with Felix over the Policy Sync API.
        - name: flexvol-driver
          image: sea.hub:5000/calico/pod2daemon-flexvol:v3.8.2
          volumeMounts:
          - name: flexvol-driver-host
            mountPath: /host/driver
      containers:
        # Runs calico-node container on each Kubernetes node.  This
        # container programs network policy and routes on each
        # host.
        - name: calico-node
          image: sea.hub:5000/calico/node:v3.8.2
          env:
            # Use Kubernetes API as the backing datastore.
            - name: DATASTORE_TYPE
              value: "kubernetes"
            # Wait for the datastore.
            - name: WAIT_FOR_DATASTORE
              value: "true"
            # Set based on the k8s node name.
            - name: NODENAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            # Choose the backend to use.
            - name: CALICO_NETWORKING_BACKEND
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: calico_backend
            # Cluster type to identify the deployment type
            - name: CLUSTER_TYPE
              value: "k8s,bgp"
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
            - name: IP_AUTODETECTION_METHOD
              value: "interface=eth0"
            # Enable IPIP
            - name: CALICO_IPV4POOL_IPIP
              value: "Off"
            # Set MTU for tunnel device used if ipip is enabled
            - name: FELIX_IPINIPMTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # The default IPv4 pool to create on startup if none exists. Pod IPs will be
            # chosen from this range. Changing this value after installation will have
            - name: CALICO_IPV4POOL_CIDR
              value: "100.64.0.0/10"
            - name: CALICO_DISABLE_FILE_LOGGING
              value: "true"
            # Set Felix endpoint to host default action to ACCEPT.
            - name: FELIX_DEFAULTENDPOINTTOHOSTACTION
              value: "ACCEPT"
            # Disable IPv6 on Kubernetes.
            - name: FELIX_IPV6SUPPORT
              value: "false"
            # Set Felix logging to "info"
            - name: FELIX_LOGSEVERITYSCREEN
              value: "info"
            - name: FELIX_HEALTHENABLED
              value: "true"
          securityContext:
            privileged: true
          resources:
            requests:
              cpu: 250m
          livenessProbe:
            httpGet:
              path: /liveness
              port: 9099
              host: localhost
            periodSeconds: 10
            initialDelaySeconds: 10
            failureThreshold: 6
          readinessProbe:
            exec:
              command:
              - /bin/calico-node
              - -bird-ready
              - -felix-ready
            periodSeconds: 10
          volumeMounts:
            - mountPath: /lib/modules
              name: lib-modules
              readOnly: true
            - mountPath: /run/xtables.lock
              name: xtables-lock
              readOnly: false
            - mountPath: /var/run/calico
              name: var-run-calico
              readOnly: false
            - mountPath: /var/lib/calico
              name: var-lib-calico
              readOnly: false
            - name: policysync
              mountPath: /var/run/nodeagent
      volumes:
        # Used by calico-node.
        - name: lib-modules
          hostPath:
            path: /lib/modules
        - name: var-run-calico
          hostPath:
            path: /var/run/calico
        - name: var-lib-calico
          hostPath:
            path: /var/lib/calico
        - name: xtables-lock
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
        # Used to install CNI.
        - name: cni-bin-dir
          hostPath:
            path: /opt/cni/bin
        - name: cni-net-dir
          hostPath:
            path: /etc/cni/net.d
        # Mount in the directory for host-local IPAM allocations. This is
        # used when upgrading from host-local to calico-ipam, and can be removed
        # if not using the upgrade-ipam init container.
        - name: host-local-net-dir
          hostPath:
            path: /var/lib/cni/networks
        # Used to create per-pod Unix Domain Sockets
        - name: policysync
          hostPath:
            type: DirectoryOrCreate
            path: /var/run/nodeagent
        # Used to install Flex Volume Driver
        - name: flexvol-driver-host
          hostPath:
            type: DirectoryOrCreate
            path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-node
  namespace: kube-system

---
# Source: calico/templates/calico-kube-controllers.yaml

# See https://github.com/projectcalico/kube-controllers
apiVersion: apps/v1
kind: Deployment
metadata:
  name: calico-kube-controllers
  namespace: kube-system
  labels:
    k8s-app: calico-kube-controllers
spec:
  # The controllers can only have a single active instance.
  replicas: 1
  selector:
    matchLabels:
      k8s-app: calico-kube-controllers
  strategy:
    type: Recreate
  template:
    metadata:
      name: calico-kube-controllers
      namespace: kube-system
      labels:
        k8s-app: calico-kube-controllers
      annotations:
    spec:
      nodeSelector:
        beta.kubernetes.io/os: linux
      tolerations:
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      serviceAccountName: calico-kube-controllers
      priorityClassName: system-cluster-critical
      containers:
        - name: calico-kube-controllers
          image: sea.hub:5000/calico/kube-controllers:v3.8.2
          env:
            # Choose which controllers to run.
            - name: ENABLED_CONTROLLERS
              value: node
            - name: DATASTORE_TYPE
              value: kubernetes
          readinessProbe:
            exec:
              command:
              - /usr/bin/check-status
              - -r

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-kube-controllers
  namespace: kube-system
' | kubectl apply -f -
configmap/calico-config created
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created

至此，kubernetes集群部署完成，查看集群状态：

[root@node1]# kubectl get node -owide
NAME       STATUS   ROLES    AGE     VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
node1   Ready    master   2m50s   v1.19.9   10.10.11.49   <none>        CentOS Linux 7 (Core)   3.10.0-862.11.6.el7.x86_64   docker://19.3.0

[root@node1]# kubectl get pod -A  -owide
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE     IP              NODE       NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-5565b777b6-w9mhw   1/1     Running   0          2m32s   100.76.153.65   node1
kube-system   calico-node-mwkg2                          1/1     Running   0          2m32s   10.10.11.49    node1
kube-system   coredns-597c5579bc-dpqbx                   1/1     Running   0          2m32s   100.76.153.64   node1
kube-system   coredns-597c5579bc-fjnmq                   1/1     Running   0          2m32s   100.76.153.66   node1
kube-system   etcd-node1                                 1/1     Running   0          2m51s   10.10.11.49    node1
kube-system   kube-apiserver-node1                       1/1     Running   0          2m51s   10.10.11.49    node1
kube-system   kube-controller-manager-node1              1/1     Running   0          2m51s   10.10.11.49    node1
kube-system   kube-proxy-qgt9w                           1/1     Running   0          2m32s   10.10.11.49    node1
kube-system   kube-scheduler-node1                       1/1     Running   0          2m51s   10.10.11.49    node1

参考资料

https://github.com/alibaba/sealer/blob/main/docs/README_zh.md

K8S问题排查-业务高并发导致Pod反复重启

发表于 2021-06-19 更新于 2024-12-15 分类于 kubernetes 阅读时长 ≈ 5 分钟

问题背景

K8S集群环境中，有个业务在做大量配置的下发（持续几小时甚至更长时间），期间发现calico的Pod反复重启。

[root@node02 ~]# kubectl get pod -n kube-system -owide|grep node01
calico-kube-controllers-6f59b8cdd8-8v2qw   1/1     Running            0          4h45m   10.10.119.238    node01   <none>           <none>
calico-node-b8w2b                          1/1     CrashLoopBackOff   43         3d19h   10.10.119.238    node01   <none>           <none>
coredns-795cc9c45c-k7qpb                   1/1     Running            0          4h45m   177.177.237.42    node01   <none>           <none>
...

分析过程

看到Pod出现CrashLoopBackOff状态，就想到大概率是Pod内服务自身的原因，先使用kubectl describe命令查看一下：

[root@node02 ~]# kubectl descroiebe pod -n kube-system calico-node-b8w2b
...
Events:
  Type     Reason     Age                      From               Message
  ----     ------     ----                     ----               -------
  Warning  Unhealthy  58m (x111 over 3h12m)    kubelet, node01  (combined from similar events): Liveness probe failed: Get http://localhost:9099/liveness: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Normal   Pulled     43m (x36 over 3d19h)     kubelet, node01  Container image "calico/node:v3.15.1" already present on machine
  Warning  Unhealthy  8m16s (x499 over 3h43m)  kubelet, node01  Liveness probe failed: Get http://localhost:9099/liveness: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  BackOff    3m31s (x437 over 3h3m)   kubelet, node01  Back-off restarting failed container

从Event日志可以看出，是calico的健康检查没通过导致的重启，出错原因也比较明显：net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)，这个错误的含义是建立连接超时[1]，并且手动在控制台执行健康检查命令，发现确实响应慢（正常环境是毫秒级别）：

[root@node01 ~]# time curl -i http://localhost:9099/liveness
HTTP/1.1 204 No Content
Date: Tue, 15 Jun 2021 06:24:35 GMT
real0m1.012s
user0m0.003s
sys0m0.005s
[root@node01 ~]# time curl -i http://localhost:9099/liveness
HTTP/1.1 204 No Content
Date: Tue, 15 Jun 2021 06:24:39 GMT
real0m3.014s
user0m0.002s
sys0m0.005s
[root@node01 ~]# time curl -i http://localhost:9099/liveness
real1m52.510s
user0m0.002s
sys0m0.013s
[root@node01 ~]# time curl -i http://localhost:9099/liveness
^C

先从calico相关日志查起，依次查看了calico的bird、confd和felix日志，没有发现明显错误，再看端口是否处于正常监听状态：

[root@node02 ~]# netstat -anp|grep 9099
tcp        0      0 127.0.0.1:9099          0.0.0.0:*               LISTEN      1202/calico-node    
tcp        0      0 127.0.0.1:9099          127.0.0.1:56728         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:56546         127.0.0.1:9099          TIME_WAIT   -

考虑到错误原因是建立连接超时，并且业务量比较大，先观察一下TCP连接的状态情况：

[root@node01 ~]# netstat -na | awk '/^tcp/{s[$6]++}END{for(key in s) print key,s[key]}'
LISTEN 49
ESTABLISHED 284
SYN_SENT 4
TIME_WAIT 176

连接状态没有什么大的异常，再使用top命令看看CPU负载，好家伙，业务的java进程的CPU跑到了700%，持续观察一段时间发现最高飙到了2000%+，跟业务开发人员沟通，说是在做压力测试，并且线上有可能也存在这么大的并发量。好吧，那就继续看看这个状态下，CPU是不是出于高负载；

[root@node01 ~]# top
top - 14:28:57 up 13 days, 27 min,  2 users,  load average: 9.55, 9.93, 9.91
Tasks: 1149 total,   1 running, 1146 sleeping,   0 stopped,   2 zombie
%Cpu(s): 16.0 us,  2.9 sy,  0.0 ni, 80.9 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem : 15249982+total, 21419184 free, 55542588 used, 75538048 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 94226176 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                        
 6754 root      20   0   66.8g  25.1g 290100 S 700.0 17.3   2971:49 java                                                                                           
25214 root      20   0 6309076 179992  37016 S  36.8  0.1 439:06.29 kubelet                                                                                        
20331 root      20   0 3196660 172364  24908 S  21.1  0.1 349:56.64 dockerd

查看CPU总核数，再结合上面统计出的load average和cpu的使用率，貌似负载也没有高到离谱；

[root@node01 ~]# cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l
48
[root@node01 ~]# cat /proc/cpuinfo| grep "cpu cores"| uniq
cpu cores: 1

这就奇怪了，凭感觉，问题大概率是高并发导致的，既然这里看不出什么，那就再回到建立连接超时这个现象上面来。说到连接超时，就会想到TCP建立连接的几个阶段（参考下图），那超时发生在哪个阶段呢？

Google相关资料[2]，引用一下：

在TCP三次握手创建一个连接时，以下两种情况会发生超时：

client发送SYN后，进入SYN_SENT状态，等待server的SYN+ACK。

server收到连接创建的SYN，回应SYN+ACK后，进入SYN_RECD状态，等待client的ACK。

那么，我们的问题发生在哪个阶段？从下面的验证可以看出，问题卡在了SYN_SENT阶段，并且不止calico的健康检查会卡住，其他如kubelet、kube-controller等组件也会卡住：

[root@node01 ~]# curl http://localhost:9099/liveness
^C
[root@node01 ~]# netstat -anp|grep 9099
tcp        0      0 127.0.0.1:44360         127.0.0.1:9099          TIME_WAIT   -                   
tcp        0      1 127.0.0.1:47496         127.0.0.1:9099          SYN_SENT    16242/curl

[root@node01 ~]# netstat -anp|grep SYN_SENT
tcp        0      1 127.0.0.1:47496         127.0.0.1:9099          SYN_SENT    16242/curl
tcp        0      1 127.0.0.1:39142         127.0.0.1:37807         SYN_SENT    25214/kubelet       
tcp        0      1 127.0.0.1:38808         127.0.0.1:10251         SYN_SENT    25214/kubelet       
tcp        0      1 127.0.0.1:53726         127.0.0.1:10252         SYN_SENT    25214/kubelet
...

到目前为止，我们可以得出2个结论：

calico健康检查不通过的原因是TCP请求在SYN_SENT阶段卡住了；
该问题不是特定Pod的问题，应该是系统层面导致的通用问题；

综合上面2个结论，那就怀疑TCP相关内核参数是不是合适呢？特别是与SYN_SENT状态有关的参数[3]；

1
2

net.ipv4.tcp_max_syn_backlog 默认为1024，表示SYN队列的长度
net.core.somaxconn 默认值是128，用于调节系统同时发起的tcp连接数，在高并发的请求中，默认值可能会导致链接超时或者重传，因此需要结合并发请求数来调节此值

查看系统上的配置，基本都是默认值，那就调整一下上面两个参数的值并设置生效：

[root@node01 ~]# cat /etc/sysctl.conf 
...
net.ipv4.tcp_max_syn_backlog = 32768
net.core.somaxconn = 32768

[root@node01 ~]# sysctl -p
...
net.ipv4.tcp_max_syn_backlog = 32768
net.core.somaxconn = 32768

再次执行calico的健康检查命令，请求已经不再卡住了，问题消失，查看异常的Pod也恢复正常：

[root@node01 ~]# time curl -i http://localhost:9099/liveness
HTTP/1.1 204 No Content
Date: Tue, 15 Jun 2021 14:48:38 GMT
real    0m0.011s
user    0m0.004s
sys     0m0.004s
[root@node01 ~]# time curl -i http://localhost:9099/liveness
HTTP/1.1 204 No Content
Date: Tue, 15 Jun 2021 14:48:39 GMT
real    0m0.010s
user    0m0.001s
sys     0m0.005s
[root@node01 ~]# time curl -i http://localhost:9099/liveness
HTTP/1.1 204 No Content
Date: Tue, 15 Jun 2021 14:48:40 GMT
real    0m0.011s
user    0m0.002s

其实，最终这个问题的解决也是半猜半验证得到的，如果是正向推演，发现TCP请求在SYN_SENT阶段卡住之后，其实应该要确认相关内核参数是不是确实太小。

解决方案

在高并发场景下，做服务器内核参数的调优。

参考资料

K8S问题排查-UDP请求不通导致设备备份失败

发表于 2021-06-14 更新于 2024-12-15 分类于 kubernetes 阅读时长 ≈ 5 分钟

问题背景

K8S双栈环境下，业务Pod纳管了IPv4和IPv6的设备（Pod需要与设备通过UDP协议通信），对IPv4设备配置做备份时可以成功，对IPv6设备配置做备份时失败。

分析过程

查看K8S集群主节点node3上的IP信息：

[root@node3 ~]# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 0c:da:41:1d:d2:9d brd ff:ff:ff:ff:ff:ff
    inet 192.168.65.13/16 brd 192.168.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.65.21/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 2000::65:21/128 scope global deprecated
       valid_lft forever preferred_lft 0sec
    inet6 2000::65:13/64 scope global
       valid_lft forever preferred_lft forever

其中各IP角色如下：

192.168.65.13：IPv4节点IP
192.168.65.21：IPv4虚IP
2000::65:13：IPv6节点IP
2000::65:21：IPv6虚IP

查看主节点上接收UDP报文异常的业务Pod：

[root@node1 ~]# kubectl get pod -A -owide|grep tftpserver-dm
ss    tftpserver-dm-798nv                      1/1     Running     2          13d     177.177.166.147   node1   <none>           <none>
ss    tftpserver-dm-drrsn                      1/1     Running     4          13d     177.177.104.10    node2   <none>           <none>
ss    tftpserver-dm-vmgtf                      1/1     Running     6          13d     177.177.135.16    node3   <none>           <none>

找到Pod的网卡：

1 2	[root@node3 ~]# ip route \|grep 177.177.135.16 177.177.135.16 dev cali928cc4cd898 scope link

在业务提供的页面上触发备份IPv4设备配置的操作，抓包看到数据有请求和响应：

[root@node3 ~]# tcpdump -n -i cali928cc4cd898 -p udp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cali928cc4cd898, link-type EN10MB (Ethernet), capture size 262144 bytes
07:29:48.654684 IP 192.168.101.254.58625 > 177.177.135.16.tftp:  64 WRQ "running_3346183882.cfg" octet tsize 7304 blksize 512 timeout 5
07:29:48.686337 IP 177.177.135.16.39873 > 192.168.101.254.58625: UDP, length 35
07:29:48.707187 IP 192.168.101.254.58625 > 177.177.135.16.39873: UDP, length 516
07:29:48.707332 IP 177.177.135.16.39873 > 192.168.101.254.58625: UDP, length 4
07:29:48.708377 IP 192.168.101.254.58625 > 177.177.135.16.39873: UDP, length 516
07:29:48.708622 IP 177.177.135.16.39873 > 192.168.101.254.58625: UDP, length 4
07:29:48.710532 IP 192.168.101.254.58625 > 177.177.135.16.39873: UDP, length 516
...

在主机网卡上抓包，同样可以看到数据有请求和响应：

12:00:02.333324 IP 192.168.101.254.58631 > 192.168.65.21.tftp:  64 WRQ "running_3346346022.cfg" octet tsize 7304 blksize 512 timeout 5
12:00:02.349104 ARP, Request who-has 192.168.101.254 tell 192.168.65.13, length 28
12:00:02.350492 ARP, Reply 192.168.101.254 is-at 58:6a:b1:df:e3:d1, length 46
12:00:02.350499 IP 192.168.65.13.56284 > 192.168.101.254.58631: UDP, length 35
12:00:02.373403 IP 192.168.101.254.58631 > 192.168.65.13.56284: UDP, length 516
12:00:02.373603 IP 192.168.65.13.56284 > 192.168.101.254.58631: UDP, length 4
12:00:02.374613 IP 192.168.101.254.58631 > 192.168.65.13.56284: UDP, length 516
12:00:02.374724 IP 192.168.65.13.56284 > 192.168.101.254.58631: UDP, length 4
12:00:02.375775 IP 192.168.101.254.58631 > 192.168.65.13.56284: UDP, length 516
...

在业务提供的页面上触发备份IPv6设备配置的操作，抓包看到设备侧主动发送一个请求后，后续的数据传输请求就没有应答了：

[root@node3 ~]# tcpdump -n -i cali928cc4cd898 -p udp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cali928cc4cd898, link-type EN10MB (Ethernet), capture size 262144 bytes
08:14:31.913637 IP6 2000::65:119.41217 > fd00:177:177:0:7bf3:bb28:910a:873c.tftp:  64 WRQ "running_3346210712.cfg" octet tsize 8757 blksize 512 timeout 5
08:14:31.925400 IP6 fd00:177:177:0:7bf3:bb28:910a:873c.38680 > 2000::65:119.41217: UDP, length 35
08:14:34.928820 IP6 fd00:177:177:0:7bf3:bb28:910a:873c.38680 > 2000::65:119.41217: UDP, length 35
08:14:37.931610 IP6 fd00:177:177:0:7bf3:bb28:910a:873c.38680 > 2000::65:119.41217: UDP, length 35
08:14:40.933541 IP6 fd00:177:177:0:7bf3:bb28:910a:873c.38680 > 2000::65:119.41217: UDP, length 35
08:19:25.395306 IP6 2000::65:119.41218 > fd00:177:177:0:7bf3:bb28:910a:873c.tftp:  64 WRQ "startup_3346213742.cfg" octet tsize 8757 blksize 512 timeout 5
08:19:25.410374 IP6 fd00:177:177:0:7bf3:bb28:910a:873c.48233 > 2000::65:119.41218: UDP, length 35
08:19:28.413797 IP6 fd00:177:177:0:7bf3:bb28:910a:873c.48233 > 2000::65:119.41218: UDP, length 35
08:19:31.415977 IP6 fd00:177:177:0:7bf3:bb28:910a:873c.48233 > 2000::65:119.41218: UDP, length 35
08:19:34.418414 IP6 fd00:177:177:0:7bf3:bb28:910a:873c.48233 > 2000::65:119.41218: UDP, length 35
...

主机网卡上抓包，可以看到数据有请求和响应，说明设备的响应到了主机上，但没到Pod网卡上：

11:55:29.393598 IP6 2000::65:119.41226 > 2000::65:21.tftp:  64 WRQ "startup_3346343382.cfg" octet tsize 8757 blksize 512 timeout 5
11:55:29.401115 IP6 2000::65:13.32991 > 2000::65:119.41226: UDP, length 35
11:55:29.405709 IP6 2000::65:119.41226 > 2000::65:21.32991: UDP, length 516
11:55:29.405745 IP6 2000::65:21 > 2000::65:119: ICMP6, destination unreachable, unreachable port, 2000::65:21 udp port 32991, length 572
11:55:32.404514 IP6 2000::65:13.32991 > 2000::65:119.41226: UDP, length 35
11:55:32.406399 IP6 2000::65:119.41226 > 2000::65:21.32991: UDP, length 516
11:55:32.406432 IP6 2000::65:21 > 2000::65:119: ICMP6, destination unreachable, unreachable port, 2000::65:21 udp port 32991, length 572
11:55:35.407644 IP6 2000::65:13.32991 > 2000::65:119.41226: UDP, length 35
11:55:35.409423 IP6 2000::65:119.41226 > 2000::65:21.32991: UDP, length 516
11:55:35.409463 IP6 2000::65:21 > 2000::65:119: ICMP6, destination unreachable, unreachable port, 2000::65:21 udp port 32991, length 572
...

那IPv6设备的请求响应和IPV4设备场景下的有什么不同呢？对比IPv4和IPv6两个场景下的主机网卡抓包结果，可以看出：

IPv4设备请求时主机上抓包分析：
1. 第一次交互时，设备侧（192.168.101.254）先发送请求给VIP（192.168.65.21）
2. 第二次交互时，业务Pod请求以节点IP为源（192.168.65.13）发送给设备；
3. 第三次交互时，设备侧请求以节点IP为目标地址（192.168.65.13）发送给业务Pod

IPv6设备请求时主机上抓包分析：
1. 第一次交互时，设备侧（2000::65:119）先发送请求给VIP（2000::65:21）
2. 第二次交互时，业务Pod请求以节点IP为源（2000::65:13）发送给设备；
3. 第三次交互时，设备侧请求以VIP为目标地址（2000::65:21）发送给业务Pod

从上述报文交互过程可看出，IPv6设备在报文交互时源IP和目标地址不一致，经确认是设备侧强制配置了以VIP为目的地址发送报文的配置，而正常情况下，应该以请求报文的源IP作为响应报文的目的地址。

通过临时修改验证，把第三次交互的VIP目的地址改为节点IP，验证问题解决。

解决方案

业务层面修改发送报文的配置。

K8S问题排查-UDP频繁发包导致Pod重启后无法接收数据

发表于 2021-06-14 更新于 2024-12-15 分类于 kubernetes 阅读时长 ≈ 6 分钟

问题背景

K8S环境下，集群外的设备通过NodePort方式频繁发送UDP请求到集群内的某个Pod，当Pod因为升级或异常重启时，出现流量中断的现象。

分析过程

构造K8s集群：

1
2
3

[root@node]# kubectl get node -owide
NAME    STATUS   ROLES   VERSION    INTERNAL-IP             
node    Ready     master   v1.15.12    10.10.212.164

\部署一个通过NodePort暴露的UDP服务：

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dao
spec:
  replicas: 1
  selector:
    matchLabels:
      app: dao
  template:
    metadata:
      labels:
        app: dao
    spec:
      containers:
      - image: samwelkey24/dao-2048:1.0
        name: dao
---
apiVersion: v1
kind: Service
metadata:
  name: dao
  labels:
    app: dao
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 80
    name: tcp
  - port: 8080
    targetPort: 8080
    nodePort: 30030
    name: udp
    protocol: UDP
  selector:
    app: dao

使用nc命令模拟客户端频繁向集群外发送udp包：

1	[root@node]# while true; do echo "test" \| nc -4u 10.10.212.164 30030 -p 9999;done

在Pod网卡和主机网卡上抓包，请求都正常：

[root@node]# tcpdump -n -i cali1bd5e5bd67b port 8080
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
17:39:50.543529 IP 10.10.212.164.7156 > 177.177.241.159.webcache: UDP, length 5
17:39:50.553849 IP 10.10.212.164.7156  > 177.177.241.159.webcache: UDP, length 5
17:39:50.565139 IP 10.10.212.164.7156 > 177.177.241.159.webcache: UDP, length 5
17:39:50.576749 IP 10.10.212.164.7156 > 177.177.241.159.webcache: UDP, length 5
17:39:50.587671 IP 10.10.212.164.7156 > 177.177.241.159.webcache: UDP, length 5

[root@node]# tcpdump -n -i eth0  port 30030
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
17:43:10.470136 IP 10.10.212.167.distinct > 10.10.212.164.30030: UDP, length 5
17:43:10.481007 IP 10.10.212.167.distinct > 10.10.212.164.30030: UDP, length 5
17:43:10.491607 IP 10.10.212.167.distinct > 10.10.212.164.30030: UDP, length 5
17:43:10.502879 IP 10.10.212.167.distinct > 10.10.212.164.30030: UDP, length 5

通过删除Pod构造重启：

[root@node]#  kubectl get pod -n allkinds -owide
NAME                   READY   STATUS    RESTARTS   AGE   IP                NODE       
dao-5f7669bc69-kkfk5   1/1     Running   0          18m   177.177.241.159   node

[root@node]# kubectl delete pod dao-5f7669bc69-kkfk5

Pod重启后，抓包发现Pod无法再接收UDP包：

1
2
3

[root@node]# tcpdump -n -i cali1bd5e5bd67b port 8080
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
^

在Pod所在节点网卡上可以抓到包，说明请求已到达节点上：

[root@node]# tcpdump -n -i eth0 port 30030
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
17:55:08.173773 IP 10.10.212.167.distinct > 10.10.212.164.30030: UDP, length 5
17:55:08.187789 IP 10.10.212.167.distinct > 10.10.212.164.30030: UDP, length 5
17:55:08.201551 IP 10.10.212.167.distinct > 10.10.212.164.30030: UDP, length 5
17:55:08.212789 IP 10.10.212.167.distinct > 10.10.212.164.30030: UDP, length 5

继续通过trace iptables跟踪请求的走向，观察到流量没有经过PREROUTING表的nat链，之后也没有按预期的方向走到FORWARD链，而是走到了INPUT链，继续往上层协议栈，从这个现象可以推测是DNAT出了问题；

根据netfilter原理图可以知道，DNAT跟conntrack表有关：

查看指定NodePort端口的conntrack条目，确认是表项问题：

正常表项：
[root@node]# cat /pro/net/nf_contrack |grep 30030
ipv4     2 udp      17 29 src=10.10.212.167 dst=10.10.212.164 sport=9999 dport=30030 [UNREPLIED] src=177.177.241.159 dst=10.10.212.164 sport=8080 dport=9999 mark=0 zone=0 use=2

异常表项：
[root@node]# cat /pro/net/nf_contrack |grep 30030
ipv4     2 udp      17 29 src=10.10.212.167 dst=10.10.212.164 sport=9999 dport=30030 [UNREPLIED] src=10.10.212.164 dst=10.10.212.167 sport=8080 dport=9999 mark=0 zone=0 use=2

从conntrack表项可以看出，业务Pod重启时，conntrack表项记录了到节点IP而不是到Pod的IP，因为UDP的conntrack表项默认老化时间为30s，当设备请求频繁时，conntrack表项也就无法老化，后续所有请求都会转给节点IP而不是Pod的IP；

那么Pod重启场景下，UDP的表项中反向src为什么变成了节点IP呢？怀疑是Pod重启过程中，Podd的IP发送变化，相应的iptables规则也会删除重新添加，这段时间如果设备继续通过NodePort发送请求给该Pod，会存在短暂的时间请求无法发送到Pod内，而是节点IP收到后直接记录到conntrack表项里。

为了验证这个想法，再次构造nc命令频繁发送UDP请求到节点IP：

1	[root@node]# while true; do echo "test" \| nc -4u 10.10.212.164 30031 -p 9999;done

查看30031端口的conntrack条目，确认正常情况下发送节点IP的UDP请求的反向src是节点IP，由此推测重启Pod过程中可能会出现这个问题：

1
2

[root@node]# cat /pro/net/nf_contrack |grep 30031
ipv4     2 udp      17 29 src=10.10.212.167 dst=10.10.212.164 sport=9999 dport=30031 [UNREPLIED] src=10.10.212.164 dst=10.10.212.167 sport=30031 dport=9999 mark=0 zone=0 use=2

一般来说，一个Pod的重启会经历先Kill再Create的操作，那么conntrack的异常表项的创建是在哪个阶段发生的呢？通过构造Pod的删除，实时记录conntrack的异常表项创建时间，可以分析出老的表项在Pod Kill阶段会被被动删除，而异常的表项是在Create Pod阶段创建的；

通过查看kube-proxy代码，也可以看出相关iptables规则的清除动作：

1	代码位置：https://github.com/kubernetes/kubernetes/blob/v1.15.12/pkg/proxy/iptables/proxier.go

而创建Pod阶段，为什么会偶现这个问题呢？查看proxier.go的实现并验证发现，Pod从删除后到新创建之前，会在KUBE-EXTERNAL-SERVICES链中临时设置如下规则（位于DNAT链之后），用于REJECT请求到异常Pod的流量：

-A KUBE-EXTERNAL-SERVICES -p udp -m comment --comment "allkinds/allkinds-deployment:udp has no endpoints" -m addrtype --dst-type LOCAL -m udp --dport 30030 -j REJECT --reject-with icmp-port-unreachable

上面的规则是在Pod异常时临时设置的，那么在Pod创建阶段，必然有个时机去清除，并且会下发相应的DNAT规则，而这两个操作的顺序就至关重要了。如果先下DNAT规则，请求从被拒绝转为走DNAT，这样conntrack表项的记录应该没有问题；如果先清理REJECT规则，则请求在DNAT规则下发之前有个临时状态——既没有了REJECT规则，又没有DNAT规则，这种情况下也就会出现我们见到的这个现象；

为了验证上面的猜想，继续查看proxier.go的实现，可以发现实际下发规则的动作发生在如下几行代码，并且是先下发filter链，再下发nat链，而上面说的REJECT规则正是在filter链内，DNAT规则在nat链内，基本确认是下发顺序可能导致的异常；

代码位置：https://github.com/kubernetes/kubernetes/blob/v1.15.12/pkg/proxy/iptables/proxier.go#L667-L1446
  // Sync rules. 
  // NOTE: NoFlushTables is used so we don't flush non-kubernetes chains in the table 
  proxier.iptablesData.Reset()
  proxier.iptablesData.Write(proxier.filterChains.Bytes()) 
  proxier.iptablesData.Write(proxier.filterRules.Bytes()) 
  proxier.iptablesData.Write(proxier.natChains.Bytes()) 
  proxier.iptablesData.Write(proxier.natRules.Bytes())

最后是修改验证，通过调整filter链和nat链下发的顺序，重新制作kube-proxy镜像并替换到环境中，验证问题不再出现；

但是，这个修改方案只是为了定位出原因而做的临时修改，毕竟改变两个链的下发顺序的影响还是很大的，不能这么轻易调整，所以给社区提了相关issue（https://github.com/kubernetes/kubernetes/issues/102618），社区很快给出答复，说是https://github.com/kubernetes/kubernetes/pull/98305这个PR已经解决，社区的做法是将清理conntrack表项的时机移到了下发filter链和nat链之后，通过分析验证，该问题解决（唯一的小瑕疵是还会偶现几条异常conntrack表项，然后被清除，再恢复正常，不过也不影响什么）；

ipv4     2 udp      17 29 src=10.10.212.167 dst=10.10.212.164 sport=9999 dport=30030 [UNREPLIED] src=10.10.212.164 dst=10.10.212.167 sport=8080 dport=9999 mark=0 zone=0 use=2
ipv4     2 udp      17 29 src=10.10.212.167 dst=10.10.212.164 sport=9999 dport=30030 [UNREPLIED] src=177.177.241.159dst=10.10.212.164 sport=8080 dport=9999 mark=0 zone=0 use=2
ipv4     2 udp      17 29 src=10.10.212.167 dst=10.10.212.164 sport=9999 dport=30030 [UNREPLIED] src=177.177.241.159dst=10.10.212.164 sport=8080 dport=9999 mark=0 zone=0 use=2
ipv4     2 udp      17 29 src=10.10.212.167 dst=10.10.212.164 sport=9999 dport=30030 [UNREPLIED] src=177.177.241.159dst=10.10.212.164 sport=8080 dport=9999 mark=0 zone=0 use=2

解决方案

升级K8S到v1.21及以上版本；
在无法升级K8S版本的前提下，将社区修改patch到老版本；

如何使用fsck命令检查和修复文件系统

发表于 2021-04-17 更新于 2024-12-15 分类于 linux 阅读时长 ≈ 3 分钟

简介

fsck（File System Consistency Check）是Linux的实用工具，用于检查文件系统是否存在错误或未解决的问题。该工具可以修复潜在的错误并生成报告。

默认情况下，Linux发行版附带此工具。使用fsck不需要特定的步骤或安装过程。打开终端后，就可以利用该工具的功能了。

按照本指南学习如何使用fsck在Linux上检查和修复文件系统。本教程将列出有关如何使用该工具以及用例的示例。

先决条件

Linux或类UNIX系统
访问终端或命令行
具有root权限的用户可以运行该工具

何时在Linux中使用fsck

fsck工具可以在多种情况下使用：

使用fsck作为预防性维护或在系统出现问题时运行文件系统检查。
fsck可以诊断的一个常见问题是系统何时无法启动。
另一个是当系统上的文件损坏时出现输入/输出错误。
还可以使用fsck实用工具检查外部驱动器（例如SD卡或USB闪存驱动器）的运行状况。

基本的fsck语法

fsck实用工具的基本语法遵循以下模式：

1	fsck <options> <filesystem>

在上面的示例中，filesystem 可以是设备，分区，挂载点等。还可以在命令末尾使用特定于文件系统的选项。

如何检查和修复文件系统

在检查和修复文件系统之前，需要执行几个步骤。

查看已安装的磁盘和分区

要查看系统上所有已安装的设备并检查磁盘位置，请使用Linux中可用的工具之一。例如，使用df 命令列出文件系统磁盘：

df -h

该工具可以打印系统上文件系统的使用情况。记下要使用fsck命令检查的磁盘。

例如，要查看第一个磁盘的分区，请使用以下命令：

1	sudo parted /dev/sda 'print'

sda是Linux指代第一个SCSI磁盘的方式。如果有两个，则第二个为sdb，依此类推。

在我们的示例中，由于该虚拟机上只有一个分区，因此得到了一个结果。如果有更多的分区，我们将获得更多的结果。

此处的磁盘名称为**/dev/sda** ，然后在“Number”列中显示分区的编号。在我们的例子中是：sda1。

卸载磁盘

必须先卸载磁盘或分区，然后才能使用fsck进行磁盘检查。如果尝试在已安装的磁盘或分区上运行fsck，则会收到警告：

确保运行unmount命令：

1	sudo umount /dev/sdb

替换*/dev/sdb*为要卸载的设备。

注意：我们不能卸载根文件系统。因此，现在fsck不能在正在运行的计算机上使用。

运行fsck检查错误

现在已经卸载了磁盘，就可以运行了fsck。要检查第二个磁盘，请输入：

1	sudo fsck /dev/sdb

上面的示例显示了正常磁盘的输出。如果磁盘上有多个问题，则每个错误都会出现一个提示，需要手动确认操作。

fsck实用工具返回的退出代码如下：

挂载磁盘

完成检查和修复设备后，请挂载磁盘，以便可以再次使用它。

在本例中，我们将重新安装sdb磁盘：

1	mount /dev/sdb

使用fsck进行试运行

在执行实时检查之前，可以使用fsck进行测试运行。将**-N** 选项传递给fsck命令以执行测试：

1	sudo fsck -N /dev/sdb

输出显示将发生的情况，但不执行任何操作。

使用fsck自动修复检测到的错误

要尝试解决潜在问题而没有任何提示，请将**-y选项传递给fsck**。

1	sudo fsck -y /dev/sdb

跳过修复，但在输出中显示fsck错误

如果要检查文件系统上的潜在错误而不进行修复，请使用**-n**选项。

1	sudo fsck -n /dev/sdb

强制fsck执行文件系统检查

在正常的设备上执行fsck时，该工具会跳过文件系统检查。如果要强制检查文件系统，请使用该**-f** 选项。

1	sudo fsck -f /dev/sdb

即使认为没有问题，也会执行扫描以搜索损坏。

一次在所有文件系统上运行fsck

如果要一次性检查所有使用fsck的文件系统，请传递该**-A标志。此选项将遍历/etc/fstab 中所有的磁盘并执行检查。

由于无法在正在运行的计算机上卸载根文件系统，因此请添加**-R** 选项以跳过它们：

fsck -AR

在特定文件系统上跳过fsck

如果要fsck跳过检查文件系统，则需要在文件系统之前添加**-t** 。

例如，要跳过ext3文件系统，请运行以下命令：

1	sudo fsck -AR -t noext3 -y

我们添加**-y**了跳过提示。

在已挂载的文件系统上跳过fsck

为确保不在已挂载的文件系统上运行fsck，请添加该**-M** 选项。该标志告诉fsck工具跳过任何已挂载的文件系统。

为了说明挂载前后的区别，我们将在sdb挂载时和卸载后分别执行fsck检查。

1	sudo fsck -M /dev/sdb

当sdb被挂载时，该工具退出而不运行检查。然后，我们卸载sdb并再次运行相同的命令。这次，fsck检查磁盘并将其报告为正常磁盘或有错误。

注意：如果想要删除第一行标题“fsck from util-linux 2.31.1”，请使用**-T**选项。

在Linux根分区上运行fsck

正如我们已经提到的，fsck无法检查正在运行的计算机上的根分区，因为它们已经挂载并正在使用中。但是，如果进入恢复模式并运行fsck检查，是可以检查Linux根分区的。

1.为此，请通过GUI或使用终端打开或重新启动计算机：

1	sudo reboot

2.在启动过程中按住Shift键。出现GNU GRUB菜单。

3.选择Ubuntu的高级选项。

4.然后，选择末尾带有（恢复模式）的条目。让系统加载到“恢复菜单”中。

5.从菜单中选择fsck。

6.通过在提示符下选择**<是>**进行确认。

7.完成后，在恢复菜单中选择“恢复”以启动计算机。

如果fsck被中断怎么办

正常来说，不应该打断正在进行的fsck检查。但是，如果该过程被中断，fsck将完成正在进行的检查，然后停止。

如果该实用工具在检查过程中发现错误，则如果中断，它将不会尝试修复任何问题。可以在下次重新运行检查。

fsck Linux命令选项列表

最后，下面是可与fsck Linux实用工具一起使用的选项列表。

选项	描述
-a	尝试自动修复文件系统错误。不会出现提示，因此请谨慎使用。
-A	检查/etc/fstab中列出的所有文件系统。
-C	显示检查ext2和ext3文件系统的进度。
-F	强制fsck检查文件系统。该工具甚至在文件系统看起来正常时也进行检查。
-l	锁定设备，以防止其他程序在扫描和修复期间使用该分区。
-M	不要检查已挂载的文件系统。挂载文件系统时，该工具返回退出代码0。
-N	做空试。输出显示fsck在不执行任何操作的情况下将执行的操作。警告或错误消息也将被打印。
-P	用于在多个文件系统上并行运行扫描。请谨慎使用。
-R	使用-A选项时，告诉fsck工具不要检查根文件系统。
-r	打印设备统计信息。
-t	指定要使用fsck检查的文件系统类型。请查阅手册页以获取详细信息。
-T	工具启动时隐藏标题。
-y	尝试在检查期间自动修复文件系统错误。
-V	详细输出。

结论

现在我们知道了如何使用fsck Linux命令来检查和修复文件系统。该指南提供了该工具的功能和示例。

在运行列出的命令之前，请确保具有root权限。有关所有选项的详细说明，还可以查阅该工具的手册文件或访问fsck Linux手册页。

Helm入门

发表于 2021-01-11 更新于 2024-12-15 分类于 kubernetes 阅读时长 ≈ 15 分钟

Helm简介

Helm是一个可简化Kubernetes应用程序安装和管理的工具。Helm可以理解为Kubernetes的apt/yum/homebrew。

此文档使用的是Helm的v3版本。如果我们使用的是Helm v2，请转到helm-v2分支。请参阅“Helm状态”以获取有关不同Helm版本的更多详细信息。

Helm状态

Helm v3于2019年11月发布。新老版本的接口非常相似，但是Helm的体系结构和内部架构发生了重大变化。有关更多详细信息，请查看Helm 3中的内容。

Helm v2计划支持1年“维护模式”。它指出以下内容：

6个月的bug修复，直到2020年5月13日
6个月的安全修复，直到2020年11月13日
2020年11月13日开始，对Helm v2的支持将终止

为什么使用Helm

Helm通常被称为Kubernetes应用程序包管理器。那么，使用Helm而不直接使用kubectl有什么好处呢？

目标

这些实验提供了关于使用Helm优于直接通过Kubectl使用Kubernetes的优势的见解。后续的几个实验都分为两种情况：第一种情况提供了如何使用kubectl执行任务的示例；第二种情况提供了使用Helm的示例。完成所有实验后，我们可以：

了解Helm的核心概念
了解使用Helm而非直接使用Kubernetes进行部署的优势：
- 应用管理
- 更新
- 配置
- 修订管理
- 储存库和Chart图表共享

前提

有一个正在运行的Kubernetes集群。有关创建集群的详细信息，请参阅《 IBM Cloud Kubernetes服务或Kubernetes入门指南》。
已通过Kubernetes集群安装并初始化了Helm。有关Helm入门，请参阅在IBM Cloud Kubernetes Service上安装Helm或《 Helm快速入门指南》。

Helm概览

Helm是可简化Kubernetes应用程序安装和管理的工具。它使用一种称为“Chart”的打包格式，该格式是描述Kubernetes资源的文件的集合。它可以在任何地方（笔记本电脑，CI/CD等）运行，并且可用于各种操作系统，例如OSX，Linux和Windows。

Helm 3从Helm 2客户端-服务器架构转向了客户端架构。客户端仍称为helm，并且有一个改进的Go库，该库封装了Helm逻辑，以便可以由不同的客户端使用。客户端是一个CLI，用户可以与它进行交互以执行不同的操作，例如安装/升级/删除等。客户端与Kubernetes API服务器和Chart存储库进行交互。它将Helm模板文件渲染为Kubernetes清单文件，用于通过Kubernetes API在Kubernetes集群上执行操作。有关更多详细信息，请参见Helm架构。

Chart被组织为目录内文件的集合，其中目录名是Chart的名称。它包含模板YAML文件，这些模板有助于在运行时提供配置值，并且无需修改YAML文件。这些模板基于Go模板语言，Sprig lib中的功能和其他专用功能提供了编程逻辑。

Chart存储库是可以存储和共享打包的Chart的位置。这类似于Docker中的镜像存储库。有关更多详细信息，请参考《Chart存储库指南》。

Helm概念

Helm术语：

Chart - 包含在Kubernetes集群中运行的应用程序，工具或服务所需的所有资源定义。Chart基本上是预先配置的Kubernetes资源的软件包。
Config - 包含可合并到Chart中以创建可发布对象的配置信息。
helm - helm客户端。它将Chart呈现为清单文件。它直接与Kubernetes API服务器交互以安装，升级，查询和删除Kubernetes资源。
Release - 在Kubernetes集群中运行的Chart实例。
Repository - 存储Chart的仓库，可以与他人共享。

Lab0 安装Helm

可以从源代码或预构建的二进制发行版中安装Helm客户端（helm）。在本实验中，我们将使用Helm社区的预构建二进制发行版（Linux amd64）。有关更多详细信息，请参阅Helm安装文档。

前提依赖

Kubernetes集群

安装Helm客户端

下载适用于环境的最新版本的Helm v3，以下步骤适用于Linux amd64，请根据环境调整示例。
解压：$ tar -zxvf helm-v3.<x>.<y>-linux-amd64.tgz。
在解压后的目录中找到helm二进制文件，并将其移至所需位置：mv linux-amd64/helm /usr/local/bin/helm。最好是将复制到的位置设置到path环境变量，因为它避免了必须对helm命令进行路径设置。
现在已安装了Helm客户端，可以使用helm help命令对其进行测试。

结论

现在可以开始使用Helm了。

Lab1 使用Helm部署应用

让我们研究一下Helm如何使用Chart来简化部署。我们首先使用kubectl将应用程序部署到Kubernetes集群，然后展示如何通过使用Helm部署同一应用程序。

该应用程序是Guestbook App，它是一个多层级的Web应用程序。

场景1: 使用kubectl部署应用

在本部分的实验中，我们将使用Kubernetes客户端kubectl部署应用程序。使用该应用程序的版本1进行部署。

如果已经从kube101安装了guestbook应用程序，请跳过本节，转到场景2中的helm示例。

克隆Guestbook App存储库以获取文件：

1	git clone https://github.com/IBM/guestbook.git

使用克隆的Git库中的配置文件来部署容器，并使用以下命令为它们创建服务：

$ cd guestbook/v1

$ kubectl create -f redis-master-deployment.yaml
deployment.apps/redis-master created

$ kubectl create -f redis-master-service.yaml
service/redis-master created

$ kubectl create -f redis-slave-deployment.yaml
deployment.apps/redis-slave created

$ kubectl create -f redis-slave-service.yaml
service/redis-slave created

$ kubectl create -f guestbook-deployment.yaml
deployment.apps/guestbook-v1 created

$ kubectl create -f guestbook-service.yaml
service/guestbook created

有关更多详细信息，请参阅README。

查看guestbook：

现在，我们可以通过在浏览器中打开刚创建的留言簿来玩（可能需要一些时间才能显示出来）。

本地主机：如果我们在本地运行Kubernetes，请在浏览器中导航至http://localhost:3000以查看留言簿。

远程主机：

要查看远程主机上的留言簿，请在$ kubectl get services输出的EXTERNAL-IP和PORTS列中找到负载均衡器的外部IP和端口。

$ kubectl get services
NAME           TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)         
guestbook      LoadBalancer   172.21.252.107   50.23.5.136   3000:31367/TCP 
redis-master   ClusterIP      172.21.97.222    <none>        6379/TCP       
redis-slave    ClusterIP      172.21.43.70     <none>        6379/TCP       
.........

在这种情况下，URL为http://50.23.5.136:31367。

注意：如果未分配外部IP，则可以使用以下命令获取外部IP：

1
2
3

$ kubectl get nodes -o wide
NAME           STATUS    ROLES     AGE       VERSION        EXTERNAL-IP      OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME  
10.47.122.98   Ready     <none>    1h        v1.10.11+IKS   173.193.92.112   Ubuntu 16.04.5 LTS   4.4.0-141-generic   docker://18.6.1

在这种情况下，URL为http://173.193.92.112:31367。WW在浏览器中导航到给定的输出（例如http://50.23.5.136:31367）。应该看到浏览器显示如下：

场景2: 使用Helm部署应用

在实验的这一部分，我们将使用Helm部署应用程序。我们将设置guestbook-demo的发行版名称，以使其与之前的部署区分开。可在此处获得Helm chart。克隆Helm 101存储库以获取文件：

1	git clone https://github.com/IBM/helm101

Chart被定义为描述一组相关的Kubernetes资源的文件的集合。我们先查看文件，然后再安装。guestbook 的chart文件如下：

.
├──Chart.yaml \\包含有关信息的YAML文件
├──LICENSE \\许可证
├──README.md \\帮助文档，提供有关chart用法，配置，安装等信息
├──template \\模板目录，当与values.yaml结合使用时将生成有效的Kubernetes清单文件
│  ├──_helpers.tpl \\在整个chart中重复使用的模板帮助程序/定义
│  ├──guestbook-deployment.yaml \\ Guestbook应用程序容器资源
│  ├──guestbook-service.yaml \\ Guestbook应用服务资源
│  ├──NOTES.txt \\一个纯文本文件，包含有关如何在安装后访问应用程序的简短使用说明
│  ├──redis-master-deployment.yaml \\ Redis主容器资源
│  ├──redis-master-service.yaml \\ Redis主服务资源
│  ├──redis-slave-deployment.yaml \\ Redis从属容器资源
│  └──redis-slave-service.yaml \\ Redis从属服务资源
└──values.yaml \\chart的默认配置值

注意：上面显示的模板文件将被传递到Kubernetes清单文件中，然后再传递给Kubernetes API服务器。因此，它们映射到我们在使用kubectl时部署的清单文件（不包含README和NOTES）。

让我们继续并立即安装chart。如果helm-demo命名空间不存在，则需要使用以下命令创建它：

1	kubectl create namespace helm-demo

将应用程序作为Helm chart安装：

$ cd helm101/charts

$ helm install guestbook-demo ./guestbook/ --namespace helm-demo
NAME: guestbook-demo
...

我们应该看到类似于以下内容的输出：

NAME: guestbook-demo
LAST DEPLOYED: Mon Feb 24 18:08:02 2020
NAMESPACE: helm-demo
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Get the application URL by running these commands:
  NOTE: It may take a few minutes for the LoadBalancer IP to be available.
      You can watch the status of by running 'kubectl get svc -w guestbook-demo --namespace helm-demo'
  export SERVICE_IP=$(kubectl get svc --namespace helm-demo guestbook-demo -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
  echo http://$SERVICE_IP:3000

该chart的安装将执行Redis主服务器和从服务器以及guestbook应用的Kubernetes部署和服务创建。这是因为该chart是描述一组相关的Kubernetes资源的文件的集合，并且Helm通过Kubernetes API管理这些资源的创建。

查看部署状态：

1
2
3

$ kubectl get deployment guestbook-demo --namespace helm-dem
NAME             READY   UP-TO-DATE   AVAILABLE   AGE
guestbook-demo   2/2     2            2           51m

查看pod状态：

$ kubectl get pods --namespace helm-demo
NAME                            READY     STATUS    RESTARTS   AGE
guestbook-demo-6c9cf8b9-jwbs9   1/1       Running   0          52m
guestbook-demo-6c9cf8b9-qk4fb   1/1       Running   0          52m
redis-master-5d8b66464f-j72jf   1/1       Running   0          52m
redis-slave-586b4c847c-2xt99    1/1       Running   0          52m
redis-slave-586b4c847c-q7rq5    1/1       Running   0          52m

查看service状态：

$ kubectl get services --namespace helm-demo
NAME             TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
guestbook-demo   LoadBalancer   172.21.43.244    <pending>     3000:31367/TCP   52m
redis-master     ClusterIP      172.21.12.43     <none>        6379/TCP         52m
redis-slave      ClusterIP      172.21.176.148   <none>        6379/TCP         52m

查看留言簿：

现在，我们可以通过在浏览器中打开刚创建的留言簿来玩（可能需要一些时间才能显示出来）。
- 本地主机：如果我们在本地运行Kubernetes，请在浏览器中导航至http://localhost:3000以查看留言簿。
- 远程主机：
  - 要查看远程主机上的留言簿，请在$ kubectl get services输出的EXTERNAL-IP和PORTS列中找到负载均衡器的外部IP和端口。
    1
    2
    3
    $ export SERVICE_IP=$(kubectl get svc --namespace helm-demo guestbook-demo -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
    $ echo http://$SERVICE_IP
    http://50.23.5.136

在这种情况下，URL为http://50.23.5.136:31367。

注意：如果未分配外部IP，则可以使用以下命令获取外部IP：

1
2
3

$ kubectl get nodes -o wide
NAME           STATUS    ROLES     AGE       VERSION        EXTERNAL-IP      OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME  
10.47.122.98   Ready     <none>    1h        v1.10.11+IKS   173.193.92.112   Ubuntu 16.04.5 LTS   4.4.0-141-generic   docker://18.6.1

 - 在这种情况下，URL为`http://173.193.92.112:31367`。在浏览器中导航到给定的输出（例如`http://50.23.5.136:31367`）。应该看到浏览器显示如下：

   ![guestbook-page](https://gitee.com/lyyao09/cdn/raw/master/k8s/Helm101/guestbook-page.png)

结论

恭喜，我们现在已经通过两种不同的方法将应用程序部署到Kubernetes。从本实验中，我们可以看到，与使用kubectl相比，使用Helm所需的命令更少，思考的时间也更少（通过提供chart路径而不是单个文件）。 Helm的应用程序管理为用户提供了这种简单性。

Lab2 使用Helm更新应用

在Lab1中，我们使用Helm安装了guestbook示例应用程序，并看到了相较于kubectl的优势。我们可能认为自己已经足够了解使用Helm。但是chart的更新或修改呢？我们如何更新和修改正在运行的应用？

在本实验中，我们将研究chart更改后如何更新正在运行的应用程序。为了说明这一点，我们将通过以下方式对原始留言簿的chart进行更改：

删除Redis从节点并改为仅使用内存数据库
将类型从LoadBalancer更改为NodePort

虽然是修改，但是本实验的目的是展示如何使用Kubernetes和Helm更新应用。那么，这样做有多容易呢？让我们继续看看。

场景1: 使用kubectl更新应用

在本部分的实验中，我们将直接使用Kubernetes更新以前部署的应用程序Guestbook。

这是一个可选步骤，从技术上讲，更新正在运行的应用程序不是必需的。进行此步骤的原因是“整理”-我们要为已部署的当前配置获取正确的文件。这样可以避免在以后进行更新甚至回滚时犯错误。在此更新的配置中，我们删除了Redis从节点。要使目录与配置匹配，请移动/存档或仅从来文件夹中删除Redis从属文件：

1
2
3

cd guestbook/v1
rm redis-slave-service.yaml
rm redis-slave-deployment.yaml

注意：如果需要，可以稍后使用git checkout-命令来还原这些文件。

删除Redis从节点的Service和Pod：

$ kubectl delete svc redis-slave --namespace default
service "redis-slave" deleted
$ kubectl delete deployment redis-slave --namespace default
deployment.extensions "redis-slave" deleted

将Guestbook服务的yaml从LoadBalancer更新为NodePort类型：

1	sed -i.bak 's/LoadBalancer/NodePort/g' guestbook-service.yaml

删除Guestbook运行时服务

1	kubectl delete svc guestbook --namespace default

重新创建具有NodePort类型的服务：

1	kubectl create -f guestbook-service.yaml

使用以下命令检查更新：

$ kubectl get all --namespace default
NAME                                READY     STATUS    RESTARTS   AGE
pod/guestbook-v1-7fc76dc46-9r4s7    1/1       Running   0          1h
pod/guestbook-v1-7fc76dc46-hspnk    1/1       Running   0          1h
pod/guestbook-v1-7fc76dc46-sxzkt    1/1       Running   0          1h
pod/redis-master-5d8b66464f-pvbl9   1/1       Running   0          1h

NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
service/guestbook      NodePort    172.21.45.29    <none>        3000:31989/TCP   31s
service/kubernetes     ClusterIP   172.21.0.1      <none>        443/TCP          9d
service/redis-master   ClusterIP   172.21.232.61   <none>        6379/TCP         1h

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/guestbook-demo   3/3     3            3           1h
deployment.apps/redis-master     1/1     1            1           1h

NAME                                      DESIRED   CURRENT   READY     AGE
replicaset.apps/guestbook-v1-7fc76dc46    3         3         3         1h
replicaset.apps/redis-master-5d8b66464f   1         1         1         1h

注意：服务类型已更改（更改为NodePor），并且已为留言簿服务分配了新端口（在此输出情况下为31989）。所有redis-slave资源均已删除。

获取节点的公共IP，并重新访问应用提供的服务：

1	kubectl get nodes -o wide

场景2: 使用Helm更新应用

在本节中，我们将使用Helm更新以前部署的guestbook-demo应用程序。

在开始之前，让我们花几分钟看一下Helm与直接使用Kubernetes相比如何简化流程。 Helm使用模板语言为chart提供了极大的灵活性和强大的功能，从而为chart用户消除了复杂性。在留言簿示例中，我们将使用以下模板功能：

Values：提供访问传递到chart中的值的对象。例如在guestbook-service中，它包含以下类型：.Values.service.type。此行提供了在升级或安装期间设置服务类型的功能。
控制结构：在模板中也称为“动作”，控制结构使模板能够控制生成的流程。一个例子是在redis-slave-service中，它包含行-if .Values.redis.slaveEnabled-。该行允许我们在升级或安装期间启用/禁用REDIS主/从。

如下所示，完整的redis-slave-service.yaml演示了在禁用slaveEnabled标志时文件如何变得冗余以及如何设置端口值。其他chart文件中还有更多的模板功能示例。

{{- if .Values.redis.slaveEnabled -}}
apiVersion: v1
kind: Service
metadata:
  name: redis-slave
  labels:
    app: redis
    role: slave
spec:
  ports:
  - port: {{ .Values.redis.port }}
    targetPort: redis-server	
  selector:
    app: redis
    role: slave
{{- end }}

1	helm list -n helm-demo

请注意，我们指定了名称空间。如果未指定，它将使用当前的名称空间上下文。我们应该看到类似于以下内容的输出：

1
2
3

$ helm list -n helm-demo
NAME           NAMESPACE REVISION  UPDATED                                 STATUS    CHART            APP VERSION
guestbook-demo helm-demo 1         2020-02-24 18:08:02.017401264 +0000 UTC deployed  guestbook-0.2.0

list命令提供已部署chart（发行版）的列表，其中提供了chart版本，名称空间，更新（修订）数量等信息。

我们更新应用程序：

$ cd helm101/charts

$ helm upgrade guestbook-demo ./guestbook --set redis.slaveEnabled=false,service.type=NodePort --namespace helm-demo
Release "guestbook-demo" has been upgraded. Happy Helming!
...

Helm升级将采用现有版本，并根据提供的信息对其进行升级。我们应该看到类似于以下内容的输出：

$ helm upgrade guestbook-demo ./guestbook --set redis.slaveEnabled=false,service.type=NodePort --namespace helm-demo
Release "guestbook-demo" has been upgraded. Happy Helming!
NAME: guestbook-demo
LAST DEPLOYED: Tue Feb 25 14:23:27 2020
NAMESPACE: helm-demo
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
1. Get the application URL by running these commands:
  export NODE_PORT=$(kubectl get --namespace helm-demo -o jsonpath="{.spec.ports[0].nodePort}" services guestbook-demo)
  export NODE_IP=$(kubectl get nodes --namespace helm-demo -o jsonpath="{.items[0].status.addresses[0].address}")
  echo http://$NODE_IP:$NODE_PORT

upgrade命令将应用程序升级到chart的指定版本，删除redis-slave资源，并将应用程序service.type更新为NodePort。

使用kubectl get all --namespace helm-demo获取更新内容：

$ kubectl get all --namespace helm-demo
NAME                                  READY   STATUS    RESTARTS   AGE
pod/guestbook-demo-6c9cf8b9-dhqk9     1/1     Running   0          20h
pod/guestbook-demo-6c9cf8b9-zddn2     1/1     Running   0          20h
pod/redis-master-5d8b66464f-g7sh6     1/1     Running   0          20h

NAME                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
service/guestbook-demo   NodePort    172.21.43.244    <none>        3000:31202/TCP   20h
service/redis-master     ClusterIP   172.21.12.43     <none>        6379/TCP         20h

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/guestbook-demo   2/2     2            2           20h
deployment.apps/redis-master     1/1     1            1           20h

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/guestbook-demo-6c9cf8b9     2         2         2       20h
replicaset.apps/redis-master-5d8b66464f     1         1         1       20h

注意：服务类型已更改（更改为NodePort），并且已为留言簿服务分配了新端口（在此输出情况下为31202）。所有redis-slave资源均已删除。

当我们使用helm list -n helm-demo命令检查Helm版本时，可以看到revision和日期已更新：

1
2
3

$ helm list -n helm-demo
NAME            NAMESPACE REVISION  UPDATED                                 STATUS    CHART            APP VERSION
guestbook-demo  helm-demo 2         2020-02-25 14:23:27.06732381 +0000 UTC  deployed  guestbook-0.2.0

获取节点的公共IP，并重新访问应用提供的服务：

1	kubectl get nodes -o wide

结论

恭喜，现在已经更新了应用程序！ Helm不需要任何手动更改资源，因此非常容易升级！所有配置都可以在命令行上即时设置，也可以使用替代文件设置。从将逻辑添加到模板文件后就可以实现这一点，这取决于flag标识，启用或禁用此功能。

Lab 3. 跟踪已部署的应用程序

假设我们部署了应用程序的不同发行版（即升级了正在运行的应用程序）。如何跟踪版本以及如何回滚？

场景1: 使用Kubernetes进行修订管理

在本部分的实验中，我们应该直接使用Kubernetes来说明留言簿的修订管理，但是我们不能。这是因为Kubernetes不为修订管理提供任何支持。我们有责任管理系统以及所做的任何更新或更改。但是，我们可以使用Helm进行修订管理。

场景2: 使用Helm进行修订管理

在本部分的实验中，我们将使用Helm来说明对已部署的应用程序guestbook-demo的修订管理。

使用Helm，每次进行安装，升级或回滚时，修订版本号都会增加1。第一个修订版本号始终为1。Helm将发布元数据保留在Kubernetes集群中存储的Secrets（默认）或ConfigMap中。每当发行版更改时，都会将其附加到现有数据中。这为Helm提供了回滚到先前版本的功能。

让我们看看这在实践中如何工作。

检查部署的数量：

应该看到类似于以下的输出，因为在Lab 1中进行初始安装后，我们在Lab 2中进行了升级。

$ helm history guestbook-demo -n helm-demo
REVISION    UPDATED                     STATUS      CHART           APP VERSION DESCRIPTION
1           Mon Feb 24 18:08:02 2020    superseded  guestbook-0.2.0             Install complete
2           Tue Feb 25 14:23:27 2020    deployed    guestbook-0.2.0             Upgrade complete

回滚到以前的版本：

在此回滚中，Helm将检查从修订版1升级到修订版2时发生的更改。此信息使它能够调用Kubernetes API服务，以根据初始部署更新已部署的应用程序-换句话说，使用Redis slave并使用负载平衡器。

1 2	$ helm rollback guestbook-demo 1 -n helm-demo Rollback was a success! Happy Helming!

再次检查历史记录：

应该看到类似于以下的输出：

$ helm history guestbook-demo -n helm-demo
REVISION    UPDATED                     STATUS      CHART           APP VERSION DESCRIPTION
1           Mon Feb 24 18:08:02 2020    superseded  guestbook-0.2.0             Install complete
2           Tue Feb 25 14:23:27 2020    superseded  guestbook-0.2.0             Upgrade complete
3           Tue Feb 25 14:53:45 2020    deployed    guestbook-0.2.0             Rollback to 1

检查回滚结果：

$ kubectl get all --namespace helm-demo
NAME                                  READY   STATUS    RESTARTS   AGE
pod/guestbook-demo-6c9cf8b9-dhqk9     1/1     Running   0          20h
pod/guestbook-demo-6c9cf8b9-zddn      1/1     Running   0          20h
pod/redis-master-5d8b66464f-g7sh6     1/1     Running   0          20h
pod/redis-slave-586b4c847c-tkfj5      1/1     Running   0          5m15s
pod/redis-slave-586b4c847c-xxrdn      1/1     Running   0          5m15s

NAME                     TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
service/guestbook-demo   LoadBalancer   172.21.43.244    <pending>     3000:31367/TCP   20h
service/redis-master     ClusterIP      172.21.12.43     <none>        6379/TCP         20h
service/redis-slave      ClusterIP      172.21.232.16    <none>        6379/TCP         5m15s

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/guestbook-demo   2/2     2            2           20h
deployment.apps/redis-master     1/1     1            1           20h
deployment.apps/redis-slave      2/2     2            2           5m15s

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/guestbook-demo-26c9cf8b9    2         2         2       20h
replicaset.apps/redis-master-5d8b66464f     1         1         1       20h
replicaset.apps/redis-slave-586b4c847c      2         2         2       5m15s

从输出中可以再次看到，应用程序服务是LoadBalancer的服务类型，并且Redis主/从部署已返回。这显示了实验2中升级的完整回滚。

结论

从这个实验中，我们可以说Helm很好地进行了修订管理，而Kubernetes没有内置的功能！我们可能想知道为什么需要helm rollback，因为重新执行helm upgrade也可以回到老版本。这是一个很好的问题。从技术上讲，我们应该最终部署相同的资源（具有相同的参数）。但是，使用helm rollback的好处是，Helm可以管理（即记住）以前的helm install\upgrade的所有变体/参数。通过helm upgrade进行回滚需要我们手动跟踪先前执行命令的方式。这不仅繁琐，而且容易出错。让Helm管理所有这些工作更加容易，安全和可靠，并且我们需要做的所有事情都告诉它可以使用哪个以前的版本，其余的都可以完成。

Lab 4. 共享Helm Charts

提供应用程序的一个关键方面意味着与他人共享。共享可以是直接的（由用户或在CI/CD管道中），也可以作为其他chart的依赖项。如果人们找不到你的应用程序，那么他们就无法使用它。

共享的一种方法是使用chart库，该仓库可以存储和共享打包的chart。由于chart库仅适用于Helm，因此我们将仅查看Helm chart的用法和存储。

从公共仓库中获取Chart

Helm charts可以在远程存储库或本地环境/存储库中使用。远程存储库可以是公共的，例如Bitnami Charts或IBM Helm Charts，也可以是托管存储库，例如在Google Cloud Storage或GitHub上。有关更多详细信息，请参阅《 Helm Chart存储库指南》。我们可以通过在本实验中检查chart索引文件来了解有关chart存储库结构的更多信息。

在本部分的实验中，我们将展示如何从Helm101存储库中安装留言簿chart。

检查系统上配置的存储库：

1 2	$ helm repo list Error: no repositories to show

注意：默认情况下，Helm v3未安装chart存储库，而是期望我们自己为要使用的chart添加存储库。 Helm Hub可以集中搜索公共可用的分布式chart。使用Helm Hub，我们可以找到所需chart，然后将其添加到本地存储库列表中。 Helm chart存储库（如Helm v2）处于“维护模式”，将于2020年11月13日弃用。有关更多详细信息，请参见项目状态。

添加helm101仓库：

1 2	$ helm repo add helm101 https://ibm.github.io/helm101/ "helm101" has been added to your repositories

还可以通过运行以下命令在存储库中搜索chart：

1
2
3

$ helm search repo helm101
NAME              CHART VERSION  APP VERSION DESCRIPTION
helm101/guestbook 0.2.1                      A Helm chart to deploy Guestbook three tier web...

安装chart：

如前所述，我们将安装Helm101存储库中的留言簿chart。当将仓库添加到我们的本地仓库清单中时，我们可以使用repo name/chart name（即helm101/guestbook）来引用chart。要查看实际效果，将应用程序安装到名为repo-demo的新命名空间中。

$kubectl create namespace repo-demo
$helm install guestbook-demo helm101/guestbook --namespace repo-demo

$helm install guestbook-demo helm101/guestbook --namespace repo-demo
NAME: guestbook-demo
LAST DEPLOYED: Tue Feb 25 15:40:17 2020
NAMESPACE: repo-demo
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Get the application URL by running these commands:
  NOTE: It may take a few minutes for the LoadBalancer IP to be available.
        You can watch the status of by running 'kubectl get svc -w guestbook-demo --namespace repo-demo'
  export SERVICE_IP=$(kubectl get svc --namespace repo-demo guestbook-demo -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
  echo http://$SERVICE_IP:3000

检查是否按预期部署了该版本，如下所示：

1
2
3

$ helm list -n repo-demo
NAME           NAMESPACE   REVISION UPDATED                                   STATUS   CHART            APP VERSION
guestbook-demo repo-demo   1        2020-02-25 15:40:17.627745329 +0000 UTC   deployed guestbook-0.2.1

结论

本实验简要介绍了Helm存储库，以显示如何安装chart。共享chart的能力意味着更易于使用。

如何在IntelliJ IDEA中的Maven项目中debug测试

发表于 2021-01-05 更新于 2024-12-15 分类于 tools 阅读时长 ≈ 1 分钟

什么是Debug

Debug调试是为了找到并修复代码中的错误。这是朝着编写没有bug的代码的方向迈出的重要一步，而没有bug的代码可以创建可靠的软件。

因此，我将以简单的步骤说明如何在IntelliJ IDEA中调试Maven项目的Test测试。

Debug测试

Step 1 :

Debug测试例需要使用到Maven surefire plugin插件。以下使用到的命令是在Ubuntu上执行的。

首先是在需要调试的代码行中打断点。为此，只需在代码编辑区域中单击行的左上角，即可在调试期间暂停测试。单击时将出现一个红点。

Step 2 :

进入包含maven项目的集成测试的目录后，在命令行上键入以下命令。

1 2	cd <path-to-the-directory-containing-your-maven-project's-integrationtests> mvn clean install -Dmaven.surefire.debug

测试将自动暂停，并在端口5005上等待远程调试器。（端口5005为默认端口）。我们可以在命令行中看到一条语句，通知它正在监听端口5005。

1	Listening for transport dt_socket at address: 5005

如果需要配置其他端口，则可以将更详细的值传递给上述命令。

1	mvn clean install -Dmaven.surefire.debug="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000 -Xnoagent -Djava.compiler=NONE"

此命令将会监听端口8000而不是5005。

Step 3 :

如果是第一次运行调试器，则必须在IntelliJ IDEA中编辑Debug配置。如果已经完成了配置并将远程调试器端口设置为5005，则无需再次编辑配置。

Debug配置可以安装如下流程进行编辑：

在IDE中转到“Run –> Edit Configurations…”
在出现的对话框中，单击左上角的“ +”号
在下拉列表中找到“Remote”选项
在出现的下一个窗口中，在必须指定端口的地方指定端口
然后“Apply ”，然后单击“Ok”。

Step 4 :

然后，可以使用IDE附加到正在运行的测试。

转到Run –> Debug…
然后选择之前指定的配置

现在，测试已附加到远程调试器。上面就是我们需要做的所有事情。

测试将在我们之前指定的断点处暂停。在运行测试时，进出请求的详细信息可以在IDE中看到。我们也可以单击并逐个删除断点，并在每次暂停后通过IDE恢复程序。