0%

K8S问题排查-没有Endpoint的Service请求Reject失效问题

问题背景

客户的防火墙抓到了没有EndpointService请求,从K8S角度来说,正常情况下不应该存在这种现象的,因为没有EndpointService请求会被iptables规则reject掉才对。

分析过程

先本地环境复现,创建一个没有后端的服务,例如grafana-service111

1
2
3
4
5
6
7
8
9
10
11
 [root@node01 ~]# kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d
kube-system grafana-service ClusterIP 10.96.78.163 <none> 3000/TCP 2d
kube-system grafana-service111 ClusterIP 10.96.52.101 <none> 3000/TCP 13s

[root@node01 ~]# kubectl get ep -A
NAMESPACE NAME ENDPOINTS AGE
default kubernetes 10.10.72.15:6443 2d
kube-system grafana-service 10.78.104.6:3000,10.78.135.5:3000 2d
kube-system grafana-service111 <none> 18s

进入一个业务Pod,并请求grafana-service111,结果请求卡住并超时终止:

1
2
3
4
5
6
7
[root@node01 ~]# kubectl exec -it -n kube-system   influxdb-rs1-5bdc67f4cb-lnfgt bash
root@influxdb-rs1-5bdc67f4cb-lnfgt:/# time curl http://10.96.52.101:3000
curl: (7) Failed to connect to 10.96.52.101 port 3000: Connection timed out

real 2m7.307s
user 0m0.006s
sys 0m0.008s

查看grafana-service111iptables规则,发现有reject规则,但从上面的实测现象看,应该是没有生效:

1
2
[root@node01 ~]# iptables-save |grep 10.96.52.101
-A KUBE-SERVICES -d 10.96.52.101/32 -p tcp -m comment --comment "kube-system/grafana-service111: has no endpoints" -m tcp --dport 3000 -j REJECT --reject-with icmp-port-unreachable

在业务Pod容器网卡上抓包,没有发现响应报文(不符合预期):

1
2
3
4
[root@node01 ~]# tcpdump -n -i calie2568ca85e4 host 10.96.52.101
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on calie2568ca85e4, link-type EN10MB (Ethernet), capture size 262144 bytes
20:31:34.647286 IP 10.78.166.136.39230 > 10.96.52.101.hbci: Flags [S], seq 1890821953, win 29200, options [mss 1460,sackOK,TS val 792301056 ecr 0,nop,wscale 7], length 0

在节点网卡上抓包,存在服务请求包(不符合预期):

1
2
3
4
5
6
[root@node01 ~]# tcpdump -n -i eth0 host 10.96.52.101
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
20:33:36.994881 IP 10.10.72.10.41234 > 10.96.52.101.hbci: Flags [S], seq 3530065013, win 29200, options [mss 1460,sackOK,TS val 792423403 ecr 0,nop,wscale 7], length 0
20:33:37.995298 IP 10.10.72.10.41234 > 10.96.52.101.hbci: Flags [S], seq 3530065013, win 29200, options [mss 1460,sackOK,TS val 792424404 ecr 0,nop,wscale 7], length 0
20:33:39.999285 IP 10.10.72.10.41234 > 10.96.52.101.hbci: Flags [S], seq 3530065013, win 29200, options [mss 1460,sackOK,TS val 792426408 ecr 0,nop,wscale 7], length 0

既然reject规则存在,初步怀疑可能影响该规则的组件有两个:

  1. kube-proxy
  2. calico-node

基于上一篇《使用Kubeasz一键部署K8S集群》,在最新的K8S集群上做相同的测试,发现不存在该问题,说明该问题在新版本已经修复了。分别在K8SCalicoissue上查询相关问题,最后发现是Calicobug,相关issue见参考资料[1, 2],修复记录见参考资料[3]。

下面是新老版本的Calico处理cali-FORWARD链的差异点:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
有问题的环境:
[root@node4 ~]# iptables -t filter -S cali-FORWARD
-N cali-FORWARD
-A cali-FORWARD -m comment --comment "cali:vjrMJCRpqwy5oRoX" -j MARK --set-xmark 0x0/0xe0000
-A cali-FORWARD -m comment --comment "cali:A_sPAO0mcxbT9mOV" -m mark --mark 0x0/0x10000 -j cali-from-hep-forward
-A cali-FORWARD -i cali+ -m comment --comment "cali:8ZoYfO5HKXWbB3pk" -j cali-from-wl-dispatch
-A cali-FORWARD -o cali+ -m comment --comment "cali:jdEuaPBe14V2hutn" -j cali-to-wl-dispatch
-A cali-FORWARD -m comment --comment "cali:12bc6HljsMKsmfr-" -j cali-to-hep-forward
-A cali-FORWARD -m comment --comment "cali:MH9kMp5aNICL-Olv" -m comment --comment "Policy explicitly accepted packet." -m mark --mark 0x10000/0x10000 -j ACCEPT
//问题在这最后这一条规则,新版本的calico把这条规则移到了FORWARD链

正常的环境:
[root@node01 ~]# iptables -t filter -S cali-FORWARD
-N cali-FORWARD
-A cali-FORWARD -m comment --comment "cali:vjrMJCRpqwy5oRoX" -j MARK --set-xmark 0x0/0xe0000
-A cali-FORWARD -m comment --comment "cali:A_sPAO0mcxbT9mOV" -m mark --mark 0x0/0x10000 -j cali-from-hep-forward
-A cali-FORWARD -i cali+ -m comment --comment "cali:8ZoYfO5HKXWbB3pk" -j cali-from-wl-dispatch
-A cali-FORWARD -o cali+ -m comment --comment "cali:jdEuaPBe14V2hutn" -j cali-to-wl-dispatch
-A cali-FORWARD -m comment --comment "cali:12bc6HljsMKsmfr-" -j cali-to-hep-forward
-A cali-FORWARD -m comment --comment "cali:NOSxoaGx8OIstr1z" -j cali-cidr-block

下面是在最新的K8S集群上做相同的测试记录,可以跟异常环境做对比。

模拟一个业务请求pod

1
2
3
4
5
[root@node01 home]# kubectl run busybox --image=busybox-curl:v1.0 --image-pull-policy=IfNotPresent -- sleep 300000
pod/busybox created

[root@node01 home]# kubectl get pod -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE default busybox 1/1 Running 0 14h 10.78.153.73 10.10.11.49

模拟一个业务响应服务metrics-server111,并且该服务无后端endpoint

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[root@node01 home]# kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.68.0.1 <none> 443/TCP 18h
kube-system dashboard-metrics-scraper ClusterIP 10.68.174.38 <none> 8000/TCP 17h
kube-system kube-dns ClusterIP 10.68.0.2 <none> 53/UDP,53/TCP,9153/TCP 17h
kube-system kube-dns-upstream ClusterIP 10.68.41.41 <none> 53/UDP,53/TCP 17h
kube-system kubernetes-dashboard NodePort 10.68.160.45 <none> 443:30861/TCP 17h
kube-system metrics-server ClusterIP 10.68.65.249 <none> 443/TCP 17h
kube-system metrics-server111 ClusterIP 10.68.224.53 <none> 443/TCP 14h
kube-system node-local-dns ClusterIP None <none> 9253/TCP 17h

[root@node01 ~]# kubectl get ep -A
NAMESPACE NAME ENDPOINTS AGE
default kubernetes 172.28.11.49:6443 18h
kube-system dashboard-metrics-scraper 10.78.153.68:8000 18h
kube-system kube-dns 10.78.153.67:53,10.78.153.67:53,10.78.153.67:9153 18h
kube-system kube-dns-upstream 10.78.153.67:53,10.78.153.67:53 18h
kube-system kubernetes-dashboard 10.78.153.66:8443 18h
kube-system metrics-server 10.78.153.65:4443 18h
kube-system metrics-server111 <none> 15h
kube-system node-local-dns 172.28.11.49:9253 18h

进入业务请求pod,做curl测试,请求立刻被拒绝(符合预期):

1
2
3
[root@node01 02-k8s]# kubectl exec -it busybox bash
/ # curl -i -k https://10.68.224.53:443
curl: (7) Failed to connect to 10.68.224.53 port 443 after 2 ms: Connection refused

tcpdump抓取容器网卡报文,出现tcp port https unreachable符合预期):

1
2
3
tcpdump -n -i cali12d4a061371
21:54:42.697437 IP 10.78.153.73.41606 > 10.68.224.53.https: Flags [S], seq 3510100476, win 29200, options [mss 1460,sackOK,TS val 2134372616 ecr 0,nop,wscale 7], length 0
21:54:42.698804 IP 10.10.11.49> 10.78.153.73: ICMP 10.68.224.53 tcp port https unreachable, length 68

tcpdump抓取节点网卡报文,无请求从测试容器内发出集群(符合预期);

1
2
3
4
5
6
7
[root@node01 bin]# tcpdump -n -i eth0 host 10.68.224.53
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
2 packets received by filter
0 packets dropped by kernel

解决方案

升级Calico,要求版本>=v3.16.0

参考资料

https://github.com/projectcalico/calico/issues/1055
https://github.com/projectcalico/calico/issues/3901
https://github.com/projectcalico/felix/pull/2424