misterli's Blog.

记录一次kyverno重启解决过程

字数统计: 1.8k阅读时长: 10 min
2020/11/06

记一次kyverno重启解决
收到报警kyverno的pod一直重启,查看一下发现重启了1w+次,还是有点疯狂

1
2
3
4
# lishuai @ MacBook-Pro in ~/.kube [10:22:22]
$ kubectl --kubeconfig config-test -n kyverno get pod
NAME READY STATUS RESTARTS AGE
kyverno-6d75c9bcbc-9wrrp 0/1 Error 10739 38d

日志如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ kubectl --kubeconfig config-test  -n kyverno logs -f kyverno-6d75c9bcbc-9cwmr
I1106 02:30:05.596951 1 version.go:17] "msg"="Kyverno" "Version"="v1.1.5"
I1106 02:30:05.597059 1 version.go:18] "msg"="Kyverno" "BuildHash"="(HEAD/7d76a4566784d366e93a8e0023cec7a9724ad465"
I1106 02:30:05.597072 1 version.go:19] "msg"="Kyverno" "BuildTime"="2020-04-28_11:05:16AM"
I1106 02:30:05.597394 1 config.go:79] CreateClientConfig "msg"="Using in-cluster configuration"
I1106 02:30:05.599167 1 client.go:246] Client/Poll "msg"="starting registered resources sync" "period"=10000000000
I1106 02:30:05.674374 1 util.go:69] CRDInstalled "msg"="CRD found" "kind"="ClusterPolicy"
I1106 02:30:05.674959 1 util.go:69] CRDInstalled "msg"="CRD found" "kind"="ClusterPolicyViolation"
I1106 02:30:05.675425 1 util.go:69] CRDInstalled "msg"="CRD found" "kind"="PolicyViolation"
I1106 02:30:05.677486 1 dynamicconfig.go:68] ConfigData "msg"="init configuration from commandline arguments"
I1106 02:30:05.677675 1 dynamicconfig.go:170] ConfigData "msg"="Init resource filters" "filters"=[{"Kind":"Event","Namespace":"*","Name":"*"},{"Kind":"*","Namespace":"kube-system","Name":"*"},{"Kind":"*","Namespace":"kube-public","Name":"*"},{"Kind":"*","Namespace":"kube-node-lease","Name":"*"},{"Kind":"Node","Namespace":"*","Name":"*"},{"Kind":"APIService","Namespace":"*","Name":"*"},{"Kind":"TokenReview","Namespace":"*","Name":"*"},{"Kind":"SubjectAccessReview","Namespace":"*","Name":"*"},{"Kind":"*","Namespace":"kyverno","Name":"*"}]
I1106 02:30:05.684098 1 certificates.go:28] Client "msg"="Generating new key/certificate pair for TLS"
I1106 02:30:06.136691 1 certificates.go:89] Client/submitAndApproveCertificateRequest "msg"="Old certificate request is deleted"
I1106 02:30:06.144440 1 certificates.go:98] Client/submitAndApproveCertificateRequest "msg"="Certificate request created" "name"="kyverno-svc.kyverno.cert-request"
I1106 02:30:06.158198 1 certificates.go:113] Client/submitAndApproveCertificateRequest "msg"="Certificate request is approved" "name"="kyverno-svc.kyverno.cert-request"
E1106 02:30:06.179647 1 main.go:239] setup "msg"="Failed to initialize TLS key/certificate pair" "error"="Unable to save TLS pair to the cluster: name is required"

看日志描述说证书无法保存到集群。

查看发现集群中已经存在证书

1
2
3
4
5
$ kubectl --kubeconfig config-test  -n kyverno get secrets
NAME TYPE DATA AGE
default-token-pdbwt kubernetes.io/service-account-token 3 223d
kubernetes.io/service-account-token 3 223d
kyverno-svc.kyverno.svc.kyverno-tls-pair kubernetes.io/tls 2 223d

查看证书内容发现也没什么问题

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ kubectl --kubeconfig config-test  -n kyverno get secrets kyverno-svc.kyverno.svc.kyverno-tls-pair -o yaml
apiVersion: v1
data:
tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURZakNDQWtxZ0F3SUJBZ0lVYmgzSk4zeExPQ1lGWXI2aU1jY2lXS1RhNkJ3d0RRWUpLb1pJaHZjTkFRRUwKQlFBd0VqRVFNQTRHQTFVRUF4TUhhM1ZpWlMxallUQWVGdzB5TURBek1qY3dOelEzTURCYUZ3MHlNVEF6TWpjdwpOelEzTURCYU1CWXhGREFTQmdOVkJBTVRDMnQ1ZG1WeWJtOHRjM1pqTUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGCkFBT0NBUThBTUlJQkNnS0NBUUVBMXRVNnBaV2dsQzZwYnhVTG42NzhvYnJuVzlTdWdjb1BseTNPZk5ha09BRkgKdlpYWE91elRGTVA5OTVxK3J4eGNNTlpiMEU4ZWhmNVhpVDJ3YnhvcHRqSVNIRk0weEdIeStEL0dXa1daVElQeAo0K1VQRGxzKzVWZjZtekRPR1pFMzk0cTlmbHp2ZGtrYUtQaWhhS1FVNkdlVjRucnhpRy9qdFI5a1hqNW5OTGZpCjFySmVEdVg0bTRuWWFtODVQY056cWl3TTVmL2I3K3grNU80ZWZpWVUxK2VtUDZsVlFOejZFbkhVVkd3akFMR1UKOHNxdjBWL2NJbHFuR0tGa3pYRnMrOUovTzJzUXREdjd1ZWZ3YXgzZEI3RjBFWEM3Y1A2OUZQK2ZTMWxGYkZZcApPYkFpM3ZmYy9tZm5OaVJJV3JXQ1RrT3YrSkRFV0tNejArbS8xaWQ2QVFJREFRQUJvNEdyTUlHb01BNEdBMVVkCkR3RUIvd1FFQXdJRm9EQWRCZ05WSFNVRUZqQVVCZ2dyQmdFRkJRY0RBUVlJS3dZQkJRVUhBd0l3REFZRFZSMFQKQVFIL0JBSXdBREFkQmdOVkhRNEVGZ1FVUXNBL3JmK0hPRHFGZExYNEs4TWI4bGEzYUdJd1NnWURWUjBSQkVNdwpRWUlMYTNsMlpYSnVieTF6ZG1PQ0UydDVkbVZ5Ym04dGMzWmpMbXQ1ZG1WeWJtK0NGMnQ1ZG1WeWJtOHRjM1pqCkxtdDVkbVZ5Ym04dWMzWmpod1FLS3dBQk1BMEdDU3FHU0liM0RRRUJDd1VBQTRJQkFRQWN1R1Arc2N4Wk5TSUQKRCs5UWE1bnViVEp5SS9xSVhxWUM2S3cyakRDMk1XQzJXTDl1ZXNiN1lCejd4SEVQVFFnR2lNUHpqZm9lNi9iSQpjeXBhckZVMGdRSFZQZFhVZFNSaU5qckplUm54UlhiT0pRY24rQlJpb1VaNkw0SUlZQmlIcjV4VEhHSnJ3eHJnCjNYalBSZ1pDSHdTVkJoWnd4YlArN0VIeEdLVjlyRmRacXRWZ3dObDdGNUdrRjlnc0hqRmw5eFcyMUNaZTlEa04KM1dkVW0vOW9mZ2VHSTRzalQ0cHFnb1F0cXVOM2tRVnFkc1VSQ0p3OXlwK1lrMjJ3RWZJcW93d2VSYzMybHRLZQozRVNjclhEL3hIbVJ3czY3bkYxcXJvZWFCMk51eW1jdEhZMUMrYkovdFBUWnB2eDIzWllvSExLaHl5RGsxYktBCk9yeWd1QU00Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
tls.key: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUVwQUlCQUFLQ0FRRUExdFU2cFpXZ2xDNnBieFVMbjY3OG9icm5XOVN1Z2NvUGx5M09mTmFrT0FGSHZaWFgKT3V6VEZNUDk5NXErcnh4Y01OWmIwRThlaGY1WGlUMndieG9wdGpJU0hGTTB4R0h5K0QvR1drV1pUSVB4NCtVUApEbHMrNVZmNm16RE9HWkUzOTRxOWZsenZka2thS1BpaGFLUVU2R2VWNG5yeGlHL2p0UjlrWGo1bk5MZmkxckplCkR1WDRtNG5ZYW04NVBjTnpxaXdNNWYvYjcreCs1TzRlZmlZVTErZW1QNmxWUU56NkVuSFVWR3dqQUxHVThzcXYKMFYvY0lscW5HS0ZrelhGcys5Si9PMnNRdER2N3VlZndheDNkQjdGMEVYQzdjUDY5RlArZlMxbEZiRllwT2JBaQozdmZjL21mbk5pUklXcldDVGtPditKREVXS016MCttLzFpZDZBUUlEQVFBQkFvSUJBQ2tYOEhmci95TlpLWi9OCjdzTkV2WjVTR2g4K3Q0S3NHLzlYQzhCbGJsUW9Lb2poT0tKVTJxdUdNZlpDNjJhampoN3BZZmFlcThBRnZzakoKdkE0RWV5WVd2ZEFkT21LMk9idXl0MFpkT2MyaEQ0d0FMTGthU3hXamxwUkk2YU9LVzZKR0w2a1VMZG42Y2I2VQprSXRybDNROUhEYU9QZFZUVWNNN2xmOVJBSHpjdGhhczNaSXFmNSt0b05iRUhvQnpxTTYrN2NWVlNlVW5EN1VqCmlQNXlhY2F1bkcycDRDZkJLZFNqUnZZUG5uZmFnNGNIWkV2b3BleGo1a1pVTWlra0VLMG8zYlY5cHRqUktXVHkKUjMyTWVYMGl2M0tmdnpLVUUrK2d0TmFQalB5UGkwZEdmbkk2bUlFMHVhNks0U3RaMlNxN25oVENvRU16MDFJTQoxS2gzZ0FFQ2dZRUE3SUNoUGZrQTRrNkp5amNtQ0FWZ0ZlbjJVbFN1bUNnT2N0UWpMcmtFZ1pydHJ4MWpBV0s5CjczOE00WUpCTElyQ2ppVHZIQjM4d3pKb0FaSGxRV3lNTTE0a2JsK3FBL2gzZkJHdmFmTEZPVGNNbHpWcWV6WkMKcXhGRWdBWno3R0xmdExuOURBa0JBSW5aTG9XN21ZZ25sZmI3V1ZSVjI0N3p4VDhkNThWbmVVRUNnWUVBNkl0RAoyTFo4VktuWnB6QVVwS0VRV1VwdEJOQlF5bGpyZ2U2KzRxajJSeFhzVjVrVU43d2RER3J3RU5xSFZsNk5RemdtCjBvZlVjRHl4dkFhYVQ0cnE1K2Y3Ui9LT3BMTElOZ1FmY0dzeUdyY1NZOXM1V1MxdlVTbXpoQlZFQjVoUmlXOHAKdmZLeU1LMGowWjR1aDN5UGR2SzJicy8vcXNCei91RkFCcTVSRU1FQ2dZRUFxbGdyeTF0aWk1NU9HTnlJQkJiNwpFazJtSWI3azBxdG5YTVgzWVZ2YUp3L1VTdUU3d20vQXBwUTRUdVZtMUJKTjk5d2FiWUliNE95WmhTZjBuSjcyCmpMa3VQR0dqTDZEelR1WGVGczNKeUdBaUxYZEg3dDh5UGN6K0xjaDREcmRZc2UrVWwrcVVVakwzdjA2THhSWVEKalMrTDh0ZVB6OGl6UkVzbDJ4NlFYUUVDZ1lCVVBXYjFrWjNXbWJVRUVMSFp0Wk1UbFplS24rQTBmU1BMYk81dgpjNS9MdnBCZ1owN2dwZCtzQ08wd1hjbWJLeU5uVDJjWTZ5VzFCdmVuMG9pQitpUUFvSlB4eTFlTEtFekk3Sk5yCkNSb2NmV2RIRHpwbUtNUmpsWVMzZTNDcWc2NDk2Q3dwNkVwT3dkbnc3S21VWVRZamMrZE1tMExWMjJQcDJEVjIKZGgxZHdRS0JnUURPOHBIN1V3WVk4UDVxR1pJL3NncGYwRlBFdndLWUE1d2NiTS9HL2V6SDlnSE1XMWdPUkoxRQpLNzYxWGFJYzIrRnFLV1d2c3oyMjhKZFFlbDk2ZVZqM1BPZk1ncFZGNmVyL2NBQWxXVUlRVWQ3Qm1rK3k0dTdiCk91SU9wK1h4bDdRaEJBblIxZ0hJRDc2SmhydmFXK3UyRUNCQzRpdklsRkh2M0ZXaXR1U3NyQT09Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K
kind: Secret
metadata:
creationTimestamp: "2020-03-27T07:51:31Z"
name: kyverno-svc.kyverno.svc.kyverno-tls-pair
namespace: kyverno
resourceVersion: "234527968"
selfLink: /api/v1/namespaces/kyverno/secrets/kyverno-svc.kyverno.svc.kyverno-tls-pair
uid: c5b6cbee-6fff-11ea-8fe8-fa163e10b76c
type: kubernetes.io/tls

解析一下证书,发现也都正常

image-20201106104827712

去github查看kyverno的issuse,没有找到相同的问题,不过从下面两个中找到思路。

https://github.com/kyverno/kyverno/issues/105

https://github.com/kyverno/kyverno/issues/98

日志说Unable to save TLS pair to the cluster: name is required,无法把证书保存到集群,猜测是重新生成了tls证书,但是集群内存在之前创建的证书导致新生成的无法保存。

解决办法:我们先把之前secert文件保存一下(万一猜想不对还需要恢复),然后把当前使用的证书的secert文件删除并删除pod。这个时候pod会重新请求并生成新的tls证书保存到集群中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# lishuai @ MacBook-Pro in ~/.kube [10:40:49]
$ kubectl --kubeconfig config-test -n kyverno get secrets kyverno-svc.kyverno.svc.kyverno-tls-pair -o yaml > tls.yaml

# lishuai @ MacBook-Pro in ~/.kube [10:41:03]
$ kubectl --kubeconfig config-test -n kyverno delete secrets kyverno-svc.kyverno.svc.kyverno-tls-pair
secret "kyverno-svc.kyverno.svc.kyverno-tls-pair" deleted

# lishuai @ MacBook-Pro in ~/.kube [10:41:18]
$ kubectl --kubeconfig config-test -n kyverno delete pod kyverno-6d75c9bcbc-9cwmr
pod "kyverno-6d75c9bcbc-9cwmr" deleted

# lishuai @ MacBook-Pro in ~/.kube [10:41:35]
$ kubectl --kubeconfig config-test -n kyverno get pod
NAME READY STATUS RESTARTS AGE
kyverno-6d75c9bcbc-sfd6r 1/1 Running 0 17s

# lishuai @ MacBook-Pro in ~/.kube [10:41:47]
$ kubectl --kubeconfig config-test -n kyverno get pod
NAME READY STATUS RESTARTS AGE
kyverno-6d75c9bcbc-sfd6r 1/1 Running 0 19s

# lishuai @ MacBook-Pro in ~/.kube [10:41:49]
$ kubectl --kubeconfig config-test -n kyverno get secrets
NAME TYPE DATA AGE
default-token-pdbwt kubernetes.io/service-account-token 3 223d
kyverno-service-account-token-6c2r4 kubernetes.io/service-account-token 3 223d
kyverno-svc.kyverno.svc.kyverno-tls-pair kubernetes.io/tls 2 10s

日志中也可以看到tls证书保存正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
I1106 02:41:43.182347       1 version.go:17]  "msg"="Kyverno"  "Version"="v1.1.5"
I1106 02:41:43.182439 1 version.go:18] "msg"="Kyverno" "BuildHash"="(HEAD/7d76a4566784d366e93a8e0023cec7a9724ad465"
I1106 02:41:43.182458 1 version.go:19] "msg"="Kyverno" "BuildTime"="2020-04-28_11:05:16AM"
I1106 02:41:43.182940 1 config.go:79] CreateClientConfig "msg"="Using in-cluster configuration"
I1106 02:41:43.184401 1 client.go:246] Client/Poll "msg"="starting registered resources sync" "period"=10000000000
I1106 02:41:43.258751 1 util.go:69] CRDInstalled "msg"="CRD found" "kind"="ClusterPolicy"
I1106 02:41:43.259026 1 util.go:69] CRDInstalled "msg"="CRD found" "kind"="ClusterPolicyViolation"
I1106 02:41:43.259389 1 util.go:69] CRDInstalled "msg"="CRD found" "kind"="PolicyViolation"
I1106 02:41:43.260724 1 dynamicconfig.go:68] ConfigData "msg"="init configuration from commandline arguments"
I1106 02:41:43.260889 1 dynamicconfig.go:170] ConfigData "msg"="Init resource filters" "filters"=[{"Kind":"Event","Namespace":"*","Name":"*"},{"Kind":"*","Namespace":"kube-system","Name":"*"},{"Kind":"*","Namespace":"kube-public","Name":"*"},{"Kind":"*","Namespace":"kube-node-lease","Name":"*"},{"Kind":"Node","Namespace":"*","Name":"*"},{"Kind":"APIService","Namespace":"*","Name":"*"},{"Kind":"TokenReview","Namespace":"*","Name":"*"},{"Kind":"SubjectAccessReview","Namespace":"*","Name":"*"},{"Kind":"*","Namespace":"kyverno","Name":"*"}]
E1106 02:41:43.267910 1 certificates.go:182] Client/ReadTlsPair "msg"="Failed to get secret" "error"="secrets \"kyverno-svc.kyverno.svc.kyverno-tls-pair\" not found" "name"="kyverno-svc.kyverno.svc.kyverno-tls-pair" "namespace"="kyverno"
I1106 02:41:43.267928 1 certificates.go:28] Client "msg"="Generating new key/certificate pair for TLS"
I1106 02:41:45.130681 1 certificates.go:89] Client/submitAndApproveCertificateRequest "msg"="Old certificate request is deleted"
I1106 02:41:45.137781 1 certificates.go:98] Client/submitAndApproveCertificateRequest "msg"="Certificate request created" "name"="kyverno-svc.kyverno.cert-request"
I1106 02:41:45.157524 1 certificates.go:113] Client/submitAndApproveCertificateRequest "msg"="Certificate request is approved" "name"="kyverno-svc.kyverno.cert-request"
I1106 02:41:45.181091 1 certificates.go:241] Client/WriteTlsPair "msg"="secret created" "name"="kyverno-svc.kyverno.svc.kyverno-tls-pair" "namespace"="kyverno"
I1106 02:41:45.181197 1 registration.go:252] WebhookRegistrationClient "msg"="Started cleaning up webhookconfigurations"
E1106 02:41:45.186968 1 resource.go:75] WebhookRegistrationClient "msg"="resource does not exit" "error"="mutatingwebhookconfigurations.admissionregistration.k8s.io \"kyverno-resource-mutating-webhook-cfg\" not found" "kind"="MutatingWebhookConfiguration" "name"="kyverno-resource-mutating-webhook-cfg"
CATALOG