misterli's Blog.

使用sloop 监控kubernetes event

字数统计: 1.9k阅读时长: 10 min
2022/05/09

简介

Sloop 可以监控 Kubernetes event ,记录事件和资源状态变化的历史,并提供可视化来帮助调试过去的事件。

主要特点:

  1. 允许查找和检查不再存在的资源(例如:发现之前部署中的 pod )。
  2. 提供时间线显示,显示deployment 、ReplicaSet 和 StatefulSet 更新中相关资源的退出。
  3. 帮助调试瞬态和间歇性错误。
  4. 可以查看 Kubernetes 应用程序中随时间的变化。
  5. 是一个独立的服务,不依赖于分布式存储。

架构

image-20220509180133422

安装及使用

docker 安装

我们可以使用官方提供的镜像安装,sloop数据文件保存在容器的/data目录下

1
docker run  -it -p 8080:8080 -v ~/.kube/config:/kube/config  -v /data:/data -e KUBECONFIG=/kube/config sloopimage/sloop

通过访问https://localhost:8080 即可进入web ui 。

image-20220509181035369

在侧边栏我们可以选择要查看的时间范围,名称空间,资源对象,以及关键词过滤等。

image-20220509181609251

在详情页面我们可以看到我们event的详情

image-20220509181811970

我们还可以点击页面里的details 查看资源对象的详情

image-20220509190133986

还可以点击页面上方的debug menu 进入debug 页面查看metrics

image-20220509190207768

image-20220509190217838

我们还可以配置一下我们打开ui后的默认页面,sloop有如下选项

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
[root@dev-tools sloop]# docker run --rm -it -p 8080:8080 -v ~/.kube/config:/kube/config  -e KUBECONFIG=/kube/config sloop  sloop -h
Usage of configFileOnly:
-alsologtostderr
log to standard error as well as files
-apiserver-host string
Kubernetes API server endpoint
-badger-detail-log-enabled
Turns on detailed logging of BadgerDB
-badger-discard-ratio float
Badger value log GC uses this value to decide if it wants to compact a vlog file. The lower the value of discardRatio the higher the number of !badger!move keys. And thus more the number of !badger!move keys, the size on disk keeps on increasing over time.
-badger-enable-event-logging
Turns on badger event logging
-badger-keep-l0-in-memory
Keeps all level 0 tables in memory for faster writes and compactions
-badger-level-one-size int
The maximum total size for Level 1. 0 = use badger default
-badger-level-size-multiplier int
The ratio between the maximum sizes of contiguous levels in the LSM. 0 = use badger default
-badger-max-table-size int
Max LSM table size in bytes. 0 = use badger default
-badger-number-of-compactors int
Number of compactors for badger
-badger-number-of-level-zero-tables int
Number of level zero tables for badger
-badger-number-of-zero-tables-stall int
Number of Level 0 tables that once reached causes the DB to stall until compaction succeeds
-badger-sync-writes
Sync Writes ensures writes are synced to disk if set to true
-badger-use-lsm-only-options
Sets a higher valueThreshold so values would be collocated with LSM tree reducing vlog disk usage
-badger-vlog-file-size int
Max size in bytes per value log file. 0 = use badger default
-badger-vlog-fileIO-mapping
Indicates which file loading mode should be used for the value log data, in memory constrained environments the value is recommended to be true
-badger-vlog-gc-freq duration
Frequency of running badger's ValueLogGC
-badger-vlog-max-entries uint
Max number of entries per value log files. 0 = use badger default
-badger-vlog-truncate
Truncate value log if badger db offset is different from badger db size
-bind-address string
Web server bind ip address.
-cleanup-frequency duration
Frequency between subsequent runs for the database cleanup
-config string
Path to a yaml or json config file
-context string
Use a specific kubernetes context
-crd-refresh-interval duration
Frequency between CRD Informer refresh
-default-kind string
Default UX filter kind
-default-lookback string
Default UX filter lookback
-default-namespace string
Default UX filter namespace
-deletion-batch-size int
Size of batch for deletion
-disable-kube-watch
Turn off kubernetes watch
-disable-store-manager
Turn off store manager which is to clean up database
-display-context string
Use this to override the display context. When running in k8s the context is empty string. This lets you override that (mainly useful if you are running many copies of sloop on different clusters)
-enable-delete-keys
Use delete prefixes instead of dropPrefix for GC
-gc-threshold float
Threshold for GC to start garbage collecting
-keep-minor-node-updates
Keep all node updates even if change is only condition timestamps
-kube-watch-resync-interval duration
OPTIONAL: Kubernetes watch resync interval
-log_backtrace_at string
when logging hits line file:N, emit a stack trace
-logtostderr
log to standard error instead of files
-max-disk-mb int
Max disk storage in MB
-max-look-back duration
Max history data to keep
-playback-file string
Read watch data from a playback file
-port int
Web server port
-record-file string
Record watch data to a playback file
-restore-database-file string
Restore database from backup file into current context.
-stderrthreshold int
logs at or above this threshold go to stderr
-store-root string
Path to store history data
-use-mock-badger
Use a fake in-memory mock of badger
-v int
log level for V logs
-vmodule string
comma-separated list of pattern=N settings for file-filtered logging
-watch-crds
Watch for activity for CRDs
-web-files-path string
Path to web files
Failed to pre-parse flags looking for config file: flag: help requested
ERROR: logging before flag.Parse: I0509 10:51:23.862730 1 config.go:256] Default config set
Usage of sloop:
-alsologtostderr
log to standard error as well as files
-apiserver-host string
Kubernetes API server endpoint
-badger-detail-log-enabled
Turns on detailed logging of BadgerDB
-badger-discard-ratio float
Badger value log GC uses this value to decide if it wants to compact a vlog file. The lower the value of discardRatio the higher the number of !badger!move keys. And thus more the number of !badger!move keys, the size on disk keeps on increasing over time. (default 0.99)
-badger-enable-event-logging
Turns on badger event logging
-badger-keep-l0-in-memory
Keeps all level 0 tables in memory for faster writes and compactions (default true)
-badger-level-one-size int
The maximum total size for Level 1. 0 = use badger default
-badger-level-size-multiplier int
The ratio between the maximum sizes of contiguous levels in the LSM. 0 = use badger default
-badger-max-table-size int
Max LSM table size in bytes. 0 = use badger default
-badger-number-of-compactors int
Number of compactors for badger
-badger-number-of-level-zero-tables int
Number of level zero tables for badger
-badger-number-of-zero-tables-stall int
Number of Level 0 tables that once reached causes the DB to stall until compaction succeeds
-badger-sync-writes
Sync Writes ensures writes are synced to disk if set to true (default true)
-badger-use-lsm-only-options
Sets a higher valueThreshold so values would be collocated with LSM tree reducing vlog disk usage (default true)
-badger-vlog-file-size int
Max size in bytes per value log file. 0 = use badger default
-badger-vlog-fileIO-mapping
Indicates which file loading mode should be used for the value log data, in memory constrained environments the value is recommended to be true
-badger-vlog-gc-freq duration
Frequency of running badger's ValueLogGC (default 1m0s)
-badger-vlog-max-entries uint
Max number of entries per value log files. 0 = use badger default (default 200000)
-badger-vlog-truncate
Truncate value log if badger db offset is different from badger db size (default true)
-bind-address string
Web server bind ip address.
-cleanup-frequency duration
Frequency between subsequent runs for the database cleanup (default 30m0s)
-config string
Path to a yaml or json config file
-context string
Use a specific kubernetes context
-cpuprofile string
write profile to file
-crd-refresh-interval duration
Frequency between CRD Informer refresh (default 5m0s)
-default-kind string
Default UX filter kind (default "_all")
-default-lookback string
Default UX filter lookback (default "1h")
-default-namespace string
Default UX filter namespace (default "default")
-deletion-batch-size int
Size of batch for deletion (default 1000)
-disable-kube-watch
Turn off kubernetes watch
-disable-store-manager
Turn off store manager which is to clean up database
-display-context string
Use this to override the display context. When running in k8s the context is empty string. This lets you override that (mainly useful if you are running many copies of sloop on different clusters)
-enable-delete-keys
Use delete prefixes instead of dropPrefix for GC
-gc-threshold float
Threshold for GC to start garbage collecting (default 0.8)
-keep-minor-node-updates
Keep all node updates even if change is only condition timestamps
-kube-watch-resync-interval duration
OPTIONAL: Kubernetes watch resync interval (default 30m0s)
-log_backtrace_at value
when logging hits line file:N, emit a stack trace
-log_dir string
If non-empty, write log files in this directory
-logtostderr
log to standard error instead of files
-max-disk-mb int
Max disk storage in MB (default 32768)
-max-look-back duration
Max history data to keep (default 336h0m0s)
-playback-file string
Read watch data from a playback file
-port int
Web server port (default 8080)
-record-file string
Record watch data to a playback file
-restore-database-file string
Restore database from backup file into current context.
-stderrthreshold value
logs at or above this threshold go to stderr
-store-root string
Path to store history data (default "./data")
-use-mock-badger
Use a fake in-memory mock of badger
-v value
log level for V logs
-vmodule value
comma-separated list of pattern=N settings for file-filtered logging
-watch-crds
Watch for activity for CRDs (default true)
-web-files-path string
Path to web files (default "./pkg/sloop/webserver/webfiles")

修改默认的名称空间以及资源对象及时间

1
docker run --rm -it -p 8080:8080 -v ~/.kube/config:/kube/config  -e KUBECONFIG=/kube/config sloop  sloop -default-namespace=kube-system -default-kind=pod  -default-lookback=2h

image-20220509185647596

从源码安装

1
2
3
4
5
6
7
mkdir -p $GOPATH/src/github.com/salesforce
cd $GOPATH/src/github.com/salesforce
git clone https://github.com/salesforce/sloop.git
cd sloop
go env -w GO111MODULE=auto
make
$GOPATH/bin/sloop

Helm 方式安装

1
2
3
4
5
6
git clone https://github.com/salesforce/sloop.git
cd sloop
cd /root/sloop/helm/sloop
kubectl create namespace sloop
helm template . --namespace sloop> sloop-test.yaml
kubectl -n sloop apply -f sloop-test.yaml

参考:https://github.com/salesforce/sloop.git

CATALOG
  1. 1. 简介
  2. 2. 架构
  3. 3. 安装及使用
    1. 3.1. docker 安装
    2. 3.2. 从源码安装
    3. 3.3. Helm 方式安装