k8s测试环境问题记录
在公司准备测试环境,记录一些过程中碰到的问题
dashboard认证
dashboard 大概就发现一个比较明显的好处,用来执行容器命令比较方便 : )
安装 dashboard 先去 github获取 kubernetes-dashboard.yaml
最后 Service 部分添加 NodePort 映射。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: ports: - port: 443 targetPort: 8443 nodePort: 9000 type: NodePort selector: k8s-app: kubernetes-dashboard
打开 kube-proxy 节点的9000端口,提示认证使用 token,通过下面的命令获取。复制整个长串的 token 到页面登陆 dashboard。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 # kubectl -n kube-system get secret NAME TYPE DATA AGE an00-token-brpds kubernetes.io/service-account-token 3 14d default-token-jct5w kubernetes.io/service-account-token 3 22d elasticsearch-logging-token-zwr2b kubernetes.io/service-account-token 3 14d fluentd-es-token-rxrxc kubernetes.io/service-account-token 3 44m heapster-token-qk4m9 kubernetes.io/service-account-token 3 14d kube-dns-autoscaler-token-nmgjs kubernetes.io/service-account-token 3 16d kube-dns-token-s5hkb kubernetes.io/service-account-token 3 16d kubernetes-dashboard-certs Opaque 0 14d kubernetes-dashboard-key-holder Opaque 2 21d kubernetes-dashboard-token-7nt98 kubernetes.io/service-account-token 3 14d # 找token关键字的secret,获取到最后的token字段内容粘贴到页面登陆。 # kubectl -n kube-system describe secret kubernetes-dashboard-token-7nt98 Name: kubernetes-dashboard-token-7nt98 Namespace: kube-system Labels: <none> Annotations: kubernetes.io/service-account.name=kubernetes-dashboard kubernetes.io/service-account.uid=54c71522-6f80-11e8-bc0b-525400eac085 Type: kubernetes.io/service-account-token Data ==== ca.crt: 1107 bytes namespace: 11 bytes token: eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi03bnQ5OCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjU0YzcxNTIyLTZmODAtMTFlOC1iYzBiLTUyNTQwMGVhYzA4NSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTprdWJlcm5ldGVzLWRhc2hib2FyZCJ9.leD2gC5FkkN1_0mt5_AwveStC6vh5H8-UL1LqwF7N07xQ2ZKSh1matYyWyv-buMflrks1-my88MKwaYNmMaNRk2-WrlybNLJKrf-QLpmGLdCB3IHBuSViuHHQwPS4g7CD5GNAsuPZF3GAszuBamBD3HJT1okrrH8J3KlstqMpYsEbwullLfgQaznfd02YjrR6izC3sneJpj0vTKSrY8LxweI2xcYVNshZHRacEgdNzwBTe48dU_9pCqyUWOSS2J2Y4EimAMyPQlwDbazgGuHn027neIosxO0ooSbEeiqaEnu9-ATpyJCCWW4ukOxt_PG8VJsNzmZuG18LIA_KImd6A
kube-dns 为k8s中的pod增加service名字解析和自动发现,通过api监控service变动,可以将service名字解析到对应的VIP中。
kube-dns包含三个镜像(谷歌gcr的仓库,若服务器环境没外网需要先准备)
kube-dns:通过k8s api监控service和ip变动以及它们的对应关系,保存到内存中后就是dns记录了。
dnsmasq-nanny:通过kube-dns容器获取dns规则,在集群中相当于dns服务器,减轻kube-dns压力,提高稳定性和查询性能
sidecar:
使用官方的YAML文件创建,在源码的”kubernetes/cluster/addons/“目录下,关于kube-dns主要有两个文件:
dns服务定义文件,包含Service、ServiceAccount、Deployment等。需要将文件中的 __PILLAR__DNS__SERVER__
修改成一个clusterip,将 __PILLAR__DNS__DOMAIN__
修改成 cluster.local
(注意保留后面的点)。
dns在编排的服务通信和发现中有很大的作用,所以它不能有闪失。官方提供了一个为kube-dns自动scale的配置,文件可以直接创建
dns-horizontal-autoscaler/dns-horizontal-autoscaler.yaml
在创建了kube-dns服务后,需要修改pod默认的dns服务器配置。pod被kubelet创建,找到kubelet的配置文件,添加dns配置内容,注意 kubelet.service
启动文件也添加新增的配置选项名,刷新配置后重启 kubelet
。
1 2 3 4 5 6 7 8 9 10 11 12 13 KUBE_LOGTOSTDERR ="--logtostderr=true" KUBE_LOG_LEVEL ="--v=2" NODE_ADDRESS ="--address=0.0.0.0" NODE_HOSTNAME ="--hostname-override=debian-70" KUBE_ALLOW_PRIV ="--allow-privileged=false" KUBE_POD_INFRA_CONTAINER_IMAGE ="--pod-infra-container-image=k8s.gcr.io/pause-amd64:3.1" KUBE_RUNTIME_CGROUPS ="--runtime-cgroups=/systemd/system.slice" KUBE_CGROUPS ="--kubelet-cgroups=/systemd/system.slice" KUBE_FAIL_SWAP_ON ="--fail-swap-on=false" KUBE_CONFIG ="--kubeconfig=/etc/kubernetes/kubeconfig.yml" KUBELET_DNS_IP ="--cluster-dns=10.66.77.2" KUBELET_DNS_DOMAIN ="--cluster-domain=cluster.local"
可以创建简单的busybox应用,查看它的 /etc/resolv.conf
文件,默认有 kubernetes.default
指向apiserver。解析得到的是service的clusterIP,不能被ping通 。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 # kubectl exec -it busybox sh / # nslookup kubernetes.default Server: 10.66.77.2 Address 1: 10.66.77.2 kube-dns.kube-system.svc.cluster.local Name: kubernetes.default Address 1: 10.66.77.1 kubernetes.default.svc.cluster.local / # / # / # nslookup nginx-service Server: 10.66.77.2 Address 1: 10.66.77.2 kube-dns.kube-system.svc.cluster.local Name: nginx-service Address 1: 10.66.77.225 nginx-service.default.svc.cluster.local / # exit # 当前的service,和上面解析的ip是一致的 # kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.66.77.1 <none> 443/TCP 6d nginx-service ClusterIP 10.66.77.225 <none> 80/TCP 5h
elasticsearch fluentd kibana 官方提供了日志监控方案,资源定义文件在源码的 kubernetes/cluster/addons/fluentd-elasticsearch 目录,获取YAML文件可以创建和运行EFK。
依赖镜像:( elasticsearch 和 kibana 非常大…)
k8s.gcr.io/elasticsearch:v5.6.4
alpine:3.6
k8s.gcr.io/fluentd-elasticsearch:v2.0.4
docker.elastic.co/kibana/kibana:5.6.4
elasticsearch 是一个搜索引擎和日志存储的数据库 flunetd 会将节点中保存的其他 pod 输出的日志流收集到 elasticsearch 中 kibana 展示日志,提供人性化搜索界面和图表等功能,需要访问可以改 kibana-service.yaml
映射 NodePort
。
从源码中获取定义资源的 YAML 文件,创建前注释一下 kibana-deployment.yaml
中 env 的 SERVER_BASEPATH
变量和 value
保存,开始创建。 kibana 启动有点久,如果 pod 日志没异常的情况,需要耐心等几分钟。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 # ls -ltotal 36 -rw-r--r-- 1 root root 382 Jun 13 11:34 es-service.yaml -rw-r--r-- 1 root root 2820 Jun 13 11:34 es-statefulset.yaml -rw-r--r-- 1 root root 15648 Jun 13 11:34 fluentd-es-configmap.yaml -rw-r--r-- 1 root root 2774 Jun 13 11:34 fluentd-es-ds.yaml -rw-r--r-- 1 root root 1186 Jun 13 11:34 kibana-deployment.yaml -rw-r--r-- 1 root root 354 Jun 13 11:34 kibana-service.yaml # kubectl create -f . # 查看部分资源情况 # kubectl get statefulset,pod,daemonset -n kube-system NAME DESIRED CURRENT AGE statefulset.apps/elasticsearch-logging 2 2 11m NAME READY STATUS RESTARTS AGE pod/elasticsearch-logging-0 1/1 Running 0 6m pod/elasticsearch-logging-1 1/1 Unknown 0 6m pod/kibana-logging-bc776986-7vtf7 1/1 Unknown 0 11m pod/kibana-logging-bc776986-cm69s 0/1 ContainerCreating 0 34s pod/kube-dns-659bc9899c-ghj2n 0/3 ContainerCreating 0 20s pod/kube-dns-659bc9899c-lm4pd 3/3 Running 0 1d pod/kube-dns-659bc9899c-r655f 3/3 Unknown 0 1d pod/kube-dns-autoscaler-79b4b844b9-6v856 1/1 Running 0 1d pod/kubernetes-dashboard-5c469b58b8-pf7cg 0/1 ContainerCreating 0 16s pod/kubernetes-dashboard-5c469b58b8-pkttx 1/1 Unknown 2 5d NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.extensions/fluentd-es-v2.0.4 0 0 0 0 0 beta.kubernetes.io/fluentd-ds-ready=true 11m
1. elasticsearch statefulset启动和删除可能的错误 若是启动 elasticsearch-logging
碰到如下错误,请为 apiserver
和所有 kubelet
添加启动参数 --allow-privileged
(默认为 flase
),重载配置文件后重新启动
1 2 3 4 5 6 7 8 9 10 11 12 13 # kubectl describe statefulset -n kube-system ... Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedCreate 1m (x24 over 6m) statefulset-controller create Pod elasticsearch-logging-0 in StatefulSet elasticsearch-logging failed error: Pod "elasticsearch-logging-0" is invalid: spec.initContainers[0].securityContext.privileged: Forbidden: disallowed by cluster policy # 重启后启动成功 # kubectl describe statefulset -n kube-system Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedCreate 1m (x24 over 6m) statefulset-controller create Pod elasticsearch-logging-0 in StatefulSet elasticsearch-logging failed error: Pod "elasticsearch-logging-0" is invalid: spec.initContainers[0].securityContext.privileged: Forbidden: disallowed by cluster policy Normal SuccessfulCreate 49s statefulset-controller create Pod elasticsearch-logging-0 in StatefulSet elasticsearch-logging successful Normal SuccessfulCreate 42s statefulset-controller create Pod elasticsearch-logging-1 in StatefulSet elasticsearch-logging successful
另外在删除资源时我面临一个问题,其他的资源都删除了,但是 elasticsearch-logging
总是不能删除成功,即使是强制 --force
,提示我超时了
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 # kubectl delete -f . # 还存在elasticsearch-logging这个statefulset # kubectl get -f . NAME DESIRED CURRENT AGE elasticsearch-logging 0 2 2h Error from server (NotFound): services "elasticsearch-logging" not found Error from server (NotFound): serviceaccounts "elasticsearch-logging" not found Error from server (NotFound): clusterroles.rbac.authorization.k8s.io "elasticsearch-logging" not found Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io "elasticsearch-logging" not found Error from server (NotFound): configmaps "fluentd-es-config-v0.1.4" not found Error from server (NotFound): serviceaccounts "fluentd-es" not found Error from server (NotFound): clusterroles.rbac.authorization.k8s.io "fluentd-es" not found Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io "fluentd-es" not found Error from server (NotFound): daemonsets.apps "fluentd-es-v2.0.4" not found Error from server (NotFound): deployments.apps "kibana-logging" not found Error from server (NotFound): services "kibana-logging" not found # kubectl delete statefulset elasticsearch-logging -n kube-system --force timed out waiting for "elasticsearch-logging" to be synced
后面在google找到一个方法,删除失败的资源,指定 --cascade=false
,解决了我的问题。若删除后还可以看到被删除资源定义的pod,可以指定 --grace-period=0
删除它。
1 2 3 4 5 # kubectl delete -f es-statefulset.yaml --cascade=false # 如果删除了定义elastic的stateful,但是get pod发现还有stateful定义的pod没删除,可以执行下面命令强制删除 # kubectl --namespace=kube-system delete pods elasticsearch-logging-0 --grace-period=0 --force # kubectl --namespace=kube-system delete pods elasticsearch-logging-1 --grace-period=0 --force
2. fluentd daemonset启动问题 另外一个 fluentd 问题,get 发现 daemonset fluentd-es-v2.0.4
运行0个,这里需要为所有 node 添加一个label,如下
1 2 3 4 5 6 7 8 9 10 11 12 # kubectl get daemonset fluentd-es-v2.0.4 -n kube-system NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE fluentd-es-v2.0.4 0 0 0 0 0 beta.kubernetes.io/fluentd-ds-ready=true 32s # 为所有node添加一个label,有此label的node才会运行fluentd # kubectl label node debian-70 beta.kubernetes.io/fluentd-ds-ready=true # kubectl label node sl-80 beta.kubernetes.io/fluentd-ds-ready=true # 再次get,已经有在运行了 # kubectl get daemonset fluentd-es-v2.0.4 -n kube-system NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE fluentd-es-v2.0.4 2 2 2 2 2 beta.kubernetes.io/fluentd-ds-ready=true 4m
一次偶然,公司测试机房断电了(我猜测与这个有关系),再启动集群的时候有一台节点的 fluentd
无法启动,如下体现
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 # pod状态一直是CrashLoopBackOff,describe查看到的信息也有限,重启也不能正常 # kubectl get -n kube-system pods NAME READY STATUS RESTARTS AGE elasticsearch-logging-0 1/1 Running 3 14d elasticsearch-logging-1 1/1 Running 3 14d fluentd-es-v2.0.4-4lthw 1/1 Running 0 16h fluentd-es-v2.0.4-lvmj7 0/1 CrashLoopBackOff 4 16h heapster-69b5d4974d-4dzm8 1/1 Running 3 14d kibana-logging-799d8b46db-rn6fq 1/1 Running 3 14d kube-dns-659bc9899c-ghj2n 3/3 Running 9 15d kube-dns-659bc9899c-lm4pd 3/3 Running 12 16d kube-dns-autoscaler-79b4b844b9-6v856 1/1 Running 4 16d kubernetes-dashboard-7d5dcdb6d9-f4j6n 1/1 Running 3 14d monitoring-grafana-69df66f668-gg9kl 1/1 Running 3 14d monitoring-influxdb-78d4c6f5b6-2phht 1/1 Running 3 14d # 另外看一下pod日志,这个帮助很大 # kubectl logs -n kube-system fluentd-es-v2.0.4-lvmj7 2018-06-29 01:21:19 +0000 [warn]: parameter 'time_format' in <source> @id fluentd-containers.log @type tail path "/var/log/containers/*.log" pos_file "/var/log/es-containers.log.pos" time_format %Y-%m-%dT%H:%M:%S.%NZ tag "raw.kubernetes.*" read_from_head true <parse> @type "multi_format" <pattern> format json time_key "time" time_format "%Y-%m-%dT%H:%M:%S.%NZ" time_type string </pattern> <pattern> format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/ time_format "%Y-%m-%dT%H:%M:%S.%N%:z" expression "^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$" ignorecase false multiline false </pattern> </parse> </source> is not used. # 下面开始error,而另外正常的pod下面是传输到elastic的info日志了 2018-06-29 01:21:19 +0000 [error]: unexpected error error_class=TypeError error="no implicit conversion of Symbol into Integer" 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buffer/file_chunk.rb:219:in `[]' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buffer/file_chunk.rb:219:in `restore_metadata' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buffer/file_chunk.rb:322:in `load_existing_staged_chunk' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buffer/file_chunk.rb:51:in `initialize' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buf_file.rb:144:in `new' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buf_file.rb:144:in `block in resume' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buf_file.rb:133:in `glob' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buf_file.rb:133:in `resume' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buffer.rb:171:in `start' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buf_file.rb:120:in `start' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/output.rb:415:in `start' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/root_agent.rb:165:in `block in start' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/root_agent.rb:154:in `block (2 levels) in lifecycle' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/root_agent.rb:153:in `each' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/root_agent.rb:153:in `block in lifecycle' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/root_agent.rb:140:in `each' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/root_agent.rb:140:in `lifecycle' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/root_agent.rb:164:in `start' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/engine.rb:274:in `start' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/engine.rb:219:in `run' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/supervisor.rb:774:in `run_engine' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/supervisor.rb:523:in `block in run_worker' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/supervisor.rb:699:in `main_process' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/supervisor.rb:518:in `run_worker' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/command/fluentd.rb:316:in `<top (required)>' 2018-06-29 01:21:19 +0000 [error]: /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' 2018-06-29 01:21:19 +0000 [error]: /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' 2018-06-29 01:21:19 +0000 [error]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/bin/fluentd:8:in `<top (required)>' 2018-06-29 01:21:19 +0000 [error]: /usr/local/bin/fluentd:22:in `load' 2018-06-29 01:21:19 +0000 [error]: /usr/local/bin/fluentd:22:in `<main>' 2018-06-29 01:21:19 +0000 [error]: unexpected error error_class=TypeError error="no implicit conversion of Symbol into Integer" 2018-06-29 01:21:19 +0000 [error]: suppressed same stacktrace
在 google 找到一个 GitHub 的 fluentd/issues/#1760 ,发现和缓存的元数据损坏有关系。不同的是这里 2.0.4 版本 buffer 存放位置变了,映射在宿主机的 /var/log/fluentd-buffers/kubernetes.system.buffer
目录。我将 *.meta
删除后节点的 fluentd
启动正常(可以删除 pod 会自动创建,或者等它重启)。
1 2 3 4 5 6 7 8 9 10 # cd /var/log/fluentd-buffers/kubernetes.system.buffer/# ls buffer.b56efd1a03768b2f7eabddf200cf50b79.log buffer.b56efd1a03ad20d927856cabfb9e0b1d7.log.meta buffer.b56efd1a03768b2f7eabddf200cf50b79.log.meta buffer.b56efd1a1f72114c77ad018dec6591873.log buffer.b56efd1a03ad20d927856cabfb9e0b1d7.log buffer.b56efd1a1f72114c77ad018dec6591873.log.meta # rm -rf *.meta
关于监控heapster influxdb grafana 官方提供了监控方案heapster
获取 YAML 文件在项目的 deploy/kube-config/influxdb 路径下,获取后直接可以创建。镜像请先获取,过滤文件的image字段,拉取这些再创建比较妥。
启动了这些资源后,若 heapster 日志无异常,dashboard 过一会就有关于容器组 CPU 内存等监控状态。可以修改 grafana 的 service 映射成 NodePort,配置图形展示(admin/admin)。
关于自动水平扩展HPA 参考官方 https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/