CKA 模拟题目的学习,虽然比较难。但是感觉都是值得学习的。
这里做完一次模拟之后用这篇文章来复习一下模拟题目
考前技巧
alias k=kubectl # will already be pre-configured
export do="--dry-run=client -o yaml" # k create deploy nginx --image=nginx $do
export now="--force --grace-period 0" # k delete pod x $now
vimrc 编辑,
set tabstop=2
set expandtab
set shiftwidth=2
常见的资源缩写
- deploy
- ds
- sts
- sa
正题
Q1
获取当前拥有哪些kube的环境。这个基本是送分题。第二个猪一下sed 的用法。****
kubectl config get-contexts -o name > /opt/course/1/contexts
cat ~/.kube/config | grep current | sed -e "s/current-context: //"
Q2
Create a single Pod of image
httpd:2.4.41-alpine
in Namespacedefault
. The Pod should be namedpod1
and the container should be namedpod1-container
. This Pod should only be scheduled on a controlplane node, do not add new labels any nodes.创建一个图像httpd的单个吊舱:2.4.41-Alpine在名称空间默认情况下。 POD应命名为POD1,并且该容器应命名为POD1-container。 该吊舱只能安排在控制平面节点上,请勿添加新标签任何节点。
这里就使用了节点选择器的知识了,使用 node selector,以及 tolerate。一个是选择节点,还有一个是容忍节点上的污点,用下面的代码来查看节点的信息。得到 其label 进行 select 以及 污点,来进行容忍。
k get node # find controlplane node
k describe node cluster1-controlplane1 | grep Taint -A1 # get controlplane node taints
k get node cluster1-controlplane1 --show-labels # get controlplane node labels
使用下面的命令进行pod来dry run,得到基础的yaml 文件来进行删减,在yaml 里面添加下面的 容忍度和节点选择。
k run pod1 --image=httpd:2.4.41-alpine $do > 2.yaml
##
tolerations: # add
- effect: NoSchedule # add
key: node-role.kubernetes.io/control-plane # add
nodeSelector: # add
node-role.kubernetes.io/control-plane: "" # add
拓展
为 node1 设置 taint:
kubectl taint nodes node1 key1=value1:NoSchedule
kubectl taint nodes node1 key1=value1:NoExecute
kubectl taint nodes node1 key2=value2:NoSchedule
删除上面的 taint:
kubectl taint nodes node1 key1:NoSchedule-
kubectl taint nodes node1 key1:NoExecute-
kubectl taint nodes node1 key2:NoSchedule-
查看 node1 上的 taint:
kubectl describe nodes node1
effect 控制着 副作用的行为有
- NoExecute
- NoSchedule
- PreferNoSchedule
如果在给节点添加上述污点之前,该 Pod 已经在上述节点运行, 那么它还可以继续运行在该节点上
如果给一个节点添加了一个 effect 值为 NoExecute
的污点, 则任何不能忍受这个污点的 Pod 都会马上被驱逐,任何可以忍受这个污点的 Pod 都不会被驱逐。 但是,如果 Pod 存在一个 effect 值为 NoExecute
的容忍度指定了可选属性 tolerationSeconds
的值,则表示在给节点添加了上述污点之后, Pod 还能继续在节点上运行的时间。例如,
Q3
Use context:
kubectl config use-context k8s-c1-H
There are two Pods named
o3db-*
in Namespaceproject-c13
. C13 management asked you to scale the Pods down to one replica to save resources.
查看POD 情况可以看到末尾的有序命名,那么可以看到是使用 stateful set 来进行管理的。
➜ k -n project-c13 get pod | grep o3db
o3db-0 1/1 Running 0 52s
o3db-1 1/1 Running 0 42s
之后列出全部的工作负载,可以找到o3 是一个 statefulset
➜ k -n project-c13 get deploy,ds,sts | grep o3db
statefulset.apps/o3db 2/2 2m56s
之后直接进行Scale 就可以了
➜ k -n project-c13 scale sts o3db --replicas 1
statefulset.apps/o3db scaled
➜ k -n project-c13 get sts o3db
NAME READY AGE
o3db 1/1 4m39s
Q4
Do the following in Namespace
default
. Create a single Pod namedready-if-service-ready
of imagenginx:1.16.1-alpine
. Configure a LivenessProbe which simply executes commandtrue
. Also configure a ReadinessProbe which does check if the urlhttp://service-am-i-ready:80
is reachable, you can usewget -T2 -O- http://service-am-i-ready:80
for this. Start the Pod and confirm it isn’t ready because of the ReadinessProbe.Create a second Pod named
am-i-ready
of imagenginx:1.16.1-alpine
with labelid: cross-server-ready
. The already existing Serviceservice-am-i-ready
should now have that second Pod as endpoint.Now the first Pod should be in ready state, confirm that.
这个是考的POD 的几个探针的功能。按照题目要求配置 liveness 和 readiness 的探针
livenessProbe: # add from here
exec:
command:
- 'true'
readinessProbe:
exec:
command:
- sh
- -c
- 'wget -T2 -O- http://service-am-i-ready:80' # to here
因为svc是已经创建了,所以就可以直接手动的 run 一个pod 带上 label 就可以了。
k run am-i-ready --image=nginx:1.16.1-alpine --labels="id=cross-server-ready"
拓展
这里记录下 两种探针的区别
- LivenessProbe(存活探针): 存活探针主要作用是,用指定的方式进入容器检测容器中的应用是否正常运行,如果检测失败,则认为容器不健康,那么
Kubelet
将根据Pod
中设置的restartPolicy
(重启策略)来判断,Pod 是否要进行重启操作,如果容器配置中没有配置livenessProbe
存活探针,Kubelet
将认为存活探针探测一直为成功状态。 - ReadinessProbe(就绪探针): 用于判断容器中应用是否启动完成,当探测成功后才使 Pod 对外提供网络访问,设置容器
Ready
状态为true
,如果探测失败,则设置容器的Ready
状态为false
。对于被 Service 管理的 Pod,Service
与Pod
、EndPoint
的关联关系也将基于 Pod 是否为Ready
状态进行设置,如果 Pod 运行过程中Ready
状态变为false
,则系统自动从Service
关联的EndPoint
列表中移除,如果 Pod 恢复为Ready
状态。将再会被加回Endpoint
列表。通过这种机制就能防止将流量转发到不可用的 Pod 上。
探针支持下面的集中探测方式:
- ExecAction 执行命令成功即成功
- HttpGet 发送http请求,成功就成功
- TcpSocketAction 发起TCP建连成功就成功
对于探测失败两种探针也有不同的动作,ReadinessProbe 和 LivenessProbe 是使用相同探测的方式,只是探测后对 Pod 的处置方式不同:
- ReadinessProbe: 当检测失败后,将 Pod 的 IP:Port 从对应 Service 关联的 EndPoint 地址列表中删除。
- LivenessProbe: 当检测失败后将杀死容器,并根据 Pod 的重启策略来决定作出对应的措施。
Q5 Question 5 | Kubectl sorting
There are various Pods in all namespaces. Write a command into
/opt/course/5/find_pods.sh
which lists all Pods sorted by their AGE (metadata.creationTimestamp
).Write a second command into
/opt/course/5/find_pods_uid.sh
which lists all Pods sorted by fieldmetadata.uid
. Usekubectl
sorting for both commands.
简单的命令使用的考察,在sort by 里面使用正确的列就可以了
kubectl get pod -A --sort-by=.metadata.creationTimestamp
kubectl get pod -A --sort-by=.metadata.uid
Question 6 | Storage, PV, PVC, Pod volume
Create a new PersistentVolume named
safari-pv
. It should have a capacity of 2Gi, accessMode ReadWriteOnce, hostPath/Volumes/Data
and no storageClassName defined.Next create a new PersistentVolumeClaim in Namespace
project-tiger
namedsafari-pvc
. It should request 2Gi storage, accessMode ReadWriteOnce and should not define a storageClassName. The PVC should bound to the PV correctly.Finally create a new Deployment
safari
in Namespaceproject-tiger
which mounts that volume at/tmp/safari-data
. The Pods of that Deployment should be of imagehttpd:2.4.41-alpine
.
考察基础的pv和pvc 的创建知识,虽然知道怎么搞出来,但是yaml的模板获取只能到doc中去。考试浏览器的体验不怎样。复制粘贴比较耗时。
# 6_pv.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
name: safari-pv
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/Volumes/Data"
# 6_pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: safari-pvc
namespace: project-tiger
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
使用命令来生成deploy的模板。
k -n project-tiger create deploy safari \
--image=httpd:2.4.41-alpine $do > 6_dep.yaml
在spec 部分来实现volume的挂载
spec:
volumes: # add
- name: data # add
persistentVolumeClaim: # add
claimName: safari-pvc # add
containers:
- image: httpd:2.4.41-alpine
name: container
volumeMounts: # add
- name: data # add
mountPath: /tmp/safari-data # add
之后describe 可以查看volume的挂载情况
➜ k -n project-tiger describe pod safari-5cbf46d6d-mjhsb | grep -A2 Mounts:
Mounts:
/tmp/safari-data from data (rw) # there it is
/var/run/secrets/kubernetes.io/serviceaccount from default-token-n2sjj (ro)
Question 7 | Node and Pod Resource Usage
The metrics-server has been installed in the cluster. Your college would like to know the kubectl commands to:
show Nodes resource usage
show Pods and their containers resource usage
Please write the commands into /opt/course/7/node.sh and /opt/course/7/pod.sh.
这个是个送分题,最多再加一个 sort by
kubectl top node
# 下面这个展示pod内每个容器的,需要加一个参数
kubectl top pod --containers=true
Question 8 | Get Controlplane Information
Ssh into the controlplane node with
ssh cluster1-controlplane1
. Check how the controlplane components kubelet, kube-apiserver, kube-scheduler, kube-controller-manager and etcd are started/installed on the controlplane node. Also find out the name of the DNS application and how it’s started/installed on the controlplane node.Write your findings into file
/opt/course/8/controlplane-components.txt
. The file should be structured like:“`
<h1>/opt/course/8/controlplane-components.txt</h1>
kubelet: [TYPE]
kube-apiserver: [TYPE]
kube-scheduler: [TYPE]
kube-controller-manager: [TYPE]
etcd: [TYPE]
dns: [TYPE] [NAME]
“`Choices of
[TYPE]
are:not-installed
,process
,static-pod
,pod
检查哥哥组件的安装方式,这个没做出来。不知道该用何种形式查看。解析中的流程如下:
先查看每个进程,看看是否是已经安装的。 直接 ps aux
加上 grep来看。
之后find /etc/systemd/system
查看是否注册 service
➜ [email protected]:~# find /etc/systemd/system/ | grep kube
➜ [email protected]:~# find /etc/systemd/system/ | grep etcd
之后可以查看kube的清单文件(默认安装的资源文件),这里看到的都是节点上部署的staticpod。
find /etc/kubernetes/manifests/
通过下面的拓展知识可以知道,static-pod 有固定的命名方式,可以通过节目的的命名来找到。
kubectl -n kube-system get pod -o wide | grep controlplane1
拓展
静态 Pod 或者说 static-pod 是指的是在指定的节点上由 kubelet 守护进程直接管理,不需要 API 服务器监管。 与由控制面管理的 Pod(例如,Deployment) 不同;kubelet 监视每个静态 Pod(在它失败之后重新启动)。
直接使用 kubectl 是无法进行节点的控制的,但是资源是可见的。Pod 名称将把以连字符开头的节点主机名作为后缀。
用于在指定的节点上运行一个指定的服务。当然如果是需要全部节点都部署的话那就需要使用到 daemonset。
Question 9 | Kill Scheduler, Manual Scheduling
Ssh into the controlplane node with
ssh cluster2-controlplane1
. Temporarily stop the kube-scheduler, this means in a way that you can start it again afterwards.Create a single Pod named
manual-schedule
of imagehttpd:2.4-alpine
, confirm it’s created but not scheduled on any node.Now you’re the scheduler and have all its power, manually schedule that Pod on node cluster2-controlplane1. Make sure it’s running.
Start the kube-scheduler again and confirm it’s running correctly by creating a second Pod named
manual-schedule2
of imagehttpd:2.4-alpine
and check if it’s running on cluster2-node1.
这里就比较难了,第一步是确定scheduler 的部署方式, 使用上面提到的 static的方式找到 scheduler, 之后吧他的manifestfile 移除目录。这样scheduler 就没有了。temporarily killed。
手动调度这里就比较高级了。记录下官方的操作方法。
k run manual-schedule --image=httpd:2.4-alpine
# 可以看到pod是pending的状态,没有被调度。
k get pod manual-schedule -o wide
NAME READY STATUS ... NODE NOMINATED NODE
manual-schedule 0/1 Pending ... <none> <none>
没有scheduler之后,pod 就不会被进行调度。进行手动调度的方式很简单。获取yaml文件之后。
k get pod manual-schedule -o yaml > 9.yaml
手动的设置他的当前的nodeName。强制的指定他当前的运行node即可完成手动的调度。
下面应用官方的解释,scheduler的目的就是设置 nodename 字段。但是实际上考虑了非常多的变量。
调度程序唯一要做的是它为POD声明设置了名称。 它如何找到可以安排的正确节点,这是一个非常复杂的问题,并考虑了许多变量。
# 9.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2020-09-04T15:51:02Z"
labels:
run: manual-schedule
managedFields:
...
manager: kubectl-run
operation: Update
time: "2020-09-04T15:51:02Z"
name: manual-schedule
namespace: default
resourceVersion: "3515"
selfLink: /api/v1/namespaces/default/pods/manual-schedule
uid: 8e9d2532-4779-4e63-b5af-feb82c74a935
spec:
nodeName: cluster2-controlplane1 # add the controlplane node name
containers:
- image: httpd:2.4-alpine
imagePullPolicy: IfNotPresent
name: manual-schedule
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-nxnc7
readOnly: true
dnsPolicy: ClusterFirst
...
之后直接使用命令,replace 掉资源即可完成手动的调度。
k -f 9.yaml replace --force
之后,恢复manifest 文件,恢复 scheduler 的启动。
Question 10 | RBAC ServiceAccount Role RoleBinding
Create a new ServiceAccount
processor
in Namespaceproject-hamster
. Create a Role and RoleBinding, both namedprocessor
as well. These should allow the new SA to only create Secrets and ConfigMaps in that Namespace.
这个比较简单是一个RBAC 的问题。理解role binding 以及 account的关系就可以。
ClusterRole|Role 定义了一组权限及其可用位置,是在整个集群中还是在单个命名空间中。 ClusterRoleBinding|RoleBinding 将一组权限与一个帐户连接起来,并定义它的应用位置,是在整个集群中还是在单个命名空间中。
因此,有 4 种不同的 RBAC 组合和 3 种有效组合:
- Role + RoleBinding(单Namespace可用,单Namespace应用)
- ClusterRole + ClusterRoleBinding(全集群可用,全集群应用)
- ClusterRole + RoleBinding(在集群范围内可用,应用于单个命名空间)
- Role + ClusterRoleBinding(不可能:在单个命名空间中可用,在集群范围内应用)
这里记录下创建的过程
➜ k -n project-hamster create sa processor
# 查看role 的help
k -n project-hamster create role -h
# 根据help 来授权动词和资源。
k -n project-hamster create role processor \
--verb=create \
--resource=secret \
--resource=configmap
如果写yaml 文件的话内容如下
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: processor
namespace: project-hamster
rules:
- apiGroups:
- ""
resources:
- secrets
- configmaps
verbs:
- create
之后创建rolebinding,来绑定 sa 和 role
k -n project-hamster create rolebinding processor \
--role processor \
--serviceaccount project-hamster:processor
这里学到了一个技巧,可以使用 auth 来测试自己的sa 权限。可以测试自己的权限。
k auth can-i -h # examples
➜ k -n project-hamster auth can-i create secret \
--as system:serviceaccount:project-hamster:processor
yes
➜ k -n project-hamster auth can-i create configmap \
--as system:serviceaccount:project-hamster:processor
yes
➜ k -n project-hamster auth can-i create pod \
--as system:serviceaccount:project-hamster:processor
no
➜ k -n project-hamster auth can-i delete secret \
--as system:serviceaccount:project-hamster:processor
no
➜ k -n project-hamster auth can-i get configmap \
--as system:serviceaccount:project-hamster:processor
no
Question 11 | DaemonSet on all Nodes
Use Namespace
project-tiger
for the following. Create a DaemonSet namedds-important
with imagehttpd:2.4-alpine
and labelsid=ds-important
anduuid=18426a0b-5f59-4e10-923f-c0e078e82462
. The Pods it creates should request 10 millicore cpu and 10 mebibyte memory. The Pods of that DaemonSet should run on all nodes, also controlplanes.
创建一个daemonset ,需要带id的标签,可以直接使用dry run 来生成配置
k -n project-tiger create deployment --image=httpd:2.4-alpine ds-important $do > 11.yaml
这里直接贴yaml 的配置了,里面有些常用字段还是要记忆一下。
# 11.yaml
apiVersion: apps/v1
kind: DaemonSet # change from Deployment to Daemonset
metadata:
creationTimestamp: null
labels: # add
id: ds-important # add
uuid: 18426a0b-5f59-4e10-923f-c0e078e82462 # add
name: ds-important
namespace: project-tiger # important
spec:
#replicas: 1 # remove
selector:
matchLabels:
id: ds-important # add
uuid: 18426a0b-5f59-4e10-923f-c0e078e82462 # add
#strategy: {} # remove
template:
metadata:
creationTimestamp: null
labels:
id: ds-important # add
uuid: 18426a0b-5f59-4e10-923f-c0e078e82462 # add
spec:
containers:
- image: httpd:2.4-alpine
name: ds-important
resources:
requests: # add
cpu: 10m # add
memory: 10Mi # add
tolerations: # add
- effect: NoSchedule # add
key: node-role.kubernetes.io/control-plane # add
#status: {} # remove
这里需要run on allnode ,所以容忍度这里需要进行配置,需要容忍 control-panel的taint。所以 tolerations 是比较重要的一部份。
拓展
Kubectl apply 和 create 的区别
根据搜索结果 [1][2][3][4][5][6], kubectl apply和kubectl create是用来创建Kubernetes对象的两个命令。它们之间的主要区别在于:
Imperative vs. Declarative:kubectl create是一种Imperative command,它是直接告诉kubectl要创建什么资源或对象,而kubectl apply是一种Declarative command,它不会直接告诉kubectl具体做什么,而是根据后面-f中的yaml文件与k8s中对应的对象进行比较和合并,实现将声明(Desired state)与实际状态(Actual state)进行比较和更新。
资源存在性判断:kubectl create创建新资源,如果给定的资源名称已存在,将返回错误;而kubectl apply则不会创建新资源,而是在已有资源的基础上做更新的操作。
部分更新:kubectl create只能创建完全指定的资源,如果只是想修改资源的某个字段,需要先获取资源然后修改,再进行更新。而kubectl apply可以对yaml文件中部分字段进行更新,只要在文件中定义那些要修改的字段就行了。
综上,kubectl apply适用于部署对象的创建和更新,可以方便地进行部分更新,而kubectl create适用于在没有任何资源的情况下进行全新的创建操作。
个人看法:由于apply基于现有的定义进行更新,仅更改当前定义和现有定义之间的差异,所以适用于应用的持续部署和自动化部署。而create更适用于应用的首次部署或罕见资源的创建。
Question 13 | Multi Containers and Pod shared Volume
Create a Pod named
multi-container-playground
in Namespacedefault
with three containers, namedc1
,c2
andc3
. There should be a volume attached to that Pod and mounted into every container, but the volume shouldn’t be persisted or shared with other Pods.Container
c1
should be of imagenginx:1.17.6-alpine
and have the name of the node where its Pod is running available as environment variable MY_NODE_NAME.Container
c2
should be of imagebusybox:1.31.1
and write the output of thedate
command every second in the shared volume into filedate.log
. You can usewhile true; do date >> /your/vol/path/date.log; sleep 1; done
for this.Container
c3
should be of imagebusybox:1.31.1
and constantly send the content of filedate.log
from the shared volume to stdout. You can usetail -f /your/vol/path/date.log
for this.Check the logs of container
c3
to confirm correct setup.
创建一个有三个container的pod。 不同pod不同功能
- 环境变量 缺省的变量名
- Empty DIR 共享
- run command
yaml 文件重点的部份这里贴一下。
- 使用节点名来作为环境变量导入container。
env:
- name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- emptyDir 用来container 挂载
# pod
volumes:
- name: vol
emptyDir: {}
# container
volumeMounts:
- name: vol
mountPath: /vol
- 定义容器的Command,来实现题目中的功能
# pod1
command: ["sh", "-c", "while true; do date >> /vol/date.log; sleep 1; done"]
#pod2
command: ["sh", "-c", "tail -f /vol/date.log"]
Question 14 | Find out Cluster Information
You’re ask to find out following information about the cluster
k8s-c1-H
:
- How many controlplane nodes are available?
- How many worker nodes are available?
- What is the Service CIDR?
- Which Networking (or CNI Plugin) is configured and where is its config file?
- Which suffix will static pods have that run on cluster1-node1?
获取集群的基本信息的能力,不难,但是有些名词不知道。这里也简单的记录一下。
- 使用get node 通过role 就可以看到。
- 使用get node 通过role 就可以看到。
- 是查看service 的CIDR ,这个就是看 apiserver的配置。它是一个 staticipod。那么可以上控制平面节节点通过manifest 来查看。
➜ ssh cluster1-controlplane1
➜ [email protected]:~# cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep range
- --service-cluster-ip-range=10.96.0.0/12
- 查找主机安装的CNI插件,
$find /etc/cni/net.d/
/etc/cni/net.d/
/etc/cni/net.d/10-weave.conflist
$cat /etc/cni/net.d/10-weave.conflist
{
"cniVersion": "0.3.0",
"name": "weave",
...
- 后缀是带有前导连字符的节点主机名。
直接get all pod就可以看到。
Question 15 | Cluster Event Logging
Write a command into
/opt/course/15/cluster_events.sh
which shows the latest events in the whole cluster, ordered by time (metadata.creationTimestamp
). Usekubectl
for it.Now kill the kube-proxy Pod running on node cluster2-node1 and write the events this caused into
/opt/course/15/pod_kill.log
.Finally kill the containerd container of the kube-proxy Pod on node cluster2-node1 and write the events into
/opt/course/15/container_kill.log
.Do you notice differences in the events both actions caused?
一个查看event的基本操作,用于定位集群问题。
kubectl get events -A --sort-by=.metadata.creationTimestamp
k -n kube-system delete pod kube-proxy-z64cg
# 这里使用 crictl命令 和 docker 命令类似
crictl ps | grep kube-proxy
crictl rm 1e020b43c4423
Question 16 | Namespaces and Api Resources
Write the names of all namespaced Kubernetes resources (like Pod, Secret, ConfigMap…) into
/opt/course/16/resources.txt
.Find the
project-*
Namespace with the highest number ofRoles
defined in it and write its name and amount of Roles into/opt/course/16/crowded-namespace.txt
.
送分题,get 所有的资源,以及role最多的命名空间。
k api-resources # shows all
k api-resources -h # help always good
k api-resources --namespaced -o name > /opt/course/16/resources.txt
获取ns中间的role 计数
k -n project-c13 get role --no-headers | wc -l
Question 17 | Find Container of Pod and check info
In Namespace
project-tiger
create a Pod namedtigers-reunite
of imagehttpd:2.4.41-alpine
with labelspod=container
andcontainer=pod
. Find out on which node the Pod is scheduled. Ssh into that node and find the containerd container belonging to that Pod.Using command
crictl
:
- Write the ID of the container and the
info.runtimeType
into/opt/course/17/pod-container.txt
- Write the logs of the container into
/opt/course/17/pod-container.log
如题目所示,先创建pod,在找到它调度的节点。
k -n project-tiger run tigers-reunite \
--image=httpd:2.4.41-alpine \
--labels "pod=container,container=pod"
k -n project-tiger get pod -o wide
登录节点主机获取容器的信息,这里使用 crictl 命令 实际上和 docker 命令是一样的
➜ ssh cluster1-node2
➜ [email protected]:~# crictl ps | grep tigers-reunite
b01edbe6f89ed 54b0995a63052 5 seconds ago Running tigers-reunite ...
➜ [email protected]:~# crictl inspect b01edbe6f89ed | grep runtimeType
"runtimeType": "io.containerd.runc.v2",
# get
ssh cluster1-node2 'crictl logs b01edbe6f89ed' &> /opt/course/17/pod-container.log
Question 18 | Fix Kubelet
There seems to be an issue with the kubelet not running on
cluster3-node1
. Fix it and confirm that cluster has nodecluster3-node1
available in Ready state afterwards. You should be able to schedule a Pod oncluster3-node1
afterwards.Write the reason of the issue into
/opt/course/18/reason.txt
.
找到 节点上 Kubectl 的未运行的原因。
service kubelet status
➜ [email protected]:~# /usr/local/bin/kubelet
-bash: /usr/local/bin/kubelet: No such file or directory
➜ [email protected]:~# whereis kubelet
kubelet: /usr/bin/kubelet
# /opt/course/18/reason.txt
wrong path to kubelet binary specified in service config
Question 19 | Create Secret and mount into Pod
Do the following in a new Namespace
secret
. Create a Pod namedsecret-pod
of imagebusybox:1.31.1
which should keep running for some time.There is an existing Secret located at
/opt/course/19/secret1.yaml
, create it in the Namespacesecret
and mount it readonly into the Pod at/tmp/secret1
.Create a new Secret in Namespace
secret
calledsecret2
which should containuser=user1
andpass=1234
. These entries should be available inside the Pod’s container as environment variables APP_USER and APP_PASS.Confirm everything is working.
简单的看下题目,是secret 的用法。一个是挂载只读。一个是注入到环境变量中。
Secret 文件
# 19_secret1.yaml
apiVersion: v1
data:
halt: IyEgL2Jpbi9zaAo...
kind: Secret
metadata:
creationTimestamp: null
name: secret1
namespace: secret # change
之后使用命令来进行创建
k -f 19_secret1.yaml create
k -n secret create secret generic secret2 --from-literal=user=user1 --from-literal=pass=1234
使用dry run 来创建yaml 模板
k -n secret run secret-pod --image=busybox:1.31.1 $do -- sh -c "sleep 5d" > 19.yaml
如下文来进行yaml 的文件修改,里面有两段,一个是使用 key 来导入到 ENV 中去。 一段是挂载只读文件。
# 19.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: secret-pod
name: secret-pod
namespace: secret # add
spec:
containers:
- args:
- sh
- -c
- sleep 1d
image: busybox:1.31.1
name: secret-pod
resources: {}
env: # add
- name: APP_USER # add
valueFrom: # add
secretKeyRef: # add
name: secret2 # add
key: user # add
- name: APP_PASS # add
valueFrom: # add
secretKeyRef: # add
name: secret2 # add
key: pass # add
volumeMounts: # add
- name: secret1 # add
mountPath: /tmp/secret1 # add
readOnly: true # add
dnsPolicy: ClusterFirst
restartPolicy: Always
volumes: # add
- name: secret1 # add
secret: # add
secretName: secret1 # add
status: {}
Question 20 | Update Kubernetes Version and join cluster
Your coworker said node
cluster3-node2
is running an older Kubernetes version and is not even part of the cluster. Update Kubernetes on that node to the exact version that’s running oncluster3-controlplane1
. Then add this node to the cluster. Use kubeadm for this.
k8s 节点的升级。kubeadm upgrade node
但是回显报错。经验上看是没有安装k8s的环境。所以需要进行安装,并且手动的加入到集群。
couldn't create a Kubernetes client from file "/etc/kubernetes/kubelet.conf": failed to load admin kubeconfig: open /etc/kubernetes/kubelet.conf: no such file or directory
To see the stack trace of this error execute with --v=5 or higher
指定安装命令如下。
[email protected]:~# apt show kubectl -a | grep 1.26
apt install kubectl=1.26.0-00 kubelet=1.26.0-00
kubelet --version
service kubelet restart
service kubelet status
# 输出会有报错,因为没有加入集群
在控制平面上创建新的join token
➜ ssh cluster3-controlplane1
➜ [email protected]:~# kubeadm token create --print-join-command
kubeadm join 192.168.100.31:6443 --token rbhrjh.4o93r31o18an6dll --discovery-token-ca-cert-hash sha256:d94524f9ab1eed84417414c7def5c1608f84dbf04437d9f5f73eb6255dafdb18
➜ [email protected]:~# kubeadm token list
到node节点上在使用 join 命令来加入到集群中去
kubeadm join 192.168.100.31:6443 --token rbhrjh.4o93r31o18an6dll --discovery-token-ca-cert-hash
Question 21 | Create a Static Pod and Service
Create a
Static Pod
namedmy-static-pod
in Namespacedefault
on cluster3-controlplane1. It should be of imagenginx:1.16-alpine
and have resource requests for10m
CPU and20Mi
memory.Then create a NodePort Service named
static-pod-service
which exposes that static Pod on port 80 and check if it has Endpoints and if it’s reachable through thecluster3-controlplane1
internal IP address. You can connect to the internal node IPs from your main terminal.
static pod 的创建,前面提到了相关的知识点。 static pod 是不受 kube api 控制的。 是本机的kubelet 直接控制。创建起来很简单,放在manifest 文件夹中就可以成功创建了。
➜ ssh cluster3-controlplane1
➜ [email protected]:~# cd /etc/kubernetes/manifests/
➜ [email protected]:~# kubectl run my-static-pod \
--image=nginx:1.16-alpine \
-o yaml --dry-run=client > my-static-pod.yaml
Question 22 | Check how long certificates are valid
Check how long the kube-apiserver server certificate is valid on
cluster2-controlplane1
. Do this with openssl or cfssl. Write the exipiration date into/opt/course/22/expiration
.Also run the correct
kubeadm
command to list the expiration dates and confirm both methods show the same date.Write the correct
kubeadm
command that would renew the apiserver server certificate into/opt/course/22/kubeadm-renew-certs.sh
.
➜ ssh cluster2-controlplane1
➜ [email protected]:~# find /etc/kubernetes/pki | grep apiserver
#openssl 进行查看
openssl x509 -noout -text -in /etc/kubernetes/pki/apiserver.crt | grep Validity -A2
使用kubeadm 来查看证书情况,以及renew
➜ [email protected]:~# kubeadm certs check-expiration | grep apiserver
kubeadm certs renew apiserver
Question 23 | Kubelet client/server cert info
Node cluster2-node1 has been added to the cluster using
kubeadm
and TLS bootstrapping.Find the “Issuer” and “Extended Key Usage” values of the cluster2-node1:
- kubelet client certificate, the one used for outgoing connections to the kube-apiserver.
- kubelet server certificate, the one used for incoming connections from the kube-apiserver.
Write the information into file
/opt/course/23/certificate-info.txt
.Compare the “Issuer” and “Extended Key Usage” fields of both certificates and make sense of these.
Question 24 | NetworkPolicy
There was a security incident where an intruder was able to access the whole cluster from a single hacked backend Pod.
To prevent this create a NetworkPolicy called
np-backend
in Namespaceproject-snake
. It should allow thebackend-*
Pods only to:
- connect to
db1-*
Pods on port 1111- connect to
db2-*
Pods on port 2222Use the
app
label of Pods in your policy.After implementation, connections from
backend-*
Pods tovault-*
Pods on port 3333 should for example no longer work.
配置 Networkpolicy 来限制pod 到指定范围的出口。
# 先看看目前的pod环境
➜ k -n project-snake get pod -L app
# 得到目前的pod 的ip
➜ k -n project-snake get pod -o wide
### 使用exec 来进行访问测试,可见目前都是通的
➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.25:1111
database one
➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.23:2222
database two
➜ k -n project-snake exec backend-0 -- curl -s 10.44.0.22:3333
vault secret storage
编辑networkPolicy,下面直接列出来配置,因为是对backend 进行配置,所以是限制他的egress。
# 24_np.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: np-backend
namespace: project-snake
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Egress # policy is only about Egress
egress:
- # first rule
to: # first condition "to"
- podSelector:
matchLabels:
app: db1
ports: # second condition "port"
- protocol: TCP
port: 1111
- # second rule
to: # first condition "to"
- podSelector:
matchLabels:
app: db2
ports: # second condition "port"
- protocol: TCP
port: 2222
Question 25 | Etcd Snapshot Save and Restore
Make a backup of etcd running on cluster3-controlplane1 and save it on the controlplane node at
/tmp/etcd-backup.db
.Then create a Pod of your kind in the cluster.
Finally restore the backup, confirm the cluster is still working and that the created Pod is no longer with us.
ETCD 的备份和恢复。 ETCD 可能是 服务安装方式,或者是集群内安装方式。所以需要检查etc文件夹和 manifest 文件夹。找到配置文件。来了得到关键的信息。证书的路径等等
➜ [email protected]:~# cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep etcd
- --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
- --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
- --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
- --etcd-servers=https://127.0.0.1:2379
TIPS
Components
- Understanding Kubernetes components and being able to fix and investigate clusters: https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster
- Know advanced scheduling: https://kubernetes.io/docs/concepts/scheduling/kube-scheduler
- When you have to fix a component (like kubelet) in one cluster, just check how it’s setup on another node in the same or even another cluster. You can copy config files over etc
- If you like you can look at Kubernetes The Hard Way once. But it’s NOT necessary to do, the CKA is not that complex. But KTHW helps understanding the concepts
- You should install your own cluster using kubeadm (one controlplane, one worker) in a VM or using a cloud provider and investigate the components
- Know how to use Kubeadm to for example add nodes to a cluster
- Know how to create an Ingress resources
- Know how to snapshot/restore ETCD from another machine
CKA Preparation
Read the Curriculum
https://github.com/cncf/curriculum
Read the Handbook
https://docs.linuxfoundation.org/tc-docs/certification/lf-candidate-handbook
Read the important tips
https://docs.linuxfoundation.org/tc-docs/certification/tips-cka-and-ckad
Read the FAQ
https://docs.linuxfoundation.org/tc-docs/certification/faq-cka-ckad