节点调度
节点调度
Taint
Taint 无非是给集群节点添加属性,防止分配给我们的 pods 不合适。例如,集群中的每个节点 master 都被标记为不接收非集群管理的 pods。节点 master 被标记了污点 NoSchedule,所以 Kubernetes 调度器不会在 master 节点上分配 pod,而会在集群中寻找没有该标记的其他节点。
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
elliot-01 Ready master 7d14h v1.18.2
elliot-02 Ready <none> 7d14h v1.18.2
elliot-03 Ready <none> 7d14h v1.18.2
$ kubectl describe node elliot-01 | grep -i taint
Taints: node-role.kubernetes.io/master:NoSchedule
我们将测试一些东西,并允许主节点运行其他 pods。首先我们将运行 3 个 nginx 的副本。
$ kubectl create deployment nginx --image=nginx
deployment.apps/nginx created
$ kubectl get deployments.apps
NAME READY UP-TO-DATE AVAILABLE AGE
nginx 1/1 1 1 5s
$ kubectl scale deployment nginx --replicas=3
deployment.apps/nginx scaled
$ kubectl get deployments.apps
NAME READY UP-TO-DATE AVAILABLE AGE
nginx 3/3 3 3 1m5s
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
limit-pod 1/1 Running 0 3m44s 10.32.0.4 elliot-02 <none> <none>
nginx 1/1 Running 0 25m 10.46.0.1 elliot-03 <none> <none>
nginx-85f7fb6b45-9bzwc 1/1 Running 0 6m7s 10.32.0.3 elliot-02 <none> <none>
nginx-85f7fb6b45-cbmtr 1/1 Running 0 6m7s 10.46.0.2 elliot-03 <none> <none>
nginx-85f7fb6b45-rprz5 1/1 Running 0 6m7s 10.32.0.2 elliot-02 <none> <none>
我们将把 NoSchedule 这个标签也添加到 worker 节点中,看看它们的表现如何。
$ kubectl taint node elliot-02 key1=value1:NoSchedule
node/elliot-02 tainted
$ kubectl describe node elliot-02 | grep -i taint
Taints: key1=value1:NoSchedule
$ kubectl taint node elliot-03 key1=value1:NoSchedule
node/elliot-03 tainted
$ kubectl describe node elliot-03 | grep -i taint
Taints: key1=value1:NoSchedule
现在让我们增加副本的数量:
$ kubectl scale deployment nginx --replicas=5
deployment.apps/nginx scaled
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
limit-pod 1/1 Running 0 5m23s 10.32.0.4 elliot-02 <none> <none>
nginx 1/1 Running 0 27m 10.46.0.1 elliot-03 <none> <none>
nginx-85f7fb6b45-9bzwc 1/1 Running 0 7m46s 10.32.0.3 elliot-02 <none> <none>
nginx-85f7fb6b45-cbmtr 1/1 Running 0 7m46s 10.46.0.2 elliot-03 <none> <none>
nginx-85f7fb6b45-qnhtl 0/1 Pending 0 18s <none> <none> <none> <none>
nginx-85f7fb6b45-qsvpp 0/1 Pending 0 18s <none> <none> <none> <none>
nginx-85f7fb6b45-rprz5 1/1 Running 0 7m46s 10.32.0.2 elliot-02 <none> <none>
正如我们所看到的,新的复制是孤儿,希望有一个节点出现在 Scheduler 的适当优先级。让我们从我们的 worker 节点中移除这个 taint。
$ kubectl taint node elliot-02 key1:NoSchedule-
node/elliot-02 untainted
$ kubectl taint node elliot-03 key1:NoSchedule-
node/elliot-03 untainted
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
limit-pod 1/1 Running 0 6m17s 10.32.0.4 elliot-02 <none> <none>
nginx 1/1 Running 0 27m 10.46.0.1 elliot-03 <none> <none>
nginx-85f7fb6b45-9bzwc 1/1 Running 0 8m40s 10.32.0.3 elliot-02 <none> <none>
nginx-85f7fb6b45-cbmtr 1/1 Running 0 8m40s 10.46.0.2 elliot-03 <none> <none>
nginx-85f7fb6b45-qnhtl 1/1 Running 0 72s 10.46.0.5 elliot-03 <none> <none>
nginx-85f7fb6b45-qsvpp 1/1 Running 0 72s 10.46.0.4 elliot-03 <none> <none>
nginx-85f7fb6b45-rprz5 1/1 Running 0 8m40s 10.32.0.2 elliot-02 <none> <none>
有几种类型的标签,我们可以用来对节点进行分类,我们来测试另一个调用 NoExecute,它可以防止 Scheduler 在这些节点上调度 Pod。
$ kubectl taint node elliot-02 key1=value1:NoExecute
node/elliot-02 tainted
$ kubectl taint node elliot-03 key1=value1:NoExecute
node/elliot-03 tainted
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-85f7fb6b45-87sq5 0/1 Pending 0 20s
nginx-85f7fb6b45-8q99g 0/1 Pending 0 20s
nginx-85f7fb6b45-drmzz 0/1 Pending 0 20s
nginx-85f7fb6b45-hb4dp 0/1 Pending 0 20s
nginx-85f7fb6b45-l6zln 0/1 Pending 0 20s
我们可以看到所有的 Pod 都是孤儿。因为节点 master 有 kubernetes 的 NoScheduler 默认的污点标记,而 worker 节点有 NoExecute 的标记。让我们减少副本的数量来看看会发生什么。
$ kubectl scale deployment nginx --replicas=1
deployment.apps/nginx scaled
$ kubectl get pods
nginx-85f7fb6b45-drmzz 0/1 Pending 0 43s
$ kubectl taint node elliot-02 key1:NoExecute-
node/elliot-02 untainted
$ kubectl taint node elliot-03 key1:NoExecute-
node/elliot-03 untainted
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-85f7fb6b45-drmzz 1/1 Running 0 76s
我们现在有一个节点在正常运行。但是如果我们的 Worker 节点不可用,我们可以在主节点上运行 Pods 吗?当然可以,我们将配置我们的主节点,让 Scheduler 可以在它上调度 Pod。
$ kubectl taint nodes --all node-role.kubernetes.io/master-
node/elliot-01 untainted
$ kubectl describe node elliot-01 | grep -i taint
Taints: <none>
$ kubectl scale deployment nginx --replicas=4
deployment.apps/nginx scaled
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-85f7fb6b45-2c6dm 1/1 Running 0 9s 10.32.0.2 elliot-02 <none> <none>
nginx-85f7fb6b45-4jzcn 1/1 Running 0 9s 10.32.0.3 elliot-02 <none> <none>
nginx-85f7fb6b45-drmzz 1/1 Running 0 114s 10.46.0.1 elliot-03 <none> <none>
nginx-85f7fb6b45-rstvq 1/1 Running 0 9s 10.46.0.2 elliot-03 <none> <none>
让我们在 worker 节点中添加 Taint NoExecute,看看会发生什么。
$ kubectl taint node elliot-02 key1=value1:NoExecute
node/elliot-02 tainted
$ kubectl taint node elliot-03 key1=value1:NoExecute
node/elliot-03 tainted
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-85f7fb6b45-49knz 1/1 Running 0 14s 10.40.0.5 elliot-01 <none> <none>
nginx-85f7fb6b45-4cm9x 1/1 Running 0 14s 10.40.0.4 elliot-01 <none> <none>
nginx-85f7fb6b45-kppnd 1/1 Running 0 14s 10.40.0.6 elliot-01 <none> <none>
nginx-85f7fb6b45-rjlmj 1/1 Running 0 14s 10.40.0.3 elliot-01 <none> <none>
Scheduler 把所有的东西都分配给了节点主控,我们可以看到 Taint 可以用来调整设置哪个 Pod 应该分配给哪个节点。我们将允许我们的 Scheduler 在所有节点上分配和运行 Pod。去除集群中所有节点上的 NoSchedule 污点。
$ kubectl taint node --all key1:NoSchedule-
node/elliot-01 untainted
node/elliot-02 untainted
node/elliot-03 untainted
$ kubectl taint node --all key1:NoExecute-
node/kube-worker1 untainted
node/kube-worker2 untainted
error: taint "key1:NoExecute" not found
将节点设置为维护状态
为了将节点置于维护状态,我们将使用 cordon。
$ kubectl cordon elliot-02
node/elliot-02 cordoned
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
elliot-01 Ready master 7d14h v1.18.2
elliot-02 Ready,SchedulingDisabled <none> 7d14h v1.18.2
elliot-03 Ready <none> 7d14h v1.18.2
注意到节点 elliot-02 的状态是 Ready,SchedulingDisabled,现在你可以顺利地对你的节点进行维护了。要从维护模式中删除一个节点,我们将使用 uncordon。
$ kubectl uncordon elliot-02
node/elliot-02 uncordoned
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
elliot-01 Ready master 7d14h v1.18.2
elliot-02 Ready <none> 7d14h v1.18.2
elliot-03 Ready <none> 7d14h v1.18.2
根据标签选择节点
节点选择器是对我们的节点进行分类的方法,比如我们的节点 elliot-02 有一个 SSD 磁盘,位于 DataCenter 英国,而节点 elliot-03 有一个 HDD 磁盘,位于 DataCenter 荷兰。现在我们有了这些信息,我们将在节点中创建这些标签,使用 nodeSelector。
$ kubectl label node elliot-02 disk=SSD
$ kubectl label node elliot-02 dc=UK
$ kubectl label node elliot-03 dc=Netherlands
$ kubectl label nodes elliot-03 disk=hdd
$ kubectl label nodes elliot-03 disk=HDD --overwrite
要知道每个节点上配置的标签,只需执行以下命令。
$ kubectl label nodes elliot-02 --list
dc=UK
disk=SSD
kubernetes.io/hostname=elliot-02
beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
$ kubectl label nodes elliot-03 --list
beta.kubernetes.io/os=linux
dc=Netherlands
disk=HDD
kubernetes.io/hostname=elliot-03
beta.kubernetes.io/arch=amd64
现在,只需再次执行部署,但首先我们将在 YAML 中添加两个新选项,我们将看到神奇的事情发生。我们的 pod 将在节点 elliot-02 中创建,其中有 disk=SSD 的标签。
apiVersion : apps / v1
kind : Deployment
metadata :
labels :
run : nginx
name : third-deployment
namespace : default
spec :
replicas : 1
selector :
matchLabels :
run : nginx
template :
metadata :
creationTimestamp : null
labels :
run : nginx
dc :Netherlands
spec :
containers :
- image : nginx
imagePullPolicy : Always
name : nginx2
ports :
- containerPort : 80
protocol : TCP
resources : {}
terminationMessagePath : / dev / termination-log
terminationMessagePolicy : File
dnsPolicy : ClusterFirst
restartPolicy : Always
schedulerName : default-scheduler
securityContext : {}
terminationGracePeriodSeconds : 30
nodeSelector :
disk : SSD
从清单中创建部署:
$ kubectl create -f terceiro-deployment.yaml
deployment.extensions/terceiro-deployment created
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
primeiro-deployment-56d9... 1/1 Running 0 14m 172.17.0.4 elliot-03
segundo-deployment-869f... 1/1 Running 0 14m 172.17.0.5 elliot-03
terceiro-deployment-59cd... 1/1 Running 0 22s 172.17.0.6 elliot-02
我们可以通过如下方式移除标签:
$ kubectl label nodes elliot-02 dc-
$ kubectl label nodes --all dc-
现在想象一下,这可以为你提供无限的可能性,比如它是否是生产型的,是否消耗大量的 CPU 或大量的内存,是否需要放在某个机架上等等。