|
| 1 | +--- |
| 2 | +title: "kube-scheduler-simulator:让调度器可观察、可调试、可扩展" |
| 3 | +summary: "本文介绍了 kube-scheduler-simulator,它是 Kubernetes 调度器的模拟器,能暴露调度器内部决策,用于测试调度约束、配置和自定义插件,解决调度器测试难题,仅需安装 Docker 即可使用,由 Kubernetes SIG Scheduling 开发并欢迎社区参与。" |
| 4 | +authors: ["Kensei Nakada"] |
| 5 | +translators: ["云原生社区"] |
| 6 | +categories: ["Kubernetes"] |
| 7 | +tags: ["Kuberentes"] |
| 8 | +draft: false |
| 9 | +date: 2025-04-08T14:35:56+08:00 |
| 10 | +links: |
| 11 | + - icon: language |
| 12 | + icon_pack: fa |
| 13 | + name: 阅读英文版原文 |
| 14 | + url: https://kubernetes.io/blog/2025/04/07/introducing-kube-scheduler-simulator/ |
| 15 | +--- |
| 16 | + |
| 17 | +Kubernetes Scheduler 是控制面中的核心组件之一,负责决定每个 Pod 运行在哪个节点上。换句话说,所有使用 Kubernetes 的用户,其 Pod 的命运都掌握在调度器手中。 |
| 18 | + |
| 19 | +[kube-scheduler-simulator](https://github.com/kubernetes-sigs/kube-scheduler-simulator) 是一个 Kubernetes 调度器的“模拟器”,最初是我(Kensei Nakada)在 [Google Summer of Code 2021](https://summerofcode.withgoogle.com/) 中启动的项目,后续得到了众多贡献者的支持。这个工具的目标是帮助用户深入观察调度器的行为和决策逻辑。 |
| 20 | + |
| 21 | +无论你是使用诸如 [Pod 亲和性](https://kubernetes.io/zh-cn/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity) 等调度约束的普通用户,还是开发自定义调度插件的调度器专家,这款模拟器都将成为你理解和测试调度策略的重要助手。 |
| 22 | + |
| 23 | +## 为什么我们需要一个调度器模拟器? |
| 24 | + |
| 25 | +Kubernetes Scheduler 本质上是一个插件驱动的“黑盒”,每个插件从不同的角度参与调度决策,理解它的行为并不容易。 |
| 26 | + |
| 27 | +即使你在测试集群中看到 Pod 正常调度,也不能保证它是按你预期的逻辑调度的。这种“表面正常,实则偏差”的问题,往往会在生产环境中带来意想不到的调度后果。 |
| 28 | + |
| 29 | +此外,调度器的测试也非常具有挑战性。真实集群中的调度场景复杂多样,无法通过有限的测试用例全面覆盖。即使是 Kubernetes 官方的 upstream 调度器,也经常是在发布后才被用户发现问题。 |
| 30 | + |
| 31 | +虽然开发环境或沙箱集群是常用的测试手段,但这类环境通常规模较小、工作负载有限,与实际生产集群的行为存在巨大差异。因此,仅靠传统方法,很难预知调度器在真实环境中的表现。 |
| 32 | + |
| 33 | +kube-scheduler-simulator 诞生正是为了填补这一空白: |
| 34 | + |
| 35 | +- 用户可以验证调度约束、调度器配置、自定义插件的行为; |
| 36 | +- 可以在模拟环境中测试调度效果,而不影响实际的工作负载; |
| 37 | +- 还能观察调度过程中的每一个决策细节,真正“把调度器变成白盒”。 |
| 38 | + |
| 39 | +## 模拟器具有什么能力? |
| 40 | + |
| 41 | +kube-scheduler-simulator 的核心能力是:**揭示调度器内部的决策过程**。Kubernetes Scheduler 是基于 [调度框架(Scheduling Framework)](https://kubernetes.io/zh-cn/docs/concepts/scheduling-eviction/scheduling-framework/) 实现的,整个调度流程分为多个扩展点,如 Filter、Score、Bind 等,每个阶段会调用对应插件进行判断和打分。 |
| 42 | + |
| 43 | +模拟器提供了图形化界面,用户可以在其中创建 Pod、Node、Deployment 等 Kubernetes 资源,并清晰地看到调度器每个插件的执行结果和得分情况。 |
| 44 | + |
| 45 | + |
| 46 | + |
| 47 | +在这个模拟器中,我们运行的是一个 **可调试版本的调度器**,它会将各插件在不同阶段的执行结果写入 Pod 的注解中,前端界面再将这些信息进行可视化展示。 |
| 48 | + |
| 49 | +例如,你可以查看调度器在 Filter 阶段如何判断某个节点是否可用,在 Score 阶段各插件打了多少分,最终选择了哪个节点作为目标节点。这些信息都能在 Pod 的注解中查阅: |
| 50 | + |
| 51 | +```yaml |
| 52 | +kind: Pod |
| 53 | +apiVersion: v1 |
| 54 | +metadata: |
| 55 | + # The JSONs within these annotations are manually formatted for clarity in the blog post. |
| 56 | + annotations: |
| 57 | + kube-scheduler-simulator.sigs.k8s.io/bind-result: '{"DefaultBinder":"success"}' |
| 58 | + kube-scheduler-simulator.sigs.k8s.io/filter-result: >- |
| 59 | + { |
| 60 | + "node-jjfg5":{ |
| 61 | + "NodeName":"passed", |
| 62 | + "NodeResourcesFit":"passed", |
| 63 | + "NodeUnschedulable":"passed", |
| 64 | + "TaintToleration":"passed" |
| 65 | + }, |
| 66 | + "node-mtb5x":{ |
| 67 | + "NodeName":"passed", |
| 68 | + "NodeResourcesFit":"passed", |
| 69 | + "NodeUnschedulable":"passed", |
| 70 | + "TaintToleration":"passed" |
| 71 | + } |
| 72 | + } |
| 73 | + kube-scheduler-simulator.sigs.k8s.io/finalscore-result: >- |
| 74 | + { |
| 75 | + "node-jjfg5":{ |
| 76 | + "ImageLocality":"0", |
| 77 | + "NodeAffinity":"0", |
| 78 | + "NodeResourcesBalancedAllocation":"52", |
| 79 | + "NodeResourcesFit":"47", |
| 80 | + "TaintToleration":"300", |
| 81 | + "VolumeBinding":"0" |
| 82 | + }, |
| 83 | + "node-mtb5x":{ |
| 84 | + "ImageLocality":"0", |
| 85 | + "NodeAffinity":"0", |
| 86 | + "NodeResourcesBalancedAllocation":"76", |
| 87 | + "NodeResourcesFit":"73", |
| 88 | + "TaintToleration":"300", |
| 89 | + "VolumeBinding":"0" |
| 90 | + } |
| 91 | + } |
| 92 | + kube-scheduler-simulator.sigs.k8s.io/permit-result: '{}' |
| 93 | + kube-scheduler-simulator.sigs.k8s.io/permit-result-timeout: '{}' |
| 94 | + kube-scheduler-simulator.sigs.k8s.io/postfilter-result: '{}' |
| 95 | + kube-scheduler-simulator.sigs.k8s.io/prebind-result: '{"VolumeBinding":"success"}' |
| 96 | + kube-scheduler-simulator.sigs.k8s.io/prefilter-result: '{}' |
| 97 | + kube-scheduler-simulator.sigs.k8s.io/prefilter-result-status: >- |
| 98 | + { |
| 99 | + "AzureDiskLimits":"", |
| 100 | + "EBSLimits":"", |
| 101 | + "GCEPDLimits":"", |
| 102 | + "InterPodAffinity":"", |
| 103 | + "NodeAffinity":"", |
| 104 | + "NodePorts":"", |
| 105 | + "NodeResourcesFit":"success", |
| 106 | + "NodeVolumeLimits":"", |
| 107 | + "PodTopologySpread":"", |
| 108 | + "VolumeBinding":"", |
| 109 | + "VolumeRestrictions":"", |
| 110 | + "VolumeZone":"" |
| 111 | + } |
| 112 | + kube-scheduler-simulator.sigs.k8s.io/prescore-result: >- |
| 113 | + { |
| 114 | + "InterPodAffinity":"", |
| 115 | + "NodeAffinity":"success", |
| 116 | + "NodeResourcesBalancedAllocation":"success", |
| 117 | + "NodeResourcesFit":"success", |
| 118 | + "PodTopologySpread":"", |
| 119 | + "TaintToleration":"success" |
| 120 | + } |
| 121 | + kube-scheduler-simulator.sigs.k8s.io/reserve-result: '{"VolumeBinding":"success"}' |
| 122 | + kube-scheduler-simulator.sigs.k8s.io/result-history: >- |
| 123 | + [ |
| 124 | + { |
| 125 | + "kube-scheduler-simulator.sigs.k8s.io/bind-result":"{\"DefaultBinder\":\"success\"}", |
| 126 | + "kube-scheduler-simulator.sigs.k8s.io/filter-result":"{\"node-jjfg5\":{\"NodeName\":\"passed\",\"NodeResourcesFit\":\"passed\",\"NodeUnschedulable\":\"passed\",\"TaintToleration\":\"passed\"},\"node-mtb5x\":{\"NodeName\":\"passed\",\"NodeResourcesFit\":\"passed\",\"NodeUnschedulable\":\"passed\",\"TaintToleration\":\"passed\"}}", |
| 127 | + "kube-scheduler-simulator.sigs.k8s.io/finalscore-result":"{\"node-jjfg5\":{\"ImageLocality\":\"0\",\"NodeAffinity\":\"0\",\"NodeResourcesBalancedAllocation\":\"52\",\"NodeResourcesFit\":\"47\",\"TaintToleration\":\"300\",\"VolumeBinding\":\"0\"},\"node-mtb5x\":{\"ImageLocality\":\"0\",\"NodeAffinity\":\"0\",\"NodeResourcesBalancedAllocation\":\"76\",\"NodeResourcesFit\":\"73\",\"TaintToleration\":\"300\",\"VolumeBinding\":\"0\"}}", |
| 128 | + "kube-scheduler-simulator.sigs.k8s.io/permit-result":"{}", |
| 129 | + "kube-scheduler-simulator.sigs.k8s.io/permit-result-timeout":"{}", |
| 130 | + "kube-scheduler-simulator.sigs.k8s.io/postfilter-result":"{}", |
| 131 | + "kube-scheduler-simulator.sigs.k8s.io/prebind-result":"{\"VolumeBinding\":\"success\"}", |
| 132 | + "kube-scheduler-simulator.sigs.k8s.io/prefilter-result":"{}", |
| 133 | + "kube-scheduler-simulator.sigs.k8s.io/prefilter-result-status":"{\"AzureDiskLimits\":\"\",\"EBSLimits\":\"\",\"GCEPDLimits\":\"\",\"InterPodAffinity\":\"\",\"NodeAffinity\":\"\",\"NodePorts\":\"\",\"NodeResourcesFit\":\"success\",\"NodeVolumeLimits\":\"\",\"PodTopologySpread\":\"\",\"VolumeBinding\":\"\",\"VolumeRestrictions\":\"\",\"VolumeZone\":\"\"}", |
| 134 | + "kube-scheduler-simulator.sigs.k8s.io/prescore-result":"{\"InterPodAffinity\":\"\",\"NodeAffinity\":\"success\",\"NodeResourcesBalancedAllocation\":\"success\",\"NodeResourcesFit\":\"success\",\"PodTopologySpread\":\"\",\"TaintToleration\":\"success\"}", |
| 135 | + "kube-scheduler-simulator.sigs.k8s.io/reserve-result":"{\"VolumeBinding\":\"success\"}", |
| 136 | + "kube-scheduler-simulator.sigs.k8s.io/score-result":"{\"node-jjfg5\":{\"ImageLocality\":\"0\",\"NodeAffinity\":\"0\",\"NodeResourcesBalancedAllocation\":\"52\",\"NodeResourcesFit\":\"47\",\"TaintToleration\":\"0\",\"VolumeBinding\":\"0\"},\"node-mtb5x\":{\"ImageLocality\":\"0\",\"NodeAffinity\":\"0\",\"NodeResourcesBalancedAllocation\":\"76\",\"NodeResourcesFit\":\"73\",\"TaintToleration\":\"0\",\"VolumeBinding\":\"0\"}}", |
| 137 | + "kube-scheduler-simulator.sigs.k8s.io/selected-node":"node-mtb5x" |
| 138 | + } |
| 139 | + ] |
| 140 | + kube-scheduler-simulator.sigs.k8s.io/score-result: >- |
| 141 | + { |
| 142 | + "node-jjfg5":{ |
| 143 | + "ImageLocality":"0", |
| 144 | + "NodeAffinity":"0", |
| 145 | + "NodeResourcesBalancedAllocation":"52", |
| 146 | + "NodeResourcesFit":"47", |
| 147 | + "TaintToleration":"0", |
| 148 | + "VolumeBinding":"0" |
| 149 | + }, |
| 150 | + "node-mtb5x":{ |
| 151 | + "ImageLocality":"0", |
| 152 | + "NodeAffinity":"0", |
| 153 | + "NodeResourcesBalancedAllocation":"76", |
| 154 | + "NodeResourcesFit":"73", |
| 155 | + "TaintToleration":"0", |
| 156 | + "VolumeBinding":"0" |
| 157 | + } |
| 158 | + } |
| 159 | + kube-scheduler-simulator.sigs.k8s.io/selected-node: node-mtb5x |
| 160 | +``` |
| 161 | +
|
| 162 | +不仅如此,用户还可以集成自己开发的 [调度插件](https://kubernetes.io/zh-cn/docs/concepts/scheduling-eviction/scheduling-framework/) 或 [调度扩展器(Extender)](https://github.com/kubernetes/design-proposals-archive/blob/main/scheduling/scheduler_extender.md),并在模拟器中观察其行为,调试效果显著提升。 |
| 163 | +
|
| 164 | +该可调试调度器还可以脱离模拟器独立运行,用于集成测试、自定义调度器的开发或真实集群中的调试分析。 |
| 165 | +
|
| 166 | +## 模拟器 = 更好的开发环境? |
| 167 | +
|
| 168 | +如前所述,真实调度场景复杂多样,单靠开发集群无法覆盖全部可能性。模拟器提供了一个更强大的方案: |
| 169 | +
|
| 170 | +通过使用 [集群资源导入功能](https://github.com/kubernetes-sigs/kube-scheduler-simulator/blob/master/simulator/docs/import-cluster-resources.md),用户可以将生产集群的资源同步到模拟器中,**在不影响实际业务的前提下测试新版本调度器**。 |
| 171 | +
|
| 172 | +你可以在模拟器中验证 Pod 的调度行为是否符合预期,再将调度器部署到生产集群,极大降低了调度相关的变更风险。 |
| 173 | +
|
| 174 | +## 使用场景总结 |
| 175 | +
|
| 176 | +1. **集群用户**:验证调度约束(如 PodAffinity、TopologySpread)是否按预期工作; |
| 177 | +2. **集群管理员**:评估调度器配置变更对调度结果的影响; |
| 178 | +3. **调度器插件开发者**:在模拟器中测试自定义插件、使用同步功能进行更真实的验证。 |
| 179 | +
|
| 180 | +## 如何上手? |
| 181 | +
|
| 182 | +这个项目无需 Kubernetes 集群,只需要本地安装 Docker 即可: |
| 183 | +
|
| 184 | +```bash |
| 185 | +git clone git@github.com:kubernetes-sigs/kube-scheduler-simulator.git |
| 186 | +cd kube-scheduler-simulator |
| 187 | +make docker_up |
| 188 | +``` |
| 189 | + |
| 190 | +默认情况下,Web 界面将运行在 `http://localhost:3000`,即可开始你的调度实验! |
| 191 | + |
| 192 | +👉 项目地址:https://github.com/kubernetes-sigs/kube-scheduler-simulator |
| 193 | + |
| 194 | +## 如何参与贡献? |
| 195 | + |
| 196 | +该项目由 [Kubernetes SIG Scheduling](https://github.com/kubernetes/community/blob/master/sig-scheduling/README.md#kube-scheduler-simulator) 维护。欢迎你提出 Issues、提交 PR,也欢迎加入社区参与讨论。 |
| 197 | + |
| 198 | +Slack 频道:[#sig-scheduling](https://kubernetes.slack.com/messages/sig-scheduling) |
| 199 | + |
| 200 | +## 鸣谢 |
| 201 | + |
| 202 | +这个模拟器的发展离不开众多志愿者工程师的坚持和贡献,感谢所有为之付出心力的 [贡献者们](https://github.com/kubernetes-sigs/kube-scheduler-simulator/graphs/contributors)! |
| 203 | + |
0 commit comments