While kubernetes is easy to start with, it is quite challenging to master and know all details. In this post I will provide checklist of important manifest stanzas that are applicable to most applications that are targeted to run in production and which are expected to not have downtime during cluster maintenance and/or application updates.
Deploying to kubernetes is easy: create manifest with your
Deployment and then
kubectl apply it.
The most basic deployment manifest looks like this:
apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/name: test-app name: test-app namespace: default spec: selector: matchLabels: app.kubernetes.io/name: test-app template: metadata: labels: app.kubernetes.io/name: test-app spec: containers: - name: controller image: nginx
It works as is, but you may improve reliability of this deployment by configuring additional fields for the manifest.
metadata field efficiently. You can add labels who owns the deployment, if they are part of the bigger project, and so on.
This will allow to discover all deployments that are owned by a specific department:
kubectl get deploy -n production -l department=marketing
Correct configuration is important. Requests are used for scheduling (kubelet reports node configuration to scheduler and scheduler uses this information when decides where the pod will be assigned). Limits are used for enforcing usage in runtime.
latesttag in images
I spoke about it previously. Use exact version, i.e.
nginx:1.19.2. This is better than
latest, but even better to use sha256 of the image:
Default strategy is
RollingUpdate. If you run multiple replicas of the application, consider tuning
maxSurge based on your requirements.
If your app have multiple replicas and each replica requires a lot of CPU/Memory, having
maxSurge might require autoscaler to provision extra nodes.
By default, each deployment will use
default service account. If app requires access to kubernetes API, consider creating separate service account for the app. This will allow improve your isolation and security:
securityContext allows to control security context of the pod. Recommended to enforce
runAsNonRoot. See documentation
If you have multiple replicas of the application, create
PodDisruptionBudget for the
Deployment. See documentation for more details.
I cannot stress more importance of it. I’ve seen application being taken down when they should not and vice versa. Your biggest nightmare will be if you do a rollout when new pods are crashlooping and olds pods are not terminated when they should.
Lifecycle hooks allows gracefully terminate application. I.e. you can finish current request, save state and then terminate. See documentation
Not all apps are created equal. Some apps are more important. Consider defining priority classes and use appropriate priority class for the application. Read more in official documentation
Sometimes there are specific requirements where application should or should not run. Taints allows to taint nodes to prevent regular workload from scheduling on those node. To allow workload to be scheduled on the nodes - use Tolerations. See documentation
Affinities and anti-affinities provides you more control where to schedule the workload. For example, you might want to use pod anti-affinity to distribute replicas of the application across different nodes or availability zones. See documentation for details. Another great feature that you might need is Topology Manager for better allocation of the workload.
This is a must read documentation to learn / refresh your knowledge: