Kubernetes StatefulSet
Links: 110 Kubernetes Index
StatefulSet¶
- It is a k8s component used specifically for stateful applications.
- Databases are example of stateful applications
Stateless applications are deployed using deployment component whereas stateful applications are deployed using StatefulSet.
Pod Identity¶
- In StatefulSet all the replica pods are not identical.
-
Every pod has its own identifier.
- Unlike deployment where we have random hashes at the end, StatefulSet gets fixed ordered names (
statefulsetname-ordinal
). - So if you create a StatefulSet called
mysql
with 3 replicas you will have pods with the following names- First is the master then comes the slave in the order of startup.
- Next pod is only created if previous is up & running.
- Deletion starts from the last pod.
- Unlike deployment where we have random hashes at the end, StatefulSet gets fixed ordered names (
-
Each pod has a persistent identifier which it maintains while rescheduling.
- This means that if a pod of ID-2 dies then a new pod of the same ID will replace it.
Why it's difficult to scale database applications?¶
How are database applications scaled?
There can be only 1 master database which takes writes.
- The pods don't share the same physical storage even though they use the same data.
- They each have their own replica of their storage which they can access.
- This means to have the same data as other pods the data must be synchronised.
- The worker pods (slaves) must know about each and every changes made to master so that it can be upto date.
- When a new pod joins it must take care of replicating the data
- It first clones the data from the previous pod and then starts synchronisation.
Since all the pods have a copy of the data you can actually get away with only having temporary data storage.
- But this means that all data will be lost when all the pods die.
- But in case of persistent storage date will survive even if all the pods die since Persistent Volume lifecycle isn't tied to other component's lifecycle.
Pod State¶
-
Each pod has its own persistent volume plus the pod state.
- Pod state has information that whether its a master pod or a slave pod.
-
When a pod dies & gets replaced the persistent pod identifiers makes sure that the storage volume gets reattached to the replacement pod.
- Since the storage has the state of the pod.
-
In case when pods are deleted, pods are deleted but their persistent volume is kept.
For the reattachment to work it is important to use remote storage.
- Since if the pod gets rescheduled from one node to another then the previous storage must be available on other nodes as well.
- We cannot use local volume storage as it is attached to a particular node.
Pod Endpoints¶
-
Each pod has 2 endpoints.
- LoadBalancer Service
- Individual Service Name
-
We need to have a headless service with a StatefulSet for individual service name.
Having a fixed endpoint & pod name ensures that pods have sticky identity to retain state & retain role.
Stateful applications not perfect for containerised environments
References¶
- Kubernetes StatefulSet simply explained | Deployment vs StatefulSet - YouTube
- Understanding StatefulSets in Kubernetes - YouTube
- Demos:
Last updated: 2022-09-06