After the first release in June, 2014 in just a span of approx. 6 years, Kubernetes a CNCF project has become the standard of container orchestration with almost all major technology giants like AWS, Azure, GCP, IBM, Redhat, VMware (Project Pacific) and many more started supporting it. It would not be wrong to say, Kubernetes is the fastest growing project in the history of Open Source Software.
Initially, Kubernetes was primarily considered a platform to run stateless applications where application is not required to hold any data. The server processes requests based only on information relayed with each request and doesn’t rely on information from earlier requests. On the other hand, Stateful services like database, analytics where server processes requests based on the information relayed with each request and information stored from earlier requests will run either in virtual machines or managed services by any cloud provider.
In this article, I will be focussing on the key points you need to keep in consideration before deploying a stateful application. As we are now clear, stateful application require information to be stored. In a Kubernetes cluster, there are multiple approaches to store the data.
Let’s discuss the two approaches….
Stateful application using Shared filesystem
By design Docker containers are ephemeral in nature and require persistent disk storage i.e. persistent volumes to store the data. A persistent volumes can either be created manually or dynamically. A manual persistent volume or static provisioning, will be created before application provisioning whereas in dynamic provisioning of storage, the cluster can automatically deploy storage in response to the persistent volume claims it receives and then permanently bind the resulting persistent volume to the requesting pod. In Kubernetes, dynamic provisioning can be done using StorageClass.
You can create a persistent volume either by
Stateful applications using Kubernetes Statefulset controller
In case of shared file system, durability and persistence is of data provided by the underlying storage as the workload is completed decoupled from it. This provides flexibility to get the pods scheduled on any node of the Kubernetes cluster . As the workload is completely decoupled from the underline storage this approach is not right fit for the applications like noSQL relational databases which requires high I/O throughput.
For the stateful application requiring high I/O throughput, Kubernetes Statefulsets are the recommended method. Leveraging Statefulsets along with Persistent Volume Claim you can have applications that can scale up automatically with unique Persistent Volume Claims associated to each replica Pod. StatefulSets are suitable for deploying Kafka, MySQL, Redis, ZooKeeper, and other applications needing unique, persistent identities and stable hostnames.
There are three major components underlying a StatefulSet:
Conclusion
In this articles we discussed on two approaches of deploying a stateful application in a Kubernetes cluster. Deploying a stateful application using shared filesystem is best fit for the application which don’t require high I/O throughput. On the other hand Deploying a stateful application using Kubernetes Statefulsets is right fit of applications requiring high I/O throughput. You can choose from a wide set of storage choices like GlusterFS, Samba, NFS, Amazon EFS, Azure Files, Google Cloud Filestore.
I hope this will be informative for you. Please do share if you find worth sharing this.