Persistent Volumes: why use them, and how to protect them

I’m sure I am not the only one, but I still have that recurring dream where I am halfway through an essay or thesis for school and without warning the screen goes black. My paper is gone, and I had not saved my changes. I then wake up in a cold sweat. Similarly, for those video game fans, you all know that the first thing that you do as soon as you beat a tough level, or complete a difficult task, is save the game. You make sure that if a disaster happens, you have saved your progress.

Now if you are this concerned about saving your progress in a video game, I’m sure you have taken considerable time thinking about how to protect your organization’s persistent data. Persistent data is non-transient, business-critical information. And this type of data, more than anything else, needs robust and durable storage to make sure that this data is never at risk and is always available to users and applications. This can be accomplished within a Kubernetes environment using persistent volumes.

A persistent volume is one of two storage abstracts that are available for Kubernetes: volumes and persistent volumes. With a typical Kubernetes volume, the volume only exists while the pod that contains that volume exists. This means that once the pod is deleted, the associated volume is also deleted. A persistent volume, on the other hand, remains available outside of the pod lifecycle. If the containing pod is deleted, the persistent volume remains, and can then be made available to claim by another pod if required, all the while retaining the data on that persistent volume. Long story short, if the data stored in the pods is temporary data and does not need to be retained regardless of the pods lifecycle, a Kubernetes volume may be fine, but if the data needs to be retained even after the pod has been deleted, a persistent volume would be the better choice.

So, you may be wondering then, how are persistent volumes used in Kubernetes? Well, an administrator or a storage class will create a persistent volume by provisioning some dedicated storage. Once the persistent volume is created, it can be claimed by a persistent volume claim. This is essentially a request for storage by a user or developer that can be used to store data from pods, applications, etc. A persistent volume claim describes the amount and characteristics of the storage required by the pod, finds any matching persistent volumes and claims these.

Now the idea of the persistent volume and the fact that if the containing pod were to be destroyed the persistent data would be retained is great, however, this does not provide an entirely robust data protection scenario. In order to ensure that your data living on those persistent volumes is protected, and recoverable, you should also be backing up that data. For example, what would you do if there was a disaster like your cluster went into an unrecoverable state, or a natural disaster makes your cluster unreachable, or if a bug is introduced and wipes a persistent volume clean? That is where vProtect can help.

Developed by Storware LLC vProtect provides an agentless, crash-consistent backup of deployments running in Kubernetes and OpenShift environments and stored on persistent volumes, and provides the ability to store these backups in a wide range of backup destinations, including mounted file systems, enterprise-level backup solutions like IBM Spectrum Protect, DellEMC NetWorker, and Veritas NetBackup, cloud storage like Amazon S3, Google Cloud, Microsoft Azure, and many others. During a backup job, vProtect will automatically pause the running deployments to ensure backups are crash-consistent, collect the information stored on persistent volumes as well as the configuration, and export that metadata to the backup destination. We do that by creating a small temporary pod to attach the persistent volumes from target pods and then read that data from there to the vProtect node (and later moved to the backup destination). Then, during a restore, vProtect recreates the application from the backed-up metadata stored in the backup destination. This gives the end-user the ability to restore that data, and recreate applications, even if the original pod no longer exists.

Now, just like you would while playing a difficult video game, and just like I did several times while writing this article, make sure you are saving your progress when it comes to your persistent data in Kubernetes. Take advantage of the availability of persistent volumes and make sure that the data on those persistent volumes are protected by using vProtect.

If you would like to learn more about vProtect, you can request a live demo or even get a 30-day trial copy to try it for yourself. We’ll be happy to help you set things up.

We also have a pre-recorded demo available on our website if you would like to see the product in action on your own time.


Solution Architect at Catalogic Software

Brian Sietsma

Professional Services / Pre-sales field engineer for Catalogic Software, covering multiple operating systems, databases, virtualization and storage technologies. Covering the US north-central territory.