Hi all. I'm a "power hobby user" of k8s and not r...
# general
s
Hi all. I'm a "power hobby user" of k8s and not really a full-fledged platform engineer. My question is about database storage and I know of the warnings "don't put databases inside k8s", but I've done it anyway with a lot of success (for a simple low-usage cluster, admittedly). One of the things that I learned about running databases is that most databases, to be at their best, need as fast a storage as possible. My cluster is on bare metal, but via generated VMs (3 servers and 5 agents - k3s) and I put the database storage on the drives local to the nodes themselves. This all works fine and the databases do well (Mongo, Redis and Postgres). However, the storage locality is where k8s sort of runs into trouble. If a node needs to be taken down and rebuilt, for instance when upgrading hardware or replacing the node, then the data is always "lost" from the node being upgraded/ replaced with this setup. Of course, a back up is made through the database tools, but it's sort of bass-akwards to have to rebuild the databases "internally" to the databases and it is majorly time consuming. My question is then, what would you consider to be a good tool to basically off-load a PV to an external storage, to then "rebulld" the PV on the new node? I'm eyeing Valero. Or, is there something better or even a better process (other than running the DBs on their own hardware), to handle PV rebulding to a new node?
s
I’ve noticed that more companies are now confident in running databases on Kubernetes compared to 2020. However, a key question to consider is where Kubernetes is being deployed—on-premises, bare-metal, or in the cloud. Understanding the deployment environment allows us to optimize for cost efficiency, scalability, flexibility, and compliance. With the right ecosystem of tools, we can ensure scalability and stability during traffic spikes. Industries like healthcare and finance, which benefit from keeping their data on-premises due to strict regulations, can leverage Kubernetes to meet data compliance, data locality restrictions, or support multi-region architectures Every k8s release for the past 2 years we have seen great enhancement made to the flow of creation and flow of access and one such PersistentVolume Last Phase Transition Time Moves to GA k8 1.31 https://kubernetes.io/blog/2024/08/14/last-phase-transition-time-ga/ these are the good signs of active community reducing the gap and building the momentum for upstream adaptation of running databases in k8s.
what would you consider to be a good tool to basically off-load a PV to external storage, to then "rebuild" the PV on the new node?
StatefulSets in Kubernetes provide an excellent foundation for making this design choice. Velero is a powerful tool for backing up Persistent Volumes (PVs) to external storage, enabling seamless restoration to a new node. Similarly, with Rook-Ceph, you can efficiently migrate data by detaching and reattaching PVs to different nodes within the cluster KubeBlocks uses a unified set of APIs (CRDs) and code to manage various databases and my initial observation is was build up to have day 02 operations in mind and to collect monitoring metrics from rich data sources and is compatible with AWS, GCP, Azure, But above all, I believe it’s less about the feature set you look in the tool and more about the design choices for deploying databases that can scale and aggregate with different apps and permission models for the team using it.
r
In case you are interested, there’s a whole community dedicated to dispelling the myth that you can’t run DBs in Kubernetes -> https://dok.community/
We use Velero for this task in our SaaS product, and it works pretty well. We’ve done whole cluster migrations in very small time period with Velero. That being said, this is the DIY way. There are commercial products that could make this faster and more reliable.
s
@Saim Safdar - Thanks for your reply. That new feature in k8s sounds interesting. Not sure how I can use it, but I bet it will be useful overall. @Ramiro Berrelleza - Thanks for your reply too. And, we are currently fully "hands-on" and DIY and we are attempting to avoid any SaaS/ external solutions outside of our cluster "looking in". I'm amazed personally that people would buy into such services, but alas, I guess I think somewhat differently than most. I appreciate your insights to Velero. It makes me think I was on the right track to use it.
m
@scott molinari Have you looked at https://github.com/openebs/zfs-localpv ? It should work with Velero for backup and restore and there was a proposal for migration using ZFS natively, not sure if that went anywere https://github.com/pawanpraka1/zfs-localpv/blob/bd7e5547d4251b4ab1c5b22ac4e3ef0ed34e116d/design/pv-migration.md
s
Thanks all. Looks like this problem of PVs getting unwantedly zapped and lost is a common one for many. 🙂 I'm going to look deeper into Valero. OpenEBS sounds interesting too. I'll keep digging for sure. Thanks again!