Cloud native databases on Kubernetes: Where are we today?

HPE_Experts · ‎04-25-2024

A few years ago, the message was “leave the databases outside of Kubernetes. They are not cloud native in nature and should stay where they belong—in a big, resilient mission-critical server.”

Kubernetes is an ideal platform for cloud native applications that are stateless and can be easily re-deployed upon failure on separate machines that form part of the same cluster. This is hardly the description of an environment that is meant to be hosting mission-critical databases that have stringent consistency requirements.

Once persistent volumes became generally available in Kubernetes in 2018, Kubernetes started supporting stateful workloads. Databases are obviously a stateful workload, so some early advocates of adopting this technology in Kubernetes started pushing boundaries. They came up with the slogan “Kubernetes supports database workloads since they are stateful.” I usually countered with “Kubernetes supports stateful workloads, I don’t.”

In HPE Services, we have the responsibility to make sure that the services that we develop are reliable and will run smoothly in your infrastructure, including Kubernetes. Databases have some intrinsic complexity and requirements that require careful design and planning when deploying them in Kubernetes. Even though Kubernetes is capable of running persistent workloads, we still need to understand how databases work and adapt the underlying infrastructure to make the most out of the database engine features.

Relational databases are ACID-compliant, which means that they meet the principles of atomicity, consistency, isolation, and durability that are required to process transactions in a reliable way. Kubernetes needs to work in tandem with the ACID principle and adapt to be capable of orchestrating database workloads without introducing unexpected behavior in the database.

With the introduction of Kubernetes operators in the second half of 2020, the landscape changed drastically. We were now able to create custom resource definitions (CRD), which let us define our own object kinds and allows the Kubernetes API to manage the entire lifecycle of that object. It basically expands the standard vanilla Kubernetes API available upstream and customizes the cluster to make it capable of automating the lifecycle of the new objects that we have defined in CRD.

Open source relational databases

Let’s put Kubernetes aside for minute and delve into open source relational databases options. PostgreSQL is the main choice as is fully community operated (not Oracle influenced) and provides more features than MySQL. PostgreSQL has its own replication mechanisms and is capable of archiving transactions as soon as they occur and store them in WAL files on an S3 compliant storage.

This raises an important point when it comes to DR of applications in Kubernetes. You can’t think of Kubernetes as a panacea that will resolve all your application challenges (including DR). DR is best handled at the application level, especially for stateful applications that have well define procedures for DR to occur.

All of this leads us to the conclusion that running Postgres within Kubernetes requires three fundamental things:

Kubernetes knowledge within your organization
PostgreSQL knowledge within the organization (ideally cross-trained consultants with skills on both Kubernetes and Postgres)
A reliable operator to manage the entire lifecycle of a PostgreSQL database cluster in high availability

CloudnativePG

Enter CloudnativePG, an open-source project governed by a vendor-neutral community which provides a Kubernetes operator that covers the full lifecycle of a highly available PostgreSQL database cluster with primary / standby architecture using native streaming replication. This is the perfect merger of two worlds: databases requiring full control and Kubernetes requiring full automation.

When very large and critical databases are being onboarded into Kubernetes, we need to pay special attention to the design and sizing. A full, dedicated storage node can be assigned with node selectors and taints for the primary database. Other workers can take care of synchronous standby replicas.

Following a share-nothing approach, you would have local storage in the node with very fast IOPS, and let the database handle the replication with native capabilities to the standby databases, which would sit on separate nodes with their own dedicated storage.

This architecture can expand to multiple clusters, including DR sites that run on a separate Kubernetes cluster. This cluster hosts a main database that can receive transactions from the production database through WAL files. It is prepared to be promoted to the production environment when needed.

HPE can help you get the most out of your cloud native database strategy on Kubernetes. We understand that to maximize both platforms’ capabilities, databases must be properly deployed the right way in Kubernetes.

From the beginning, it is crucial to carefully design the right infrastructure for the Kubernetes cluster. Using a share-nothing approach is recommended for hosting a PostgreSQL database to maximize its capabilities, including disaster recovery with the most aggressive objectives for recovery time and point objectives (RTO and RPO).

To learn more, see our HPE Container Adoption Solution Brief.

Learn more about advisory and professional service from HPE Services.

Meet HPE Alex Tesch, Senior Consultant, Cloud Native Computing Practice, HPE Advisory & Professional Services

Alex has worked with open source enterprise technologies for most of his 21-year IT career. He currently leads HPE's Cloud Platform-Hyperscaler and Cloud Native teams. Alex designs and evangelizes cloud native solutions that help companies modernize their infrastructure and adopt new best practices to leverage next-generation IT.

HPE Experts
Hewlett Packard Enterprise

twitter.com/hpe
linkedin.com/company/hewlett-packard-enterprise
hpe.com

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Cloud native databases on Kubernetes: Where are we today?

Open source relational databases

CloudnativePG

HPE_Experts

Author

Kudos