Tech Insights
1753792 Members
6952 Online
108799 Solutions
New Article
AndreaFabrizi1

Protecting your AI containers environment

HPE and Commvault provide end-to-end enterprise-grade solutions to protect your containers—wherever they live, on-premises or the cloud—and improve business continuity within your analytics data pipeline.

HPE-AI-containers-blog.png

I’ve discussed why protecting AI containers is important in a previous blog, where I explored the reasons and the benefits of protecting container environments in the analytics data pipeline. In this blog, I talk about how to protect containerized applications within these pipelines.

By nature, containers are not bundled with physical servers or virtual machines and can span across locations -  on premises, in the cloud or even in multi-cloud environments. This enormous flexibility leads to a number of scenarios and deployment options for backing up and restoring your containerized applications and their data.

Kubernetes overview

Before discussing these scenarios, it’s important to understand how Kubernetes manages containers and storage resources.

Unlike other container orchestration systems, Kubernetes doesn’t run containers directly. It wraps them into a higher-level structure called a Pod. A Pod contains one (best practice) or more containers, shared storage, access network resources, and the specification for how to run the containers. Pods are used as the unit of replication in Kubernetes, which means that multiple copies of the same pod can run at any time in a production system. Pods can be associated with key/value pairs called labels. Labels are used to simplify the management of Pods.

Pods are part of Namespaces. Kubernetes namespaces are logical entities that group and isolate resources dedicated to a set of users. In other words, Namespaces are a way to divide cluster resources between multiple users. Indeed, they are also referred to as a virtual cluster.

Another important concept is Persistent Volume (PV). Kubernetes persistent volumes are administrator-provisioned storage volumes. PVs are created with a specific filesystem, size, and identifying characteristics such as volume IDs and names. The key aspect of the PVs is that they “survive” the Pod lifecycle, which means that the volume and the data contained in it  will remain after the Pod is deleted and will be available by other Pods, if required.

To use a PV, Pods need to claim it via a Persistent Volume Claim (PVC). A Persistent Volume Claim describes the amount and characteristics of the PV storage required by the Pod, finds any matching Persistent Volumes and claims them. Here’s a simplified representation of all these objects and relationships inside a Kubernetes cluster.

Kubernetes cluster-HPE AI.png

Container protection scenarios

The diagram above makes it easier to understand why there are multiple ways to protect a Kubernetes cluster. First, in a container environment it is possible to backup different objects:

  • Application configuration (Pod)—Backing up the application configuration is only  used to restore the application from scratch. It is useful to migrate the application from one environment to another (e.g. migrate an application from cloud to on-premise)
  • Application data only (PV)—Backing up the application data is only used to preserve the application data. For example, when you have a containerized AI model running, you’re interested in saving the data output but not the model itself as AI model container image is already saved in a software versioning system (e.g.GitHub).
  • Application and data (Pod and PV)—Backing up both application and its data is typical application protection in case of application failure.

The backup of a Kubernetes cluster can be performed with different levels of granularity:

  • Cluster level—All applications or volumes from all namespaces available in the cluster are backed up. This kind of backup is useful to protect the whole Kubernetes cluster, or to replicate it in another environment (e.g. cloud)
  • Namespaces level—All applications or volumes available in a namespace are backed up. The aim of this kind of backup is to protect user’s specific virtual environments.
  • Label level—All applications identified by a label are backed up. This backup level is used to simplify the protection of a group of similar applications without backing up any single application.
  • Applications and volumes level—This is the lower level of protection and  is used to protect a single application and a single Persistent Volume.

The restore of the backup can be performed in three different ways, as also illustrated below:

  • In-cluster/In-place—The backup application is deployed within its original cluster and with the same name. The restore application and/or its data will overwrite the previous one.
  • In-cluster/Out-place—The backup application is deployed within its original cluster but with a different name.
  • Cross-cluster—The backup application is deployed in a different cluster on-premise in the cloud or in a managed cloud service.

HPE-containers-AI-clusters.png

 How HPE can help to protect your Kubernetes environment

The HPE and Commvault solution provides an end-to-end enterprise-grade, scale-out, fully integrated software and hardware platform to back up your container environments wherever they live (on-premises or in the cloud) and to restore wherever they are needed. The main advantages are:

  • Scale-out solution
  • Support for all major operating systems, applications, and databases on virtual and physical servers, NAS shares, cloud-based infrastructures, and mobile devices
  • Support for protecting Kubernetes, Openshift Container Platform (OCP) and Ranch container environments, as well as protecting VM environments, applications, and files
  • Simplified management through a single console to view, manage, and access all functions and all data and information across the enterprise
  • Multiple protection methods including backup and archive, snapshot management, replication, and content indexing for eDiscovery
  • Efficient storage management using deduplication for disk and tape
  • Support for HPE Apollo and HPE ProLiant systems optimized for Commvault software

Dig deeper

Learn more about the HPE Container Platform built on Kubernetes.

Check out this technical paper offering a step-by-step guide on how to protect your containers: Data protection for Kubernetes using Commvault backup & recovery software and HPE Apollo servers


Andrea Fabrizi
Hewlett Packard Enterprise

twitter.com/HPE_AI
linkedin.com/showcase/hpe-ai/
hpe.com/us/en/solutions/artificial-intelligence.html

About the Author

AndreaFabrizi1

Andrea Fabrizi is the Strategic Portfolio Manager for Big Data and Analytics at HPE.