Reliable Backup Strategy: Secure Your Entire Kubernetes Cluster with Velero to AWS S3

Backing up an entire Kubernetes cluster using Velero to AWS S3, or any other object storage solution, serves several important purposes:

Data Resilience and Disaster Recovery:
- In the event of data loss, accidental deletions, or cluster failures, having regular backups is crucial for data resilience. Velero enables you to capture the entire state of your Kubernetes cluster, including application configurations and persistent volumes. Storing these backups in AWS S3 ensures data durability and provides a reliable mechanism for disaster recovery.

Application Consistency:
- By backing up the entire cluster, you capture the relationships and configurations of all the applications and services running in your Kubernetes environment. This ensures that when you restore from a backup, the entire cluster is brought back to a consistent state.

Incremental Backups and Versioning:
- Velero supports incremental backups, meaning it only backs up changes made since the last backup. This reduces the backup duration and storage requirements. AWS S3, being an object storage solution, also allows for versioning, providing a history of changes to your backups over time.

Cluster Migration and Replication:
- Backing up the entire cluster facilitates smooth migration between Kubernetes clusters or replication of clusters. You can use Velero to move your applications and data from one cluster to another, helping in scenarios like moving from a development cluster to a production cluster.

Comprehensive Backup Strategy:
- Backing up the entire cluster ensures that all components, including custom resources, namespaces, and persistent volumes, are included in the backup. This comprehensive approach ensures that no critical data or configurations are left out.

Regulatory Compliance:
- In some industries or environments, regulatory compliance requires organizations to have robust data backup and recovery strategies. Using Velero to back up to AWS S3, which complies with various security and compliance standards, helps organizations meet these requirements.

Ease of Management and Automation:
- Velero provides a straightforward and automated way to manage backups and restores. The integration with AWS S3 simplifies storage management, and the backups can be scheduled and automated, reducing the manual effort required.

In summary, backing up the entire Kubernetes cluster using Velero to AWS S3 ensures data protection, consistency, and recoverability, providing a safety net for unexpected events and supporting efficient cluster management practices.

What is Velero?

Velero is an open-source tool for backing up, restoring, and migrating Kubernetes cluster resources and persistent volumes. Originally developed by Heptio (now part of VMware) and later donated to the Cloud Native Computing Foundation (CNCF), Velero provides a way to protect and recover cluster, deployed microservices with data.

Prerequisites for Velero Setup:

To initiate the Velero setup, ensure the following components are in place:

Active Kubernetes Cluster:
- Confirm the presence of a functioning Kubernetes cluster where you plan to implement Velero.

Object Storage Solution with S3 API Support:
- Select an object storage solution compatible with the S3 API. Options include Amazon S3, Google Cloud Storage, or MinIO.

For this scenario, let’s consider the use of Amazon EKS as the Kubernetes cluster and an Amazon S3 bucket for storing backup data as object.

EKS Cluster

Step 1: Create AWS S3 bucket

Begin by creating a S3 bucket dedicated to stores your Kubernetes cluster backups.

aws s3 mb s3://your-custom-bucket-name

Note: When executing the commands below, remember to replace “Custom_Bucket_Name” with your chosen bucket name.

Enhance Security and Configuration:

To bolster security and configure the bucket appropriately, execute the following commands:

Block Public Access to Bucket:

aws s3api put-public-access-block --bucket your-custom-bucket-name --public-access-block-configuration "BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true"

Enable Data Encryption at Rest:

aws s3api put-bucket-encryption --bucket your-custom-bucket-name --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'

Enforce bucket policy to ensure data is encrypted in transit:

aws s3api put-bucket-policy --bucket your-custom-bucket-name --policy file://policy-file.json

Policy File (policy-file.json):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowSSLRequestsOnly",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::your-custom-bucket-name",
                "arn:aws:s3:::your-custom-bucket-name/*"
            ],
            "Condition": {
                "Bool": {
                    "aws:SecureTransport": "false"
                }
            }
        }
    ]
}

In these steps, you’re essentially creating an S3 bucket, securing it by blocking public access, enforcing server-side encryption, and implementing a policy to allow connections only over HTTPS. Ensure to customize the bucket name as needed and replace “custom-bucket-name” in the commands and policy file.

S3 Bucket

IAM Role Setup:

To enable Velero pods to interact with your S3 bucket for backup and restore operations, an IAM role needs to be established. While this role can be attached at the node level, it’s considered best practice to attach it to the pod service account.

Create Role:

aws iam create-role --role-name your-velero-role --assume-role-policy-document file://your-trust-policy.json

Trust Relationship Policy (Service Account Attachment):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::YOUR_ACCOUNT_ID:oidc-provider/YOUR_EKS_OIDC_PROVIDER_URL"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "YOUR_EKS_OIDC_PROVIDER_URL:sub": "system:serviceaccount:YOUR_EKS_NAMESPACE:YOUR_SERVICE_ACCOUNT_NAME"
        }
      }
    }
  ]
}

Note: Replace placeholders such as YOUR_ACCOUNT_ID, YOUR_EKS_OIDC_PROVIDER_URL, YOUR_EKS_NAMESPACE, and YOUR_SERVICE_ACCOUNT_NAME with your specific values.

Trust Relationship (Node Attachment):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Attach Policy to Role:

aws iam put-role-policy --role-name velero-role --policy-name velero-policy --policy-document file://velero-policy.json

Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::<your-velero-bucket>",
        "arn:aws:s3:::<your-velero-bucket>/*"
      ],
      "Action": [
        "s3:AbortMultipartUpload",
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:ListBucketMultipartUploads",
        "s3:PutObject",
        "s3:ListBucket"
      ]
    },
    {
      "Effect": "Allow",
      "Resource": "*",
      "Action": [
        "ec2:DescribeVolumes",
        "ec2:DescribeSnapshots",
        "ec2:CreateTags",
        "ec2:CreateVolume",
        "ec2:CreateSnapshot",
        "ec2:DeleteSnapshot"
      ]
    }
  ]
}

Remember to replace placeholders with your specific information. These steps ensure the IAM role is set up correctly to grant Velero the necessary permissions for interacting with AWS S3.

IAM Role with Trust Relationship

IAM Role with Policy

Velero Installation using Helm:

To install Velero, we’ll leverage the Helm chart. You can either use the provided values.yaml file or create a customized one by referring to Velero’s default values.yaml file.

values.yaml:

initContainers:
  - name: velero-plugin-for-aws
    image: velero/velero-plugin-for-aws:v1.7.1
    imagePullPolicy: IfNotPresent
    volumeMounts:
      - mountPath: /target
        name: plugins
configuration:
  backupStorageLocation:
    - name: "aws"
      provider: "velero.io/aws"
      bucket: BUCKET_NAME    # Replace with the name of your created bucket
      default: true
      config:
        region: AWS_REGION    # Specify the region where your bucket is located
  volumeSnapshotLocation:
    - name: aws
      provider: velero.io/aws
      config:
        region: AWS_REGION    # Specify the region where your volume(s) are located
serviceAccount:
  server:
    create: true
    name: velero
    annotations:
      eks.amazonaws.com/role-arn: IAM_ROLE_ARN    # Provide the ARN of the IAM role created earlier
schedules:
  eks-cluster:
    disabled: false
    schedule: "0 0 * * *"  # Set a CRON expression for periodic backups
    template:
      ttl: "240h"  # Backups will be automatically deleted after 10 days

Add Helm Repository:

helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts

Install Velero using Helm:

helm install velero vmware-tanzu/velero --namespace <YOUR NAMESPACE> -f values.yaml --create-namespace

Replace placeholders like BUCKET_NAME, AWS_REGION, IAM_ROLE_ARN, and <YOUR DESIRED NAMESPACE> with your specific values. Following these steps will deploy Velero on your Kubernetes cluster using Helm, configured to interact with AWS S3 for backup storage.

Velero Helm Chart

Velero Pod

Testing Velero Backup and Restore:

Now that all the necessary components are in place, let’s verify if Velero can successfully perform backup and restore operations. To do this, we’ll execute manual backup and restore procedure, starting with the installation of the Velero CLI.

Install Velero CLI:

Mac:

brew install velero

Linux:

wget https://github.com/vmware-tanzu/velero/releases/download/v1.12.1/velero-v1.12.1-linux-amd64.tar.gz
tar -xvf velero-v1.12.1-linux-amd64.tar.gz
cd velero-v1.12.1-linux-amd64
mv velero /usr/local/bin/velero

Windows:

choco install velero

For installation on other platforms, please refer to the official documentation.

These commands ensure that the Velero CLI is available for use on your system. Once installed successfully, we can proceed with testing Velero’s backup and restore functionalities on Kubernetes cluster.

Backup and Restore Demonstration:

Before we proceed with manual backup and restore, let’s create a deployment on the Kubernetes cluster:

kubectl create deployment nginx --image=nginx

Nginx Deployment

Nginx Deployment:

Now, let’s initiate a manual backup:

velero backup create demo --include-namespaces="*"

–include-namespaces=”” : This flag specifies the namespaces to include in the backup. In this case, the asterisk () is a wildcard that means all namespaces will be included in the backup.

Velero Backup

Velero Backup:

To verify the completion of the backup, let’s check the backup status:

velero backup describe demo

Velero Backup Status

Velero Backup Status:

Additionally, let’s confirm that the backup is stored in the designated S3 bucket.

Velero Backup in S3

Velero Backup in S3:

Now, for demonstration purposes, let’s delete the previously created nginx deployment:

Velero Backup in S3

Restoration procedure requires performing the same steps to setup Velero on new EKS cluster.

Restoring the Backup:

Perform the restore operation to ensure recovery in uncertain situations:

velero restore create --from-backup demo

Velero Restore

Velero Restore:

Let’s verify if the nginx deployment was successfully restored:

kubectl get deploy

Nginx Restored

Nginx Restored:

Excellent! Our Kubernetes cluster is now safeguarded. In the event of cluster unresponsiveness or accidental service deletions, recovery is straightforward with just a single command.

Note: When restoring to a different cluster, ensure that the backup location is correctly configured with the appropriate bucket and region