TechAnek

How to Automatically Reload Production Data into a Staging Database Daily Using Jenkins

Keeping your staging database updated with fresh production data is crucial for accurate testing and debugging. It helps developers catch bugs early by working with real-world scenarios. However, doing this manually every day can be time-consuming and error-prone. That’s where Jenkins, a powerful automation tool, comes in. In this blog, you’ll learn how to automate the daily process of reloading production data into your staging environment using Jenkins. This not only saves time but also ensures consistency across environments.

Why Reload Production Data into Staging?

Reloading production data into your staging environment ensures your tests and development work closely mirror real-world usage. Here are some key reasons why it’s important:

  • Using production-like data helps uncover bugs that may not appear with dummy or outdated test data
  • Developers can reproduce and diagnose issues more accurately with real data.
  • Quality assurance teams can validate features under realistic data scenarios, ensuring better release quality.
  • With current data in staging, stakeholders can review features and fixes in a near-production setting.
  • Testing with real data helps prevent unexpected behaviour after deployment.

Keeping staging in sync with production ultimately leads to more stable and reliable software releases.

What You’ll Need Before You Start

Before setting up the automation to reload production data into your staging database using Jenkins, make sure you have the following tools and access in place:

  • A working Jenkins installation (freestyle or pipeline jobs supported).
  • Basic shell scripting knowledge using Bash, or a language like Python.
  • For securely transferring data between servers using scp or rsync.
  • To securely manage database usernames, passwords, and SSH keys.

Make sure to follow proper security practices, especially when dealing with production data, including masking sensitive information where necessary.

Jenkins Pipeline Breakdown: Automating the Reload Process

The pipeline performs the following steps:

  1. Creates a Read Replica of the production DB.
  2. Promotes the replica to a standalone instance.
  3. Deletes the old staging DB (after disabling deletion protection).
  4. Renames the replica to match the staging DB identifier.
  5. Reconfigures access credentials (including resetting the admin password).

All of this happens automatically, so you don’t have to manually copy snapshots or run scripts.

Create Replica of PROD DB

This stage is responsible for creating a read replica of your production database and promoting it to a standalone instance.

				
					environment {
    SOURCE_CRED = credentials('<prod-db-creds-id>')         // Contains prod DB username/password
    DEST_CRED   = credentials('<stage-db-creds-id>')        // Contains stage DB username/password
    ADMIN_CRED  = credentials('<stage-admin-creds-id>')     // Admin-level credentials for stage DB

    SOURCE_ID   = "prod-db-instance-id"
    REPLICA_ID  = "prod-replica-id"
    DEST_ID     = "stage-db-id"
    DEST_HOST   = "stage-db-host"
    DEST_DB     = "stage-db-name"
}


stage('Create and Promote Replica') {
    steps {
        sh '''
            set -euo pipefail
            set -x

            echo "Creating read replica of production DB..."
            aws rds create-db-instance-read-replica \
                --db-instance-identifier "$REPLICA_ID" \
                --source-db-instance-identifier "$SOURCE_ID" \
                --db-instance-class db.m5.large \
                --availability-zone us-east-1f

            echo "Waiting for replica to become available..."
            until aws rds describe-db-instances --db-instance-identifier "$REPLICA_ID" | grep '"DBInstanceStatus": "available"'; do
                sleep 30
            done

            echo "Promoting replica to standalone..."
            aws rds promote-read-replica --db-instance-identifier "$REPLICA_ID"

            echo "Waiting for promoted replica to become available..."
            until aws rds describe-db-instances --db-instance-identifier "$REPLICA_ID" | grep '"DBInstanceStatus": "available"'; do
                sleep 30
            done
        '''
    }
}

				
			

first, we need to define all necessary environment variables. These include Jenkins credentials for accessing the databases, and the identifiers for AWS RDS instances.

Then we will safely replicate production data without putting load on the live database, this stage involves creating a read replica and preparing it to serve as the new staging environment.

  • Create Replica: An Amazon RDS read replica is created from the existing production database instance. This ensures that the replica contains an up-to-date copy of the production data, without introducing risk or performance issues to the live system. Depending on the size of your database, this process can take several minutes.

  • Promote Replica: Once the replica is fully synchronized, it is promoted to a standalone RDS instance. At this point, it functions independently of the production database and is ready to be used as the new staging environment.

This approach allows you to build a fresh staging instance using real production data—without introducing any downtime or performance degradation to your live system. It also offers a reliable way to test with realistic data while preserving the integrity of both environments.

Replace Stage DB with Promoted Replica

This stage deletes the current staging DB, and renames the newly promoted replica to take its place. It also resets credentials for the new staging instance.

				
					        stage('Replace Stage DB with Promoted Replica') {
            steps {
                sh '''
                    #!/bin/bash
                    set -euo pipefail
                    set -x

                    echo "Disabling deletion protection for existing stage DB..."
                    aws rds modify-db-instance --db-instance-identifier "$DEST_ID" --no-deletion-protection --apply-immediately

                    echo "Deleting old stage DB instance..."
                    aws rds delete-db-instance --db-instance-identifier "$DEST_ID" --skip-final-snapshot

                    echo "Waiting for stage DB to be deleted..."
                    while aws rds describe-db-instances --db-instance-identifier "$DEST_ID" 2>&1 | grep -q '"DBInstanceStatus": "deleting"'; do
                        sleep 30
                    done

                    echo "Renaming promoted replica to stage DB identifier..."
                    aws rds modify-db-instance --db-instance-identifier "$REPLICA_ID" --new-db-instance-identifier "$DEST_ID" --apply-immediately

                    echo "Waiting for renamed instance to be available..."
                    sleep 150
                    until aws rds describe-db-instances --db-instance-identifier "$DEST_ID" | grep '"DBInstanceStatus": "available"' > /dev/null; do
                        sleep 30
                    done

                    echo "Re-enabling deletion protection..."
                    aws rds modify-db-instance --db-instance-identifier "$DEST_ID" --deletion-protection --apply-immediately
                    sleep 60

                    echo "Waiting again for DB to be available after modifications..."
                    until aws rds describe-db-instances --db-instance-identifier "$DEST_ID" | grep '"DBInstanceStatus": "available"' > /dev/null; do
                        sleep 30
                    done

                    echo "Updating stage DB master password..."
                    aws rds modify-db-instance \
                        --db-instance-identifier "$DEST_ID" \
                        --master-user-password "$DEST_CRED_PSW"

                    echo "Waiting for DB to become available after password change..."
                    sleep 180
                    until aws rds describe-db-instances --db-instance-identifier "$DEST_ID" | grep '"DBInstanceStatus": "available"' > /dev/null; do
                        sleep 30
                    done

                    echo "Changing admin user password via MySQL..."
                    mysql --user="$DEST_CRED_USR" --password="$DEST_CRED_PSW" --host="$DEST_HOST" -e \
                        "ALTER USER '${ADMIN_CRED_USR}'@'%' IDENTIFIED BY '${ADMIN_CRED_PSW}'; FLUSH PRIVILEGES;"
                '''
            }
        }
				
			

This stage is responsible for creating a read replica of your production database and promoting it to a standalone instance.This final stage transitions the newly created database replica into an active staging environment.

  • Delete Old Staging DB: The existing staging database instance is removed to avoid conflicts and reduce resource usage.

  • Rename Replica: The promoted read replica is renamed to match the original staging database identifier. This ensures seamless integration with existing services and environment configurations.

  • Reconfigure Credentials: All necessary database users are recreated, passwords are reset, and admin access is restored to match staging requirements.

The result is a clean, up-to-date staging database that accurately reflects production, while remaining isolated and secure for testing and development purposes.

Automate It Daily

To ensure your staging environment is always refreshed with the latest production data, you can schedule this Jenkins pipeline to run automatically during off-peak hours—typically late at night or early morning, when system traffic is minimal.

				
					triggers {
    cron('0 2 * * *') // Runs daily at 2:00 AM
}
				
			

This configuration schedules the pipeline to run every day at 2:00 AM, a common low-traffic period. Running during off-hours helps minimize potential impact on resource usage, especially when creating and promoting database replicas.

Complete Jenkins Pipeline: Automate Staging DB Refresh from Production

Here’s a full working Jenkins pipeline that automates the process of creating a read replica from your production database and promoting it to replace the staging environment

				
					pipeline {
    agent any
    options {
        timestamps()
    }
    
    triggers {
        cron('0 2 * * *') 
    }

    environment {
        SOURCE_CRED = credentials('<your-prod-db-credentials-id>')
        DEST_CRED   = credentials('<your-stage-db-credentials-id>')
        ADMIN_CRED  = credentials('<your-stage-admin-credentials-id>')

        SOURCE_ID   = "<your-prod-db-id>"
        REPLICA_ID  = "<your-replica-db-id>"
        DEST_ID     = "<your-stage-db-id>"
        DEST_DB     = "<your-stage-db-name>"
        DEST_HOST   = "<your-stage-db-host>"
        BKP_TABLE   = "<your-table-name>"
    }

    stages {
        stage('Create Replica of PROD DB') {
            steps {
                sh '''
                    #!/bin/bash
                    set -euo pipefail
                    set -x

                    export BKP_PATH="${WORKSPACE}"
                    export DATE=$(date +"%d-%b-%Y")

                    echo "Creating read replica of production DB..."
                    aws rds create-db-instance-read-replica \
                        --db-instance-identifier "$REPLICA_ID" \
                        --source-db-instance-identifier "$SOURCE_ID" \
                        --db-instance-class db.m5.large \
                        --availability-zone us-east-1f

                    echo "Waiting for replica to become available..."
                    until aws rds describe-db-instances --db-instance-identifier "$REPLICA_ID" | grep '"DBInstanceStatus": "available"' > /dev/null; do
                        sleep 30
                    done

                    echo "Promoting replica to standalone..."
                    aws rds promote-read-replica --db-instance-identifier "$REPLICA_ID"

                    echo "Waiting for promoted replica to become available..."
                    until aws rds describe-db-instances --db-instance-identifier "$REPLICA_ID" | grep '"DBInstanceStatus": "available"' > /dev/null; do
                        sleep 30
                    done
                '''
            }
        }

        stage('Replace Stage DB with Promoted Replica') {
            steps {
                sh '''
                    #!/bin/bash
                    set -euo pipefail
                    set -x

                    echo "Disabling deletion protection for existing stage DB..."
                    aws rds modify-db-instance --db-instance-identifier "$DEST_ID" --no-deletion-protection --apply-immediately

                    echo "Deleting old stage DB instance..."
                    aws rds delete-db-instance --db-instance-identifier "$DEST_ID" --skip-final-snapshot

                    echo "Waiting for stage DB to be deleted..."
                    while aws rds describe-db-instances --db-instance-identifier "$DEST_ID" 2>&1 | grep -q '"DBInstanceStatus": "deleting"'; do
                        sleep 30
                    done

                    echo "Renaming promoted replica to stage DB identifier..."
                    aws rds modify-db-instance --db-instance-identifier "$REPLICA_ID" --new-db-instance-identifier "$DEST_ID" --apply-immediately

                    echo "Waiting for renamed instance to be available..."
                    sleep 150
                    until aws rds describe-db-instances --db-instance-identifier "$DEST_ID" | grep '"DBInstanceStatus": "available"' > /dev/null; do
                        sleep 30
                    done

                    echo "Re-enabling deletion protection..."
                    aws rds modify-db-instance --db-instance-identifier "$DEST_ID" --deletion-protection --apply-immediately
                    sleep 60

                    echo "Waiting again for DB to be available after modifications..."
                    until aws rds describe-db-instances --db-instance-identifier "$DEST_ID" | grep '"DBInstanceStatus": "available"' > /dev/null; do
                        sleep 30
                    done

                    echo "Updating stage DB master password..."
                    aws rds modify-db-instance \
                        --db-instance-identifier "$DEST_ID" \
                        --master-user-password "$DEST_CRED_PSW"

                    echo "Waiting for DB to become available after password change..."
                    sleep 180
                    until aws rds describe-db-instances --db-instance-identifier "$DEST_ID" | grep '"DBInstanceStatus": "available"' > /dev/null; do
                        sleep 30
                    done

                    echo "Changing admin user password via MySQL..."
                    mysql --user="$DEST_CRED_USR" --password="$DEST_CRED_PSW" --host="$DEST_HOST" -e \
                        "ALTER USER '${ADMIN_CRED_USR}'@'%' IDENTIFIED BY '${ADMIN_CRED_PSW}'; FLUSH PRIVILEGES;"
                '''
            }
        }
    }

    post {
        success {
            echo "Database replication pipeline completed successfully!"
        }
        failure {
            echo "Database replication pipeline failed!"
        }
    }
}

				
			

Conclusion

By automating the daily refresh of your staging database with Jenkins, you not only streamline repetitive tasks but also create a more stable and reliable development environment. The combination of AWS RDS capabilities and simple scripting makes the process both powerful and adaptable. As your infrastructure grows, this foundation can evolve—integrating alerts, security controls, and rollback mechanisms—to support a more resilient and scalable pipeline. Ultimately, it’s about building confidence in every release by ensuring your staging environment always reflects the real world.

Leave a Reply

Your email address will not be published. Required fields are marked *