Multi-Site Active-Active DR Strategy in SITE Cloud

Cloud Product Team

• 9 min read •

November 6, 2025

Implementing a Multi-Region Active/Active Disaster Recovery Strategy

In this post, you'll learn how to implement an Active/Active Disaster Recovery (DR) strategy to run your workload and serve requests concurrently from the Riyadh and Jeddah SITE Cloud Regions. This strategy is designed to ensure high availability and business continuity, allowing your workload to remain accessible and operational despite major outage events, such as natural disasters, systemic technical failures, or significant human error.

Multi-Site Active/Active Architecture

The architecture illustrates how to leverage SITE Cloud Regions as your active sites, establishing a multi-Region Active/Active deployment.

Workloads: Each Region hosts a highly available (HA), multi-Availability Zone (Z1,Z2) workload stack. This architecture inherently offers resilience against localized failures within a single site.

Data Replication: Data is synchronously or asynchronously replicated live from the Riyadh primary database to the Jeddah database instance.
Backup and Recovery: A robust backup and recovery mechanism is configured in both Regions. This provides protection against logical disasters, such as data corruption or accidental deletion, enabling a point-in-time recovery (PITR) to the last known good state.

Figure 1: Architecture diagram for active-active multi-site

Multi-site active_active DR strategy.jpg

Global Traffic Routing

Each regional application stack is designed to serve production traffic. The architecture utilizes the SITE Cloud Global Load Balancer (GLB), a highly available and scalable cloud Domain Name System (DNS) service, for traffic steering. The SITE Cloud GLB supports various routing policies, including:

Active-Active Configuration

Two or more sites are mapped to the Active-Active GLB configuration. Upon a DNS query, the GLB returns one of the configured IP addresses for the active endpoints. The selection is typically non-deterministic (e.g., random or round-robin without session persistence) across the sites. This configuration is predicated on the assumption of identical workload setups (or symmetrical scaling) across all active sites, making cost optimization a significant factor.

Active-Passive Configuration

Two or more sites are mapped to an Active-Passive setup. Traffic is only routed to the secondary (passive) site if the primary (active) site is deemed unhealthy or all its pool members are unavailable.

Active/Active Data Replication

SITE Cloud DBaaS-PostgreSQL utilizes its Cluster-to-Cluster (C2C) feature to support PostgreSQL replication. This typically involves a primary cluster replicating to one or more standby/secondary clusters.

Bi-directional Replication Considerations

While bi-directional replication can be configured for a multi-active architecture, it introduces significant complexity. Native PostgreSQL replication (including logical replication) lacks built-in conflict detection and resolution. Implementing bi-directional data flow requires meticulous application design to ensure disjoint write sets and prevent data conflicts (e.g., ensuring a record is only written to in a single region).

Reads from Replicas

To optimize performance, read operations are typically distributed across one or more replica nodes (standbys/secondaries). This offloads read traffic from the primary instance, reducing its transaction load and enhancing overall system responsiveness, which is particularly beneficial for read-heavy applications.

Automated Failover

In an Active/Active multi-Region strategy, if a Region's workload becomes inoperable or reports a failure, the automated failover process will reroute traffic away from the impacted Region to the remaining healthy Region.

This is managed effectively using SITE Cloud Global Load Balancers health checks. For the failover to execute rapidly and meet your Recovery Time Objective (RTO) targets, it is crucial to set a low Time-To-Live (TTL) value on the associated DNS records. A low TTL ensures that DNS resolvers refresh the cached information quickly, reflecting the updated, healthy IP addresses.

Cloud Security Overview

Network Security

North-South Traffic: All external (North-South) network traffic is inspected by Next-Generation Firewalls (NGFW), providing essential Layer-7 application-layer security.

Web Application Firewall (WAF): A Cloud WAF is integrated with the Application Load Balancers to provide robust defense against common web exploits and vulnerabilities.

Endpoint Security

Hardened Images: Only security-hardened Virtual Machine (VM) images are provided by default, reducing the attack surface.

Microsegmentation: VMs are deployed with default microsegmentation, which offers Layer-7 firewalling capabilities that actively prevent lateral threat movement across the internal network.

Threat Monitoring: VMs include built-in Endpoint Protection Platform (EPP) and Endpoint Detection and Response (EDR) security agents.
24/7 SOC & NOC : EPP & EDR facilitates 24/7 threat monitoring and incident response by the SITE Cloud Security Operations Center (SOC) and Network Operations Center (NOC).

Conclusion and Trade-offs

The multi-site Active/Active strategy is the optimal choice for workloads demanding the quickest recovery time (lowest RTO) and the least data loss (lowest Recovery Point Objective - RPO). A multi-Region implementation provides the maximum geographical separation and operational independence between sites, and also offers the advantage of low-latency access for a distributed user base. Trade-offs must be considered: the implementation, synchronization, and ongoing operation of this strategy, especially across multiple Regions, are typically more complex and significantly more expensive than simpler DR architectures.

Sovereign AI Platform

LLM Inference

Vision (VLM)

Vector Store

Reranking

Document Parsing

See all AI products

Featured Blogs

A Sovereign Path to Reliable AI Inference

Understanding AI Inference: LLMs, Embeddings, and Reranking

Why Site Cloud

Sovereign AI Platform

Elevated Cybersecurity

Sovereign Cloud

Accelerated Compliance

Cost & Efficiency

Dedicated Region

Explore

Read Our Whitepaper

Mapping Guide – NCA Compliance Frameworks

By Industry

Healthcare and Life Sciences

Energy and Utilities

Education

Industrial

Featured

Government

Financial Services

Artificial Intelligence

LLM Inference

Embedding

Reranking

Vision

Vector Store

Document Parsing

MLOps

SITE Cloud Regions

Riyadh

Jeddah

Dedicated Region Offering

Exclusive Dedicated Region

AI Region

AI Region

Learn

Documentations

Cloud Architecture Center

Whitepapers

Cloud Glossary

SITE Sovereign Cloud Brochure

What's New

Blog

News

Connect and Support

Pricing Calculator

Support