Doris provides complete disaster recovery capabilities that help you handle data loss risks caused by hardware failures, software errors, and human operation mistakes. By combining cross-cluster data replication, backup and restore, and recycle bin recovery, you can ensure high availability and reliability of data across different failure granularities.
Different disaster recovery capabilities apply to different failure scenarios. Combine them based on your business needs:
| Capability | Applicable Scenario | Recovery Granularity | Recovery Speed | Typical Use Case |
|---|---|---|---|---|
| Cross-cluster data replication | Cluster-level, data-center-level, or region-level failures | Cluster, database, table | Seconds to minutes (near real-time) | Multi-region active-active, region-level disaster recovery, read-write separation |
| Backup and restore | Data corruption, long-term archiving, migration | Database, table, partition | Minutes to hours (depending on data volume) | Periodic snapshots, cross-cluster migration, compliance archiving |
| Recycle bin recovery | Short-term human errors such as accidentally dropping tables or databases | Table, database | Seconds | Accidental deletion recovery, short-term retention |
The three capabilities differ in data protection strength, operational cost, and dependencies:
| Dimension | Cross-cluster Data Replication | Backup and Restore | Recycle Bin Recovery |
|---|---|---|---|
| Data timeliness | Real-time (full + incremental) | Periodic snapshots | Retained at the moment of deletion |
| External dependencies | Requires an additional target Doris cluster | Requires remote storage such as object storage or HDFS | None (retained locally within the cluster) |
| Retention period | Continuous replication | Retained according to backup policy | Configurable retention period, automatically cleaned up on expiration |
| Operational cost | High (operating two clusters) | Medium (periodic jobs + storage cost) | Low (enabled by default) |
Doris Cross-Cluster Replication (CCR) supports real-time data replication between different Doris clusters. It ensures that critical data is distributed across multiple physically isolated clusters, enabling region-level disaster recovery.
A company deploys two Doris clusters in different cities, with cluster A as the primary and cluster B as the backup. With cross-cluster data replication, when cluster A is interrupted by a natural disaster, cluster B can take over the workload and minimize downtime.
For detailed usage, see Cross-Cluster Replication Overview, Quick Start, and User Manual.
Doris provides backup and restore functionality for periodically saving data snapshots, preventing data loss caused by unexpected events, and supporting data migration and long-term archiving.
A company periodically backs up its data and stores the backup files in an object storage service (such as Amazon S3). When an important table is dropped by mistake, the backup feature quickly restores the lost data and ensures normal business operations.
For detailed usage, see Data Backup and Data Restore.
Doris provides a recycle bin feature that offers a quick way to recover recently deleted data and reduce the impact of operational mistakes.
A team accidentally drops an important table during routine operations. With the recycle bin feature, they quickly recover the deleted data, avoid a complex backup and restore process, and keep the business running.
For detailed usage, see Recycle Bin.
Use recycle bin recovery first. It completes in seconds and does not depend on external storage. If the retention period has passed, use backup and restore to recover from the most recent snapshot.
No. Cross-cluster replication targets real-time disaster recovery and high availability, while backup and restore targets periodic snapshots and long-term archiving. Combine them to cover different failure scenarios.
No. Deleted data is only kept during the configurable retention period and is automatically cleaned up after expiration. For long-term retention, use backup and restore.
Use cross-cluster data replication to deploy primary and standby Doris clusters in different regions. When the primary cluster fails, the standby cluster takes over the workload.
Backup files support remote storage such as object storage (for example, Amazon S3) and HDFS, which avoids single-point failures shared with the source cluster.