When disaster strikes on a business database, a company could lose all connection to its data or (even worse) its entire data file. Internal and client-facing applications may also cease to function.
Here are some common causes of a database disaster:
- Power Failure
- Hardware Failure
- Virus Attack
- Natural Disaster
- Human Error
Note: The term High Availability and Disaster Recovery is sometimes used interchangeably. High Availability is the term that shows the percentage of time a system (server) needs to be available. In some environments, it goes as high as 99.999% - only 5.26 minutes of downtime per calendar year.
A disaster recovery plan (DRP) must be in place when a disaster occurs. A DRP is a set of steps that is performed automatically to switch users to an alternate connection to the database. These action plans are put together in advance and documented in order to prevent any catastrophic data loss or incidents. It is a good practice to simulate a failure by creating a disaster situation, so your company can test your solution before a real disaster strikes.
A thorough plan must consider numerous factors:
- Sensitivity of Data
- Data Loss Tolerance
- Required Availability
The DRP can then be based on one or a combination of following solutions:
- Database Mirroring
- Log Shipping
- Failover Clustering
Each solution has its own advantages, implementation costs, and planning time. A SQL Server disaster recovery plan should include one or more available solutions.
Database Mirroring maintains a second identical copy (a Mirror Database) of the source database (a Principal Database), increasing the availability of the database.
These two databases form a relationship – a Database Mirroring Session - during which the data from the Principal Database is copied and replicated to the Mirror Database. The two instances or servers that participate in a Database Mirroring Session are called Partners.
An optional third SQL Server instance may be used as a Witness Server to oversee the Database Mirroring Session and intervene if required. For some businesses, this is required.
Architecture of Database Mirroring Technique:
Whenever a new transaction is committed on the Principal Database, the Database Mirroring Session ensures that the same transaction is carried out on the Mirror Database as well.
There are three different modes of operation:
Mode 1 - High Safety with Manual Failover
In this mode, only the Principal Database and the Mirror Database participate in the Database Mirroring Session - there is no Witness Server. Each new transaction is first executed and committed on the Principal Database, and then immediately replicated on the Mirror Database.
Mode 2 - High Safety with Automatic Failover
In this mode, all three servers - Principal Server, Mirror Server and Witness Server - participate in the Database Mirroring Session.
There is one advantage to having a Witness Server: Automatic Failover
In Mode 1, we must manually bring the Mirror Database up and set it as Principal Database so it starts serving the user/application.
In Mode 2, the Witness Server will immediately notify the Mirror Server of the Principal Server’s failure and convert the Mirror Database to the Principal Database. The newly converted database would start serving the user/application without much delay. This entire process takes place within a few seconds.
Mode 3 - High Performance
In this mode, the Principal Database and the Mirror Database participate in the Database Mirroring Session. The participation of a Witness Server is optional.
Whenever a new transaction is first executed and committed on the Principal Database, a confirmation/acknowledgement is immediately given to the user/application. The Mirror Database queues the transaction. The Principal Server executes and commits the transaction to the Mirror database. The Principal Server acknowledges the transaction details to the application and user directly immediately, saving time.
This improves performance, but decreases the data’s reliability. If a Mirror Database/Server goes down, the transaction queued to the Mirror Database may be lost before getting applied on the Mirror Database. For businesses in which every byte of data is important (e.g. financial institutions with bank transactions) the risk associated with this mode is too high, and an alternative method must be used.
Log Shipping maintains multiple copies of the source database.
This solution is implemented by creating and scheduling 3 jobs:
Backup, Copy and Restore.
Backup Job – This job is created on the Primary Server (SQLServer_A)
Copy and Restore Jobs – These jobs are created on the Secondary Server (SQLServer_B)
The Backup, Copy and Restore jobs are scheduled to run back-to-back. Whenever a backup is taken, the backup file is copied immediately and restored onto the Secondary Server.
This method automates the backup-copy-restore process on two SQL instances. Unlike Database Mirroring, log shipping can have more than one instance of the Secondary Database.
Replication also follows the concept of maintaining multiple copies of the source database.
Instead of having the data flow directly from the source to the destination, the source server copies the data to the destination server through a Distributor (a centralized entity).
Source Database: Publishers
Databases that provide data. A replication setup may have one or more Publishers.
Destination Database: Subscribers
Destination (or secondary) Databases that receive data from the Publishers through the Replication Setup.
SQL Server supports three types of replication.
Snapshot Replication:The Publisher makes a snapshot (copy) of the database and makes it available for one or more Subscribers.
Transactional Replication:Dedicated agent jobs monitor for any changes on the Publisher (source database) and transmit the changes to the Subscribers.
Merge Replication:Publishers and Subscribers independently make changes to the database. Merge Agent monitors the changes on both Publisher and Subscriber, and if needed it modifies the databases.
Failover Clustering is a solution where the SQL Server instance is installed on the shared storage. It provides the infrastructure to supports high-availability, disaster recovery scenarios of databases and applications. If a cluster node fails, the services hosted on this node are automatically/manually moved to another available node. A Failover is a movement of services from one node to another.
A failover can take a few seconds, during which the database will not be accessible.