???What strategies can be applied to database recovery???

 

Develop a Backup and Recovery Strategy

 

“A database can become unusable because of hardware or software failure, or both. You may, at one time or another, encounter storage problems, power interruptions, and application failures, and different failure scenarios require different recovery actions. Protect your data against the possibility of loss by having a well rehearsed recovery strategy in place.

 

 

Unplanned downtime is an unfortunate

fact of life...

Up to 80% of all unplanned downtime is caused by

software or human error

Up to 70% of recovery is “think time”!

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

???What questions should we ask when developing strategies???

1.      Will the database be recoverable?

2.      How much time can be spent recovering the database?

3.      How much time will pass between backup operations?

4.      How much storage space can be allocated for backup copies and archived logs?

5.      Will table space level backups be sufficient, or will full database backups be necessary?

 

???OK, Now that we asked…what must be included???

1.      A database recovery strategy should ensure that all information is available when it is required for database recovery.

2.      It should include a regular schedule for taking database backups and, in the case of partitioned database systems, include backups when the system is scaled (when database partition servers or nodes are added or dropped).

3.      Your overall strategy should also include procedures for recovering command scripts, applications, user-defined functions (UDFs), stored procedure code in operating system libraries, and load copies.

Let’s look at some terms:

·        The concept of a database backup is the same as any other data backup: taking a copy of the data and then storing it on a different medium in case of failure or damage to the original.

·        The rebuilding of the database is called recovery.

·        Version recovery is the restoration of a previous version of the database, using an image that was created during a backup operation.

·        Rollforward recovery is the reapplication of transactions recorded in the database log files after a database or a table space backup image has been restored.

·        Crash recovery is the automatic recovery of the database if a failure occurs before all of the changes that are part of one or more units of work (transactions) are completed and committed.

Recovery Log Files and the Recovery History File

Graphic shows a database with related recovery log files and recovery history file.

·        Each database includes recovery logs, which are used to recover from application or system errors. In combination with the database backups, they are used to recover the consistency of the database right up to the point in time when the error occurred.

·        The recovery history file contains a summary of the backup information that can be used to determine recovery options, if all or part of the database must be recovered to a given point in time. It is used to track recovery-related events such as backup and restore operations, among others. This file is located in the database directory.

·        The table space change history file, which is also located in the database directory, contains information that can be used to determine which log files are required for the recovery of a particular table space.

These strategies should be used to quickly recover in the event of a database failure.

High availability strategies

Technique

Usage

Advantages

Disadvantages

Object-level recovery using Export Import and SQL*Loader

Uses Export/ Import to protect data

Fast object- level recovery

Difficult to scale; you must be aware of object associations

Failover systems using hardware redundancy

Failover provided by using another node

No data loss due to redundant system

No scalability; costly

Oracle standby databases

Primary database's redo log keeps another database updated which can be used during recovery

Fast recovery; failover; disaster recovery possible

Data loss possible; complicated setup and maintenance; potential of replicating data-base corruption

Oracle Symmetric Replication

Uses Oracle's replication feature to provide high availability

No data loss; failover; disaster recovery possible; both databases can be used simultaneously

Slow recovery due to use of transactions; use of two-phase commit can lead to additional problems while maintaining the database's consistency

Oracle Parallel Server

Clustering solution that allows failover to another instance; recovery can proceed simultaneously and is done by the surviving instances

No data loss; fast failover; protects against node and cache failures; high scalability; load balancing

Tuning can be difficult; application design plays significant part in strategy's success

Triple mirroring

Uses a third hardware that is a mirror

Fast hot backups; fast recovery

Cost of triple writes and resilvering

EMC SRDF facility

Physical I/O- based replication

No data loss; failover; disaster recovery possible; faster than Oracle Symmetric Replication

Potential of replicating database corruption

Customized store-and- forward replication

Makes use of Oracle8 features such as advanced queuing or trigger-based asynchronous replication

No data loss; fast recovery

Complex; serializing of transactions

General steps to recover a database system

  1. Detect the failure. The detection of an outage is usually simple: Either the database isn't responding to the application or the system has displayed explicit error messages. However a problem such as a corrupt control file may not be detected while the database is running.
  2. Analyze the failure. You should analyze the type and extent of the failure; the recovery procedure will depend on this analysis. This task can take a significant amount of time in large systems.
  3. Determine the components of the database that need recovery. This task can also be significant in large systems. You need to determine which components (such as a table) are lost and then determine whether you need to recover the tablespace or a data file.
  4. Determine the dependencies between components to be recovered. Usually the components aren't isolated; loss or recovery of a database object can affect other objects. For example if a table needs recovery you'll also have to recreate the indexes. This step isn't done automatically by the recovery of the table.
  5. Determine the location of the backup. The closer the backup is to where the recovery is to be performed the lesser is the MTTR.

Location factors

If the backup is on a disk is the disk on-site or off-site? Is the disk local or network? Do you have mirrored copies? Are you recovering from a cold or a hot backup? If the backup is on tape is the tape on-site or off-site? Do you need additional components to access the tape?

  1. Perform the restore. This involves restoring the physical file from disk or tape and placing it at a location where the database can access the file for recovery purposes. The time to restore is affected by file location file size file format (raw export blocks or extracts) and possibilities of restore parallelism.
  2. Replay redo logs (for archived databases) and resync the database components.”

 

 

Tutorial Sources and Links to more information:

http://www-rohan.sdsu.edu/doc/oracle/server803/A54640_01/ch4_rcvs.htm#421360

http://searchdatabase.techtarget.com/bestWebLinks/0,289521,sid13_tax281653,00.html

http://searchstorage.techtarget.com/tip/1,289483,sid5_gci866733,00.html

http://plutonium.cs.umanitoba.ca/DB2Docs/index.htm?openup=admin/c0005945.htm

http://itpapers.zdnet.com/search.aspx?dtid=2&scid=76

https://aurora.vcu.edu/db2help/db2d0/frame3.htm#db2d015