Saturday, January 20, 2018

Introduction of Backup & Recovery


Why we need backup & recovery ?

  • Protected the database from failure
  • Minimize data loss
  • Increase mean time between failure
  • decrease mean time to recover


One of a database administrator’s (DBA) major responsibilities is to ensure that the database is available for use.
To protect the data from the various types of failures that can occur, the DBA must back up the database regularly. Without a current backup, it is impossible for the DBA to get the database up and running if there is a file loss, without losing data.

Categories of Failures:
=======================================

Different types of failures may occur in an Oracle database environment. These include:

  • Statement failure
  • User process failure
  • User error
  • Instance failure
  • Media failure
  • Network failure


Each type of failure requires a varying level of involvement by the DBA to recover effectively from the situation. In some cases, recovery depends on the type of backup strategy that has been implemented.



Statement Failure:
=======================================

Statement failure occurs where there is a logical failure in the handling of a statement in an Oracle program. Types of statement failures include:

  • A logical error occurs in the application.
  • The user attempts to enter invalid data into the table, perhaps violating integrity constraints.
  • The user attempts an operation with insufficient privileges, such as an insert on a table using only SELECT privileges.
  • The user attempts to create a table but exceeds the user’s allotted quota limit.
  • The user attempts an INSERT or UPDATE on a table, causing an extent to be allocated, but insufficient free space is available in the tablespace.




Resolutions for Statement Failures
=============================================

DBA intervention after statement failures will vary in degree, depending on the type of failure, and may include the following:

  • Fix the application so that logical flow is correct. Depending on your environment this may be an application developer task rather than a DBA task.
  • Modify the SQL statement and reissue it. This may also be an application developer task rather than a DBA task.
  • Provide the necessary database privileges for the user to complete the statement successfully.
  • Issue the ALTER USER command to change the quota limit.
  • Add file space to the tablespace. Technically, the DBA should make sure this does not happen; however, in some cases it may be necessary to add file space. A DBA can also use the RESIZE and AUTOEXTEND options for data files.


When a statement failure is encountered, it is likely that the Oracle server or the operating system will return an error code and a message. The failed SQL statement is automatically rolled back, then control is returned to the user program. The application developer or DBA can use the Oracle error codes to diagnose and help resolve the failure.


oracle provide later resuming, the execution of large database operations in the event of space allocation failures. This enables an administrator to take corrective action, instead of the Oracle database server returning an error to the user. After the error condition is corrected, the suspended operation automatically resumes. This feature is called resumable space allocation and the statements that are affected are called resumable statements



User Process Failures
===============================

A user’s process may fail for a number of reasons; however, the more common causes include:

The user performed an abnormal disconnect in the session. For example, a user issues a [Ctrl] + [Break] in SQL*Plus while connected to a database in a client-server configuration.
• The user’s session was abnormally terminated. One possible scenario is the user rebooted the client while connected to a database in a client-server configuration.
• The user’s program raised an address exception which terminated the session. This is common if the application does not properly handle exceptions when they are raised.

Resolution of User Process Failures
================================

The DBA will rarely need to take action to resolve user process errors. The user process cannot continue to work, although the Oracle server and other user processes will continue to function.

PMON Background Process


  • The PMON background process is usually sufficient for cleaning up after an abnormally terminated user process.
  • When the PMON process detects an abnormally terminated server process, it rolls back the transaction of the abnormally terminated process, and releases any resources and locks it has acquired.



User Errors:
=============

DBA intervention is usually required to recover from user errors. Common Types of User Errors
  • The user accidentally drops or truncates a table.
  • The user deletes all rows in a table.
  • The user commits data, but discovers an error in the committed data.



A key issue in any database and application environment is to make sure that users are properly trained and are aware of database availability and integrity implications
A DBA should understand the types of applications and business operations that may result in loss of data from user errors and how to implement recovery measures for those situations. Some recovery situations may be quite extensive, such as restoring the database to a point-in time just prior to the error, exporting the lost data, and then importing that data back into the database from which it was lost.
  • Recover from a valid backup.
  • Import the table from an export file.
  • Use LogMiner to determine the time of error.
  • Recover with a point-in-time recovery.
  • Use LogMiner to perform object-level recovery.
  • Use FlashBack to view and repair historical data



Instance Failure:
================

An instance failure may occur for numerous reasons:
  • A power outage occurs that causes the server to become unavailable.
  • The server becomes unavailable due to hardware problems such as a CPU failure,memory corruption, or an operating system crash.
  • One of the Oracle server background processes (DBWn, LGWR, PMON, SMON,CKPT) experiences a failure.




To recover from instance failure, the DBA:


  • Starts the instance by using the “startup” command. The Oracle server will automatically recover, performing both the roll forward and rollback phases.
  • Investigates the cause of failure by reading the instance alert.log file and any other trace files that were generated during the instance failure.


Instance Recovery
=====================

Instance recovery restores a database to its transaction-consistent state just prior to instance failure. The Oracle server automatically performs instance recovery when the database is opened if it is necessary.
No recovery action needs to be performed by you. All required redo information is read by SMON. To recover from this type of failure, start the database:

After the database has opened, notify users that any data that they did not commit must be re-entered


There may be a time delay between starting the database and the “Database opened” notification—this is the roll forward phase that takes place while the database is mounted.

– SMON performs the roll forward process by applying changes recorded in the online redo log files from the last checkpoint.
– Rolling forward recovers data that has not been recorded in the database files, but has been recorded in the online redo log, including the contents of rollback segments.

Rollback can occur while the database is open, because either SMON or a server process can perform the rollback operation. This allows the database to be available for users more quickly.



Media Failures
=================

Media failure involves a physical problem when reading from or writing to a file that is necessary for the database to operate. Media failure is the most serious type of failure because it usually requires DBA intervention.
Common Types of Media Related Problems
  • The disk drive that held one of the database files experienced a head crash.
  • There is a physical problem reading from or writing to the files needed for normal database operation.
  • A file was accidentally erased.


Resolutions for Media Failures
================================================

A tested recovery strategy is the key component to resolving media failure problems. The ability of the DBA to minimize down time and data loss as a result of media failure depends on the type of backups that are available. A recovery strategy, therefore, depends on the following:

  • The backup method you choose and which files are affected.
  • The Archivelog mode of operation of the database. If archiving is used, you can apply archived redo log files to recover committed data since the last backup


No comments:

Post a Comment